CN117333844A - Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device - Google Patents

Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device Download PDF

Info

Publication number
CN117333844A
CN117333844A CN202311317651.9A CN202311317651A CN117333844A CN 117333844 A CN117333844 A CN 117333844A CN 202311317651 A CN202311317651 A CN 202311317651A CN 117333844 A CN117333844 A CN 117333844A
Authority
CN
China
Prior art keywords
model
foggy
domain
image
source domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311317651.9A
Other languages
Chinese (zh)
Inventor
刘明亮
陈广振
李孟珍
郑强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heilongjiang University
Original Assignee
Heilongjiang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heilongjiang University filed Critical Heilongjiang University
Priority to CN202311317651.9A priority Critical patent/CN117333844A/en
Publication of CN117333844A publication Critical patent/CN117333844A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

Foggy vehicle detection model constraint method and device and foggy vehicle detection method and device relate to the field of image data processing. In order to solve the technical problems that in the prior art, the statistics of the traffic flow is not specific to the treatment in the foggy environment at present, the existing target detection algorithm cannot meet the requirements of real-time detection and statistics of the vehicle in the foggy environment, and a traffic flow detection system with higher integration level is lacked, the invention provides the following technical scheme: the foggy-day vehicle detection model constraint method comprises the following steps: collecting a source domain; collecting a target domain; generating a class source domain; generating a class target domain; the class source domain is input to a teacher model; the target field and the category target field are input to the student model; the source domain is input to a guiding model; obtaining distillation loss; obtaining consistency loss; and constraining the student model according to the distillation loss and the consistency loss. The method is suitable for being applied to the work of detecting the flow of the foggy weather.

Description

Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device
Technical Field
The field of image data processing, in particular to detection of a foggy sky vehicle in a foggy day image.
Background
In severe weather conditions such as fog and haze, visual perception and scene understanding of applications such as urban security and autopilot become extremely difficult. These atmospheric phenomena often lead to non-linear noise, blurring, contrast degradation, and darkening of the image, which can be challenging to detect for traffic. The quality of pictures shot by the camera is reduced in foggy days, so that the target false detection rate and the omission factor are increased, the overall accuracy is reduced, traffic problems are more likely to occur in vehicles, and serious traffic accidents are caused. Therefore, how to more accurately and effectively acquire the types and the quantity of motor vehicles on the road and provide higher-quality data is a precondition for determining the intelligent and intelligent management of the road traffic effect, and has great practical significance. In order to cope with the foggy environment, the traffic image and video under the foggy environment are researched, the detection of the traffic flow under the foggy environment is realized through the design domain self-adaptive target detection algorithm and the vehicle tracking algorithm, more high-quality service is provided for people in a targeted manner, and the method has very important practical significance for reducing unnecessary economic loss and labor cost.
The timely collection and analysis of traffic flow information can directly reflect the current road traffic condition, help traffic departments to plan and configure traffic resources better, and timely relieve traffic jam conditions, thereby providing scientific basis for making more accurate, comprehensive and reliable decisions. However, in the case of haze weather, the visibility is low, the visual effect is poor, a vehicle target is easily shielded and becomes fuzzy, a vehicle at a distance is not easy to identify, and a large number of false detection and omission detection are easily caused, so that the accuracy of vehicle flow detection is affected.
Accurate statistics of traffic flow information is an important aspect of realizing an intelligent traffic system, and the analysis of traffic flow and other information is concerned with the subsequent processing. The early traffic flow statistics mode mainly uses means such as ultrasonic wave and infrared induction, and detection equipment mainly comprises a mechanical piezoelectric detector, an electromagnetic detector and the like. When a vehicle passes, the mechanical press on the road surface can generate pressure due to deformation, so that the passing of the vehicle is perceived. The electromagnetic detection is to judge the vehicle according to the change of the magnetic field of the magnetic induction coil on the metal of the vehicle and the road surface. Both the above methods require embedding the sensor into the ground, which can cause damage to the road, difficult maintenance and high cost. Compared with the traditional method, the detection method based on the video image has the advantages of environmental protection, simplicity, high efficiency and the like, so that the traffic flow statistical method based on the video image detection technology gradually becomes a big hot spot for research.
However, at present, statistics of traffic flow is not specific to treatment in a foggy environment, and the existing target detection algorithm cannot meet the requirements of real-time detection and statistics of vehicles in a foggy environment, and lacks a traffic flow detection system with high integration level.
Disclosure of Invention
In order to solve the technical problems that in the prior art, the statistics of the traffic flow is not specific to the treatment in the foggy environment at present, the existing target detection algorithm cannot meet the requirements of real-time detection and statistics of the vehicle in the foggy environment, and a traffic flow detection system with higher integration level is lacked, the invention provides the following technical scheme:
a foggy day vehicle detection model restraint method, the method comprising:
collecting image data under normal weather as a source domain;
collecting image data under foggy weather as a target domain;
generating an image with a label through the source domain as a source domain-like step;
generating an image with a label through the target domain as a class target domain;
the class source domain is input to a teacher model after being enhanced by weak data;
the target domain and the category target domain are enhanced by strong data and then input into a student model;
the source domain is enhanced by strong data and then is input into a guiding model;
obtaining distillation loss according to the teacher model and the student model;
obtaining consistency loss according to the student model and the guiding model;
and constraining the student model according to the distillation loss and the consistency loss.
Further, there is provided a preferred embodiment, wherein the class source domain and the class destination domain are obtained by a CycleGAN method according to a source domain and a destination domain, respectively.
Further, a preferred embodiment is provided, the weak data enhancement specifically being: the image is subjected to random horizontal inversion and cropping.
Further, a preferred embodiment is provided, the strong data enhancement specifically being: the image is subjected to random color dithering, gray scale processing and gaussian blur processing.
Further, a preferred embodiment is provided wherein the student model uses the YOLOv8s model.
Based on the same inventive concept, the invention also provides a foggy vehicle detection model restraint device, which comprises:
collecting image data under normal weather as a module of a source domain;
collecting image data under foggy weather as a module of a target domain;
generating an image with a label through the source domain, and taking the image with the label as a module of a source domain;
generating an image with a label through the target domain to serve as a module of the class target domain;
the class source domain is input to a module of a teacher model after being enhanced by weak data;
the target domain and the category target domain are enhanced by strong data and then input to a module of the student model;
the source domain is enhanced by strong data and then is input to a module of a guiding model;
obtaining a distillation loss module according to the teacher model and the student model;
obtaining a consistency loss module according to the student model and the guide model;
and a module for constraining the student model according to the distillation loss and the consistency loss.
Based on the same inventive concept, the invention also provides a foggy-day vehicle detection method, which comprises the following steps:
collecting a scene image to be detected;
according to the method, detecting the vehicle in the image.
Based on the same inventive concept, the invention also provides a foggy vehicle detection device, which comprises:
the module is used for collecting the scene image to be detected;
and a module for detecting the vehicle in the image according to the device.
Based on the same inventive concept, the present invention also provides a computer storage medium for storing a computer program, which when read by a computer performs the method.
The present invention also provides a computer comprising a processor and a storage medium, which performs the method when the processor executes a computer program stored in the storage medium, based on the same inventive concept.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
the foggy-day vehicle detection method provided by the invention can adapt to foggy-day environments, obtain the category and detection frame coordinate information of the vehicle, and has higher detection accuracy than that of an equal-magnitude detection algorithm.
The foggy day vehicle detection method provided by the invention supports the real-time tracking function of the detected vehicle on the basis of the ByteTrack tracking algorithm, and the tracking effect is better than that of classical tracking algorithms such as DeepSORT, MOTDT.
The foggy vehicle detection method provided by the invention supports two detection modes of local server detection and soft and hard integrated privately-arranged detection, and can meet the requirements of different scenes of users.
The foggy-day vehicle detection method provided by the invention supports the development of a foggy-day vehicle flow detection system based on a PyQt frame in cooperation with the prior art, is beneficial to the detection method provided by the invention, can realize the vehicle flow detection of online video stream or offline video, supports simultaneous detection of multiple lanes, can customize a detection area, and can automatically save detection results and record and analyze.
According to the foggy day vehicle detection method provided by the invention, the source domain and the target domain and the class source domain and class target domain data set generated through the CycleGAN image conversion algorithm are used as model input, and the offset distribution of the vehicle characteristics in normal weather and foggy days is reduced through four loss functions.
The method is suitable for being applied to the work of detecting the flow of the foggy weather.
Drawings
Fig. 1 is a flow chart of a foggy-day vehicle detection model constraint method according to an embodiment;
FIG. 2 is an illustration of an original image of a Cityscapes dataset and a corresponding image of a Cityscapes_foggy dataset as referred to in embodiment eleven;
wherein (a) is an original image, (b) is a haze image, (c) is a medium haze image, and (d) is a dense haze image;
FIG. 3 is an illustration of a UA-DETRAC dataset original image and a corresponding Cityscapes_foggy dataset image as referred to in embodiment eleven;
wherein (a) is an original image, (b) is a haze image, (c) is a medium haze image, and (d) is a dense haze image;
FIG. 4 is an example of class source domain and class target domain images referred to in embodiment eleven;
wherein (a) is a source domain image, (b) is a class source domain image, (c) is a target domain image, and (d) is a class target domain image;
FIG. 5 is an example of the detection of a Cityscapes data set as referred to in embodiment eleven;
wherein (a) is a haze image and (b) is a dense haze image;
fig. 6 is a diagram showing a UA-detac data set detection example mentioned in embodiment eleven;
wherein (a) is a mist scene and (b) is a medium mist scene.
Detailed Description
In order to make the advantages and benefits of the technical solution provided by the present invention more apparent, the technical solution provided by the present invention will now be described in further detail with reference to the accompanying drawings, in which:
in an embodiment, the present embodiment is described with reference to fig. 1, and provides a foggy vehicle detection model constraint method, including:
collecting image data under normal weather as a source domain;
collecting image data under foggy weather as a target domain;
generating an image with a label through the source domain as a source domain-like step;
generating an image with a label through the target domain as a class target domain;
the class source domain is input to a teacher model after being enhanced by weak data;
the target domain and the category target domain are enhanced by strong data and then input into a student model;
the source domain is enhanced by strong data and then is input into a guiding model;
obtaining distillation loss according to the teacher model and the student model;
obtaining consistency loss according to the student model and the guiding model;
and constraining the student model according to the distillation loss and the consistency loss.
Specifically, DAOD (Domain Adaptive Obeject Detection), namely domain adaptive target detection, is a deep learning algorithm different from the traditional target detection, considers the adaptability problem of the model in different data domains, and enables the detection model to have stronger generalization capability and practical application value through a specific domain adaptive technology.
The effect of the model with good detection effect in normal weather can be greatly reduced in foggy weather because the data distribution difference exists between two environment areas of normal weather and foggy weather, namely, the vehicle characteristics learned by the model in normal weather and the vehicle characteristics in foggy weather are greatly different, so that the detection effect becomes poor.
In order to reduce the distribution difference of data in two environmental domains, namely normal weather and foggy weather, so as to improve the detection effect of vehicles in foggy weather under foggy weather, the embodiment designs a domain self-adaptive detection model TIMT based on a target detection technology and combining an average teacher (MT) and image style conversion cycle GAN algorithm, and the overall structure of the model is shown in figure 1.
In the figure, a source domain and a target domain respectively represent image data under normal weather and fog, and a class source domain and a class target domain are respectively converted from the source domain and the target domain by a CycleGAN method, wherein the source domain and the class target domain images have labels, and the target domain and the class source domain images have no labels. The method comprises the steps of inputting a source-like domain image into a teacher model after weak data enhancement such as random horizontal inversion and clipping, respectively inputting the images of the remaining three domains into a student model and a guide model after strong data enhancement such as random color dithering, gray scale, gaussian blur and the like, restraining the models through two kinds of detection loss and consistency loss and distillation loss, reducing data distribution difference between a source domain and a target domain, and finally using the trained student model as a detection model for a foggy vehicle detection task. The teacher model, the student model and the instruction model can be common target detection models, such as YOLO, SSD, fast RCNN, and the like, and the embodiment selects a one-stage detection model YOLOv8s with better detection performance and detection efficiency.
The input data generation and definition are specifically as follows:
for a domain self-adaptive detection task of a foggy vehicle, defining normal weather data as a source domain and foggy data as a target domain, wherein the foggy data are represented by a formula (1-1) and a formula (1-2) respectively:
wherein D is s And D t Respectively representing source domain and target domain data, N s And N t The number of images representing the source domain and the target domain respectively,and->Respectively representing a labeled source domain image and an unlabeled target domain image, wherein i represents an ith image; for source domain data->And->The sub-table represents all target frame coordinate information and category information of the label in the ith image, and the specific definition is defined by formulas (1-3) and (1-4):
wherein B is j The j-th coordinate frame information in the image is M is the number of target frames in the image, and x is the number of target frames in the image j ,y j ,w j ,h j The j-th target frame left upper corner abscissa, left upper corner ordinate, target frame width and target frame height; c (C) j And c is the total number of categories for the category information of the jth target frame.
In addition to the source domain and target domain data, the present embodiment generates a source-like domain by the image style conversion algorithm CycleGANAnd category target field->Images, respectively formed of target fields I t And source domain I s The image is transformed as shown on the left side of fig. 1. The label information (bounding box and category) of the converted class source domain and class target domain data is not changed, and the label information and the class source domain and the class target domain data are used as the input of the model together.
The average teacher model is specifically:
an average Teacher (MT) model is a knowledge distillation technology widely used in the field of deep learning, and is initially applied in the field of semi-supervised learning. The system consists of two networks with the same architecture, namely a teacher model and a student model, which are Yolov8s in the embodiment. The student model uses the data with the label to conduct supervised learning, the weight of the teacher model is updated through an index moving average (Exponential Moving Average, EMA for short) of the weight of the student model, and the EMA formula is shown in the formula (1-5):
W t =γW t +(1-γ)W s (1-5)
wherein W is t And W is s The weights of the student model and the teacher model are respectively represented, and gamma is an attenuation index with a value close to 1, and 0.99,0.999 is generally taken.
In this embodiment, the improvement of the average textbook model is specifically:
student models are usually subjected to supervised training by using source domain data, teacher models are trained by using unlabeled target domain data, and as weight parameters of the teacher models are obtained by the student models through EMA, prediction capability of the teacher models is biased towards the source domain data, and further the student models cannot learn characteristics of the target domain, so that distillation fails.
In order to solve this problem, as shown in fig. 1, the present embodiment uses the source-like domain data as the input of the teacher model after being enhanced by weak data such as random horizontal inversion and clipping, and uses the target domain and the target-like domain data as the input of the student model after being enhanced by strong data such as random color dithering, gray scale, gaussian blur, etc., wherein the data enhancement is to increase the robustness of the model. In the distillation process of the MT model, the target frame generated by the teacher model is selected to be a pseudo tag with high probability, so that the target domain data branch in the student model is guided to train, and the distillation loss is defined as the formula (1-6):
wherein L is dis In order to achieve the loss of distillation,and->Coordinate frame information and category information after non-maximum suppression (NMS) and confidence sequencing of pseudo tag information generated by teacher model respectively, L det For normal predicted loss, defined as formula (1-7):
L det (I,B,C)=L box (I,B)+L cls (I,C) (1-7)
wherein L is box (I, B) CIoU loss and DFL loss, L, which are prediction bounding boxes cls (I, C) is a binary cross entropy loss of category information.
As can be seen from FIG. 1, in addition to distillation loss L dis In addition, there is a detection loss 1 obtained by inputting class object domain data into the student model, and since the class object domain data is converted from the source domain, the tag information is the same as the source domain, and the detection loss 1 can be defined as formula (1-8):
in order to further ensure that the teacher model and the student model do not deviate from the source domain data and learn more target domain feature information as much as possible, the source domain data is not directly input into the student model, but rather is used as an additional branch (the bottom yellow arrow part in fig. 1) to perform independent supervised training through a guiding model YOLOv8s, and the training loss is recorded as detection loss 2. Since the source domain and the class target domain have the same label information, and the class target domain data is converted from the source domain data, and the prediction information of the source domain data and the class target domain data is identical, the consistency loss is added as a prediction constraint, and the detection loss 2 and the consistency loss are respectively defined as the following formulas (1-9) and (1-10):
L det2 (I s ,B s ,C s )=L box (I s ,B s )+L cls (I s ,B s ) (1-9)
wherein, the consistency loss adopts an average absolute error (MAE) loss. Such total loss L can be expressed as formula (1-11):
where ρ and σ are hyper-parameters, the specific values will be determined experimentally.
In the second embodiment, the method for restricting the foggy weather vehicle detection model provided in the first embodiment is further limited, and the source-like domain and the target-like domain are obtained by a CycleGAN method according to the source domain and the target domain, respectively.
In a third embodiment, the present embodiment is further defined on the foggy weather vehicle detection model constraint method provided in the first embodiment, where the weak data enhancement specifically includes: the image is subjected to random horizontal inversion and cropping.
In a fourth embodiment, the present embodiment is further defined on the foggy weather vehicle detection model constraint method provided in the first embodiment, where the strong data enhancement specifically includes: the image is subjected to random color dithering, gray scale processing and gaussian blur processing.
The fifth embodiment is further defined on the foggy weather vehicle detection model constraint method provided in the first embodiment, wherein the student model uses a YOLOv8s model.
In a sixth embodiment, the present embodiment provides a foggy weather vehicle detection model restraint apparatus, the apparatus including:
collecting image data under normal weather as a module of a source domain;
collecting image data under foggy weather as a module of a target domain;
generating an image with a label through the source domain, and taking the image with the label as a module of a source domain;
generating an image with a label through the target domain to serve as a module of the class target domain;
the class source domain is input to a module of a teacher model after being enhanced by weak data;
the target domain and the category target domain are enhanced by strong data and then input to a module of the student model;
the source domain is enhanced by strong data and then is input to a module of a guiding model;
obtaining a distillation loss module according to the teacher model and the student model;
obtaining a consistency loss module according to the student model and the guide model;
and a module for constraining the student model according to the distillation loss and the consistency loss.
An embodiment seven, the present embodiment provides a foggy day vehicle detection method, the method including:
collecting a scene image to be detected;
according to a method provided in an embodiment, the step of detecting a vehicle in the image is performed.
An eighth embodiment provides a foggy day vehicle detection apparatus, the apparatus including:
the module is used for collecting the scene image to be detected;
according to the device provided in the sixth embodiment, the means for detecting a vehicle in the image is provided.
Embodiment nine, the present embodiment provides a computer storage medium storing a computer program that, when read by a computer, performs the method provided in any one of embodiments one to five and seven.
Embodiment ten, the present embodiment provides a computer including a processor and a storage medium, which when executing a computer program stored in the storage medium, performs the method provided in any one of embodiments one to five and seven.
An eleventh embodiment is described with reference to fig. 2 to 6, and the method for restricting the foggy weather vehicle detection model provided in the first embodiment is further described by specific examples and experiments, specifically:
data set:
the present embodiment uses Cityscapes, cityscapes _foggy and UA-DETRAC datasets for model training and testing, respectively.
The Cityscapes dataset, i.e. the city landscape dataset, is a widely used open source dataset designed for automatic driving and computer vision tasks, comprising 2975 training set pictures and 500 test set pictures recorded from streets of 50 different cities in germany, including detailed information of streets, traffic signs, pedestrians, vehicles, etc. The cityscapes_foggy data set is obtained by performing atomization treatment on the basis of Cityscapes through calculating the gray position proportion of pixel points by distance factors and projection transformation, and three versions of mist, medium mist and thick mist are shared according to different atomization degrees, as shown in fig. 2;
UA-detac is a challenging real-world multi-target detection and multi-target tracking benchmark. The dataset contained 10 hour videos taken at 24 different sites in beijing and Tianjin, china using a Canon EOS 550D camera. Video is recorded at 25 frames per second (fps) with a resolution of 960 x 540 pixels. The UA-detac dataset had over 14 ten thousand frames and 8250 manually labeled vehicles, with a total of 121 ten thousand labeled object bounding boxes, with training set of 82085 pictures and test set of 56167 pictures. The data set can be used for multi-target detection and multi-target tracking algorithm development. Atomizing the UA-DETRAC data set according to the Cityscapes_foggy data set to respectively obtain UA-DETRAC_foggy data sets of three versions of mist, medium mist and thick mist, wherein partial images are shown in figure 3;
generating class source domain and class target domain images by using a CycleGAN style conversion algorithm on Cityscapes, cityscapes _foggy, UA-DETRAC and UA-DETRAC_foggy data sets respectively, taking Cityscapes and Cityscapes_foggy thick fog data sets as examples, and the conversion results are shown in FIG. 4:
it can be seen that the class source domain and class target domain images, while distorted in some detail, generally better restore the spatial features of the source and target domains. In order to further ensure the effectiveness of the converted data set, a similarity test is performed based on the data sets before and after conversion, and the evaluation index is based on FID (fre chet Inception Distance), which is a measure for calculating the distance between the feature vectors of the real image and the generated image, and a smaller value represents a higher degree of similarity. The test results are shown in table 1:
TABLE 1 FID values of images before and after conversion
Source domain/target domain Class source/class target domain FID
Cityscapes Cityscapes_foggy 34.58
Cityscapes_foggy Cityscapes 32.12
UA-DETRAC UA-DETRAC_foggy 33.57
UA-DETRAC_foggy UA-DETRAC 30.85
The average value of the FID of the four kinds of conversion is 32.78 and is below 35, which shows that the conversion effect is good, and the requirements of class source domain and class target domain data sets are met.
Considering that the application scenario of the embodiment is mainly flow statistics of motor vehicles on roads, three types of cars, minibus and buses in Cityscapes and UA-DETRAC are selected for detection.
Training details and evaluation indexes:
and selecting YOLOv8s as a backbone detection network, wherein the training data set uses a labeled source domain and category target domain data set and a label-free target domain and category source domain data set, the training test proportion of the Cityscapes and Cityscapes_foggy data sets is 6:1, the training test proportion of the UA-DETRAC and UA-DETRAC data sets is 7:3, the image input size is 640 x 640, and the Batchsize is set to 16. The model training framework is based on PyTorch, the training platform is based on NVIDIA GeForce RTX 4090, the video memory size is 24GB, the training round is 300, the EMA attenuation index gamma is set to be 0.999, the loss coefficient is set to be 0.004 and 2.5 respectively.
mAP (Mean Average Precision, average precision mean) is adopted as an evaluation index of target detection. mAP is defined by the formula (1-12):
wherein N is the number of categories, in this embodiment, 3, ap is the mean Precision, the area enclosed below the PR curve formed by the Precision (Precision) and Recall (Recall) is determined, and the Precision and accuracy are determined by formulas (1-13) and (1-14), respectively:
where TP is the negative sample that is model predicted as positive, FP is the negative sample that is model predicted as positive, and FN is the positive sample that is model predicted as negative.
Results and analysis:
training tests were performed on the Cityscapes dataset and the UA-detac dataset for three fog concentrations, with partial test results shown in fig. 5, 6 and tables 2 and 3, respectively:
TABLE 2 Cityscapes dataset test results
model/AP@0.5 Car Bread car Bus mAP@0.5
YOLOv5s 58.5 48.9 72.1 59.8
YOLOxs 59.2 50.1 73.4 60.9
YOLOv8s 60.2 53.8 74.1 62.7
TIMT 64.7 56.5 76.4 65.8
TABLE 3 UA-DETRAC dataset test results
model/AP@0.5 Car Bread car Bus mAP@0.5
YOLOv5s 69.5 42.1 70.7 60.7
YOLOxs 69.6 43.5 70.9 61.3
YOLOv8s 70.2 45.2 71.5 62.3
TIMT 72.1 45.4 75.1 64.2
The model designed in the embodiment obtains the best effect on two data sets of Cityscapes and UA-DETRAC, the mAP value respectively reaches 65.8 and 64.2 under the condition that the IOU value is 0.5, the detection effect of a bus in the detection results of the two data sets is the best, and the AP value respectively reaches 76.4 and 75.1; the detection effects of the minibus are the lowest and are respectively 56.5 and 45.4, which is probably because the minibus and the car have similar appearance, the recognition difficulty of the minibus and the car becomes larger under the foggy day condition, and as can be seen from fig. 6, under the foggy day, although the detection of a distant target is very difficult due to the problems of smaller area, easy shielding and the like, the characteristic information of a medium target and a large target is more abundant, the detection effect is more ideal, and the target vehicle can be tracked to complete the flow statistics task.
The technical solution provided by the present invention is described in further detail through several specific embodiments, so as to highlight the advantages and benefits of the technical solution provided by the present invention, however, the above specific embodiments are not intended to be limiting, and any reasonable modification and improvement, combination of embodiments, equivalent substitution, etc. of the present invention based on the spirit and principle of the present invention should be included in the scope of protection of the present invention.
In the description of the present invention, only the preferred embodiments of the present invention are described, and the scope of the claims of the present invention should not be limited thereby; furthermore, the descriptions of the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise. Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention. Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

Claims (10)

1. The foggy vehicle detection model constraint method is characterized by comprising the following steps:
collecting image data under normal weather as a source domain;
collecting image data under foggy weather as a target domain;
generating an image with a label through the source domain as a source domain-like step;
generating an image with a label through the target domain as a class target domain;
the class source domain is input to a teacher model after being enhanced by weak data;
the target domain and the category target domain are enhanced by strong data and then input into a student model;
the source domain is enhanced by strong data and then is input into a guiding model;
obtaining distillation loss according to the teacher model and the student model;
obtaining consistency loss according to the student model and the guiding model;
and constraining the student model according to the distillation loss and the consistency loss.
2. The foggy day vehicle detection model constraint method of claim 1, wherein the class source domain and the class target domain are obtained by a CycleGAN method according to a source domain and a target domain, respectively.
3. The foggy day vehicle detection model constraint method of claim 1, wherein the weak data enhancement is specifically: the image is subjected to random horizontal inversion and cropping.
4. The foggy day vehicle detection model constraint method of claim 1, wherein the strong data enhancement is specifically: the image is subjected to random color dithering, gray scale processing and gaussian blur processing.
5. The foggy day vehicle detection model constraint method of claim 1, wherein the student model employs a YOLOv8s model.
6. Foggy vehicle detection model restraint device, characterized in that it comprises:
collecting image data under normal weather as a module of a source domain;
collecting image data under foggy weather as a module of a target domain;
generating an image with a label through the source domain, and taking the image with the label as a module of a source domain;
generating an image with a label through the target domain to serve as a module of the class target domain;
the class source domain is input to a module of a teacher model after being enhanced by weak data;
the target domain and the category target domain are enhanced by strong data and then input to a module of the student model;
the source domain is enhanced by strong data and then is input to a module of a guiding model;
obtaining a distillation loss module according to the teacher model and the student model;
obtaining a consistency loss module according to the student model and the guide model;
and a module for constraining the student model according to the distillation loss and the consistency loss.
7. A method for detecting a foggy vehicle, the method comprising:
collecting a scene image to be detected;
the method of claim 1, the step of detecting a vehicle in the image.
8. Foggy vehicle detection device, characterized in that it comprises:
the module is used for collecting the scene image to be detected;
the apparatus of claim 6, means for detecting a vehicle in the image.
9. Computer storage medium for storing a computer program, characterized in that the computer performs the method according to any one of claims 1-5 and 7 when the computer program is read by the computer.
10. Computer comprising a processor and a storage medium, characterized in that the computer performs the method according to any of claims 1-5 and 7 when the processor executes a computer program stored in the storage medium.
CN202311317651.9A 2023-10-12 2023-10-12 Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device Pending CN117333844A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311317651.9A CN117333844A (en) 2023-10-12 2023-10-12 Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311317651.9A CN117333844A (en) 2023-10-12 2023-10-12 Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device

Publications (1)

Publication Number Publication Date
CN117333844A true CN117333844A (en) 2024-01-02

Family

ID=89290029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311317651.9A Pending CN117333844A (en) 2023-10-12 2023-10-12 Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device

Country Status (1)

Country Link
CN (1) CN117333844A (en)

Similar Documents

Publication Publication Date Title
CN111368687B (en) Sidewalk vehicle illegal parking detection method based on target detection and semantic segmentation
CN112233097B (en) Road scene other vehicle detection system and method based on space-time domain multi-dimensional fusion
CN111626277B (en) Vehicle tracking method and device based on over-station inter-modulation index analysis
CN111666854B (en) High-resolution SAR image vehicle target detection method fusing statistical significance
Chen et al. YOLOv5-based vehicle detection method for high-resolution UAV images
CN112801227B (en) Typhoon identification model generation method, device, equipment and storage medium
CN114495064A (en) Monocular depth estimation-based vehicle surrounding obstacle early warning method
CN113111722A (en) Automatic driving target identification method based on improved Mask R-CNN
Wang et al. FarNet: An attention-aggregation network for long-range rail track point cloud segmentation
CN117079117B (en) Underwater image processing and target identification method and device, storage medium and electronic equipment
Mukhopadhyay et al. Combating bad weather part i: Rain removal from video
CN112418149A (en) Abnormal behavior detection method based on deep convolutional neural network
JP2018124963A (en) Image processing device, image recognition device, image processing program, and image recognition program
CN115359306B (en) Intelligent identification method and system for high-definition images of railway freight inspection
Lashkov et al. Edge-computing-facilitated nighttime vehicle detection investigations with CLAHE-enhanced images
Lai et al. Super resolution of car plate images using generative adversarial networks
CN117333844A (en) Foggy vehicle detection model constraint method and device, foggy vehicle detection method and device
WO2018143278A1 (en) Image processing device, image recognition device, image processing program, and image recognition program
CN112348011B (en) Vehicle damage assessment method and device and storage medium
CN112016534A (en) Neural network training method for vehicle parking violation detection, detection method and device
Yu et al. Leveraging temporal information for 3d detection and domain adaptation
Ziaratnia et al. Development of Robust Vehicle Classification and Tracking for Traffic Analysis
Hu Logistics vehicle tracking method based on intelligent vision
Zeng et al. Detection of vehicle pressure line based on machine learning
Jin et al. Damage detection of road domain waveform guardrail structure based on machine learning multi-module fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination