CN111126271B - Bayonet snap image vehicle detection method, computer storage medium and electronic equipment - Google Patents

Bayonet snap image vehicle detection method, computer storage medium and electronic equipment Download PDF

Info

Publication number
CN111126271B
CN111126271B CN201911347011.6A CN201911347011A CN111126271B CN 111126271 B CN111126271 B CN 111126271B CN 201911347011 A CN201911347011 A CN 201911347011A CN 111126271 B CN111126271 B CN 111126271B
Authority
CN
China
Prior art keywords
detection
vehicle body
vehicle
bayonet
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911347011.6A
Other languages
Chinese (zh)
Other versions
CN111126271A (en
Inventor
王祥雪
贺迪龙
林焕凯
汪刚
刘双广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gosuncn Technology Group Co Ltd
Original Assignee
Gosuncn Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gosuncn Technology Group Co Ltd filed Critical Gosuncn Technology Group Co Ltd
Priority to CN201911347011.6A priority Critical patent/CN111126271B/en
Publication of CN111126271A publication Critical patent/CN111126271A/en
Application granted granted Critical
Publication of CN111126271B publication Critical patent/CN111126271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a vehicle detection method for snap shot images of a bayonet, a computer storage medium and electronic equipment, wherein the method comprises the following steps: s1, acquiring a bayonet snap shot image; s2, identifying the snap shot image of the bayonet, and adopting a vehicle body detection model to detect the vehicle body of the identified vehicle so as to detect a vehicle body area; s3, establishing a filtering rule, screening and filtering out image information of an incomplete vehicle body, and obtaining an complete vehicle body area; s4, classifying the complete vehicle body area by adopting a vehicle body front-rear beat classification model; s5, outputting complete vehicle information. According to the bayonet snap image vehicle detection method provided by the embodiment of the application, the existing vehicle detection scheme is optimized, the end-to-end method of simultaneous detection and classification in SSD is changed, the detection and classification are finished step by step, namely, the original detection and classification are changed into the prior detection and classification, the detection rate and the accuracy rate of the vehicle can be greatly improved, and the light weight of the model is realized.

Description

Bayonet snap image vehicle detection method, computer storage medium and electronic equipment
Technical Field
The application relates to the field of vehicle detection, in particular to a vehicle detection method for snap shot images of a bayonet, a computer storage medium and electronic equipment.
Background
The vehicle detection in the bayonet snap image is a key link of the vehicle video monitoring system, and the vehicle target is rapidly and accurately detected in the image by using the detection device, so that the method is very important for subsequent vehicle attribute analysis and identification, and the processing performance of the system is greatly influenced. The detection rate, namely the proportion of the detected target number in the image to the total number of all targets to be detected, is that the closer the index is to 1, the more targets are detected, and the fewer targets are missed, so that the output result of the vehicle video monitoring system is more real and reliable. Therefore, a vehicle detection scheme with high detection rate and high accuracy is important to perfect and enhance the functions and performances of the road vehicle video monitoring system. In recent years, with the improvement of computer computing power and the popularization and application of deep learning theory and method, a large number of end-to-end target detection models based on deep neural networks, such as YOLO, SSD, mobileNetSSD, tinyDSOD, are developed. The accuracy of the YOLO algorithm is the first in a plurality of algorithms, but the calculation speed is too slow to be applied in engineering; SSD and its improved algorithm are widely used in the goal detection task of the industry because of its higher detection precision and faster detection speed, but it is inevitable that model parameter scale of SSD algorithm is large, model size is up to more than 100 megabytes, detection speed on GTX1080 is also about 25 milliseconds, such efficiency is difficult to reach the demand of practical application; mobileNet is a representation of the lightweight deep neural network proposed in the last two years, because the depth separable convolution is adopted, the model parameter scale is greatly reduced, the operation speed is improved, and a considerable part of the requirements of lightweight application are met. TinyDSOD is a deep neural network which integrates DenseBlock, depth separable convolution and characteristic pyramid ideas, and is gradually applied to a lightweight front-end detection device due to the fact that the speed is high, but the network is excessively focused on the speed, a model is too small, so that the characteristic extraction capacity is limited, and the detection accuracy is required to be further improved.
The deep neural network model can realize end-to-end target detection under the support of massive data by the strong characteristic extraction capability, namely, the class of the target is obtained when the target is detected. Taking vehicle detection in a snap shot image as an example, the existing scheme often includes the following four steps: the first step: classifying objects to be inspected, for example, a front-shot vehicle body and a rear-shot vehicle body (the subsequent vehicle attribute analysis is mostly performed on the front-shot vehicle body or the rear-shot vehicle body); and a second step of: collecting snap images of different bayonet cameras in different scenes, and labeling targets in the snap images according to categories; and a third step of: training a deep neural network model by using the marked data; fourth step: and deploying the trained model into a video analysis system to realize detection of the vehicle in the snap shot image of the bayonet.
In the above technical solution, the selection of the deep neural network model is critical. Recently, the TinyDSOD model began engineering applications with its good performance. TinyDSOD is based on the thought of a backbone network and a feature pyramid, denseBlock in DenseNet is used as a basic component of the backbone network, and meanwhile convolution operation in DenseBlock is replaced by deep separable convolution, so that the network extraction capacity is ensured, and meanwhile, the detection speed is improved. In addition, the idea of feature fusion is introduced into TinyDSOD, and the upper layer features are fused with the adjacent lower layer features upwards, so that the detection capability of a small target is improved, and the backbone network structure of the TinyDSOD is shown in table 1.
TABLE 1TinyDSOD network Structure
Wherein DDB-b is shown in FIG. 1.
The prior art generally has the following disadvantages:
(1) The classification of the to-be-detected class is unreasonable, the intra-class difference is large, the inter-class difference is small, and the learning ability of the deep neural network is weakened. As shown in fig. 2 and 3, the front and rear bodies of the bus are very different, but they belong to two categories, and the rear bodies of the large trucks and the rear bodies of the cars are very different, but they are one category. Such classification results in that some objects of different classes have too small differences or objects of the same class have too large differences, which is an important cause of false detection in the current vehicle detection.
(2) The characteristics of the object to be detected are complex, and the model is difficult to lighten due to the fact that the categories are increased. In addition to special vehicles, there are up to ten types of vehicles currently known to travel on roads, such as cars, SUVs, minibuses, pick-ups, minibuses, buses, vans, and the like. If the classification of the object to be detected is very fine, for example, the object to be detected is classified according to the vehicle type, a neural network model with stronger feature extraction capability is required, the design concept and the light weight required in engineering are contradictory, and the detection capability and timeliness of the model are difficult to balance from the practical experience.
Disclosure of Invention
In view of the above, the application provides a vehicle detection method for snap-shot images of a bayonet, a computer storage medium and electronic equipment, which improve detection precision and detection efficiency.
In order to solve the technical problems, in one aspect, the application provides a vehicle detection method for snap-shot images of a bayonet, which comprises the following steps: s1, acquiring a bayonet snap shot image; s2, identifying the snap shot image of the bayonet, and adopting a vehicle body detection model to detect the vehicle body of the identified vehicle so as to detect a vehicle body area; s3, establishing a filtering rule, screening and filtering out image information of an incomplete vehicle body, and obtaining an complete vehicle body area;
s4, classifying the complete vehicle body area by adopting a vehicle body front-rear beat classification model; s5, outputting complete vehicle information.
According to the bayonet snap image vehicle detection method provided by the embodiment of the application, the existing vehicle detection scheme is optimized, the end-to-end method of simultaneous detection and classification in SSD is changed, the detection and classification are finished step by step, namely, the original detection and classification are changed into the prior detection and classification, the detection rate and the accuracy rate of the vehicle can be greatly improved, and the light weight of the model is realized.
According to some embodiments of the application, in step S2, the detection network of the vehicle body detection model includes an extraction module for feature extraction, where the extraction module includes 4 deiblock modules, each of which is formed by DDB-b-plus modules, and each of which is connected by a transitionlayer, where the transitionlayer is formed by a convolution operation and a pooling layer.
According to some embodiments of the present application, in step S2, feature extraction is performed using a loss function in the SSD, where the loss function L is a sum of the classification confidence loss and the position loss, as shown in formula (1):
wherein: n is the number of anchor blocks matched with the actual object; l (L) conf (z, c) is a classification confidence loss, L loc (z, l, g) is the loss of position of the anchor box; z is the matching result of the anchor point frame and the reference object frames of different categories; c is the confidence of the predicted object frame; l is the position information of the predicted object frame; g is the position information of the marking frame of the actual object; α is a parameter that trades off confidence loss versus location loss.
According to some embodiments of the application, in the detection network, the number of anchor blocks corresponding to the 6 feature graphs output by the feature pyramid is 6, and the aspect ratio of the anchor blocks includes 1, 2, 3, 1/2 and 1/3.
According to some embodiments of the application, in step S3, the filtering rule includes: if the height of the target frame is less than 10% of the image height, deleting the target; if the aspect ratio of the target frame is greater than 1.5, deleting the target; if there is an overlapping area between the two object frames and the area of the overlapping area is greater than 20% of the total area of the object, the object is deleted.
According to some embodiments of the application, the filtering rules further comprise: assuming that the coordinates of the upper left corner (0, 0), the lower left corner (0, h), the upper right corner (w, 0), the lower right corner (w, h), the width w, the height h, and the coordinates of the target frame are respectively: upper left corner (x 1, y 1), lower left corner (x 1, y 2), upper right corner (x 2, y 1), lower right corner (x 2, y 2): if x1<5, the left edge target is pruned; if y2<0.3 h, deleting the target in the upper third area; if w-x2<5, then prune the right edge target; if (h-y 2< 2) & ((y 2-y 1) < (x 2-x 1)), then the lower edge target is pruned.
According to some embodiments of the application, the detection network of the vehicle body front-rear beat classification model is a SqueezeNet network.
In a second aspect, embodiments of the present application provide a computer storage medium comprising one or more computer instructions which, when executed, implement a method as described in the above embodiments.
An electronic device according to an embodiment of the third aspect of the present application includes a memory for storing one or more computer instructions and a processor; the processor is configured to invoke and execute the one or more computer instructions to implement the method as described in any of the embodiments above.
Drawings
FIG. 1 is a block diagram of DDB-b in a TinyDSOD network of the prior art;
FIG. 2 is a schematic diagram of a prior art comparison of a front-slapping body and a rear-slapping body of a bus, wherein the left side of FIG. 2 is the rear-slapping body of the bus, and the right side of FIG. 2 is the front-slapping body of the bus;
FIG. 3 is a schematic diagram of a prior art comparison of a rear slap body of a large truck and a rear slap body of a car, wherein the left side of FIG. 3 is the rear slap body of the large truck and the right side of FIG. 3 is the rear slap body of the car;
FIG. 4 is a flow chart of a method for detecting a vehicle in a bayonet snap image in accordance with an embodiment of the present application;
FIG. 5 is a block diagram of a DDB-b-plus of an improved TinyDSOD network in a bayonet snap image vehicle detection method according to an embodiment of the present application;
fig. 6 is a schematic view of the use effect of the filtering rule in the method for detecting a vehicle with a bayonet snap image according to an embodiment of the present application;
fig. 7 is a schematic diagram of an electronic device according to an embodiment of the application.
Reference numerals:
an electronic device 300;
a memory 310; an operating system 311; an application 312;
a processor 320; a network interface 330; an input device 340; a hard disk 350; and a display device 360.
Detailed Description
The following describes in further detail the embodiments of the present application with reference to the drawings and examples. The following examples are illustrative of the application and are not intended to limit the scope of the application.
The following first describes a vehicle detection method for a snap shot image of a bayonet according to an embodiment of the present application in detail with reference to the accompanying drawings.
As shown in fig. 4, the method for detecting a vehicle with a bayonet snap image according to an embodiment of the present application includes the following steps:
s1, acquiring a snap shot image of the bayonet.
S2, identifying the snap shot image of the bayonet, and adopting a vehicle body detection model to detect the vehicle body of the identified vehicle so as to detect a vehicle body area.
And S3, establishing a filtering rule, screening and filtering out the image information of the incomplete vehicle body, and obtaining the complete vehicle body area.
S4, classifying the complete vehicle body area by adopting a vehicle body front-rear beat classification model.
S5, outputting complete vehicle information.
That is, according to the method for detecting the vehicle in the snap-shot image of the bayonet, the existing vehicle detection scheme is optimized with the aim of improving the vehicle detection rate in the snap-shot image of the bayonet, and the existing end-to-end vehicle detection task is disassembled into two tasks, namely, the original detection classification is changed into the detection first and the classification is reclassified. As shown in fig. 4, firstly, the objects to be detected are combined into a category, namely, the front shooting car body and the rear shooting car body are no longer distinguished in the training set, and a deep neural network model is selected for training; then designing a filtering rule, and filtering out the detected incomplete car body; and finally, performing front and rear beat classification on the detected car body by using a front and rear beat classification model.
According to the method, all the classes of the objects to be detected are combined into one class, the difference between the classes is eliminated, and the model only needs to detect the vehicle body object from the image, so that the detection rate can be greatly improved; and the complexity of the detection task is reduced, so that the possibility is provided for designing a lighter vehicle body detection model, the detection and classification scheme is optimized, and the high detection rate, the high accuracy and the high efficiency of the detection device are ensured.
Therefore, according to the bayonet snap image vehicle detection method provided by the embodiment of the application, the existing vehicle detection scheme is optimized, the end-to-end method of simultaneous detection and classification in SSD is changed, the detection and classification are finished step by step, namely, the original simultaneous detection and classification is changed into the prior detection and the classification is carried out again, the detection rate and the accuracy rate of the vehicle can be greatly improved, and the weight of the model is realized.
According to one embodiment of the application, the detection network of the vehicle body detection model comprises an extraction module for extracting characteristics, wherein the extraction module comprises 4 DenseBlock, each DenseBlock is composed of DDB-b-plus modules, each DenseBlock is connected in a TransportLayer, and the TransportLayer is composed of a convolution operation and a pooling layer. Specifically, as shown in Table 2, the main improvement is the introduction of dilation convolution into the DDB-b module species of the Tiny DSOD backbone network, with the improved DDB-b-plus structure shown in FIG. 5.
That is, according to the improved detection classification scheme, the complexity of the vehicle body detection task is greatly reduced due to the combination of the categories, so that a lighter model can be selected. The backbone network of the model is not changed, as shown in Table 2, and only DDB-b in the backbone network is changed into a DDB-b-plus structure, as shown in FIG. 5.
Table 2 the backbone network structure of the improved TinyDSOD of the application
In some embodiments of the present application, in step S2, feature extraction is performed using a loss function in the SSD. Specifically, the loss function L is the sum of the classification confidence loss and the position loss, as shown in formula (1):
wherein: n is the number of anchor blocks matched with the actual object; l (L) conf (z, c) is a classification confidence loss, L loc (z, l, g) is the loss of position of the anchor box; z is the matching result of the anchor point frame and the reference object frames of different categories; c is the confidence of the predicted object frame; l is the position information of the predicted object frame; g is the position information of the marking frame of the actual object; alpha is a parameter that balances confidence loss and location loss, and is typically set to 1, i.e., the weights of both losses are the same.
That is, the method for detecting the vehicle with the bayonet snap images according to the embodiment of the application adopts the same loss function as the TinyDSOD network, namely the loss function in the SSD, and can simultaneously realize the position regression and the target classification of the targets.
According to some embodiments of the application, the improved TinyDSOD network includes a data enhancement module in which the area ratio of the randomly cropped image to the actual object's annotation frame is 0.8,0.8,0.9,0.9,1.0 and 1.0, respectively. Optionally, in the modified TinyDSOD network, the number of anchor blocks corresponding to the 6 feature graphs output by the feature pyramid is 6, where the feature pyramid is a pyramid structure composed of feature graphs of different sizes. Further, in the modified TinyDSOD network, the aspect ratio of the anchor block contains 1, 2, 3, 1/2 and 1/3.
The TinyDSOD network is one of the deep neural networks which does not need a pre-training model and can be randomly initialized at present, and the bayonet snap image vehicle detection method provided by the embodiment of the application has the advantages that a large number of experiments are carried out on a public data set, so that training is started from random initialization, and a better training result can be completely achieved according to parameters in an original theory.
The bayonet snap image detection method improves the data enhancement module of the original TinyDSOD network by reducing the detection of a plurality of blocked or image edge incomplete vehicles in the bayonet snap image and greatly improving the efficiency of a detection device and a subsequent vehicle identification system. The original data enhancement module includes operations of random cutting and turning on the original data to expand the training data set, where the area ratio of the randomly cut image to the labeling frame of the GroundTruth (actual object) is 0.1,0.3,0.5,0.7,0.9 and 1.0, and taking 0.1 as an example, which means that as long as the selected image area contains 10% of the GroundTruth, the area will be cut for expanding the training set, but such cutting will cause a large number of images of incomplete targets in the training set. In order to make the cropped image include a more complete GroundTruth, the parameters of the batch are changed to 0.8,0.8,0.9,0.9,1.0 and 1.0 in turn, that is, only the selected region covers more than 80% of the area of a GroundTruth, and the region image is cropped into the training set.
Another training strategy improvement is a targeted modification of the number and aspect ratio of DefaultBox. The image of the bayonet snap contains a variety of vehicles which will present different aspect ratios in the snap image, for example the aspect ratio of a container truck is close to 1:3, whereas the aspect ratio of a car is distributed around 1:1.5, the aspect ratio of other types of vehicles being distributed in the middle of these two values. The default box number and ratio in the original TinyDSOD network are set in SSD, so that in order to achieve a good effect on the bayonet snap image dataset, modification is needed for the detection target. The number of DefaultBox corresponding to the 6 feature graphs output by the feature pyramid is set to be 6, and the aspect ratio comprises 1, 2, 3, 1/2 and 1/3, so that each GroundTruth has as many DefaultBox as possible to be matched with the feature pyramid, and the matching effect is better.
Therefore, according to the vehicle detection method for the bayonet snap images, which is disclosed by the embodiment of the application, the vehicle detection in the bayonet snap images is taken as a target, and the network structure and the training strategy of TinyDSOD are subjected to targeted optimization, so that the feature extraction capability of an algorithm is improved, and the vehicle detection method is suitable for a vehicle detection device with a light front end.
In some embodiments of the present application, in step S3, the filtering rule includes:
if the height of the target frame is less than 10% of the height of the image, the target is deleted.
If the aspect ratio of the target frame is greater than 1.5, the target is deleted.
If there is an overlapping area between the two object frames and the area of the overlapping area is greater than 20% of the total area of the object, the object is deleted.
Further, the filtering rule further includes:
assuming that the coordinates of the upper left corner (0, 0), the lower left corner (0, h), the upper right corner (w, 0), the lower right corner (w, h), the width w, the height h, and the coordinates of the target frame are respectively: upper left corner (x 1, y 1), lower left corner (x 1, y 2), upper right corner (x 2, y 1), lower right corner (x 2, y 2):
if x1<5, the left edge target is pruned.
If y2<0.3 h, then the target in the upper third area is deleted.
If w-x2<5, then the right edge target is pruned.
If (h-y 2< 2) & ((y 2-y 1) < (x 2-x 1)), then the lower edge target is pruned.
In other words, the filtering rules are designed to filter out detected unwanted targets, thereby reducing the computational overhead of the subsequent vehicle attribute analysis and recognition modules. Occlusion vehicles are defined herein as: the face area or even more features are blocked; the non-occlusion vehicle is defined as: the vehicle body area is completely exposed or has little shielding, and the license plate area is clearly visible.
The pseudo code of the filtering rule in the present application is as follows:
1) The height of the target frame is less than 10% of the height of the picture, and the target (a small target car body at a distance) is deleted;
2) The aspect ratio of the target frame is greater than 1.5, and the target (side-flap car body) is deleted;
3) If the two target frames have an overlapping area, deleting the target if the area of the overlapping area is larger than 20% of the total area of the target;
4) Assuming that the coordinates of the upper left corner (0, 0), the lower left corner (0, h), the upper right corner (w, 0), the lower right corner (w, h), the width w, the height h of the picture are respectively: upper left corner (x 1, y 1), lower left corner (x 1, y 2), upper right corner (x 2, y 1), lower right corner (x 2, y 2):
a) If x1<5, prune the left edge target;
b) Deleting the target in the upper third area if y2 is less than 0.3 h;
c) If w-x2<5, deleting the right edge object;
d) If (h-y 2< 2) & ((y 2-y 1) < (x 2-x 1)), the lower edge object is deleted.
The result of applying the filtering rule is shown in fig. 6, the left side of fig. 6 is a detection result without filtering, and the right side of fig. 6 is a detection result with the filtering rule applied, so that the objects which are blocked, small objects, edge objects and the like and do not meet the requirements are not detected, and the calculation cost of subsequent analysis is greatly reduced.
According to one embodiment of the application, the detection network of the vehicle body front-rear beat classification model is a SquezeNet network.
Specifically, in the method for detecting the vehicle with the bayonet snap images according to the embodiment of the application, the second model is a vehicle body front-rear shooting classification model, and the model has the function of classifying the vehicle targets output by the filtering rules into front shots and rear shots. The training set of the model is particularly important, and because the front and rear beats of different types of vehicles are very different, the front and rear beat data of various types of vehicles are considered in the organization of the training set, and the data balance is ensured, so that the two classification models can accurately classify the front and rear beats of various types of vehicles.
The front and back beat classification model in the application selects a SqueezeNet network, the training strategy is direct random initialization training, the initial learning rate is 0.01, the learning rate change strategy is multistep (20000, 40000, 60000, 80000), max_iter is 100000, momentum:0.9, weight decay: 0.0005, optimization method: SGD, final verification accuracy reaches 0.99.
In addition, for the detection task of the vehicle with the bayonet snap image, the evaluation index for reference in the detection method of the vehicle with the bayonet snap image according to the embodiment of the application comprises an accuracy rate and a recall rate, wherein the accuracy rate is the proportion of correct detection in all detected targets, the recall rate is the proportion of correct detection in the total detection number, and the total detection number comprises a positive detection number, a missed detection number and a false detection number, as shown in the formula (2) and the formula (3).
Accuracy = positive number/(positive number + false number) (2)
Recall ratio=positive number/(positive number+missed number+false number) (3)
According to the bayonet snap image vehicle detection method provided by the embodiment of the application, detection and classification tasks are disassembled, a light detection model is optimized based on TinyDSOD, filtering rules are introduced, and the existing SqueEzeNet is adopted for front and back beat two classification, so that compared with the original TinyDSOD on a self-built vehicle data set, the detection rate and the accuracy rate are greatly improved when the test environment is GTX1080, and the detailed results are shown in a table 3. Therefore, the size and the detection speed of the model in the vehicle detection method for the snap shot image of the bayonet are slightly increased, the application requirements of engineering light weight are met, and the method can be popularized and applied in target detection tasks based on snap shot images.
TABLE 3 comparison of the detection results of the present application with the existing TinyDSOD network
In summary, according to the method for detecting the vehicle with the bayonet snap images provided by the embodiment of the application, the existing detection tasks are disassembled according to the requirements (high detection rate, high accuracy and light weight) of the actual use scene: the class labels are combined in the detection link, only the vehicle body is detected, and in the classification link, the training set is organized according to the vehicle type, so that the classification accuracy is improved; meanwhile, as the complexity of the detection task is reduced, a lighter model can be adopted, the expansion coefficient is introduced into the depth separable convolution of the TinyDSOD network, the feature extraction capacity is enhanced, the effective receptive field of convolution calculation is enhanced, the detection accuracy of an algorithm is improved, and the capacity and the performance of the model are balanced better.
In addition, the application also provides a computer storage medium, which comprises one or more computer instructions, wherein the one or more computer instructions realize the method for detecting the vehicle with the bayonet snap images when being executed.
That is, the computer storage medium stores a computer program which, when executed by a processor, causes the processor to perform any of the above-described bayonet snap image vehicle detection methods.
As shown in fig. 7, an embodiment of the present application provides an electronic device 300, including a memory 310 and a processor 320, where the memory 310 is configured to store one or more computer instructions, and the processor 320 is configured to invoke and execute the one or more computer instructions, thereby implementing any of the methods described above.
That is, the electronic device 300 includes: a processor 320 and a memory 310, in which memory 310 computer program instructions are stored which, when executed by the processor, cause the processor 320 to perform any of the methods described above.
Further, as shown in fig. 7, the electronic device 300 further includes a network interface 330, an input device 340, a hard disk 350, and a display device 360.
The interfaces and devices described above may be interconnected by a bus architecture. The bus architecture may be a bus and bridge that may include any number of interconnects. One or more Central Processing Units (CPUs), represented in particular by processor 320, and various circuits of one or more memories, represented by memory 310, are connected together. The bus architecture may also connect various other circuits together, such as peripheral devices, voltage regulators, and power management circuits. It is understood that a bus architecture is used to enable connected communications between these components. The bus architecture includes, in addition to a data bus, a power bus, a control bus, and a status signal bus, all of which are well known in the art and therefore will not be described in detail herein.
The network interface 330 may be connected to a network (e.g., the internet, a local area network, etc.), and may obtain relevant data from the network and store the relevant data in the hard disk 350.
The input device 340 may receive various instructions from an operator and transmit the instructions to the processor 320 for execution. The input device 340 may include a keyboard or pointing device (e.g., a mouse, a trackball, a touch pad, or a touch screen, among others).
The display device 360 may display results obtained by the processor 320 executing instructions.
The memory 310 is used for storing programs and data necessary for the operation of the operating system, and data such as intermediate results in the calculation process of the processor 320.
It will be appreciated that memory 310 in embodiments of the application may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be Read Only Memory (ROM), programmable Read Only Memory (PROM), erasable Programmable Read Only Memory (EPROM), electrically Erasable Programmable Read Only Memory (EEPROM), or flash memory, among others. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. The memory 310 of the apparatus and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
In some implementations, the memory 310 stores the following elements, executable modules or data structures, or a subset thereof, or an extended set thereof: an operating system 311 and applications 312.
The operating system 311 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application programs 312 include various application programs such as a Browser (Browser) and the like for implementing various application services. A program implementing the method of the embodiment of the present application may be included in the application program 312.
The method disclosed in the above embodiment of the present application may be applied to the processor 320 or implemented by the processor 320. Processor 320 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in processor 320. The processor 320 may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components, which may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 310 and the processor 320 reads the information in the memory 310 and in combination with its hardware performs the steps of the method described above.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For a hardware implementation, the processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
In particular, the processor 320 is further configured to read the computer program and execute any of the methods described above.
In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may be physically included separately, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.
The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform part of the steps of the transceiving method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
While the foregoing is directed to the preferred embodiments of the present application, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.

Claims (8)

1. The vehicle detection method for the snap shot image of the bayonet is characterized by comprising the following steps of:
s1, acquiring a bayonet snap shot image;
s2, identifying the snap shot image of the bayonet, and adopting a vehicle body detection model to detect the vehicle body of the identified vehicle so as to detect a vehicle body area;
s3, establishing a filtering rule, screening and filtering out image information of an incomplete vehicle body, and obtaining an complete vehicle body area;
s4, classifying the complete vehicle body area by adopting a vehicle body front-rear beat classification model;
s5, outputting complete vehicle information;
in step S2, the detection network of the vehicle body detection model includes an extraction module for performing feature extraction, where the extraction module includes 4 tenseblock, each of which is formed by DDB-b-plus modules, and each of which is connected by a transitionlayer, and the transitionlayer is formed by a convolution operation and a pooling layer.
2. The method according to claim 1, wherein in step S2, feature extraction is performed using a loss function in the SSD, and the loss function L is a sum of a classification confidence loss and a position loss, as shown in formula (1):
wherein: n is the number of anchor blocks matched with the actual object; l (L) conf (z, c) is a classification confidence loss, L loc (z, l, g) is the loss of position of the anchor box; z is the matching result of the anchor point frame and the reference object frames of different categories; c is the confidence of the predicted object frame; l is the position information of the predicted object frame; g is the position information of the marking frame of the actual object; α is a parameter that trades off confidence loss versus location loss.
3. The method of claim 1, wherein in the detection network, the number of anchor blocks corresponding to the 6 feature graphs output by the feature pyramid is 6, and the aspect ratio of the anchor blocks comprises 1, 2, 3, 1/2, and 1/3.
4. The method according to claim 1, wherein in step S3, the filtering rules comprise:
if the height of the target frame is less than 10% of the image height, deleting the target;
if the aspect ratio of the target frame is greater than 1.5, deleting the target;
if there is an overlapping area between the two object frames and the area of the overlapping area is greater than 20% of the total area of the object, the object is deleted.
5. The method of claim 4, wherein the filtering rules further comprise:
assuming that the coordinates of the upper left corner (0, 0), the lower left corner (0, h), the upper right corner (w, 0), the lower right corner (w, h), the width w, the height h, and the coordinates of the target frame are respectively: upper left corner (x 1, y 1), lower left corner (x 1, y 2), upper right corner (x 2, y 1), lower right corner (x 2, y 2):
if x1<5, the left edge target is pruned;
if y2<0.3 h, deleting the target in the upper third area;
if w-x2<5, then prune the right edge target;
if (h-y 2< 2) & ((y 2-y 1) < (x 2-x 1)), then the lower edge target is pruned.
6. The method of claim 1, wherein the detection network of the vehicle body front-rear beat classification model is a SqueezeNet network.
7. A computer storage medium comprising one or more computer instructions which, when executed, implement the method of any of claims 1-6.
8. An electronic device comprising a memory and a processor, characterized in that,
the memory is used for storing one or more computer instructions;
the processor is configured to invoke and execute the one or more computer instructions to implement the method of any of claims 1-6.
CN201911347011.6A 2019-12-24 2019-12-24 Bayonet snap image vehicle detection method, computer storage medium and electronic equipment Active CN111126271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911347011.6A CN111126271B (en) 2019-12-24 2019-12-24 Bayonet snap image vehicle detection method, computer storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911347011.6A CN111126271B (en) 2019-12-24 2019-12-24 Bayonet snap image vehicle detection method, computer storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN111126271A CN111126271A (en) 2020-05-08
CN111126271B true CN111126271B (en) 2023-08-29

Family

ID=70500379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911347011.6A Active CN111126271B (en) 2019-12-24 2019-12-24 Bayonet snap image vehicle detection method, computer storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111126271B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117935186B (en) * 2024-03-25 2024-06-14 福建省高速公路科技创新研究院有限公司 Method for identifying dangerous goods vehicles in tunnel under strong light inhibition

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874840A (en) * 2016-12-30 2017-06-20 东软集团股份有限公司 Vehicle information recognition method and device
CN107704866A (en) * 2017-06-15 2018-02-16 清华大学 Multitask Scene Semantics based on new neural network understand model and its application
CN108319907A (en) * 2018-01-26 2018-07-24 腾讯科技(深圳)有限公司 A kind of vehicle identification method, device and storage medium
CN109657590A (en) * 2018-12-11 2019-04-19 合刃科技(武汉)有限公司 A kind of method, apparatus and storage medium detecting information of vehicles
CN110232316A (en) * 2019-05-05 2019-09-13 杭州电子科技大学 A kind of vehicle detection and recognition method based on improved DSOD model
CN110490156A (en) * 2019-08-23 2019-11-22 哈尔滨理工大学 A kind of fast vehicle detection method based on convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874840A (en) * 2016-12-30 2017-06-20 东软集团股份有限公司 Vehicle information recognition method and device
CN107704866A (en) * 2017-06-15 2018-02-16 清华大学 Multitask Scene Semantics based on new neural network understand model and its application
CN108319907A (en) * 2018-01-26 2018-07-24 腾讯科技(深圳)有限公司 A kind of vehicle identification method, device and storage medium
CN109657590A (en) * 2018-12-11 2019-04-19 合刃科技(武汉)有限公司 A kind of method, apparatus and storage medium detecting information of vehicles
CN110232316A (en) * 2019-05-05 2019-09-13 杭州电子科技大学 A kind of vehicle detection and recognition method based on improved DSOD model
CN110490156A (en) * 2019-08-23 2019-11-22 哈尔滨理工大学 A kind of fast vehicle detection method based on convolutional neural networks

Also Published As

Publication number Publication date
CN111126271A (en) 2020-05-08

Similar Documents

Publication Publication Date Title
Zhang et al. CCTSDB 2021: a more comprehensive traffic sign detection benchmark
Hassaballah et al. Vehicle detection and tracking in adverse weather using a deep learning framework
CN106295541A (en) Vehicle type recognition method and system
CN113420607A (en) Multi-scale target detection and identification method for unmanned aerial vehicle
CN112070713A (en) Multi-scale target detection method introducing attention mechanism
CN111274926B (en) Image data screening method, device, computer equipment and storage medium
CN107808126A (en) Vehicle retrieval method and device
CN112738470B (en) Method for detecting parking in highway tunnel
CN111340026A (en) Training method of vehicle annual payment identification model and vehicle annual payment identification method
CN114419583A (en) Yolov4-tiny target detection algorithm with large-scale features
CN110909656B (en) Pedestrian detection method and system integrating radar and camera
Rajendran et al. Fast and accurate traffic sign recognition for self driving cars using retinanet based detector
Jin et al. A deep-learning-based scheme for detecting driver cell-phone use
Peng et al. Real-time illegal parking detection algorithm in urban environments
CN110991421B (en) Bayonet snap image vehicle detection method, computer storage medium and electronic equipment
CN111126271B (en) Bayonet snap image vehicle detection method, computer storage medium and electronic equipment
Wang et al. Real-time vehicle target detection in inclement weather conditions based on YOLOv4
CN114419584A (en) Improved traffic sign identification and positioning method by inhibiting YOLOv4 by using non-maximum value
CN111709377A (en) Feature extraction method, target re-identification method and device and electronic equipment
CN114724128B (en) License plate recognition method, device, equipment and medium
Wu et al. Research on asphalt pavement disease detection based on improved YOLOv5s
CN115984786A (en) Vehicle damage detection method and device, terminal and storage medium
Srikanth et al. Automatic vehicle number plate detection and recognition systems: Survey and implementation
CN115690752A (en) Driver behavior detection method and device
CN114359884A (en) License plate recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant