CN112215103A - Vehicle and pedestrian multi-class detection method and device based on improved ACF - Google Patents

Vehicle and pedestrian multi-class detection method and device based on improved ACF Download PDF

Info

Publication number
CN112215103A
CN112215103A CN202011034733.9A CN202011034733A CN112215103A CN 112215103 A CN112215103 A CN 112215103A CN 202011034733 A CN202011034733 A CN 202011034733A CN 112215103 A CN112215103 A CN 112215103A
Authority
CN
China
Prior art keywords
vehicle
pedestrian
detection
height
calibration frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011034733.9A
Other languages
Chinese (zh)
Other versions
CN112215103B (en
Inventor
石英
黄紫旗
谢长君
张晖
华捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202011034733.9A priority Critical patent/CN112215103B/en
Publication of CN112215103A publication Critical patent/CN112215103A/en
Application granted granted Critical
Publication of CN112215103B publication Critical patent/CN112215103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a vehicle and pedestrian multi-class detection method and device based on an improved ACF, wherein the method comprises the following steps: obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample; extracting the multi-view aggregation channel characteristics of the preprocessed vehicle training samples and the context pixel aggregation channel characteristics of the pedestrian training samples by using a vehicle and pedestrian detection framework, establishing a vehicle detector according to the multi-view aggregation channel characteristics, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics; sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result; and adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result. The invention solves the problems of single detection target, low detection precision and easy occurrence of false detection at present.

Description

Vehicle and pedestrian multi-class detection method and device based on improved ACF
Technical Field
The invention relates to the technical field of unmanned visual analysis, in particular to a vehicle and pedestrian multi-class detection method and device based on an improved ACF and a storage medium.
Background
With the improvement of science and technology and the improvement of the living standard of people, the automobile keeping quantity is greatly increased, and various factors cause frequent traffic accidents. Unmanned driving seeks to ameliorate this problem, where vehicle and pedestrian detection techniques are of paramount importance. The accuracy and real-time performance of the vehicle and pedestrian detection algorithm directly affect the safety performance of the unmanned vehicle.
The current mainstream vehicle and pedestrian detection algorithms include a deep learning detection algorithm and a statistical feature detection algorithm. The CNN feature training period in deep learning is long, and the calculated amount is large. According to the difference of detection strategies, the statistical characteristic detection method can be subdivided into a DPM method and a decision tree method, and the DPM method and the decision tree method are rarely applied to an unmanned system due to the characteristics of high complexity and low operation speed. In the decision tree method, the design of the feature descriptors is the key of the detection algorithm and is the most studied content at present, and mainly comprises gradient, texture, color and fusion features thereof. The Haar features are mainly used for extracting texture information of a target and are widely applied to the field of vehicle detection, and the HOG features and the like are used for capturing information such as the contour and the shape of the target, are representative of gradient features and are generally used for detecting pedestrians. In addition, the color characteristics of gray scale, RGB, LUV and the like can also be used for representing the target. However, these features can generally only be used to detect specific targets, and expressive power has limitations in complex road scenarios. For the above problems, an Integrated Channel Features (ICF) is proposed first, which fuses Features such as gradients, colors, textures, and the like, and then an Aggregated Channel Feature (ACF) is proposed in order to improve detection performance. Then, the LDCF introduces filtering operation on the basis of the ACF to enhance the expression capability of the ACF, but also brings great calculation amount, and although the algorithm detection precision is further improved, the real-time performance is greatly reduced. Although the LDCF detection precision is greatly improved compared with the ACF detection precision, the real-time performance is reduced, and the LDCF detection method is difficult to be applied to light-weight vehicle pedestrian detection.
Therefore, when the conventional vehicle and pedestrian detection method is applied to a road scene, the problems of single detection target, low detection precision and easy occurrence of false detection are solved.
Disclosure of Invention
In view of the above, it is desirable to provide a method, an apparatus and a storage medium for detecting multiple types of vehicles and pedestrians based on an improved ACF, so as to solve the problems of single detection target, low detection accuracy and easy occurrence of false detection when detecting vehicles and pedestrians.
In a first aspect, the invention provides a vehicle and pedestrian multi-class detection method based on an improved ACF, comprising the following steps:
obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample;
extracting the multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle and pedestrian detection framework, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;
extracting context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle and pedestrian detection framework, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics;
acquiring a preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result;
and adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result.
Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the method for preprocessing the vehicle training samples and the pedestrian training samples specifically includes:
scaling the vehicle training sample and the pedestrian training sample in horizontal and vertical directions, and keeping the center position of each target in the vehicle training sample and the pedestrian training sample normalized.
Preferably, in the method for detecting multiple classes of vehicles and pedestrians based on the improved ACF, the step of extracting the multi-view aggregation channel feature of the vehicle training sample after the preprocessing by using the vehicle and pedestrian detection framework and establishing the vehicle detector according to the multi-view aggregation channel feature specifically includes:
and calculating a similar incidence matrix among sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining characteristic vectors of multiple dimensions through matrix spectral decomposition, clustering the characteristic vectors of the multiple dimensions by adopting a K-means algorithm to extract aggregate channel characteristics of the multiple viewing angles, and then training the vehicle detector of the corresponding viewing angle by utilizing the aggregate channel characteristics of the multiple viewing angles.
Preferably, in the method for detecting multiple classes of vehicles and pedestrians based on the improved ACF, the step of extracting the context pixel aggregation channel feature of the preprocessed pedestrian training sample by using the vehicle and pedestrian detection framework, and establishing the pedestrian detector according to the context pixel aggregation channel feature specifically includes:
extracting 10 characteristic channels of the preprocessed pedestrian training sample, processing the ten channels by using 2 multiplied by 2 average pooling to obtain an aggregation channel F with n being 22×2After the feature is obtained, 2 × 2 average pooling is performed twice on the aggregation channel feature F2 × 2 to obtain a region context pixel aggregation channel F4×4Feature and F8×8Characterized in that F is4×4Feature and F8×8Feature sampling to F2×2And (3) resolution, combining to form 30 deformation-resistant context pixel aggregation channel features with the same size, and establishing the pedestrian detector according to the context pixel aggregation channel features.
Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the step of obtaining the preprocessed image to be detected and sharing the aggregate channel characteristics of the image to be detected to the vehicle detector and the pedestrian detector to obtain the vehicle detection result and the pedestrian detection result further includes:
and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.
Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the step of performing false detection and rejection on the vehicle detection result and the pedestrian detection result by using a false detection and rejection strategy based on road constraints includes:
normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinate of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinate of the lower edge of the pedestrian calibration frame;
training a first regression model between the height of the vehicle calibration frame and the position coordinates of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinates of the lower edge of the pedestrian calibration frame by using a support vector machine;
calculating the height of a predicted vehicle calibration frame corresponding to the lower edge position of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the lower edge position of a pedestrian in a pedestrian detection result by adopting a second regression model;
calculating a first error value between the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value between the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;
when the first error value is larger than a first threshold value, judging that the vehicle detection result is detected by mistake, otherwise, receiving the vehicle detection result; and judging that the pedestrian detection result is detected wrongly when the second error value is larger than a second threshold value, and otherwise, receiving the pedestrian detection result.
Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the first regression model is:
H1=f(Y1),
wherein H1Indicating the height of the calibration frame of the vehicle, Y1The position coordinates of the lower edge of the vehicle calibration frame are represented;
the first error value is calculated by:
Figure BDA0002704856750000051
wherein E is1Represents a first error value, h1Denotes the height, h 'of the actual vehicle calibration frame'1Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.
Preferably, in the vehicle and pedestrian multi-category detection method based on the improved ACF, the second regression model is:
H2=f(Y2),
wherein H2Indicating the height of the pedestrian calibration frame, Y2Representing the position coordinates of the lower edge of the pedestrian calibration frame;
the second error value is calculated by:
Figure BDA0002704856750000052
wherein E is2Represents a second error value, h2Denotes the height, h 'of the actual vehicle calibration frame'2Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.
In a second aspect, the present invention further provides a vehicle and pedestrian multi-category detection apparatus based on the improved ACF, including: a processor and a memory;
the memory has stored thereon a computer readable program executable by the processor;
the processor, when executing the computer readable program, implements the steps in the improved ACF-based vehicle pedestrian multi-class detection method as described above.
In a third aspect, the present invention also provides a computer readable storage medium storing one or more programs, which are executable by one or more processors to implement the steps in the improved ACF-based vehicle pedestrian multi-class detection method as described above.
[ PROBLEMS ] the present invention
In the vehicle and pedestrian multi-class detection method, the device and the storage medium based on the improved ACF, the problem that an Adaboost classifier in an ACF detection algorithm is single in detection class is solved, a multi-class detection framework is adopted, and vehicle and pedestrian detection is carried out simultaneously; in order to solve the problem of low vehicle and pedestrian detection precision, a multi-view vehicle detector and a context pixel pedestrian detector are adopted, so that the view angle difference of a vehicle sample and the deformation caused by the gesture of a pedestrian when the pedestrian walks can be effectively captured, and the detection precision is improved; in order to overcome the false detection phenomenon in the vehicle and pedestrian detection process, the false detection is effectively removed by utilizing the road prior information.
Drawings
FIG. 1 is a flow chart of a method for detecting multiple types of pedestrians in a vehicle based on an improved ACF according to a preferred embodiment of the present invention;
FIG. 2 is a flowchart of the operation of a preferred embodiment of the vehicle pedestrian detection framework of the present invention;
FIG. 3 is a schematic illustration of a training process for the vehicle detector of the present invention;
FIG. 4 is a schematic diagram of a training process for a pedestrian detector according to the present invention;
FIG. 5 is a statistical graph of the relationship between the target height of the calibration frame and the coordinates of its lower edge position according to the present invention;
fig. 6 is a schematic diagram of an operating environment of a preferred embodiment of a vehicle pedestrian multi-category detection program based on an improved ACF.
Detailed Description
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate preferred embodiments of the invention and together with the description, serve to explain the principles of the invention and not to limit the scope of the invention.
Referring to fig. 1, a vehicle and pedestrian multi-category detection method based on an improved ACF provided in an embodiment of the present invention includes the following steps:
s100, obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample.
In this embodiment, in order to implement detection of vehicles and pedestrians, training of samples needs to be performed first, and in order to ensure detection accuracy, sample data needs to be enhanced and improved first, and is preprocessed, specifically, in step S100, the method for preprocessing the samples specifically includes:
scaling the vehicle training sample and the pedestrian training sample in horizontal and vertical directions, and keeping the center position of each target in the vehicle training sample and the pedestrian training sample normalized.
Specifically, the current ACF detection algorithm strengthens a data set by using a horizontal flipping method, and ignores the influence of a labeling error of the data set. Meanwhile, images are generally standardized in the training process, and the problem of target misalignment easily occurs in the zoomed images, so that the detection precision is seriously influenced. Therefore, the horizontal inversion is directly removed, multi-scale data reinforcement is added, namely 1.1 times of scaling is carried out in the horizontal direction, the vertical direction and the like of the original training sample, the normalization of the target center position is kept, the sensitivity of the background around the labeling frame can be reduced by utilizing the multi-scale reinforcement, and the classification robustness is improved.
S200, extracting multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle and pedestrian detection frame, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;
s300, extracting context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle and pedestrian detection framework, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics.
In the embodiment, because the detection target of the traditional ACF algorithm is single, a multi-class pedestrian vehicle detection framework is introduced, feature extraction is performed on the vehicle sample and the pedestrian sample at the same time, and the ACF features detected by the vehicle detector and the pedestrian detector are shared and then used for vehicle and pedestrian detection. Specifically, the invention provides a vehicle and pedestrian detection framework based on feature sharing, so that an ACF feature is shared by a vehicle detector and a pedestrian detector, the detection efficiency is improved, and multi-class detection is completed at the same time. The frame trains the pedestrian detector and the vehicle detector at the same time in the training stage, and enables the pedestrian detector and the vehicle detector to share ACF characteristics, so that the training efficiency can be obviously improved. Meanwhile, the framework has universality, other detectors can be added, and other detection categories are easy to expand.
Further, the step S200 specifically includes:
and calculating a similar incidence matrix among sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining characteristic vectors of multiple dimensions through matrix spectral decomposition, clustering the characteristic vectors of the multiple dimensions by adopting a K-means algorithm to extract aggregate channel characteristics of the multiple viewing angles, and then training the vehicle detector of the corresponding viewing angle by utilizing the aggregate channel characteristics of the multiple viewing angles.
Specifically, as shown in fig. 3, when training a Multi-view Aggregated Channel feature vehicle detector (Mv-ACF), the present invention performs a clustering process on vehicle training samples, extracts Features of each view sample, and then trains a corresponding vehicle detector, and in consideration of the fact that the extracted feature dimension of the ACF is high, the present invention uses a K-means algorithm of unsupervised learning. Aiming at the problem of possible clustering degradation, the method adopts a spectral clustering algorithm, firstly calculates a similar incidence matrix among sample points, obtains a characteristic vector through matrix spectral decomposition, constructs a new characteristic space, and then uses a K-means algorithm for clustering.
And verifying the effectiveness of the spectral clustering algorithm used by the invention. Through experiments, K-means clustering is carried out on the training samples, and the training samples are clustered into 20 classes. The classified samples were trained and tested using an Mv-ACF detector and compared to the spectral clustering algorithm results, as shown in table 1. Obviously, the AP precision of the spectral clustering algorithm under different levels is higher than that of the K-means clustering algorithm, and the spectral clustering algorithm achieves the expected effect.
Figure BDA0002704856750000091
In a preferred embodiment, the step S300 specifically includes:
extracting 10 characteristic channels of the preprocessed pedestrian training sample, processing the ten channels by using 2 multiplied by 2 average pooling to obtain an aggregation channel F with n being 22×2After the feature is obtained, 2 × 2 average pooling is performed twice on the aggregation channel feature F2 × 2 to obtain a region context pixel aggregation channel F4×4Feature and F8×8Characterized in that F is4×4Feature and F8×8Feature sampling to F2×2And (3) resolution, combining to form 30 deformation-resistant context pixel aggregation channel features with the same size, and establishing the pedestrian detector according to the context pixel aggregation channel features.
Specifically, the posture of the pedestrian when walking can cause deformation, so that the difficulty of pedestrian detection is increased. To this end, the invention proposes a Context Pixel Aggregated Channel feature (CP-ACF). As shown in fig. 4, 10 feature channels are extracted first as in the ACF algorithm, and then the ten channels are processed using 2 × 2 average pooling to obtain an ACF feature F2 × 2 with n being 2, and on the basis of this, 2 × 2 average pooling is performed twice again to obtain region context pixel aggregation F4 × 4 and F8 × 8 features. Finally, F4 x 4 and F8 x 8 are up-sampled to F2 x 2 resolution, and finally combined to form 30 deformation-resistant CP-ACF channel features with the same size, so that fusion of local and context features is realized. When the soft cascade Adaboost is classified, the weak classifier can self-adaptively select local and context characteristics of different areas in a CP-ACF channel, compared with the characteristic that the ACF can only select a fixed area, the CP-ACF has stronger anti-deformation capability. The AP accuracies of CP-ACF and ACF at different levels of KITTI validation set are shown in the following table.
Figure BDA0002704856750000101
The invention designs the vehicle detector and the pedestrian detector in the frame respectively, integrates road information, aims to improve the detection precision, simultaneously realizes the feature sharing of the vehicle detector and the pedestrian detector and improves the algorithm real-time property.
S400, acquiring the preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result.
Specifically, in the detection algorithm, the part which usually takes the longest time is image feature extraction, and in the conventional vehicle and pedestrian detection algorithm, vehicle detection and pedestrian detection are usually performed respectively, that is, feature extraction is performed on the image respectively, which takes a long time. The vehicle pedestrian detection framework based on feature sharing is proposed, as shown in fig. 2, so that the vehicle detector and the pedestrian detector share the ACF feature, the detection efficiency is improved, and meanwhile, multi-class detection is completed.
Further, since the Mv-ACF and the CP-ACF use the same 10 original ACF feature channels, the former uses 2 × 2 average pooling to extract features, and the latter uses three types of average pooling, namely 2 × 2, 4 × 4 and 8 × 8 average pooling to extract features, it can be seen that the feature channels used by the latter contain the former, and the feature pyramid construction patterns of the two are the same, so that the vehicle detector and the pedestrian detector can share the feature pyramid of the latter.
In a preferred embodiment, after the step S400, the method further includes:
and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.
Specifically, in the testing process, each subclass detector of the Mv-ACF adopts a data training model with different viewing angles, and the detection result includes detection frames with different distribution confidence scores and inconsistent geometric features (such as aspect ratio). Direct merging introduces noise, which causes instability of subsequent NMS and reduces accuracy. The invention introduces a parameterized Logistic regression calibration method to calibrate the confidence scores of the detection results, so that the distribution of the detection results is more reasonable.
Specifically, let Deti ═ di1,di2,…,dij,…,dirAnd is r detection results of the ith subclass detector. Wherein d isij={Rij,cijDenotes the jth detection result, which is indicated by the detection frame RijAnd confidence score cijAnd (4) forming. Set mDeti={mdi1,mdi2,…,mdij,…,mdirIs the calibrated result, wherein mdij={Rij,c′ij}. The purpose of the confidence score calibration is to adopt a calibration function giC'ij=gi(cij). A parameterized Logistic regression calibration method is introduced, and the scores are normalized, namely:
Figure BDA0002704856750000111
wherein, the parameter A of the ith subclass detectoriAnd BiObtaining by solving a regularized maximum likelihood problem:
Figure BDA0002704856750000112
substituting formula (1) into formula (2) to obtain
Figure BDA0002704856750000113
Wherein,
Figure BDA0002704856750000114
wherein r is+And r-Respectively for the ith subclass for training parameter AiAnd BiPositive and negative sample numbers of (1). y isjLabel representing the jth sample, yjExpressed as target, yj-1 represents background. Through the above process, the confidence score calibration of the vehicle subclass detector is completed.
And S500, adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result.
In the embodiment, in order to avoid reducing the false detection rate, the output result is further subjected to a false detection and rejection step, a false detection and rejection strategy based on road constraint is introduced, the false detection rate is reduced, and the detection algorithm is perfected, so that the method is suitable for detecting the light-weight vehicles and pedestrians. Specifically, the step S500 specifically includes:
normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinate of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinate of the lower edge of the pedestrian calibration frame;
training a first regression model between the height of the vehicle calibration frame and the position coordinates of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinates of the lower edge of the pedestrian calibration frame by using a support vector machine;
calculating the height of a predicted vehicle calibration frame corresponding to the lower edge position of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the lower edge position of a pedestrian in a pedestrian detection result by adopting a second regression model;
calculating a first error value between the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value between the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;
when the first error value is larger than a first threshold value, judging that the vehicle detection result is detected by mistake, otherwise, receiving the vehicle detection result; and judging that the pedestrian detection result is detected wrongly when the second error value is larger than a second threshold value, and otherwise, receiving the pedestrian detection result.
Wherein the first regression model is:
H1=f(Y1),
wherein H1Indicating the height of the calibration frame of the vehicle, Y1Indicating the lower edge of the calibration frame of the vehicleA position coordinate;
the first error value is calculated by:
Figure BDA0002704856750000121
wherein E is1Represents a first error value, h1Denotes the height, h 'of the actual vehicle calibration frame'1Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.
The second regression model is:
H2=f(Y2),
wherein H2Indicating the height of the pedestrian calibration frame, Y2Representing the position coordinates of the lower edge of the pedestrian calibration frame;
the second error value is calculated by:
Figure BDA0002704856750000131
wherein E is2Represents a second error value, h2Denotes the height, h 'of the actual vehicle calibration frame'2Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.
In other words, the invention firstly normalizes the heights H of 12186 pedestrians and 15891 vehicle calibration frames in the Caltech and KITTI training data set and the position coordinates Y of the lower edge of the calibration frame. The false detection phenomenon in the vehicle and pedestrian detection process can be eliminated by utilizing the road prior information. In order to utilize the road prior information, firstly, the heights H of 12186 pedestrians and 15891 vehicle calibration frames in normalized Caltech and KITTI training data sets and the coordinates Y of the lower edges of the calibration frames are counted, and as shown in fig. 5, it can be seen that a certain statistical relationship exists between H and Y. According to the relationship, the invention provides a simple and efficient road constraint (GPC) false detection rejection strategy, namely, a target which does not accord with the relationship is regarded as false detection. The statistical relationship f can be determined using a regression model between H and Y, first normalizing H and Y of the training samples, and then training the regression model W between H and Y using an SVM. And then comparing the vehicle and pedestrian calibration frame given in the detection result with the corresponding ground channel to obtain the vehicle and pedestrian calibration frame which is closest to the real value. After the model is trained, for a detection frame { x, y, W, h } obtained after NMS, the lower edge position of the detection frame is y + h, then the trained regression model W is used for calculating corresponding h ', and finally the relative error between h and h' is calculated.
As shown in fig. 6, based on the above-mentioned multi-class vehicle and pedestrian detection method based on the improved ACF, the present invention further provides a multi-class vehicle and pedestrian detection device based on the improved ACF, where the multi-class vehicle and pedestrian detection device based on the improved ACF may be a mobile terminal, a desktop computer, a notebook, a palmtop computer, a server, and other computing devices. The improved ACF-based vehicle pedestrian multi-category detection apparatus includes a processor 10, a memory 20, and a display 30. Fig. 6 shows only some of the components of the improved ACF-based pedestrian multi-category detection apparatus, but it should be understood that not all of the shown components need be implemented, and that more or fewer components may be implemented instead.
The memory 20 may be an internal storage unit of the ACF-based vehicular pedestrian multi-class detection apparatus in some embodiments, for example, a hard disk or a memory of the ACF-based vehicular pedestrian multi-class detection apparatus. The memory 20 may also be an external storage device of the ACF-based vehicle pedestrian multi-class detection device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), and the like, which are equipped on the ACF-based vehicle pedestrian multi-class detection device. Further, the memory 20 may also include both an internal memory unit and an external memory device of the ACF-based vehicle and pedestrian multi-category detection device. The memory 20 is used for storing application software installed in the improved ACF-based multi-class vehicle and pedestrian detection device and various types of data, such as program codes of the improved ACF-based multi-class vehicle and pedestrian detection device. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores an improved ACF-based vehicle and pedestrian multi-class detection program 40, and the improved ACF-based vehicle and pedestrian multi-class detection program 40 can be executed by the processor 10, so as to implement the improved ACF-based vehicle and pedestrian multi-class detection method according to the embodiments of the present application.
The processor 10 may be, in some embodiments, a Central Processing Unit (CPU), microprocessor or other data Processing chip for running program codes stored in the memory 20 or Processing data, such as executing the improved ACF-based vehicle and pedestrian multi-category detection method.
The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 30 is used for displaying information at the improved ACF-based vehicular pedestrian multi-category detection apparatus and for displaying a visual user interface. The components 10-30 of the ACF-based vehicular pedestrian multi-category detection apparatus communicate with each other via a system bus.
In one embodiment, the improved ACF-based vehicle and pedestrian multi-class detection method according to the above embodiment is implemented when the processor 10 executes the improved ACF-based vehicle and pedestrian multi-class detection program 40 in the memory 20, and since the improved ACF-based vehicle and pedestrian multi-class detection method has been described in detail above, it will not be described herein again.
In summary, in the vehicle and pedestrian multi-class detection method, device and storage medium based on the improved ACF provided by the invention, the problem that the detection class of the Adaboost classifier in the ACF detection algorithm is single is solved, and a multi-class detection framework is adopted to simultaneously detect the vehicle and the pedestrian; in order to solve the problem of low vehicle and pedestrian detection precision, a multi-view vehicle detector and a context pixel pedestrian detector are adopted, so that the view angle difference of a vehicle sample and the deformation caused by the gesture of a pedestrian when the pedestrian walks can be effectively captured, and the detection precision is improved; in order to overcome the false detection phenomenon in the vehicle and pedestrian detection process, the false detection is effectively removed by utilizing the road prior information.
Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (10)

1. A vehicle and pedestrian multi-class detection method based on an improved ACF is characterized by comprising the following steps:
obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample;
extracting the multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle and pedestrian detection framework, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;
extracting context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle and pedestrian detection framework, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics;
acquiring a preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result;
and adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result.
2. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the method for preprocessing the vehicle training samples and the pedestrian training samples is specifically:
scaling the vehicle training sample and the pedestrian training sample in horizontal and vertical directions, and keeping the center position of each target in the vehicle training sample and the pedestrian training sample normalized.
3. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the step of extracting the multi-view aggregated channel features of the vehicle training samples after preprocessing by using a vehicle and pedestrian detection framework and building a vehicle detector according to the multi-view aggregated channel features specifically comprises:
and calculating a similar incidence matrix among sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining characteristic vectors of multiple dimensions through matrix spectral decomposition, clustering the characteristic vectors of the multiple dimensions by adopting a K-means algorithm to extract aggregate channel characteristics of the multiple viewing angles, and then training the vehicle detector of the corresponding viewing angle by utilizing the aggregate channel characteristics of the multiple viewing angles.
4. The method as claimed in claim 1, wherein the step of extracting the context pixel aggregation channel feature of the preprocessed pedestrian training sample by the vehicle and pedestrian detection framework and building the pedestrian detector according to the context pixel aggregation channel feature specifically comprises:
extracting 10 characteristic channels of the preprocessed pedestrian training sample, processing the ten channels by using 2 multiplied by 2 average pooling to obtain an aggregation channel F with n being 22×2After the characteristics, toThe aggregation channel characteristic F2 multiplied by 2 is subjected to 2 multiplied by 2 average pooling twice to obtain a regional context pixel aggregation channel F4×4Feature and F8×8Characterized in that F is4×4Feature and F8×8Feature sampling to F2×2And (3) resolution, combining to form 30 deformation-resistant context pixel aggregation channel features with the same size, and establishing the pedestrian detector according to the context pixel aggregation channel features.
5. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the step of acquiring the pre-processed image to be detected, sharing the aggregated channel features of the image to be detected into the vehicle detector and the pedestrian detector to obtain vehicle detection results and pedestrian detection results further comprises:
and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.
6. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the step of performing false detection and rejection on the vehicle detection result and the pedestrian detection result by using a false detection and rejection strategy based on road constraints includes:
normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinate of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinate of the lower edge of the pedestrian calibration frame;
training a first regression model between the height of the vehicle calibration frame and the position coordinates of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinates of the lower edge of the pedestrian calibration frame by using a support vector machine;
calculating the height of a predicted vehicle calibration frame corresponding to the lower edge position of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the lower edge position of a pedestrian in a pedestrian detection result by adopting a second regression model;
calculating a first error value between the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value between the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;
when the first error value is larger than a first threshold value, judging that the vehicle detection result is detected by mistake, otherwise, receiving the vehicle detection result; and judging that the pedestrian detection result is detected wrongly when the second error value is larger than a second threshold value, and otherwise, receiving the pedestrian detection result.
7. The improved ACF-based vehicle and pedestrian multi-class detection method as claimed in claim 6, wherein the first regression model is:
H1=f(Y1),
wherein H1Indicating the height of the calibration frame of the vehicle, Y1The position coordinates of the lower edge of the vehicle calibration frame are represented;
the first error value is calculated by:
Figure FDA0002704856740000031
wherein E is1Represents a first error value, h1Denotes the height, h 'of the actual vehicle calibration frame'1Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.
8. The improved ACF-based vehicle and pedestrian multi-category detection method of claim 6, wherein the second regression model is:
H2=f(Y2),
wherein H2Indicating the height of the pedestrian calibration frame, Y2Representing the position coordinates of the lower edge of the pedestrian calibration frame;
the second error value is calculated by:
Figure FDA0002704856740000041
wherein E is2Represents a second error value, h2Denotes the height, h 'of the actual vehicle calibration frame'2Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.
9. A vehicle pedestrian multi-category detection device based on an improved ACF, comprising: a processor and a memory;
the memory has stored thereon a computer readable program executable by the processor;
the processor, when executing the computer readable program, implements the steps in the improved ACF-based vehicle pedestrian multi-class detection method according to any one of claims 1 to 8.
10. A computer readable storage medium, wherein the computer readable storage medium stores one or more programs which are executable by one or more processors to implement the steps in the improved ACF-based vehicle pedestrian multi-class detection method according to any one of claims 1 to 8.
CN202011034733.9A 2020-09-27 2020-09-27 Vehicle pedestrian multi-category detection method and device based on improved ACF Active CN112215103B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011034733.9A CN112215103B (en) 2020-09-27 2020-09-27 Vehicle pedestrian multi-category detection method and device based on improved ACF

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011034733.9A CN112215103B (en) 2020-09-27 2020-09-27 Vehicle pedestrian multi-category detection method and device based on improved ACF

Publications (2)

Publication Number Publication Date
CN112215103A true CN112215103A (en) 2021-01-12
CN112215103B CN112215103B (en) 2024-02-23

Family

ID=74050818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011034733.9A Active CN112215103B (en) 2020-09-27 2020-09-27 Vehicle pedestrian multi-category detection method and device based on improved ACF

Country Status (1)

Country Link
CN (1) CN112215103B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760858A (en) * 2016-03-21 2016-07-13 东南大学 Pedestrian detection method and apparatus based on Haar-like intermediate layer filtering features
CN107657225A (en) * 2017-09-22 2018-02-02 电子科技大学 A kind of pedestrian detection method based on converging channels feature
CN108376235A (en) * 2018-01-15 2018-08-07 深圳市易成自动驾驶技术有限公司 Image detecting method, device and computer readable storage medium
CN109190456A (en) * 2018-07-19 2019-01-11 中国人民解放军战略支援部队信息工程大学 Pedestrian detection method is overlooked based on the multiple features fusion of converging channels feature and gray level co-occurrence matrixes
CN110109455A (en) * 2019-04-24 2019-08-09 安徽大学 A kind of Target Tracking System based on ACF converging channels feature

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760858A (en) * 2016-03-21 2016-07-13 东南大学 Pedestrian detection method and apparatus based on Haar-like intermediate layer filtering features
CN107657225A (en) * 2017-09-22 2018-02-02 电子科技大学 A kind of pedestrian detection method based on converging channels feature
CN108376235A (en) * 2018-01-15 2018-08-07 深圳市易成自动驾驶技术有限公司 Image detecting method, device and computer readable storage medium
CN109190456A (en) * 2018-07-19 2019-01-11 中国人民解放军战略支援部队信息工程大学 Pedestrian detection method is overlooked based on the multiple features fusion of converging channels feature and gray level co-occurrence matrixes
CN110109455A (en) * 2019-04-24 2019-08-09 安徽大学 A kind of Target Tracking System based on ACF converging channels feature

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆泽早 等: "使用聚合通道特征的嵌入式实时人体头肩检测", 中国图象图形学报, 30 April 2019 (2019-04-30) *

Also Published As

Publication number Publication date
CN112215103B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
CN109740478B (en) Vehicle detection and identification method, device, computer equipment and readable storage medium
EP4148622A1 (en) Neural network training method, image classification system, and related device
US11151363B2 (en) Expression recognition method, apparatus, electronic device, and storage medium
Min et al. Traffic sign recognition based on semantic scene understanding and structural traffic sign location
WO2020173022A1 (en) Vehicle violation identifying method, server and storage medium
Yuan et al. Robust traffic sign recognition based on color global and local oriented edge magnitude patterns
Xu et al. Detection of sudden pedestrian crossings for driving assistance systems
US8750573B2 (en) Hand gesture detection
US8792722B2 (en) Hand gesture detection
CN109190444B (en) Method for realizing video-based toll lane vehicle feature recognition system
CN104091147B (en) A kind of near-infrared eyes positioning and eye state identification method
Huang et al. Vehicle detection and inter-vehicle distance estimation using single-lens video camera on urban/suburb roads
US10445602B2 (en) Apparatus and method for recognizing traffic signs
CN110020592A (en) Object detection model training method, device, computer equipment and storage medium
CN111178245A (en) Lane line detection method, lane line detection device, computer device, and storage medium
US11380010B2 (en) Image processing device, image processing method, and image processing program
US10255511B2 (en) Real time traffic sign recognition
Sugiharto et al. Traffic sign detection based on HOG and PHOG using binary SVM and k-NN
Hua et al. Pedestrian-and vehicle-detection algorithm based on improved aggregated channel features
CN112381870A (en) Ship identification and navigational speed measurement system and method based on binocular vision
CN111950546B (en) License plate recognition method and device, computer equipment and storage medium
Sun et al. Vehicle Type Recognition Combining Global and Local Features via Two‐Stage Classification
CN112541394A (en) Black eye and rhinitis identification method, system and computer medium
Liu et al. An efficient real-time speed limit signs recognition based on rotation invariant feature
CN109726621B (en) Pedestrian detection method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant