CN112215103A

CN112215103A - Vehicle and pedestrian multi-class detection method and device based on improved ACF

Info

Publication number: CN112215103A
Application number: CN202011034733.9A
Authority: CN
Inventors: 石英; 黄紫旗; 谢长君; 张晖; 华捷
Original assignee: Wuhan University of Technology WUT
Current assignee: Wuhan University of Technology WUT
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2021-01-12
Anticipated expiration: 2040-09-27
Also published as: CN112215103B

Abstract

The invention relates to a vehicle and pedestrian multi-class detection method and device based on an improved ACF, wherein the method comprises the following steps: obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample; extracting the multi-view aggregation channel characteristics of the preprocessed vehicle training samples and the context pixel aggregation channel characteristics of the pedestrian training samples by using a vehicle and pedestrian detection framework, establishing a vehicle detector according to the multi-view aggregation channel characteristics, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics; sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result; and adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result. The invention solves the problems of single detection target, low detection precision and easy occurrence of false detection at present.

Description

Vehicle and pedestrian multi-class detection method and device based on improved ACF

Technical Field

The invention relates to the technical field of unmanned visual analysis, in particular to a vehicle and pedestrian multi-class detection method and device based on an improved ACF and a storage medium.

Background

With the improvement of science and technology and the improvement of the living standard of people, the automobile keeping quantity is greatly increased, and various factors cause frequent traffic accidents. Unmanned driving seeks to ameliorate this problem, where vehicle and pedestrian detection techniques are of paramount importance. The accuracy and real-time performance of the vehicle and pedestrian detection algorithm directly affect the safety performance of the unmanned vehicle.

The current mainstream vehicle and pedestrian detection algorithms include a deep learning detection algorithm and a statistical feature detection algorithm. The CNN feature training period in deep learning is long, and the calculated amount is large. According to the difference of detection strategies, the statistical characteristic detection method can be subdivided into a DPM method and a decision tree method, and the DPM method and the decision tree method are rarely applied to an unmanned system due to the characteristics of high complexity and low operation speed. In the decision tree method, the design of the feature descriptors is the key of the detection algorithm and is the most studied content at present, and mainly comprises gradient, texture, color and fusion features thereof. The Haar features are mainly used for extracting texture information of a target and are widely applied to the field of vehicle detection, and the HOG features and the like are used for capturing information such as the contour and the shape of the target, are representative of gradient features and are generally used for detecting pedestrians. In addition, the color characteristics of gray scale, RGB, LUV and the like can also be used for representing the target. However, these features can generally only be used to detect specific targets, and expressive power has limitations in complex road scenarios. For the above problems, an Integrated Channel Features (ICF) is proposed first, which fuses Features such as gradients, colors, textures, and the like, and then an Aggregated Channel Feature (ACF) is proposed in order to improve detection performance. Then, the LDCF introduces filtering operation on the basis of the ACF to enhance the expression capability of the ACF, but also brings great calculation amount, and although the algorithm detection precision is further improved, the real-time performance is greatly reduced. Although the LDCF detection precision is greatly improved compared with the ACF detection precision, the real-time performance is reduced, and the LDCF detection method is difficult to be applied to light-weight vehicle pedestrian detection.

Therefore, when the conventional vehicle and pedestrian detection method is applied to a road scene, the problems of single detection target, low detection precision and easy occurrence of false detection are solved.

Disclosure of Invention

In view of the above, it is desirable to provide a method, an apparatus and a storage medium for detecting multiple types of vehicles and pedestrians based on an improved ACF, so as to solve the problems of single detection target, low detection accuracy and easy occurrence of false detection when detecting vehicles and pedestrians.

In a first aspect, the invention provides a vehicle and pedestrian multi-class detection method based on an improved ACF, comprising the following steps:

obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample;

extracting the multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle and pedestrian detection framework, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;

extracting context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle and pedestrian detection framework, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics;

acquiring a preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result;

and adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result.

Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the method for preprocessing the vehicle training samples and the pedestrian training samples specifically includes:

scaling the vehicle training sample and the pedestrian training sample in horizontal and vertical directions, and keeping the center position of each target in the vehicle training sample and the pedestrian training sample normalized.

Preferably, in the method for detecting multiple classes of vehicles and pedestrians based on the improved ACF, the step of extracting the multi-view aggregation channel feature of the vehicle training sample after the preprocessing by using the vehicle and pedestrian detection framework and establishing the vehicle detector according to the multi-view aggregation channel feature specifically includes:

and calculating a similar incidence matrix among sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining characteristic vectors of multiple dimensions through matrix spectral decomposition, clustering the characteristic vectors of the multiple dimensions by adopting a K-means algorithm to extract aggregate channel characteristics of the multiple viewing angles, and then training the vehicle detector of the corresponding viewing angle by utilizing the aggregate channel characteristics of the multiple viewing angles.

Preferably, in the method for detecting multiple classes of vehicles and pedestrians based on the improved ACF, the step of extracting the context pixel aggregation channel feature of the preprocessed pedestrian training sample by using the vehicle and pedestrian detection framework, and establishing the pedestrian detector according to the context pixel aggregation channel feature specifically includes:

extracting 10 characteristic channels of the preprocessed pedestrian training sample, processing the ten channels by using 2 multiplied by 2 average pooling to obtain an aggregation channel F with n being 2_2×2After the feature is obtained, 2 × 2 average pooling is performed twice on the aggregation channel feature F2 × 2 to obtain a region context pixel aggregation channel F_4×4Feature and F_8×8Characterized in that F is_4×4Feature and F_8×8Feature sampling to F_2×2And (3) resolution, combining to form 30 deformation-resistant context pixel aggregation channel features with the same size, and establishing the pedestrian detector according to the context pixel aggregation channel features.

Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the step of obtaining the preprocessed image to be detected and sharing the aggregate channel characteristics of the image to be detected to the vehicle detector and the pedestrian detector to obtain the vehicle detection result and the pedestrian detection result further includes:

and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.

Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the step of performing false detection and rejection on the vehicle detection result and the pedestrian detection result by using a false detection and rejection strategy based on road constraints includes:

normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinate of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinate of the lower edge of the pedestrian calibration frame;

training a first regression model between the height of the vehicle calibration frame and the position coordinates of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinates of the lower edge of the pedestrian calibration frame by using a support vector machine;

calculating the height of a predicted vehicle calibration frame corresponding to the lower edge position of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the lower edge position of a pedestrian in a pedestrian detection result by adopting a second regression model;

calculating a first error value between the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value between the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;

when the first error value is larger than a first threshold value, judging that the vehicle detection result is detected by mistake, otherwise, receiving the vehicle detection result; and judging that the pedestrian detection result is detected wrongly when the second error value is larger than a second threshold value, and otherwise, receiving the pedestrian detection result.

Preferably, in the method for detecting multiple categories of vehicles and pedestrians based on the improved ACF, the first regression model is:

H₁＝f(Y₁)，

wherein H₁Indicating the height of the calibration frame of the vehicle, Y₁The position coordinates of the lower edge of the vehicle calibration frame are represented;

the first error value is calculated by:

wherein E is₁Represents a first error value, h₁Denotes the height, h 'of the actual vehicle calibration frame'₁Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.

Preferably, in the vehicle and pedestrian multi-category detection method based on the improved ACF, the second regression model is:

H₂＝f(Y₂)，

wherein H₂Indicating the height of the pedestrian calibration frame, Y₂Representing the position coordinates of the lower edge of the pedestrian calibration frame;

the second error value is calculated by:

wherein E is₂Represents a second error value, h₂Denotes the height, h 'of the actual vehicle calibration frame'₂Indicating the height of the predicted vehicle calibration box and abs indicates the absolute value.

In a second aspect, the present invention further provides a vehicle and pedestrian multi-category detection apparatus based on the improved ACF, including: a processor and a memory;

the memory has stored thereon a computer readable program executable by the processor;

the processor, when executing the computer readable program, implements the steps in the improved ACF-based vehicle pedestrian multi-class detection method as described above.

In a third aspect, the present invention also provides a computer readable storage medium storing one or more programs, which are executable by one or more processors to implement the steps in the improved ACF-based vehicle pedestrian multi-class detection method as described above.

[ PROBLEMS ] the present invention

In the vehicle and pedestrian multi-class detection method, the device and the storage medium based on the improved ACF, the problem that an Adaboost classifier in an ACF detection algorithm is single in detection class is solved, a multi-class detection framework is adopted, and vehicle and pedestrian detection is carried out simultaneously; in order to solve the problem of low vehicle and pedestrian detection precision, a multi-view vehicle detector and a context pixel pedestrian detector are adopted, so that the view angle difference of a vehicle sample and the deformation caused by the gesture of a pedestrian when the pedestrian walks can be effectively captured, and the detection precision is improved; in order to overcome the false detection phenomenon in the vehicle and pedestrian detection process, the false detection is effectively removed by utilizing the road prior information.

Drawings

FIG. 1 is a flow chart of a method for detecting multiple types of pedestrians in a vehicle based on an improved ACF according to a preferred embodiment of the present invention;

FIG. 2 is a flowchart of the operation of a preferred embodiment of the vehicle pedestrian detection framework of the present invention;

FIG. 3 is a schematic illustration of a training process for the vehicle detector of the present invention;

FIG. 4 is a schematic diagram of a training process for a pedestrian detector according to the present invention;

FIG. 5 is a statistical graph of the relationship between the target height of the calibration frame and the coordinates of its lower edge position according to the present invention;

fig. 6 is a schematic diagram of an operating environment of a preferred embodiment of a vehicle pedestrian multi-category detection program based on an improved ACF.

Detailed Description

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate preferred embodiments of the invention and together with the description, serve to explain the principles of the invention and not to limit the scope of the invention.

Referring to fig. 1, a vehicle and pedestrian multi-category detection method based on an improved ACF provided in an embodiment of the present invention includes the following steps:

s100, obtaining a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample.

In this embodiment, in order to implement detection of vehicles and pedestrians, training of samples needs to be performed first, and in order to ensure detection accuracy, sample data needs to be enhanced and improved first, and is preprocessed, specifically, in step S100, the method for preprocessing the samples specifically includes:

Specifically, the current ACF detection algorithm strengthens a data set by using a horizontal flipping method, and ignores the influence of a labeling error of the data set. Meanwhile, images are generally standardized in the training process, and the problem of target misalignment easily occurs in the zoomed images, so that the detection precision is seriously influenced. Therefore, the horizontal inversion is directly removed, multi-scale data reinforcement is added, namely 1.1 times of scaling is carried out in the horizontal direction, the vertical direction and the like of the original training sample, the normalization of the target center position is kept, the sensitivity of the background around the labeling frame can be reduced by utilizing the multi-scale reinforcement, and the classification robustness is improved.

S200, extracting multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle and pedestrian detection frame, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;

s300, extracting context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle and pedestrian detection framework, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics.

In the embodiment, because the detection target of the traditional ACF algorithm is single, a multi-class pedestrian vehicle detection framework is introduced, feature extraction is performed on the vehicle sample and the pedestrian sample at the same time, and the ACF features detected by the vehicle detector and the pedestrian detector are shared and then used for vehicle and pedestrian detection. Specifically, the invention provides a vehicle and pedestrian detection framework based on feature sharing, so that an ACF feature is shared by a vehicle detector and a pedestrian detector, the detection efficiency is improved, and multi-class detection is completed at the same time. The frame trains the pedestrian detector and the vehicle detector at the same time in the training stage, and enables the pedestrian detector and the vehicle detector to share ACF characteristics, so that the training efficiency can be obviously improved. Meanwhile, the framework has universality, other detectors can be added, and other detection categories are easy to expand.

Further, the step S200 specifically includes:

Specifically, as shown in fig. 3, when training a Multi-view Aggregated Channel feature vehicle detector (Mv-ACF), the present invention performs a clustering process on vehicle training samples, extracts Features of each view sample, and then trains a corresponding vehicle detector, and in consideration of the fact that the extracted feature dimension of the ACF is high, the present invention uses a K-means algorithm of unsupervised learning. Aiming at the problem of possible clustering degradation, the method adopts a spectral clustering algorithm, firstly calculates a similar incidence matrix among sample points, obtains a characteristic vector through matrix spectral decomposition, constructs a new characteristic space, and then uses a K-means algorithm for clustering.

And verifying the effectiveness of the spectral clustering algorithm used by the invention. Through experiments, K-means clustering is carried out on the training samples, and the training samples are clustered into 20 classes. The classified samples were trained and tested using an Mv-ACF detector and compared to the spectral clustering algorithm results, as shown in table 1. Obviously, the AP precision of the spectral clustering algorithm under different levels is higher than that of the K-means clustering algorithm, and the spectral clustering algorithm achieves the expected effect.

In a preferred embodiment, the step S300 specifically includes:

Specifically, the posture of the pedestrian when walking can cause deformation, so that the difficulty of pedestrian detection is increased. To this end, the invention proposes a Context Pixel Aggregated Channel feature (CP-ACF). As shown in fig. 4, 10 feature channels are extracted first as in the ACF algorithm, and then the ten channels are processed using 2 × 2 average pooling to obtain an ACF feature F2 × 2 with n being 2, and on the basis of this, 2 × 2 average pooling is performed twice again to obtain region context pixel aggregation F4 × 4 and F8 × 8 features. Finally, F4 x 4 and F8 x 8 are up-sampled to F2 x 2 resolution, and finally combined to form 30 deformation-resistant CP-ACF channel features with the same size, so that fusion of local and context features is realized. When the soft cascade Adaboost is classified, the weak classifier can self-adaptively select local and context characteristics of different areas in a CP-ACF channel, compared with the characteristic that the ACF can only select a fixed area, the CP-ACF has stronger anti-deformation capability. The AP accuracies of CP-ACF and ACF at different levels of KITTI validation set are shown in the following table.

The invention designs the vehicle detector and the pedestrian detector in the frame respectively, integrates road information, aims to improve the detection precision, simultaneously realizes the feature sharing of the vehicle detector and the pedestrian detector and improves the algorithm real-time property.

S400, acquiring the preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected into the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result.

Specifically, in the detection algorithm, the part which usually takes the longest time is image feature extraction, and in the conventional vehicle and pedestrian detection algorithm, vehicle detection and pedestrian detection are usually performed respectively, that is, feature extraction is performed on the image respectively, which takes a long time. The vehicle pedestrian detection framework based on feature sharing is proposed, as shown in fig. 2, so that the vehicle detector and the pedestrian detector share the ACF feature, the detection efficiency is improved, and meanwhile, multi-class detection is completed.

Further, since the Mv-ACF and the CP-ACF use the same 10 original ACF feature channels, the former uses 2 × 2 average pooling to extract features, and the latter uses three types of average pooling, namely 2 × 2, 4 × 4 and 8 × 8 average pooling to extract features, it can be seen that the feature channels used by the latter contain the former, and the feature pyramid construction patterns of the two are the same, so that the vehicle detector and the pedestrian detector can share the feature pyramid of the latter.

In a preferred embodiment, after the step S400, the method further includes:

Specifically, in the testing process, each subclass detector of the Mv-ACF adopts a data training model with different viewing angles, and the detection result includes detection frames with different distribution confidence scores and inconsistent geometric features (such as aspect ratio). Direct merging introduces noise, which causes instability of subsequent NMS and reduces accuracy. The invention introduces a parameterized Logistic regression calibration method to calibrate the confidence scores of the detection results, so that the distribution of the detection results is more reasonable.

Specifically, let Deti ═ d_i1,d_i2,…,d_ij,…,d_irAnd is r detection results of the ith subclass detector. Wherein d is_ij＝{R_ij,c_ijDenotes the jth detection result, which is indicated by the detection frame R_ijAnd confidence score c_ijAnd (4) forming. Set mDet_i＝{md_i1,md_i2,…,md_ij,…,md_irIs the calibrated result, wherein md_ij＝{R_ij,c′_ij}. The purpose of the confidence score calibration is to adopt a calibration function g_iC'_ij＝g_i(c_ij). A parameterized Logistic regression calibration method is introduced, and the scores are normalized, namely:

wherein, the parameter A of the ith subclass detector_iAnd B_iObtaining by solving a regularized maximum likelihood problem:

substituting formula (1) into formula (2) to obtain

Wherein,

wherein r is₊And r_-Respectively for the ith subclass for training parameter A_iAnd B_iPositive and negative sample numbers of (1). y is_jLabel representing the jth sample, y_jExpressed as target, y_j-1 represents background. Through the above process, the confidence score calibration of the vehicle subclass detector is completed.

And S500, adopting a false detection and elimination strategy based on road constraint to carry out false detection and elimination on the vehicle detection result and the pedestrian detection result.

In the embodiment, in order to avoid reducing the false detection rate, the output result is further subjected to a false detection and rejection step, a false detection and rejection strategy based on road constraint is introduced, the false detection rate is reduced, and the detection algorithm is perfected, so that the method is suitable for detecting the light-weight vehicles and pedestrians. Specifically, the step S500 specifically includes:

Wherein the first regression model is:

H₁＝f(Y₁)，

wherein H₁Indicating the height of the calibration frame of the vehicle, Y₁Indicating the lower edge of the calibration frame of the vehicleA position coordinate;

the first error value is calculated by:

The second regression model is:

H₂＝f(Y₂)，

the second error value is calculated by:

In other words, the invention firstly normalizes the heights H of 12186 pedestrians and 15891 vehicle calibration frames in the Caltech and KITTI training data set and the position coordinates Y of the lower edge of the calibration frame. The false detection phenomenon in the vehicle and pedestrian detection process can be eliminated by utilizing the road prior information. In order to utilize the road prior information, firstly, the heights H of 12186 pedestrians and 15891 vehicle calibration frames in normalized Caltech and KITTI training data sets and the coordinates Y of the lower edges of the calibration frames are counted, and as shown in fig. 5, it can be seen that a certain statistical relationship exists between H and Y. According to the relationship, the invention provides a simple and efficient road constraint (GPC) false detection rejection strategy, namely, a target which does not accord with the relationship is regarded as false detection. The statistical relationship f can be determined using a regression model between H and Y, first normalizing H and Y of the training samples, and then training the regression model W between H and Y using an SVM. And then comparing the vehicle and pedestrian calibration frame given in the detection result with the corresponding ground channel to obtain the vehicle and pedestrian calibration frame which is closest to the real value. After the model is trained, for a detection frame { x, y, W, h } obtained after NMS, the lower edge position of the detection frame is y + h, then the trained regression model W is used for calculating corresponding h ', and finally the relative error between h and h' is calculated.

As shown in fig. 6, based on the above-mentioned multi-class vehicle and pedestrian detection method based on the improved ACF, the present invention further provides a multi-class vehicle and pedestrian detection device based on the improved ACF, where the multi-class vehicle and pedestrian detection device based on the improved ACF may be a mobile terminal, a desktop computer, a notebook, a palmtop computer, a server, and other computing devices. The improved ACF-based vehicle pedestrian multi-category detection apparatus includes a processor 10, a memory 20, and a display 30. Fig. 6 shows only some of the components of the improved ACF-based pedestrian multi-category detection apparatus, but it should be understood that not all of the shown components need be implemented, and that more or fewer components may be implemented instead.

The memory 20 may be an internal storage unit of the ACF-based vehicular pedestrian multi-class detection apparatus in some embodiments, for example, a hard disk or a memory of the ACF-based vehicular pedestrian multi-class detection apparatus. The memory 20 may also be an external storage device of the ACF-based vehicle pedestrian multi-class detection device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), and the like, which are equipped on the ACF-based vehicle pedestrian multi-class detection device. Further, the memory 20 may also include both an internal memory unit and an external memory device of the ACF-based vehicle and pedestrian multi-category detection device. The memory 20 is used for storing application software installed in the improved ACF-based multi-class vehicle and pedestrian detection device and various types of data, such as program codes of the improved ACF-based multi-class vehicle and pedestrian detection device. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores an improved ACF-based vehicle and pedestrian multi-class detection program 40, and the improved ACF-based vehicle and pedestrian multi-class detection program 40 can be executed by the processor 10, so as to implement the improved ACF-based vehicle and pedestrian multi-class detection method according to the embodiments of the present application.

The processor 10 may be, in some embodiments, a Central Processing Unit (CPU), microprocessor or other data Processing chip for running program codes stored in the memory 20 or Processing data, such as executing the improved ACF-based vehicle and pedestrian multi-category detection method.

The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 30 is used for displaying information at the improved ACF-based vehicular pedestrian multi-category detection apparatus and for displaying a visual user interface. The components 10-30 of the ACF-based vehicular pedestrian multi-category detection apparatus communicate with each other via a system bus.

In one embodiment, the improved ACF-based vehicle and pedestrian multi-class detection method according to the above embodiment is implemented when the processor 10 executes the improved ACF-based vehicle and pedestrian multi-class detection program 40 in the memory 20, and since the improved ACF-based vehicle and pedestrian multi-class detection method has been described in detail above, it will not be described herein again.

In summary, in the vehicle and pedestrian multi-class detection method, device and storage medium based on the improved ACF provided by the invention, the problem that the detection class of the Adaboost classifier in the ACF detection algorithm is single is solved, and a multi-class detection framework is adopted to simultaneously detect the vehicle and the pedestrian; in order to solve the problem of low vehicle and pedestrian detection precision, a multi-view vehicle detector and a context pixel pedestrian detector are adopted, so that the view angle difference of a vehicle sample and the deformation caused by the gesture of a pedestrian when the pedestrian walks can be effectively captured, and the detection precision is improved; in order to overcome the false detection phenomenon in the vehicle and pedestrian detection process, the false detection is effectively removed by utilizing the road prior information.

Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims

1. A vehicle and pedestrian multi-class detection method based on an improved ACF is characterized by comprising the following steps:

2. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the method for preprocessing the vehicle training samples and the pedestrian training samples is specifically:

3. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the step of extracting the multi-view aggregated channel features of the vehicle training samples after preprocessing by using a vehicle and pedestrian detection framework and building a vehicle detector according to the multi-view aggregated channel features specifically comprises:

4. The method as claimed in claim 1, wherein the step of extracting the context pixel aggregation channel feature of the preprocessed pedestrian training sample by the vehicle and pedestrian detection framework and building the pedestrian detector according to the context pixel aggregation channel feature specifically comprises:

extracting 10 characteristic channels of the preprocessed pedestrian training sample, processing the ten channels by using 2 multiplied by 2 average pooling to obtain an aggregation channel F with n being 2_2×2After the characteristics, toThe aggregation channel characteristic F2 multiplied by 2 is subjected to 2 multiplied by 2 average pooling twice to obtain a regional context pixel aggregation channel F_4×4Feature and F_8×8Characterized in that F is_4×4Feature and F_8×8Feature sampling to F_2×2And (3) resolution, combining to form 30 deformation-resistant context pixel aggregation channel features with the same size, and establishing the pedestrian detector according to the context pixel aggregation channel features.

5. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the step of acquiring the pre-processed image to be detected, sharing the aggregated channel features of the image to be detected into the vehicle detector and the pedestrian detector to obtain vehicle detection results and pedestrian detection results further comprises:

6. The improved ACF-based vehicle and pedestrian multi-category detection method according to claim 1, wherein the step of performing false detection and rejection on the vehicle detection result and the pedestrian detection result by using a false detection and rejection strategy based on road constraints includes:

7. The improved ACF-based vehicle and pedestrian multi-class detection method as claimed in claim 6, wherein the first regression model is:

H₁＝f(Y₁)，

the first error value is calculated by:

8. The improved ACF-based vehicle and pedestrian multi-category detection method of claim 6, wherein the second regression model is:

H₂＝f(Y₂)，

the second error value is calculated by:

9. A vehicle pedestrian multi-category detection device based on an improved ACF, comprising: a processor and a memory;

the processor, when executing the computer readable program, implements the steps in the improved ACF-based vehicle pedestrian multi-class detection method according to any one of claims 1 to 8.

10. A computer readable storage medium, wherein the computer readable storage medium stores one or more programs which are executable by one or more processors to implement the steps in the improved ACF-based vehicle pedestrian multi-class detection method according to any one of claims 1 to 8.