CN112528938A

CN112528938A - Vehicle detection model training and detection method, device and computer storage medium thereof

Info

Publication number: CN112528938A
Application number: CN202011533150.0A
Authority: CN
Inventors: 毛艺凡; 钟南昌
Original assignee: Sichuan Yuncong Tianfu Artificial Intelligence Technology Co Ltd
Current assignee: Sichuan Yuncong Tianfu Artificial Intelligence Technology Co Ltd
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2021-03-19

Abstract

The application provides a vehicle detection model training and a detection method, a device and a computer storage medium thereof, which mainly comprise the steps of determining vehicle type classification labels of training samples and position labels and visibility category labels of all key points in the training samples according to the training samples, constructing a vehicle type predictor and a key point predictor of a vehicle detection model, using the training samples as input, and using the vehicle type classification labels of the training samples and the position labels and the visibility category labels of all vehicle key points as output to respectively train the vehicle type predictor and the key point predictor. Therefore, the lightweight vehicle detection model designed by the application can not only save calculation cost to meet the requirement of lower calculation power, but also can meet the vehicle detection of different vehicle types.

Description

Vehicle detection model training and detection method, device and computer storage medium thereof

Technical Field

The embodiment of the application relates to the technical field of artificial intelligence recognition, in particular to vehicle detection model training, a vehicle detection method and device and a computer storage medium.

Background

The vehicle key point is dense expression of vehicle information, can be used for analyzing information such as vehicle orientation and integrity, and has wide application requirements in scenes such as video structuring and vehicle weight recognition.

In a real scene, the detection of the key points of the vehicle needs to solve the following main problems: firstly, the positions of key parts of different vehicle types (such as cars, buses, trucks and tricycles) are different, and the number of key points is different, so that different detection models need to be designed for different vehicle types, thereby causing the problem of overlarge calculation amount; secondly, the layout quantity of the key points needs to be proper, the labeling cost is too high due to too dense layout of the key points, and the accuracy of the detection result is influenced due to information loss due to too sparse layout of the key points; finally, the key point detection is used as a preprocessing algorithm of the video identification system, and the computational power requirement of the video identification system can be reduced as much as possible on the premise of keeping high precision.

Disclosure of Invention

In view of the above, the present application provides a vehicle detection model training method, a vehicle detection model training device, and a vehicle detection model training program, to overcome the above problems or at least partially solve the above problems.

The first aspect of the application provides a vehicle detection model training method, which includes determining vehicle type classification labels of training samples and position labels and visibility category labels of vehicle key points in the training samples according to the training samples; and constructing a vehicle type predictor and a key point predictor of a vehicle detection model, training the vehicle type predictor by using the training samples as input and the vehicle type classification labels of the training samples as output, and training the key point predictor by using the training samples as input and using the position labels and the visibility class labels of the key points of the vehicles in the training samples as output.

A second aspect of the present application provides a computer storage medium having stored therein instructions for performing the steps of the vehicle detection model training method of the first aspect.

A third aspect of the present application provides a vehicle detection method, including: obtaining a target sample; the vehicle detection model trained by the vehicle detection model training method according to the first aspect is used to detect a target vehicle in the target sample, and a vehicle type classification detection value of the target vehicle and each coordinate detection value and each visibility category detection value corresponding to each vehicle key point of the target vehicle are obtained.

A fourth aspect of the present application provides a computer storage medium having stored therein instructions for performing the steps of the vehicle detection method of the third aspect.

The fifth aspect of the present application provides a vehicle detection model training device, which includes a training sample obtaining module, configured to determine, according to a training sample, a vehicle type classification label of the training sample, and a position label and a visibility category label of each vehicle key point in the training sample; the model training module is used for constructing a vehicle type predictor and a key point predictor of a vehicle detection model, training the vehicle type predictor by using the training sample as input and using the vehicle type classification label of the training sample as output, and training the key point predictor by using the training sample as input and using the position label and the visibility class label of each vehicle key point in the training sample as output.

A sixth aspect of the present application provides a vehicle detection device, comprising: the target sample acquisition module is used for acquiring a target sample; a vehicle detection module, configured to detect a target vehicle in the target sample by using the vehicle detection model trained by the vehicle detection model training apparatus according to the fifth aspect, and obtain a vehicle type classification detection value of the target sample and each coordinate detection value and each visibility category detection value corresponding to each vehicle key point in the target sample.

According to the technical scheme, the vehicle type predictor and the key point predictor in the vehicle detection model provided by the embodiments of the application respectively adopt 2 layers of full connection layers, and through the light-weight network structure design, the calculation cost can be saved so as to meet the calculation force requirement as a vehicle identification operation preprocessing algorithm.

Moreover, the vehicle appearances of different vehicle types are summarized by adopting the key point marking mode of the same style, so that the same vehicle detection model can be used for detecting the key points of the vehicles aiming at different vehicle types. Moreover, by designing the number of the key points of the vehicles with reasonable number, the labeling cost of the key points of the vehicles can be reduced while summarizing the appearance information of the vehicles of different vehicle types, so that the balance between the precision and the calculation power is realized.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present application, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a schematic flow chart diagram illustrating a vehicle detection model training method according to a first embodiment of the present application;

FIG. 2 is a schematic flow chart illustrating a vehicle detection method according to a third embodiment of the present application;

FIG. 3 is a schematic structural diagram of a vehicle detection model training apparatus according to a fifth embodiment of the present application;

fig. 4 is a schematic structural diagram showing a vehicle detecting device according to a sixth embodiment of the present application.

Element number

300: a vehicle detection model training device; 310: a training sample acquisition module; 320: a model training module; 330: a vehicle detection model; 331: a vehicle type predictor; 332: a keypoint predictor; 400: a vehicle detection device; 410: a target sample acquisition module; 420: and a vehicle detection module.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application shall fall within the scope of the protection of the embodiments in the present application.

According to the background art, the current vehicle detection model has the problems that the calculation cost is high, the vehicle detection of different vehicle types cannot be met, and the like. In view of the above, embodiments of the present application provide a light-weight vehicle detection model training method, a light-weight vehicle detection model training device, and a detection method, a detection device, and a computer storage medium, which can solve the above-mentioned various technical problems.

The following further describes specific implementations of embodiments of the present application with reference to the drawings of the embodiments of the present application.

First embodiment

Fig. 1 shows a schematic flow chart of a vehicle detection model training method according to a first embodiment of the present application. As shown in the figure, the vehicle detection model training method of the embodiment mainly includes the following steps:

and step S11, determining vehicle type classification labels of the training samples and position labels and visibility category labels of each vehicle key point in the training samples according to the training samples.

In the present embodiment, the training samples are vehicle picture samples.

Alternatively, the target vehicle in the original sample may be detected based on a preset vehicle detection algorithm, vehicle envelope frame information of the target vehicle is obtained, and picture preprocessing is performed on the original sample based on the vehicle envelope frame information to obtain a training sample.

Optionally, the preset vehicle detection algorithm at least includes one of a YOLO algorithm, an SSD algorithm, and a fast-RCNN algorithm, but is not limited thereto, and other algorithms may also be used to detect the target vehicle in the original sample, which is not limited in this application.

Specifically, the original sample is, for example, a panoramic image, a vehicle detector may be used to detect a target vehicle from the panoramic image based on a preset vehicle detection algorithm, a vehicle envelope frame is generated based on the detected target vehicle, and then a picture processing technique, such as cropping and scaling, is used to obtain a training sample required by the present application.

In this embodiment, each training sample is a single vehicle picture sample, that is, each training sample includes only one target vehicle.

Optionally, the vehicle type classification tag may include a car, a truck, a bus, and a tricycle, but is not limited thereto, and may also be arbitrarily adjusted according to an actual detection requirement, and the present application is not limited thereto.

In this embodiment, each training sample may be labeled with 20 vehicle keypoints, wherein each vehicle keypoint in the training sample corresponds to a different vehicle appearance part, and wherein, in different training samples, each keypoint number of each vehicle keypoint corresponding to the same vehicle appearance part is the same, that is, the same keypoint number is assigned to some common vehicle appearance parts between different vehicle types.

In this embodiment, the definition of each vehicle appearance part corresponding to 20 vehicle key points is shown in table 1 below:

TABLE 1

Optionally, the visibility category label is used to identify a visibility category for each vehicle keypoint.

In this embodiment, the visibility category labels include visible and invisible, specifically, when some vehicle appearance parts are blocked or hidden, the visibility category labels of the vehicle key points of the corresponding parts are invisible.

It should be noted that the visibility category labels are not limited to the above two categories, and more than two category labels may be set according to the actual detection requirement, which is not limited in the present application.

And step S12, constructing a vehicle type predictor and a key point predictor of the vehicle detection model, training the vehicle type predictor by taking the training sample as input and taking the vehicle type classification label of the training sample as output, and training the key point predictor by taking the training sample as input and taking the position label and the visibility class label of each vehicle key point in the training sample as output.

Optionally, the constructed vehicle detection model is the ResNet18 model.

Specifically, the training samples can be used as input for the vehicle type predictor to predict the vehicle type of the target vehicle in the training samples, the vehicle type prediction result of the training samples is obtained, and the vehicle type prediction result and the vehicle type classification labels of the training samples are compared to train the vehicle type predictor.

Similarly, the training samples can be used as input, so that the key point predictor can predict the coordinate position and the visibility category of each vehicle key point in the training samples, obtain each coordinate prediction result and each visibility category prediction result corresponding to each vehicle key point in the training samples, compare the coordinate prediction results and the position labels of the same vehicle key point, and compare the visibility category prediction results and the visibility category labels of the same vehicle key point, so as to train the key point predictor.

In this embodiment, the optimization strategy adopted by the vehicle detection model includes: with an initial learning rate of 0.0001, a stochastic gradient descent strategy, the number of samples (batch size) used in one iteration is set to 256, the number of iterations is 100 epochs, and the learning rate is reduced by ten times at the 50 th epoch and the 80 th epoch.

Optionally, the vehicle type predictor and the key point predictor each comprise a plurality of fully-connected layers.

In this embodiment, the vehicle type predictor may include two fully-connected layers, where the number of channels in the first layer of the fully-connected layers is 256, and the number of channels in the second layer of the fully-connected layers is the number of vehicle type classification tags, for example, when the vehicle type classification tags include 4 types (car, truck, bus, and tricycle), the number of channels in the second layer of the fully-connected layers is 4.

In this embodiment, the keypoint predictor may include two fully-connected layers, where the number of channels in the first layer of the fully-connected layers is 1024.

In this embodiment, the second layer in the fully-connected layer includes a key point position prediction sublayer for predicting the coordinate position of each vehicle key point and a key point category prediction sublayer for predicting the visibility category of each vehicle key point, where the number of channels of each of the key point position prediction sublayer and the key point category prediction sublayer is 40.

By means of the light-weight vehicle detection model structure design, calculation cost can be effectively saved, and the low calculation force requirement is met.

Optionally, the model predictor and the key point predictor can be iteratively optimized by using the preset model training loss function until the preset model training loss function converges to a stable value.

In this embodiment, the predetermined model training loss function is expressed as:

wherein Loss represents the total Loss value of the training sample; n represents the number of training samples; i represents the ith training sample; lv (low voltage) power supply_iRepresenting the vehicle type classification prediction loss of the ith training sample (in the embodiment, cross entropy loss is adopted); alpha represents Lv_iThe weight value of (1); m represents the number of vehicle key points (20 in the present embodiment); lc_ijPredicting loss of visibility categories of j (the embodiment adopts cross entropy loss) of j vehicle key points representing ith training samples; beta is a_jA weight value representing a visibility category prediction loss for a jth vehicle keypoint; lr_ijCoordinate regression prediction loss (in the embodiment, L1 loss is adopted) of the j-th vehicle key point representing the ith training sample; gamma ray_jAnd (3) a weight value representing the coordinate regression prediction loss of the jth vehicle key point.

In conclusion, the number of the key points of the vehicle is designed appropriately, so that the appearance shape information of vehicles in different vehicle types can be effectively summarized, and the labeling cost of the key points of the vehicle can be reduced. In addition, the same key point serial numbers are adopted for the same appearance parts of different vehicle types in the embodiment, so that the number of the key point labels of all the vehicle types is fixed, the shapes of the different vehicle types are summarized by using a unified key point labeling mode, the vehicle detection model of the embodiment can be suitable for detecting the key points of the vehicles of the different vehicle types, and the balance between the precision and the calculation power is realized.

Moreover, the vehicle type predictor and the key point predictor in the vehicle detection model respectively adopt 2 layers of full connection layers, and by means of the lightweight network structure design, the calculation cost of vehicle detection can be saved, and the calculation requirement of a vehicle identification detection preprocessing algorithm is met.

Second embodiment

A second embodiment of the present application provides a computer storage medium, which stores therein instructions for executing the steps of the vehicle detection model training method according to the first embodiment.

Third embodiment

Fig. 2 shows a flow chart of a vehicle detection method according to a third embodiment of the present application. As shown in the figure, the vehicle detection method of the present embodiment mainly includes the following:

in step S21, a target sample is obtained.

In the present embodiment, the target sample is a vehicle picture sample.

Step S22, using the vehicle detection model trained by the vehicle detection model training method described in the first embodiment, detects the target vehicle in the target sample, and obtains the vehicle type classification detection value of the target vehicle and the coordinate detection values and the visibility category detection values corresponding to the vehicle key points of the target vehicle.

In this embodiment, the vehicle detection model trained by the vehicle detection model training method described in the first embodiment above may be deployed to a vehicle inference engine, and the vehicle inference engine is utilized to perform detection on a target vehicle in a target sample.

Optionally, the vehicle inference engine includes at least one of an MNN inference engine and an NCNN inference engine, but not limited thereto, and other vehicle inference engines may be used.

Alternatively, the relevant data post-processing operation may also be performed for each coordinate detection value and each visibility class detection value corresponding to the vehicle type classification detection value of the target vehicle and each vehicle key point of the target vehicle.

For example, the vehicle type classification detection value may be converted into a corresponding vehicle type, and each coordinate detection value corresponding to each vehicle key point may be converted into coordinate information in an image coordinate system.

In summary, the vehicle detection method provided by the embodiment of the application can detect the vehicle type and the vehicle key point of the vehicle, so that the time consumed by inference is saved, and the balance of computing power and inference precision is realized.

Fourth embodiment

A fourth embodiment of the present application provides a computer storage medium having stored therein instructions for executing the steps of the vehicle detection method described in the third embodiment.

Fifth embodiment

Fig. 3 is a schematic structural diagram showing a vehicle detection model training apparatus according to a fifth embodiment of the present application. As shown in the figure, the vehicle detection model training apparatus 300 of the present embodiment mainly includes a training sample obtaining module 310 and a model training module 320.

The training sample obtaining module 310 is configured to determine, according to a training sample, a vehicle type classification label of the training sample, and a position label and a visibility category label of each vehicle key point in the training sample.

Optionally, the training sample is a vehicle picture sample, and the training sample obtaining module 310 is further configured to detect a target vehicle in the original sample based on a preset vehicle detection algorithm, and obtain vehicle envelope frame information of the target vehicle; and performing picture pre-processing on the original samples based on the vehicle envelope box information to obtain the training samples.

Optionally, the preset vehicle detection algorithm includes at least one of a YOLO algorithm, an SSD algorithm, a fast-RCNN algorithm.

Optionally, each training sample is labeled with 20 vehicle key points; each vehicle key point in the training sample corresponds to different vehicle appearance parts, and the serial numbers of the key points of the vehicle key points corresponding to the same vehicle appearance part in different training samples are the same.

The model training module 320 is configured to construct a model predictor 331 and a keypoint predictor 332 of a vehicle detection model 330, train the model predictor 331 by using the training samples as inputs and the model classification labels of the training samples as outputs, and train the keypoint predictor 332 by using the training samples as inputs and the position labels and visibility category labels of the vehicle keypoints in the training samples as outputs.

Optionally, the vehicle detection model 330 is a ResNet18 model.

Optionally, the vehicle type predictor 331 and the key point predictor 332 each include a plurality of fully connected layers.

Optionally, the vehicle type predictor 331 includes two fully-connected layers, where the number of channels in a first layer of the fully-connected layers is 256, and the number of channels in a second layer of the fully-connected layers is the number of the vehicle type classification labels.

Optionally, the keypoint predictor 332 includes two fully-connected layers, where the number of channels in a first layer of the fully-connected layers is 1024, and a second layer of the fully-connected layers includes a keypoint location prediction sublayer and a keypoint category prediction sublayer, where the number of channels is 40, the keypoint location prediction sublayer is configured to predict a coordinate location of each of the vehicle keypoints, and the keypoint category prediction sublayer is configured to predict a visibility category of each of the vehicle keypoints.

Optionally, the model training module 320 further iteratively optimizes the vehicle type predictor and the key point predictor by using a preset model training loss function until the preset model training loss function converges to a stable value.

In addition, the vehicle detection model training device 300 according to the embodiment of the present invention may also be used to implement other steps in the foregoing vehicle detection model training method embodiments, and has the beneficial effects of the corresponding method step embodiments, which are not described herein again.

To sum up, the vehicle detection model training device of the embodiment of the application can effectively save calculation overhead by designing a light vehicle detection model so as to meet the calculation force requirements of a vehicle detection preprocessing algorithm and can be suitable for the vehicle detection requirements of different vehicle types.

Sixth embodiment

Fig. 4 shows a schematic structural diagram of a vehicle detection device according to a sixth embodiment of the present application, and as shown in the figure, the vehicle detection device 400 of the present embodiment includes a target sample acquisition module 410 and a vehicle detection module 420.

The target sample acquiring module 410 is used for acquiring a target sample.

The vehicle detection module 420 is configured to detect a target vehicle in the target sample by using the vehicle detection model 330 trained by the vehicle detection model training apparatus 300 according to the fifth embodiment, and obtain a vehicle type classification detection value of the target sample and each coordinate detection value and each visibility category detection value corresponding to each vehicle key point in the target sample.

Specifically, the vehicle detection module 420 detects the vehicle type of the target vehicle in the target sample by using the vehicle type predictor 331 in the vehicle detection model 330 to obtain the vehicle type classification detection value of the target sample, and performs detection on each coordinate position and each visibility class of each key point of the target vehicle in the target sample by using the key point detector 332 in the vehicle detection model 330 to obtain each coordinate detection value and each visibility class detection value corresponding to each vehicle key point in the target sample.

Optionally, the vehicle detection module 420 is configured to deploy the vehicle detection model training apparatus 300 according to the fifth embodiment to a vehicle inference engine, and perform detection on the target vehicle in the target sample by using the vehicle inference engine.

Optionally, the vehicle inference engine comprises at least one of a MNN inference engine, a NCNN inference engine.

In addition, the vehicle detection apparatus 400 according to the embodiment of the present invention can also be used to implement other steps in the foregoing vehicle detection method embodiments, and has the beneficial effects of the corresponding method step embodiments, which are not described herein again.

In summary, the vehicle model predictor and the key point predictor in the vehicle detection model designed by the vehicle detection model training method and the vehicle detection model training device and the vehicle detection method and the vehicle detection device and the computer storage medium respectively adopt two full connection layers, and the calculation overhead can be effectively saved by utilizing the lightweight network structure design.

Moreover, the embodiment of the application adopts a unified key point marking mode to summarize the appearances of vehicles of different vehicle types, so that the same vehicle detection model can be used for detecting the vehicles of different vehicle types, and the design quantity of the key points is reasonable, so that the appearance information of the vehicles can be effectively summarized, the marking cost of the key points is reduced, and the balance between calculation precision and calculation power is realized.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A vehicle detection model training method, the method comprising:

determining vehicle type classification labels of the training samples and position labels and visibility category labels of key points of all vehicles in the training samples according to the training samples; and

and constructing a vehicle type predictor and a key point predictor of a vehicle detection model, training the vehicle type predictor by using the training sample as input and the vehicle type classification label of the training sample as output, and training the key point predictor by using the training sample as input and using the position label and the visibility class label of each vehicle key point in the training sample as output.

2. The vehicle detection model training method according to claim 1, wherein the training samples are vehicle picture samples, the method further comprising:

detecting a target vehicle in an original sample based on a preset vehicle detection algorithm to obtain vehicle envelope frame information of the target vehicle; and

performing picture pre-processing for the raw samples based on the vehicle envelope box information to obtain the training samples.

3. The vehicle detection model training method of claim 2, wherein the predetermined vehicle detection algorithm comprises at least one of a YOLO algorithm, an SSD algorithm, a fast-RCNN algorithm.

4. The vehicle detection model training method according to claim 1,

each training sample is marked with 20 vehicle key points;

each vehicle key point in the training sample corresponds to different vehicle appearance parts, and the serial numbers of the key points of the vehicle key points corresponding to the same vehicle appearance part in different training samples are the same.

5. The vehicle detection model training method of claim 1, wherein the vehicle detection model comprises a ResNet18 model.

6. The vehicle detection model training method of claim 1, wherein the vehicle type predictor and the keypoint predictor each comprise a plurality of fully-connected layers.

7. The vehicle detection model training method of claim 6, wherein the vehicle type predictor comprises two fully-connected layers, wherein the number of channels of a first layer of the fully-connected layers is 256, and the number of channels of a second layer of the fully-connected layers is the total number of the vehicle type classification labels.

8. The vehicle detection model training method according to claim 6, wherein the keypoint predictor comprises two fully-connected layers, wherein the number of channels of a first layer of the fully-connected layers is 1024, and a second layer of the fully-connected layers comprises a keypoint location prediction sublayer and a keypoint category prediction sublayer, wherein the number of channels of the first layer is 40, the keypoint location prediction sublayer is used for predicting the coordinate location of each vehicle keypoint, and the keypoint category prediction sublayer is used for predicting the visibility category of each vehicle keypoint.

9. The vehicle detection model training method of claim 1, further comprising:

iteratively optimizing the vehicle type predictor and the key point predictor by utilizing a preset model training loss function until the preset model training loss function converges to a stable value;

the preset model training loss function is expressed as:

wherein the Loss represents a total Loss value of the training sample; the N represents the number of the training samples; the i represents the ith training sample; said Lv_iRepresenting the vehicle type classification prediction loss of the ith training sample; the alpha represents a weight value of the vehicle type classification prediction loss of the ith training sample; the M represents the number of the vehicle key points; said Lc_ijA visibility category prediction loss of a jth of the vehicle keypoints representing an ith of the training samples; beta is the same as_jA weight value representing the visibility category prediction loss for the jth of the vehicle keypoints; said Lr_ijRepresenting a coordinate regression prediction loss of a jth of the vehicle key points of the ith training sample; the gamma is_jA weight value representing the coordinate regression prediction loss for the jth of the vehicle keypoints.

10. A vehicle detection method, characterized in that the method comprises:

obtaining a target sample; and

detecting a target vehicle in the target sample using the vehicle detection model trained by the vehicle detection model training method according to any one of claims 1 to 9, obtaining a vehicle type classification detection value of the target vehicle and each coordinate detection value and each visibility category detection value corresponding to each vehicle key point of the target vehicle.

11. The vehicle detection method according to claim 10, characterized in that the method further comprises:

deploying the vehicle detection model trained by the vehicle detection model training method according to any one of claims 1 to 9 to a vehicle inference engine, and performing detection for the target vehicle in the target sample with the vehicle inference engine.

12. The vehicle detection method of claim 11, wherein the vehicle inference engine comprises at least one of a MNN inference engine, a NCNN inference engine.

13. A computer storage medium having stored therein instructions for performing the steps of the vehicle detection model training method according to any one of claims 1 to 9; or a computer storage medium having stored therein instructions for carrying out each of the steps of the vehicle detection method according to any one of claims 10 to 12.

14. A vehicle detection model training apparatus, characterized in that the apparatus comprises:

the training sample acquisition module is used for determining vehicle type classification labels of the training samples and position labels and visibility category labels of each vehicle key point in the training samples according to the training samples; and

the model training module is used for constructing a vehicle type predictor and a key point predictor of a vehicle detection model, training the vehicle type predictor by using the training sample as input and using the vehicle type classification label of the training sample as output, and training the key point predictor by using the training sample as input and using the position label and the visibility class label of each vehicle key point in the training sample as output.

15. The vehicle detection model training apparatus of claim 14, wherein the vehicle type predictor and the keypoint predictor each comprise a plurality of fully-connected layers.

16. The vehicle detection model training apparatus according to claim 15,

the vehicle type predictor comprises two fully-connected layers, wherein the number of channels of the first layer in the fully-connected layers is 256, and the number of channels of the second layer in the fully-connected layers is the number of vehicle type classification labels.

17. The vehicle detection model training apparatus according to claim 15,

the key point predictor comprises two fully-connected layers, wherein the number of channels of a first layer in the fully-connected layers is 1024, a second layer in the fully-connected layers comprises a key point position prediction sub-layer and a key point category prediction sub-layer, the number of channels of the key point position prediction sub-layer is 40, the key point position prediction sub-layer is used for predicting the coordinate position of each vehicle key point, and the key point category prediction sub-layer is used for predicting the visibility category of each vehicle key point.

18. A vehicle detection apparatus, characterized in that the apparatus comprises:

the target sample acquisition module is used for acquiring a target sample; and

a vehicle detection module that detects a target vehicle in the target sample using the vehicle detection model trained by the vehicle detection model training apparatus according to any one of claims 14 to 17, and obtains a vehicle type classification detection value of the target sample and each coordinate detection value and each visibility category detection value corresponding to each vehicle key point in the target sample.