CN118397403A

CN118397403A - Training method, device, equipment and medium for low-illumination vehicle image detection model

Info

Publication number: CN118397403A
Application number: CN202410864682.4A
Authority: CN
Inventors: 王雪雁; 胡昌隆; 周平; 胡美玲; 陈瑞宁; 李美熙
Original assignee: Zenmorn Hefei Technology Co ltd
Current assignee: Zenmorn Hefei Technology Co ltd
Priority date: 2024-07-01
Filing date: 2024-07-01
Publication date: 2024-07-26
Anticipated expiration: 2044-07-01
Also published as: CN118397403B

Abstract

The invention provides a training method, a training device, training equipment and training media for a low-illumination vehicle image detection model, wherein the training method comprises the following steps: acquiring an image dataset of a vehicle in a preset environment; preprocessing the image dataset to generate a training dataset; constructing an initial vehicle image detection model, and configuring an optimizer and a loss function of the initial vehicle image detection model; and training the initial vehicle image detection model by taking the training data set as an input variable of the initial vehicle image detection model to generate a target vehicle image detection model. The invention improves the precision of vehicle detection in low-illumination environment and solves the problem of poor detection effect of the conventional vehicle detection algorithm applied to the specific scene image.

Description

Training method, device, equipment and medium for low-illumination vehicle image detection model

Technical Field

The invention relates to the technical field of image processing, in particular to a training method, device, equipment and medium of a low-illumination vehicle image detection model.

Background

With the continuous development of computer vision technology, technologies for capturing vehicle images of road sites by monitoring and capturing a series of detection and identification of vehicles have been widely used. However, in a low-illumination environment, such as at night or in an area with poor illumination, the background color in the snap-shot picture is darker, so that the contour of the vehicle is difficult to distinguish, the detection accuracy of the conventional deep learning model is greatly reduced, and the position of the vehicle is difficult to accurately identify by the existing detection method. Meanwhile, due to the problem of view angles of cameras, targets are easily shielded by other objects, uncertainty of boundary outlines of different targets is caused, and accurate detection of vehicles cannot be achieved. Therefore, there is a need for improvement.

Disclosure of Invention

In view of the above drawbacks of the prior art, the present invention provides a training method, apparatus, device and medium for a low-illumination vehicle image detection model, so as to solve the above technical problems.

The invention provides a training method of a lane line recognition model, which comprises the following steps:

Acquiring an image dataset of a vehicle in a preset environment;

Preprocessing the image dataset to generate a training dataset;

constructing an initial vehicle image detection model, and configuring an optimizer and a loss function of the initial vehicle image detection model;

And training the initial vehicle image detection model by taking the training data set as an input variable of the initial vehicle image detection model to generate a target vehicle image detection model.

In one embodiment of the present invention, the step of acquiring an image dataset of a vehicle in a preset environment includes:

Acquiring a vehicle image in a preset environment, wherein the environment is a night or an environment with poor lighting conditions;

performing label labeling processing on the vehicle image to generate the image data set, wherein the image data set comprises a plurality of image data;

dividing the image data set into a training image set, a verification image set and a test image set according to a preset proportion.

In one embodiment of the present invention, the step of training the initial vehicle image detection model using the training data set as an input variable of the initial vehicle image detection model, and generating a target vehicle image detection model includes:

Setting a model training strategy, and inputting the training data set into the initial vehicle image detection model;

performing feature extraction processing on the vehicle images in the training data set through the encoder of the initial vehicle image detection model to generate multi-scale feature images;

carrying out refinement processing on the multi-scale feature image to generate multi-scale refinement features, wherein the multi-scale refinement features represent the position information of the vehicle;

And carrying out fusion classification processing on the multi-scale refined features to generate a vehicle segmentation map.

In one embodiment of the present invention, the step of refining the multi-scale feature image to generate a multi-scale refined feature, the multi-scale refined feature characterizing location information of a vehicle includes:

the multi-scale characteristic images are fused through a decoder of the initial vehicle image detection model so as to generate boundary information of a vehicle;

And carrying out fusion processing on the multi-scale feature map and the boundary information to generate the multi-scale refined feature.

In one embodiment of the present invention, the step of performing fusion classification processing on the multi-scale refinement feature to generate a vehicle segmentation map includes:

fusing the multi-scale refinement features layer by layer according to a preset sequence through a context attention network to generate a fused feature map;

And classifying each pixel point of the fusion feature map through a convolution layer to generate the vehicle segmentation map.

In one embodiment of the invention, the step of configuring the optimizer and loss function of the initial vehicle image detection model comprises:

Setting an optimizer of the initial vehicle image detection model as an adaptive moment estimation algorithm optimizer;

And setting the loss function of the initial vehicle image detection model according to the binary cross entropy loss function, the cross ratio loss function and the boundary loss function.

In one embodiment of the invention, the initial vehicle image detection model has a loss functionThe following formula is satisfied:

Wherein, Representing the binary cross entropy loss function,Representing the cross-ratio loss function,Representing the function of the boundary loss in question,Representing the proportion of boundary loss.

The invention also provides a training system of the low-illumination vehicle image detection model, which comprises the following steps:

the image acquisition module is used for acquiring an image data set of the vehicle in a preset environment;

the data processing module is used for preprocessing the image data set to generate a training data set;

the model construction module is used for constructing an initial vehicle image detection model and configuring an optimizer and a loss function of the initial vehicle image detection model;

the model training module is used for training the initial vehicle image detection model by taking the training data set as an input variable of the initial vehicle image detection model to generate a target vehicle image detection model.

The invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, implements the steps of the training method of the low-illuminance vehicle image detection model as described in any one of the above.

The present invention also provides a computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the training method of the low-illuminance vehicle image detection model according to any one of the above.

In summary, the training method, device, equipment and medium of the low-illumination vehicle image detection model have the following beneficial effects: the invention improves the precision of vehicle detection in low-illumination environment and solves the problem of poor detection effect of the conventional vehicle detection algorithm applied to the specific scene image. In addition, the invention also utilizes the position relation of the multiple camera devices to realize the enhancement operation during test, so as to perform more accurate prediction and achieve higher detection precision.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is evident that the drawings in the following description are only some embodiments of the present invention and that a further understanding of the present invention may be obtained from these drawings to those of ordinary skill in the art without undue effort. In the drawings:

Fig. 1 is a schematic flow chart of a training method of a low-illuminance vehicle image detection model provided by the invention.

Fig. 2 is a flow chart of an embodiment of step S100 in fig. 1.

Fig. 3 is a flowchart illustrating an embodiment of step S300 in fig. 1.

Fig. 4 is a flowchart illustrating an embodiment of step S400 in fig. 1.

Fig. 5 is a flowchart illustrating an embodiment of step S430 in fig. 4.

FIG. 6 is a diagram illustrating the integration of a network encoder and features in an embodiment of the present invention.

FIG. 7 is a schematic diagram showing boundary information guiding feature fusion in an embodiment of the present invention.

Fig. 8 is a flowchart illustrating an embodiment of step S440 in fig. 4.

FIG. 9 is a schematic diagram of a context attention fusion module according to an embodiment of the present invention.

Fig. 10 is a schematic structural diagram of a training system for a low-illuminance vehicle image detection model according to the present invention.

FIG. 11 is a schematic diagram of a computer device of the present invention.

FIG. 12 is a schematic diagram of another computer device of the present invention.

Detailed Description

Further advantages and effects of the present invention will become readily apparent to those skilled in the art from the description herein, by referring to the accompanying drawings and the preferred embodiments. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be understood that the preferred embodiments are presented by way of illustration only and not by way of limitation.

The drawings provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings, not according to the number, shape and size of the components in actual implementation, the form, number and proportion of each component in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.

In the following description, numerous details are set forth in order to provide a more thorough explanation of embodiments of the present invention, it will be apparent to one skilled in the art that embodiments of the present invention may be practiced without these specific details, in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the embodiments of the present invention.

Referring to fig. 1, fig. 1 is a flow chart of a training method of a low-illuminance vehicle image detection model according to the present invention. The invention provides a training method of a low-illumination vehicle image detection model, which utilizes the technical means of semantic segmentation, feature fusion, attention mechanism, enhancement during test and the like to realize accurate detection of road and bridge passing vehicles under the low-illumination condition. The invention improves the precision of vehicle detection under low illumination and solves the problem of poor detection effect of the conventional vehicle detection algorithm applied to the specific scene image. The training method of the low-illumination vehicle image detection model provided by the invention can comprise the following steps:

step S100, acquiring an image dataset of a vehicle in a preset environment;

Step 200, preprocessing the image data set to generate a training data set;

step S300, an initial vehicle image detection model is built, and an optimizer and a loss function of the initial vehicle image detection model are configured;

and step 400, training the initial vehicle image detection model by taking the training data set as an input variable of the initial vehicle image detection model to generate a target vehicle image detection model.

Referring to fig. 2, fig. 2 is a flow chart illustrating an embodiment of step S100 in fig. 1. In one embodiment of the present invention, when step S100 is performed, an image dataset of a vehicle in a preset environment is acquired. Specifically, the steps S110 to S130 may be included, and the following details are described below:

Step S110, acquiring a vehicle image under a preset environment, wherein the environment is a night or an environment with poor illumination;

Step S120, carrying out label labeling processing on the vehicle image to generate the image data set, wherein the image data set comprises a plurality of image data;

and step S130, dividing the image data set into a training image set, a verification image set and a test image set according to a preset proportion.

In one embodiment of the present invention, when steps S110 to S130 are performed. Specifically, firstly, a plurality of intelligent traffic camera devices are adopted to collect vehicle images, for example, cameras arranged on roads and bridges, traffic signal lamps or mobile monitoring vehicles can be used for respectively shooting vehicles passing by the roads and bridges, and then vehicle images under a preset environment are generated. In this embodiment, the preset environment is a low-illuminance environment such as night or poor lighting condition. Then, using a picture marking tool, performing label marking processing on the acquired vehicle image in the low-illumination environment to generate corresponding image data. In this embodiment, the image labeling tool may use an open source image labeling tool Labelme. The image data may include a vehicle image and a tag image corresponding to the vehicle image. The plurality of image data are integrated, and an image dataset can be generated.

Further, since the number of images in the image data is large, it is required to distribute the images into a training image set, a verification image set and a test image set according to a preset ratio. For example, the preset ratio may be 4:1:1, i.e., the ratio of the number of image data in the training image set, the number of image data in the verification image set, and the number of image data in the test image set may be 4:1:1. The specific size of the preset proportion can be met without limitation, and the vehicle image detection model can be trained. The m vehicle images in the training image set may be represented asThe corresponding m label images can be expressed asWherein, the method comprises the steps of, wherein,Representing training image setThe image of the vehicle is displayed on a display screen,Representing training image setA label image of the sheet of the vehicle,. The n vehicle images in the verification image set may be represented asThe corresponding n label images may be represented asWhereinRepresenting the verification image setThe image of the vehicle is displayed on a display screen,Representing the verification image setA label image of the sheet of the vehicle,. The test image set may include several images of the vehicle for testing. In this embodiment, the test image samples may be selected from an average random sampling of most scenes.

In one embodiment of the invention, when step S200 is performed, the image dataset is preprocessed to generate a training dataset. Specifically, the resolution processing and normalization processing may be performed on all the image data in the image data set, so that each vehicle image and the label image corresponding to each vehicle image are unified to a preset size. In this embodiment, the preset size may be set to 352352 So that it facilitates convergence of the model.

In one embodiment of the present invention, when step S300 is performed, i.e., based on a self-attention mechanism, an initial vehicle image detection model is constructed, and an optimizer and a loss function of the initial vehicle image detection model are configured. Specifically, an encoder-decoder is built to construct an initial vehicle image detection model, in this embodiment, a transducer encoder is used as the encoder. Then, an optimizer and a loss function of the initial vehicle image detection model are configured.

Referring to fig. 3, fig. 3 is a flow chart illustrating an embodiment of step S300 in fig. 1. In one embodiment of the present invention, step S300 may include steps S310 to S320, which are described in detail below:

step S310, setting an optimizer of the initial vehicle image detection model as an adaptive moment estimation algorithm optimizer;

and step 320, setting a loss function of the initial vehicle image detection model according to the binary cross entropy loss function, the cross ratio loss function and the boundary loss function.

In one embodiment of the present invention, when steps S310 to S320 are performed. Specifically, first, an optimizer of the initial vehicle image detection model may be set as an adaptive moment estimation algorithm (Adam, adaptive Moment Estimation) optimizer. The Adam optimizer combines the ideas of RMSProp and Momentum optimization algorithms, and performs normalization processing on the parameter update, so that each parameter update has a similar magnitude, the learning rate can be adjusted according to the historical gradient information, and the training effect is improved. Wherein the initial learning rate is set as. The final loss function may then be calculated and set based on the binary cross entropy loss function, the cross ratio loss function, and the boundary loss function. In this embodiment, the binary cross entropy loss functionThe following formula may be satisfied:

Wherein, Representing the first in the vehicle imageThe true value of each pixel point,Representing the first in the vehicle imagePredicted values for individual pixels.

In this embodiment, the cross-ratio loss functionThe following formula may be satisfied:

Wherein, Representing truth areas in a vehicle imageAnd prediction regionIs used for the intersection of (a) and (b),Representing truth areas in a vehicle imageAnd prediction regionIs a union of (a) and (b).

In this embodiment, the boundary loss functionThe following formula may be satisfied:

Wherein, AndRepresenting the true and predicted values of the boundary, respectively.

In this embodiment, the final loss functionThe following formula may be satisfied:

Wherein, Represents the number of all pixels in the input vehicle image,，AndRepresenting the width and height of the input vehicle image respectively,Representing the proportion of boundary loss, in this example,The value of (2) may be 3.

Referring to fig. 4, fig. 4 is a flow chart illustrating an embodiment of step S400 in fig. 1. In one embodiment of the present invention, when step S400 is performed, the training data set is used as an input variable of the initial vehicle image detection model, and the initial vehicle image detection model is trained to generate a target vehicle image detection model. Specifically, step S400 may include steps S410 to S440, which are described in detail below:

Step S410, setting a model training strategy, and inputting the training data set into the initial vehicle image detection model;

step S420, carrying out feature extraction processing on the vehicle images in the training data set through the encoder of the initial vehicle image detection model so as to generate multi-scale feature images;

step S430, carrying out refinement processing on the multi-scale feature image to generate multi-scale refinement features, wherein the multi-scale refinement features represent the position information of the vehicle;

And step S440, carrying out fusion classification processing on the multi-scale refined features to generate a vehicle segmentation map.

In one embodiment of the present invention, when step S410 is performed, a model training strategy is set and the training data set is input to the initial vehicle image detection model. Specifically, first, the preset ratio 4 of the training image set and the verification image set may be: and 1, formulating a model training strategy. In this embodiment, the setting starts with the 10 th epoch, and after each epoch is finished, the current model accuracy is calculated on the verification image set and the model accuracy is reserved. Wherein epoch characterizes the process of training once using all samples in the training image set. And then calculating the current model precision after each epoch and comparing the current model precision with the previous model precision, replacing the previous model if the precision of the next model exceeds the progress of the previous model, and otherwise, reserving the previous model. Model accuracy、、AndAs an evaluation criterion, in this embodiment, a total of 60 epochs are set for model training, and the learning rate attenuation is set at the 40 th epoch, and the attenuation factor is taken to be 5.0. And then, taking the training data set as an input variable of the initial vehicle image detection model, and training and optimizing the initial vehicle image detection model according to the model training strategy so that the model is suitable for detecting the vehicle image in the low-illumination environment. In this embodiment, model training is performed based on the training image set, model accuracy verification is performed based on the verification image set, meanwhile, the model stores optimal weights, the accuracy of the training image set and the verification set is recorded, parameter adjustment is facilitated, and finally, a target vehicle image detection model meeting the preset accuracy requirement is obtained.

In one embodiment of the present invention, when step S420 is performed, that is, the feature extraction process is performed on the vehicle image in the training data set by the encoder of the initial vehicle image detection model, so as to generate a multi-scale feature image. Specifically, a network encoder of the initial vehicle image detection model adopts a transducer encoder, and the network encoder performs feature extraction processing on images in the training data set. the transducer encoder extracts features based on the PVT v2 network, and can downsample the images in the training dataset 4 times in total, thereby obtaining a multi-scale feature image. The multi-scale feature image may be recorded asWherein, the method comprises the steps of, wherein,Is of the size ofIs 2 times as large as the above. In this embodiment, define，) For low-level characteristic diagram, define%，) Is a high-level feature map.

Referring to fig. 5, fig. 5 is a flow chart illustrating an embodiment of step S430 in fig. 4. In one embodiment of the present invention, when step S430 is performed, the multi-scale feature image is subjected to refinement processing to generate multi-scale refined features, where the multi-scale refined features characterize position information of the vehicle. Specifically, step S430 may include steps S431 to S432, which are described in detail below:

Step S431, the multi-scale feature map is fused through a decoder of the initial vehicle image detection model so as to generate boundary information of a vehicle;

And step 432, performing fusion processing on the multi-scale feature map and the boundary information to generate the multi-scale refined feature.

Referring to fig. 6, fig. 6 is a schematic diagram showing the integration of the network encoder and the features according to an embodiment of the invention. In one embodiment of the present invention, when step S431 is performed, fusion processing is performed on the multi-scale feature map to generate boundary information of the vehicle. Specifically, the multi-scale feature maps 101 with different sizes are fused, so that boundary information 102 of the vehicle can be obtained. First, for low-level feature map，) The channels are compressed to 64 channels by adopting two 1X 1 convolution layers, and then the advanced characteristic diagram is processed in the same way，) Is compressed to 256 channels. Thus, two 64-channel characteristic diagrams can be obtained,) And two 256-channel feature maps,). Then, all the characteristic diagrams are combined,,,) Upsampling toAnd for four characteristic diagrams,,,) Tandem operation is performed to integrate multi-scale features. In fig. 6, C denotes a channel connection operation, and U denotes an up-sampling operation. Finally, two 3×3 convolution layers and one 1×1 convolution layer are used to obtain the boundary information. In this embodiment, the fusion process of the multi-scale feature map may satisfy the following formula:

Wherein, For the multi-scale feature map extracted by the network encoder,A convolution layer is shown and is shown in the form of a layer,An up-sampling operation is indicated and,Representation ofThe boundary information of the vehicle can thus be obtained and can be used as a priori information for the subsequent detection of the vehicle.

Referring to fig. 7, fig. 7 is a schematic diagram illustrating boundary information guidance feature fusion according to an embodiment of the invention. In one embodiment of the present invention, when step S432 is performed, the multi-scale feature map 101 and the boundary information 102 are subjected to a fusion process to generate the multi-scale refined feature 103. Specifically, first, use is made ofThe manner in which (a jump connection) the multi-scale features of the encoding stage are combined with the vehicle boundary information of the decoding stage. And then adopt(Feature addition) operation to obtain fusion features. The acceptance field of the feature map is then enhanced by the convolution module. In this embodiment, the convolution module consists of a series of parallel hole convolutions. The convolution module may employ three different rates of hole convolution layers, e.g., 6, 12, 18, respectively. In addition, the largest pool layer may be used to obtain local separation characteristics. After all features in the channel dimension are connected, a 1x 1 convolution can be employed to obtain multi-scale refined features with clearer acceptance fields, i.e., to generate vehicle location information. In this embodiment, the multi-scale feature map and the vehicle boundary information are fused to generate a multi-scale refined feature, which can satisfy the following formula:

Wherein, The input characteristics are represented as such,A communication policy is represented as such,Is a convolution layer.To have different expansion ratesIs provided with a hole convolution layer of (1),The maximum pooling layer is represented, so that multi-scale refinement features with higher receptive fields can be output, and the position information of the vehicle is obtained.

Referring to fig. 8, fig. 8 is a flow chart illustrating an embodiment of step S440 in fig. 4. In one embodiment of the present invention, when step S440 is performed, the multi-scale refinement feature is subjected to fusion classification processing to generate a vehicle segmentation map. Specifically, step S440 may include steps S441 to S442, which are described in detail below:

Step S441, fusing multi-scale refinement features layer by layer according to a preset sequence through a context attention network to generate a fused feature map;

step S442, classifying each pixel point of the fusion feature map through a convolution layer to generate the vehicle segmentation map.

Referring to fig. 9, fig. 9 is a schematic diagram of a context attention fusion module according to an embodiment of the invention. In one embodiment of the present invention, when step S441 is performed, the multiscale refinement features are fused layer by layer in a preset order through the contextual awareness network to generate a fused feature map. In particular, a strategy of layer-by-layer fusion from high to low may be employed for multi-scale refinement features. In this embodiment, feature fusion is performed through the context awareness network, and a dual-stage fusion method is adopted. Wherein in the first stage, the high-level features are up-sampled to the same size as the low-level features, and then fused by a context attention module, the first stage outputsThe following formula may be satisfied:

Wherein, The contextual attention module is represented as such,AndIn order to input the characteristics of the feature,Two upsampling operations are shown and,AndRespectively, refers to addition of an element and multiplication of an element.

In the second stage, the output of the first stage is taken as input to the contextual attention module, and the same operations as the first stage are adopted first. The output features are then input into a 3x3 convolutional layer. Finally, batch normalization is carried out, and a ReLU (RECTIFIED LINEAR unit, rectifying linear unit) activation function is adopted at the same time, so that the cross-stage fusion characteristic of the second stage can be obtainedAnd generating a fusion characteristic diagram. Cross-level fusion feature of the second stageThe following formula may be satisfied:

Wherein, Representing convolution, batch normalization and ReLU activation functions,Is the output of the first stage.

In one embodiment of the present invention, when step S442 is performed, a classification process is performed on each pixel point of the fused feature map by a convolution layer, so as to generate the vehicle segmentation map. Specifically, mapping from a final convolution layer of the deep neural network to an original image is performed, and each pixel point of the fusion feature map is classified, so that a final vehicle segmentation map is obtained.

Further, in an embodiment of the present invention, a trained target vehicle image detection model may be used to determine whether a pixel point in an input image belongs to a target to be extracted. Specifically, the target vehicle image detection model is embedded into detection equipment to perform real-time vehicle detection, and meanwhile, test enhancement operation is performed on the overlapped area to obtain a more accurate detection result. In the present embodiment, first, setting is performed after the last layer of the decoder of the target vehicle image detection modelA function that calculates the confidence of the pixel point and is defined between 0 and 1:

Wherein, Representation of the first of the feature graphsIndividual pixel pointsAnd calculating to obtain a probability value that the target confidence is a positive sample, and setting a threshold to define a target and a non-target.

Then, through the relation between the camera position and the actual bearing area of the image, different real-time vehicle detection inferences are carried out, if an overlapping area is generatedEnhancement operation during test is performed simultaneously:

Wherein, for the overlapping region in the feature map Pixel points corresponding to each otherAnd calculating to obtain a probability value that the target confidence is a positive sample. Finally, through the overlapping areaThe average confidence coefficient is calculated for a plurality of times, the final detection precision is improved, and a more accurate detection result is obtained.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a training system for a low-illumination vehicle image detection model according to the present invention. The invention also provides a training device of the low-illumination vehicle image detection model, which corresponds to the training method in the embodiment one by one. The training apparatus may include an image acquisition module 201, a data processing module 202, a model building module 203, and a model training module 204. The functional modules are described in detail as follows:

the image acquisition module 201 may be configured to acquire an image dataset of a vehicle in a preset environment. Further, the image obtaining module 201 may be specifically configured to obtain an image of the vehicle in a preset environment, where the environment is a night or an environment with poor illumination; performing label labeling processing on the vehicle image to generate the image data set, wherein the image data set comprises a plurality of image data; dividing the image data set into a training image set, a verification image set and a test image set according to a preset proportion.

The data processing module 202 may be used to pre-process the image data set to generate a training data set. Specifically, the data processing module 202 may perform the resolution process and the normalization process on all the image data in the image data set, so that each vehicle image and the label image corresponding to each vehicle image are unified to a preset size. In this embodiment, the preset size may be set to 352×352 so that it is advantageous for the convergence of the model.

The model construction module 203 may be used to construct an initial vehicle image detection model and configure an optimizer and a loss function of the initial vehicle image detection model. Further, the model building module 203 may be specifically configured to set an optimizer of the initial vehicle image detection model to be an adaptive moment estimation algorithm optimizer; and setting the loss function of the initial vehicle image detection model according to the binary cross entropy loss function, the cross ratio loss function and the boundary loss function.

The model training module 204 may be configured to train the initial vehicle image detection model using the training data set as an input variable to the initial vehicle image detection model to generate a target vehicle image detection model. Further, the model training module 204 may be specifically configured to set a model training strategy and input the training data set into the initial vehicle image detection model; performing feature extraction processing on the vehicle images in the training data set through the encoder of the initial vehicle image detection model to generate multi-scale feature images; carrying out refinement processing on the multi-scale feature image to generate multi-scale refinement features, wherein the multi-scale refinement features represent the position information of the vehicle; and carrying out fusion classification processing on the multi-scale refined features to generate a vehicle segmentation map.

For specific limitations of the training device, reference may be made to the limitations of the training method described above, and will not be repeated here. The various modules in the training device described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

Referring to fig. 11, the present invention further provides a computer device, which may be a server. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes non-volatile and/or volatile storage media and internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is for communicating with an external client via a network connection. The computer program is executed by a processor to perform the functions or steps of a training method for a low-light vehicle image detection model.

Referring to fig. 12, the present invention also provides another computer device, which may be a client. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is for communicating with an external server via a network connection. The computer program is executed by a processor to perform the functions or steps of a training method for a low-light vehicle image detection model.

In one embodiment of the invention, a computer device is provided comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

Acquiring an image dataset of a vehicle in a preset environment;

Preprocessing the image dataset to generate a training dataset;

In one embodiment of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

Acquiring an image dataset of a vehicle in a preset environment;

Preprocessing the image dataset to generate a training dataset;

It should be noted that, the functions or steps that can be implemented by the computer readable storage medium or the computer device may correspond to those described in the foregoing method embodiments, and are not described herein for avoiding repetition.

Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program, which may be stored on a non-transitory computer readable storage medium and which, when executed, may comprise the steps of the above-described embodiments of the methods. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

In summary, the invention provides a training method, device, equipment and medium for a low-illumination vehicle image detection model, which can be applied to the technical field of image processing. The invention realizes the detection of the passing of the road and bridge at low illumination by utilizing the technical means of semantic segmentation, feature fusion, attention mechanism, enhancement during test and the like, improves the precision of vehicle detection under low illumination, and solves the problem of poor detection effect of the conventional vehicle detection algorithm applied to the specific scene image. Based on the encoder/decoder model based on the deep learning, the category of each pixel is automatically acquired according to the image captured by the image capturing device. Through fusion of different depth features, more representative image features are obtained. The loss function is set by setting different weights on the binary cross entropy loss function, the cross ratio loss function and the boundary loss, so that the training effect of the model is more stable and the effect is better. Meanwhile, the position relation of the multiple camera devices is utilized to realize the enhancement operation during test, so that more accurate prediction is performed, and higher detection precision is achieved.

In the description of the present specification, the descriptions of the terms "present embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The embodiments of the invention disclosed above are intended only to help illustrate the invention. The examples are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims

1. A training method for a low-illuminance vehicle image detection model, comprising:

Acquiring an image dataset of a vehicle in a preset environment;

Preprocessing the image dataset to generate a training dataset;

2. The method for training the low-illuminance vehicle image detection model according to claim 1, wherein the step of acquiring the image dataset of the vehicle in the preset environment includes:

3. The method of training a low-illuminance vehicle image detection model according to claim 1, wherein the step of training the initial vehicle image detection model with the training data set as an input variable of the initial vehicle image detection model, and generating a target vehicle image detection model includes:

4. A method of training a low-light vehicle image detection model as claimed in claim 3, wherein the step of refining the multi-scale feature image to generate multi-scale refined features, the multi-scale refined features characterizing positional information of the vehicle comprises:

5. A method of training a low-light vehicle image detection model according to claim 3, wherein the step of performing fusion classification processing on the multi-scale refined features to generate a vehicle segmentation map comprises:

6. The method of training a low-light vehicle image detection model of claim 1, wherein the step of configuring an optimizer and a loss function of the initial vehicle image detection model comprises:

7. The method of training a low-light vehicle image detection model of claim 6, wherein the initial vehicle image detection model has a loss functionThe following formula is satisfied:

8. A training system for a low-light vehicle image detection model, comprising:

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, carries out the steps of the training method of the low-illuminance vehicle image detection model according to any one of claims 1 to 7.

10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the training method of the low-illuminance vehicle image detection model according to any one of claims 1 to 7.