CN116363462B

CN116363462B - Training method, system, equipment and medium for road and bridge passing detection model

Info

Publication number: CN116363462B
Application number: CN202310636094.0A
Authority: CN
Inventors: 王雪雁; 周平; 胡美玲; 江斌; 杨涛; 郑刚; 余超
Original assignee: Zenmorn Hefei Technology Co ltd
Current assignee: Zenmorn Hefei Technology Co ltd
Priority date: 2023-06-01
Filing date: 2023-06-01
Publication date: 2023-08-22
Anticipated expiration: 2043-06-01
Also published as: CN116363462A

Abstract

The invention provides a training method of a road and bridge passing vehicle detection model, which comprises the following steps: acquiring an image dataset; marking the sample image data set to generate a label image data set; processing the label image dataset and the test image dataset to generate a training image dataset and a target test image dataset; optimizing the initial road bridge passing detection model to generate an intermediate road bridge passing detection model; inputting the training image data set into the intermediate road bridge passing detection model to respectively and sequentially perform coding treatment and decoding treatment to generate a decoding result; and performing reverse optimization processing on the intermediate road bridge passing detection model according to the decoding result to generate a target road bridge passing detection model. The training method, the training system, the training equipment and the training medium for the road bridge passing detection model can improve the detection precision of the road bridge passing detection model on the passing vehicles.

Description

Training method, system, equipment and medium for road and bridge passing detection model

Technical Field

The invention relates to the technical field of deep learning, in particular to a training method, a training system, training equipment and training media for a road and bridge passing detection model.

Background

Along with the continuous development of scientific technology, the vehicle possession increases, and the road and bridge passing detection technology is gradually valued as an important technology for smart cities and intelligent transportation, and the application range of the road and bridge passing detection technology is gradually expanded, however, the existing road and bridge passing detection model applied to road and bridge passing detection is lower in vehicle detection precision, and because of uncertainty of the direction and distance of a target vehicle, the target vehicle cannot be accurately identified.

Disclosure of Invention

In view of the above-mentioned drawbacks of the prior art, the present invention aims to provide a training method, system, device and medium for a road and bridge passing detection model, so as to solve the problem of lower detection accuracy of the road and bridge passing detection model.

In order to solve the technical problems, the invention is realized by the following technical scheme:

the invention provides a training method of a road and bridge passing vehicle detection model, which comprises the following steps:

obtaining an image dataset, wherein the image dataset comprises a sample image dataset and a test image dataset;

marking the sample image data set to generate a label image data set;

Processing the label image data set and the test image data set to correspondingly generate a training image data set and a target test image data set respectively;

optimizing the initial road bridge passing detection model to generate an intermediate road bridge passing detection model;

inputting the training image data set into the intermediate road bridge passing detection model for encoding processing to generate feature image data to be decoded;

performing decoding processing on the feature image data to be decoded to generate a decoding result; and

and carrying out reverse optimization processing on the intermediate road bridge passing detection model according to the decoding result to generate a target road bridge passing detection model.

In an embodiment of the present invention, the step of processing the label image dataset and the test image dataset to respectively generate a training image dataset and a target test image dataset includes:

performing data augmentation processing on the label image dataset to generate an augmented image dataset; and

and respectively scaling the augmented image data set and the test image data set to correspondingly generate the training image data set and the target test image data set.

In an embodiment of the present invention, the step of optimizing the initial road bridge passing detection model to generate the intermediate road bridge passing detection model includes:

carrying out optimizer configuration on the initial road bridge passing detection model to generate an optimized road bridge passing detection model; and

and carrying out loss optimization processing on the optimized road bridge passing detection model to generate an intermediate road bridge passing detection model.

In an embodiment of the present invention, the step of inputting the training image data set into the intermediate road bridge passing detection model to perform encoding processing, and generating the feature image data to be decoded includes:

inputting the training image data set into the intermediate road bridge passing detection model for feature extraction processing to generate a multi-scale feature alignment image data set; and

and carrying out feature fusion processing on the multi-scale feature alignment image data set to generate feature image data to be decoded.

In an embodiment of the present invention, the step of performing loss optimization processing on the optimized road bridge passing detection model to generate an intermediate road bridge passing detection model includes:

performing cross entropy processing on the classification loss function and the regression loss function in the optimized road bridge passing detection model to generate a binary cross entropy loss function;

Performing cross-over comparison processing on the classification loss function and the regression loss function in the optimized road bridge passing detection model to generate a cross-over comparison loss function; and

and processing the binary cross entropy loss function and the cross ratio loss function to generate an intermediate road bridge passing detection model with a depth loss function.

In an embodiment of the present invention, the cross entropy processing of the classification loss function and the regression loss function in the optimized road bridge passing detection model may satisfy the following formula:

wherein L is _CDH Can be expressed as the binary cross entropy loss function, L _C Can be expressed as the classification loss function, L _b Can be expressed as the regression loss function, N _C The number of detection frames containing detection targets in the dynamic hiding mechanism (CDH), i can be expressed as index number of detection frame, c _i Can be expressed as confidence that the detection frame predicts the target as the detection target, b _i Position parameter, c, which can be expressed as a detection frame _i ^* Can be represented as a class label within a detection frame, b _i ^* May be represented as a position tag of the detection frame.

In an embodiment of the present invention, the step of inputting the training image dataset into the intermediate road bridge passing detection model to perform feature extraction processing, and generating the multiscale feature alignment image dataset includes:

Performing initial feature extraction processing on the training image data set in the intermediate road bridge passing detection model to generate an initial feature image data set;

performing multi-scale feature sampling processing on the initial feature image dataset to generate a multi-scale image dataset; and

and performing feature alignment processing on the multi-scale image dataset to generate a multi-scale feature alignment image dataset.

The invention also provides a training system of the road and bridge passing detection model, which is characterized by comprising the following steps:

a data acquisition module to acquire an image dataset, wherein the image dataset comprises a sample image dataset and a test image dataset;

the marking processing module is used for marking the sample image data set to generate a label image data set;

the augmentation processing module is used for processing the label image data set and the test image data set to respectively and correspondingly generate a training image data set and a target test image data set;

the optimization processing module is used for carrying out optimization processing on the initial road bridge passing detection model to generate an intermediate road bridge passing detection model;

the coding processing module is used for inputting the training image data set into the intermediate road bridge passing detection model for coding processing to generate feature image data to be decoded;

The decoding processing module is used for decoding the characteristic image data to be decoded to generate a decoding result; and

and the reverse optimization processing module is used for carrying out reverse optimization processing on the intermediate road bridge passing detection model according to the decoding result to generate a target road bridge passing detection model.

The invention also provides an electronic device, which is characterized in that the electronic device comprises:

one or more processors;

a storage system for storing one or more programs that, when executed by the one or more processors, cause the electronic device to implement a method of training a road bridge passing detection model as described in any of the above.

The present invention also provides a computer-readable storage medium, characterized in that a computer program is stored thereon, which, when executed by a processor of a computer, causes the computer to perform the training method of the road-bridge passing vehicle detection model as set forth in any one of the above.

As described above, the training method, system, device and medium for the road bridge passing detection model provided by the invention are characterized in that the initial road bridge passing detection model is subjected to optimization processing to obtain the intermediate road bridge passing detection model, the training image data set is input into the intermediate road bridge passing detection model to be subjected to coding processing to obtain the characteristic image data to be decoded, the decoder is used for decoding the characteristic image data to be decoded, and the intermediate road bridge passing detection model is subjected to counter propagation optimization according to the decoding result to obtain the trained target road bridge passing detection model, so that the detection precision of the road bridge passing detection model to the passing vehicles is improved, and the detection efficiency of the road bridge passing vehicles is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a training method of a road and bridge passing detection model according to an exemplary embodiment of the present application;

FIG. 2 is a flow chart of step S230 in the embodiment of FIG. 1 in an exemplary embodiment;

FIG. 3 is a flow chart of step S240 in the embodiment shown in FIG. 1 in an exemplary embodiment;

FIG. 4 is a flow chart of step S242 in the embodiment of FIG. 3 in an exemplary embodiment;

FIG. 5 is a flow chart of step S250 in the embodiment of FIG. 1 in an exemplary embodiment;

fig. 6 is a flow chart of step S251 in the embodiment shown in fig. 5 in an exemplary embodiment;

FIG. 7 is a schematic diagram of a partial system of the road bridge passing detection model in the embodiment shown in FIG. 6;

FIG. 8 is a schematic diagram of a training system for a road and bridge passing detection model according to an exemplary embodiment of the present application;

Fig. 9 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.

Detailed Description

Further advantages and effects of the present application will become readily apparent to those skilled in the art from the disclosure herein, by referring to the accompanying drawings and the preferred embodiments. The application may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present application. It should be understood that the preferred embodiments are presented by way of illustration only and not by way of limitation.

It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present application by way of illustration, and only the components related to the present application are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.

In the following description, numerous details are set forth in order to provide a more thorough explanation of embodiments of the present application, it will be apparent, however, to one skilled in the art that embodiments of the present application may be practiced without these specific details, in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the embodiments of the present application.

Firstly, with the development of the big data age, the deep learning technology is rapidly advanced, and more deep learning technologies are applied to various fields such as image segmentation, object detection, voice segmentation and the like. Deep Learning (DL) is a new research direction in the field of Machine Learning (ML), which was introduced to Machine Learning to bring it closer to the original goal-artificial intelligence (Artificial Intelligence, AI). Deep learning is the inherent regularity and presentation hierarchy of learning sample data, and the information obtained during such learning is helpful in interpreting data such as text, images and sounds. Its final goal is to have the machine have analytical learning capabilities like a person, and to recognize text, image, and sound data. The road bridge passing detection model is widely applied to road bridge passing detection technology in urban road traffic, however, the existing road bridge passing detection model processes the feature map only through a attention mechanism and cannot identify the edge of the feature map in a refined mode, so that the expression capability of global features is limited and better robustness cannot be achieved. In other application scenarios, the training method of the road-bridge passing detection model for the object may be set according to the actual situation, which is not limited by the embodiment of the present application.

Referring to fig. 1, fig. 1 is a flowchart illustrating a training method of a road-bridge passing vehicle detection model according to an exemplary embodiment of the present application, and it should be understood that the method may be applied to other exemplary implementation environments and be specifically executed by devices in other implementation environments, and the embodiment is not limited to the implementation environments to which the method is applied.

As shown in fig. 1, in an exemplary embodiment, the training method of the road-bridge passing detection model at least includes steps S210 to S270, which are described in detail as follows:

step S210, acquiring an image dataset, wherein the image dataset comprises a sample image dataset and a test image dataset.

And step S220, marking the sample image data set to generate a label image data set.

Step S230, processing the label image dataset and the test image dataset to generate a training image dataset and a target test image dataset respectively.

And step 240, performing optimization processing on the initial road bridge passing detection model to generate an intermediate road bridge passing detection model.

And step S250, inputting the training image data set into a middle road bridge passing detection model for encoding processing, and generating feature image data to be decoded.

Step S260, the characteristic image data to be decoded is decoded, and a decoding result is generated.

And step S270, performing reverse optimization processing on the intermediate road bridge passing detection model according to the decoding result to generate a target road bridge passing detection model.

As shown in fig. 1, in an exemplary embodiment, when step S210 is performed, an image dataset is acquired, wherein the image dataset includes a sample image dataset and a test image dataset. It should be noted that the image dataset may be acquired by, but not limited to, a camera or an infrared camera.

As shown in fig. 2, in an exemplary embodiment, when step S230 is performed, the tag image data set and the test image data set are processed to respectively generate a training image data set and a target test image data set. It should be noted that, step S230 may include steps S231 to S232, which are described in detail as follows:

step S231, performing data augmentation processing on the label image dataset to generate an augmented image dataset.

And step 232, respectively performing scaling processing on the augmented image data set and the test image data set, and correspondingly generating a training image data set and a target test image data set.

In an exemplary embodiment, the data augmentation processing is performed on the tag image data set by performing a series of operations such as rotation, clipping, transformation, translation, etc. on the tag image data to generate an augmented image data set, so as to increase the diversity of the tag image data set, reduce the influence of over-fitting, and enhance the generalization performance of the model. The scaling of the augmented image dataset and the test image dataset is to scale the augmented image dataset and the test image dataset to images with a size of 1000×1000, but the invention is not limited thereto, and the augmented image dataset and the test image dataset may be scaled to other sizes, so long as the road-bridge-passing detection model can be facilitated to identify the augmented image dataset and the test image dataset. The ratio of the number of images of the training image data set to the number of images of the target test image data set may be 7:3, but the invention is not limited thereto, and the number of images of the training image data set and the target test image data set may be other ratios.

As shown in fig. 3, in an exemplary embodiment, when step S240 is performed, an optimization process is performed on the initial road bridge passing detection model, so as to generate an intermediate road bridge passing detection model. Specifically, step S240 may include steps S241 to S242, which are described in detail below:

And S241, performing optimizer configuration on the initial road bridge passing detection model to generate an optimized road bridge passing detection model.

And step S242, carrying out loss optimization processing on the optimized road bridge passing detection model to generate an intermediate road bridge passing detection model.

In an exemplary embodiment, the initial road bridge passing detection model may be a depth residual error network model (res net 50), but is not limited thereto, and the initial road bridge passing detection model may be another type of convolutional neural network model, so long as a certain detection accuracy for the vehicle can be satisfied. The configuration of the optimizer for the initial road bridge passing detection model refers to setting an optimization module of the initial road bridge passing detection model as an adaptive moment estimation optimization module (Adaptive Moment Estimation, adam), wherein Adam is an adaptive learning rate optimization algorithm module, and the training process of the model can be effectively optimized by dynamically adjusting the learning rate. The Adam algorithm module calculates the first and second moments of the gradient of each parameter as it is updated, and then optimizes the model by dynamically adjusting the learning rate. In addition, the Adam algorithm module can adaptively adjust the learning rate of the model by calculating the first and second moments of the gradient of each parameter, so that the learning rate can be dynamically adjusted according to different parameters in the training process. Each parameter update in the Adam algorithm module can be independently performed, so that parallel computing technologies such as GPU and the like can be utilized, and the computing efficiency of the network model is improved. However, without limitation, the optimized road bridge passing detection model may include, but is not limited to, an adaptive moment estimation optimization module, and the optimized road bridge passing detection model may further include at least one of a random gradient descent optimization module, an adaptive gradient descent optimization module, and a root mean square transfer optimization module.

As shown in fig. 4, in an exemplary embodiment, when step S242 is performed, a loss optimization process is performed on the optimized road bridge passing detection model, and an intermediate road bridge passing detection model is generated. Specifically, step S242 may include steps S341 to S343, which are described in detail below:

and step S341, performing cross entropy processing on the classification loss function and the regression loss function in the optimized road and bridge passing detection model to generate a binary cross entropy loss function.

And S342, performing cross-over comparison processing on the classification loss function and the regression loss function in the optimized road bridge passing detection model to generate a cross-over comparison loss function.

And step S343, processing the binary cross entropy loss function and the cross ratio loss function to generate an intermediate road bridge passing detection model with a depth loss function.

In an exemplary embodiment, the following formula may be satisfied by performing cross entropy processing on the classification loss function and the regression loss function in the optimized road bridge passing detection model:

wherein L is _CDH Can be expressed as a binary cross entropy loss function, L _C Can be expressed as a class loss function, L _b Can be expressed as a regression loss function, N _C The number of detection frames containing detection targets in the dynamic hiding mechanism (CDH), i can be expressed as index number of detection frame, c _i Can be expressed as confidence that the detection frame predicts the target as the detection target, b _i Position parameter, c, which can be expressed as a detection frame _i ^* Can be represented as a class label within a detection frame, b _i ^* Can be represented as a detection frameAnd (5) a position tag.

The cross-over ratio processing of the classification loss function and the regression loss function in the optimized road bridge passing detection model can meet the following formula:

wherein L is _IRH Can be expressed as an cross-ratio loss function, L _C Can be expressed as a class loss function, L _b Can be expressed as a regression loss function, N _I The number of detection frames containing detection targets in the circular hiding mechanism (IRH), i can be expressed as the index number of the detection frames, c _i Can be expressed as confidence that the detection frame predicts the target as the detection target, b _i Position parameter, c, which can be expressed as a detection frame _i ^* Can be represented as a class label within a detection frame, b _i ^* May be represented as a position tag of the detection frame.

The binary cross entropy loss function and the cross-ratio loss function can be processed to satisfy the following formula:

wherein L can be expressed as a depth loss function, L _CDH Can be expressed as a binary cross entropy loss function, L _IRH May be expressed as an cross-ratio loss function and λ may be expressed as a loss optimization parameter.

As shown in fig. 5, in an exemplary embodiment, when step S250 is performed, the training image data set is input into the intermediate road bridge passing detection model to be subjected to encoding processing, and feature image data to be decoded is generated. Specifically, step S250 may include steps S251 to S252, which are described in detail below:

and step S251, inputting the training image data set into the intermediate road bridge passing detection model for feature extraction processing, and generating a multi-scale feature alignment image data set.

And step S252, performing feature fusion processing on the multi-scale feature alignment image dataset to generate feature image data to be decoded.

As shown in fig. 6 and 7, when step S251 is performed, a training image dataset is input into the intermediate road-bridge passing detection model to perform feature extraction processing, and a multi-scale feature alignment image dataset is generated. Specifically, step S251 may include steps S351 to S353, which are described in detail below:

and S351, performing initial feature extraction processing on the training image data set in the intermediate road and bridge passing detection model to generate an initial feature image data set.

Step S352, performing multi-scale feature sampling processing on the initial feature image dataset to generate a multi-scale image dataset.

Step S353, performing feature alignment processing on the multi-scale image dataset to generate a multi-scale feature aligned image dataset.

In an exemplary embodiment, the training image dataset may be input into the convolutional layer module 362 and the Cross-level feature alignment module (Cross-Level Feature Alignment Module, CFAM) 363 for initial feature extraction using the residual network module 361 in the intermediate road bridge passing detection model to obtain an initial feature image dataset. Specifically, the training image may be convolved down-sampled five times in total by the convolution layer module 362 to complete the initial feature extraction of the training image. The initial feature image dataset is then sent by the convolution layer module 362 to the regression branch 365 and the classification branch 366 in the coarse prediction module 364, respectively, for multi-scale feature sampling processing, while the classification branch 366 may send the sampled features obtained by multi-scale feature sampling processing of the initial feature image data to the regression branch 365 through the feature filter module (Feature Fusion Block, FFB) 367 to generate a regression result and a classification result. The regression results and classification results are sent to the cross-level feature alignment module 363 for feature alignment processing to generate a multi-scale feature aligned image dataset. However, the training image may be downsampled another number of times as long as the initial feature image dataset is obtained. The initial feature image dataset may satisfy the following formula:

Wherein C can be expressed as an initial feature image dataset, C _i Can be represented as the i-th initial feature image in the initial feature image dataset, and C-th _i Zhang Chushi the feature image can be of size C _i+1 Zhang Chushi the feature image is twice the size.

Further, the multi-scale image dataset may satisfy the following formula:

wherein P can be represented as a multi-scale image dataset, P _i Can be represented as the ith multiscale image in the multiscale image dataset, and the P _i The size of the multi-scale image may be P _i+1 And twice the size of the multi-scale image.

Still further, feature alignment processing of the multi-scale image dataset may satisfy the following formula:

wherein, the liquid crystal display device comprises a liquid crystal display device,can be expressed as P _i+1 Multi-scale characteristic alignment graph output after multi-scale image alignment convolutionImage data->Can be expressed as P _i Registering the multi-scale image data, O, of the multi-scale feature registered image data output after convolution _i Can be expressed as offset data, conv, calculated from the relative positional offset of the multi-scale image data sampling region and the prediction frame _Deform Can be expressed as a deformable convolution of the target region, concate () can be expressed as a channel splice of the target region, conv _3×3 Can be expressed as a standard 3 x 3 two-dimensional convolution of the target area,/o>Can be expressed as to P _i+1 Multi-scale feature aligned image data output after tensor multi-scale image alignment convolution and P < th > _i And performing channel stitching on the multi-scale characteristic alignment image data output after the multi-scale image alignment convolution and obtaining an output image after standard 3X 3 two-dimensional convolution.

As shown in fig. 1, in an exemplary embodiment, when step S260 is performed, decoding processing is performed on the feature image data to be decoded, and a decoding result is generated. Specifically, the feature image data to be decoded may be decoded by, but not limited to, an interactive refinement decoder in the intermediate road bridge passing detection model, and the decoding result may be a type of a detection target of the feature image to be decoded and bounding box information of the feature image to be decoded. The interactive refinement decoder can improve the detection precision of the single-stage detector and the detection speed of the single-stage detector.

As shown in fig. 1 and 7, in an exemplary embodiment, when step S270 is performed, reverse optimization processing is performed on the intermediate road bridge passing detection model according to the decoding result, so as to generate a target road bridge passing detection model. Specifically, in an exemplary embodiment, the decoding result may perform a counter-propagation optimization process on the intermediate road bridge passing detection model until the target road bridge passing detection model is obtained. The back propagation optimization may include adding interactivity between the regression branch 365 and the classification branch 366 to the coarse prediction module 364 in the intermediate road bridge passing detection model, then combining the regression features filtered by the feature filtering module (Feature Fusion Block, FFB) 367 with the classification features, and further optimizing the regression branch 365 by back propagation to achieve interactive mode learning of the classification branch 366 and the regression branch 365. The target road bridge passing detection model can be a deep learning network model, but is not limited to the deep learning network model, and the target road bridge passing detection model can also be other network models, so long as the target passing vehicle in the test image data set can be accurately identified.

Fig. 8 is a schematic structural diagram of a training system of a road bridge passing detection model according to an exemplary embodiment of the present application. The system may be adapted to other exemplary implementation environments and may be specifically configured in other devices, and the present embodiment is not limited to the implementation environments to which the system is adapted.

The training system of the road bridge passing detection model may include a data acquisition module 410, a marking processing module 420, an augmentation processing module 430, an optimization processing module 440, an encoding processing module 450, a decoding processing module 460, and a reverse optimization processing module 470.

In an exemplary embodiment, the data acquisition module 410 may be configured to acquire an image dataset, wherein the image dataset includes a sample image dataset and a test image dataset. Wherein the image dataset may be acquired by, but not limited to, a camera or an infrared camera acquisition.

In an exemplary embodiment, the marking module 420 may be configured to perform a marking process on the sample image dataset to generate a label image dataset.

In an exemplary embodiment, the augmentation processing module 430 may be configured to process the tag image dataset and the test image dataset to generate a training image dataset and a target test image dataset, respectively. Specifically, processing the label image dataset and the test image dataset may include performing a data augmentation process on the label image dataset to generate an augmented image dataset, and scaling the augmented image dataset and the test image dataset, respectively, to correspondingly generate a training image dataset and a target test image dataset. The data augmentation processing is to perform a series of operations such as rotation, clipping, transformation, translation and the like on the label image data to generate an augmented image data set, so as to increase the diversity of the label image data set, reduce the influence of overfitting and enhance the generalization performance of the model. The scaling of the augmented image dataset and the test image dataset is to scale the augmented image dataset and the test image dataset to images with a size of 1000×1000, but the application is not limited thereto, and the augmented image dataset and the test image dataset may be scaled to other sizes, so long as the road-bridge-passing detection model can be facilitated to identify the augmented image dataset and the test image dataset. The ratio of the number of images of the training image data set to the number of images of the target test image data set may be 7:3, but the application is not limited thereto, and the number of images of the training image data set and the target test image data set may be other ratios.

In an exemplary embodiment, the optimization processing module 440 may be configured to perform optimization on the initial road bridge passing detection model to generate an intermediate road bridge passing detection model. Specifically, performing optimization processing on the initial road bridge passing detection model may include, but is not limited to, performing an optimizer configuration on the initial road bridge passing detection model to generate an optimized road bridge passing detection model, and performing loss optimization processing on the optimized road bridge passing detection model to generate an intermediate road bridge passing detection model. The initial road bridge passing detection model may be a depth residual error network model (ResNet 50), but is not limited thereto, and the initial road bridge passing detection model may be another type of convolutional neural network model, so long as a certain detection accuracy for the vehicle can be satisfied. The configuration of the optimizer for the initial road bridge passing detection model refers to setting an optimization module of the initial road bridge passing detection model as an adaptive moment estimation optimization module (Adaptive Moment Estimation, adam), wherein Adam is an adaptive learning rate optimization algorithm module, and the training process of the model can be effectively optimized by dynamically adjusting the learning rate. The Adam algorithm module calculates the first and second moments of the gradient of each parameter as it is updated, and then optimizes the model by dynamically adjusting the learning rate. In addition, the Adam algorithm module can adaptively adjust the learning rate of the model by calculating the first and second moments of the gradient of each parameter, so that the learning rate can be dynamically adjusted according to different parameters in the training process. Each parameter update in the Adam algorithm module can be independently performed, so that parallel computing technologies such as GPU and the like can be utilized, and the computing efficiency of the network model is improved. However, without limitation, the optimized road bridge passing detection model may include, but is not limited to, an adaptive moment estimation optimization module, and the optimized road bridge passing detection model may further include at least one of a random gradient descent optimization module, an adaptive gradient descent optimization module, and a root mean square transfer optimization module.

In an exemplary embodiment, the encoding processing module 450 may be configured to input the training image data set into the intermediate road bridge passing detection model for encoding processing, and generate the feature image data to be decoded. Specifically, inputting the training image dataset into the intermediate road bridge passing detection model for encoding processing may include, but is not limited to, performing cross entropy processing and cross-over comparison processing on the classification loss function and the regression loss function in the optimized road bridge passing detection model in order to generate a binary cross entropy loss function and a cross-over comparison loss function, and processing the binary cross entropy loss function and the cross-over comparison loss function to obtain the intermediate road bridge passing detection model with a depth loss function.

In an exemplary embodiment, the decoding processing module 460 may be configured to perform decoding processing on the feature image data to be decoded, and generate a decoding result. Specifically, the feature image data to be decoded may be decoded by, but not limited to, an interactive refinement decoder in the intermediate road bridge passing detection model, and the decoding result may be a type of a detection target of the feature image to be decoded and bounding box information of the feature image to be decoded. The interactive refinement decoder can improve the detection precision of the single-stage detector and the detection speed of the single-stage detector.

In an exemplary embodiment, the inverse optimization processing module 470 may be configured to perform inverse optimization processing on the intermediate road bridge passing detection model according to the decoding result, so as to generate the target road bridge passing detection model. Specifically, the decoding result can be used for carrying out counter-propagation optimization treatment on the intermediate road bridge passing detection model until the target road bridge passing detection model is obtained. The back propagation optimization may include adding interactivity between the regression branch 365 and the classification branch 366 to the coarse prediction module 364 in the intermediate road bridge passing detection model, then combining the regression features filtered by the feature filtering module (Feature Fusion Block, FFB) 367 with the classification features, and further optimizing the regression branch 365 by back propagation to achieve interactive mode learning of the classification branch 366 and the regression branch 365. The target road bridge passing detection model can be a deep learning network model, but is not limited to the deep learning network model, and the target road bridge passing detection model can also be other network models, so long as the target passing vehicle in the test image data set can be accurately identified.

It should be noted that, the training system of the road bridge passing vehicle detection model provided in the foregoing embodiment and the training method of the road bridge passing vehicle detection model provided in the foregoing embodiment belong to the same concept, and the specific manner in which each module and unit execute the operation has been described in detail in the method embodiment, which is not repeated herein. In practical application, the training system of the road-bridge passing detection model provided in the above embodiment can distribute the functions to be completed by different functional modules according to needs, that is, the internal structure of the system is divided into different functional modules to complete all or part of the functions described above, which is not limited herein.

The embodiment of the application also provides electronic equipment, which comprises: one or more processors; and the storage system is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the electronic equipment realizes the training method of the road bridge passing detection model provided in each embodiment.

Fig. 9 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application. It should be noted that, the computer system 700 of the electronic device shown in fig. 9 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.

As shown in fig. 9, the computer system 700 includes a central processing unit (Central Processing Unit, CPU) 701 that can perform various appropriate actions and processes, such as performing the methods in the above-described embodiments, according to a program stored in a Read-Only Memory (ROM) 702 or a program loaded from a storage section 708 into a random access Memory (Random Access Memory, RAM) 703. In the RAM703, various programs and data required for the system operation are also stored. The CPU 701, ROM 702, and RAM703 are connected to each other through a bus 704. An Input/Output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output section 707 including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and the like, and a speaker, and a storage section 708 including a hard disk, and the like; and a communication section 709 including a network interface card such as a LAN (Local Area Network ) card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 710 as needed, so that a computer program read out therefrom is installed into the storage section 708 as needed.

It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with a computer-readable computer program embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, or device. A computer program embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

Another aspect of the application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform a training method such as a front axle passing detection model. The computer-readable storage medium may be included in the electronic device described in the above embodiment or may exist alone without being incorporated in the electronic device.

In summary, according to the training method, system, equipment and medium for the road bridge passing detection model provided by the application, the initial road bridge passing detection model is subjected to optimization processing to obtain the intermediate road bridge passing detection model, the training image data set is input into the intermediate road bridge passing detection model to be subjected to coding processing to obtain the characteristic image data to be decoded, the decoder is used for decoding the characteristic image data to be decoded, and the intermediate road bridge passing detection model is subjected to counter propagation optimization according to the decoding result to obtain the trained target road bridge passing detection model, so that the detection precision of the road bridge passing detection model to the road bridge passing vehicle is improved, and the detection efficiency of the road bridge passing vehicle is improved.

In the description of the present specification, the descriptions of the terms "present embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The embodiments of the invention disclosed above are intended only to help illustrate the invention. The examples are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims

1. The training method of the road and bridge passing detection model is characterized by comprising the following steps of:

marking the sample image data set to generate a label image data set;

performing reverse optimization processing on the intermediate road bridge passing detection model according to the decoding result to generate a target road bridge passing detection model;

the optimization process satisfies the formula:

wherein L is _CDH Represented as a binary cross entropy loss function, L _C Expressed as a class loss function, L _b Expressed as regression loss function, N _C Expressed as the number of detection frames containing detection targets in the dynamic hiding mechanism, i is expressed as the index number of the detection frame, c _i Confidence, b, expressed as confidence that the predicted target within the detection frame is the detection target _i The position parameter, c, expressed as the detection frame _i ^* Represented as class labels within a detection frame, b _i ^* Represented as a position tag of the detection frame.

2. The method of claim 1, wherein the step of processing the tag image dataset and the test image dataset to respectively generate a training image dataset and a target test image dataset comprises:

3. The method for training a road bridge passing detection model according to claim 1, wherein the step of optimizing the initial road bridge passing detection model to generate the intermediate road bridge passing detection model comprises:

4. The method for training a road-bridge passing detection model according to claim 1, wherein the step of inputting the training image dataset into the intermediate road-bridge passing detection model for encoding processing, and generating feature image data to be decoded comprises:

5. The method for training a road bridge passing detection model according to claim 3, wherein the step of performing a loss optimization process on the optimized road bridge passing detection model to generate an intermediate road bridge passing detection model comprises:

6. The method for training a road-bridge passing detection model according to claim 4, wherein the step of inputting the training image dataset into the intermediate road-bridge passing detection model for feature extraction processing, and generating a multi-scale feature alignment image dataset comprises:

7. A training system for a road bridge passing detection model, the system comprising:

The optimization processing module is used for carrying out optimization processing on the initial road bridge passing detection model to generate an intermediate road bridge passing detection model, and the optimization processing meets the formula:

wherein L is _CDH Represented as a binary cross entropy loss function, L _C Expressed as a class loss function, L _b Expressed as regression loss function, N _C Expressed as the number of detection frames containing detection targets in the dynamic hiding mechanism, i is expressed as the index number of the detection frame, c _i Confidence, b, expressed as confidence that the predicted target within the detection frame is the detection target _i The position parameter, c, expressed as the detection frame _i ^* Represented as class labels within a detection frame, b _i ^* A position tag denoted as a detection frame;

8. An electronic device, the electronic device comprising:

One or more processors;

a storage system for storing one or more programs that, when executed by the one or more processors, cause the electronic device to implement the method of training a road bridge passing detection model as defined in any one of claims 1 to 6.

9. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform the training method of the road-bridge passing detection model according to any one of claims 1 to 6.