CN112580408A

CN112580408A - Deep learning model training method and device and electronic equipment

Info

Publication number: CN112580408A
Application number: CN201910943965.7A
Authority: CN
Inventors: 章良君
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2019-09-30
Filing date: 2019-09-30
Publication date: 2021-03-30
Anticipated expiration: 2039-09-30
Also published as: CN112580408B

Abstract

The embodiment of the application provides a deep learning model training method, a deep learning model training device and electronic equipment, which are applied to the technical field of computer vision, wherein the method comprises the following steps: acquiring a trained target detection model, wherein the target detection model comprises a foreground and background identification structure; initializing an initial deep learning model to be trained by using background identification parameters of a target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background identification structure; and training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model. The initial deep learning model is initialized by using the background recognition parameters of the trained target detection model, the initialized initial deep learning model has strong background recognition capability and robust target feature extraction capability, and is easy to train to converge, so that the accuracy of target detection of the deep learning model is improved.

Description

Deep learning model training method and device and electronic equipment

Technical Field

The application relates to the technical field of computer vision, in particular to a deep learning model training method and device and electronic equipment.

Background

With the development of artificial intelligence technology, especially the emergence of neural networks, computer vision technology has been rapidly developed. In computer vision technology, the target detection is expected to refer to finding the position of a target in an image and identifying the category of the target, and the target detection is generally carried out by utilizing a deep learning model in the prior art.

In the related art, the training process of the deep learning model generally follows a uniform paradigm, that is, the parameters of the classification model trained on a large image classification data set (e.g., ImageNet) are used to initialize the back-end structure of the new deep learning model, while the parameters of the structures such as positioning and classification of the front-end are initialized randomly, and training is performed on the basis of the initialization.

However, by adopting the method, a large number of randomly initialized parameters exist in the network structure of the front end, a large number of random factors are introduced during training, so that convergence is slow, and for a training scene with less sample data, the accuracy of deep learning model target detection is influenced by factors such as overfitting.

Disclosure of Invention

The embodiment of the application aims to provide a deep learning model training method, a deep learning model training device and electronic equipment, so that the accuracy of deep learning model target detection is improved. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present application provides a deep learning model training method, where the method includes:

acquiring a trained target detection model, wherein the target detection model is used for carrying out target detection on an image and comprises a foreground and background identification structure;

initializing an initial deep learning model to be trained by using the background identification parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background identification structure;

and training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model.

Optionally, before the initializing the initial deep learning model to be trained by using the background identification parameter of the target detection model to obtain the initialized initial deep learning model, the method further includes:

judging whether the initial deep learning model to be trained comprises a foreground and background recognition structure;

and when the initial deep learning model does not comprise a foreground and background identification structure, adding the foreground and background identification structure in an output layer of the initial deep learning model, wherein the original classification structure in the output layer of the initial deep learning model is used for classifying the foreground target.

Optionally, the training of the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model includes:

inputting a preset sample picture into the initialized initial deep learning model for training, and adjusting learning parameters of the initialized initial deep learning model, wherein the learning parameters comprise the background identification parameters;

and when a preset ending condition is met, ending the training of the initialized initial deep learning model to obtain a trained target deep learning model.

Optionally, the preset sample picture includes a face picture marked with a face frame, or a human body picture marked with a human body frame, or a vehicle picture marked with a vehicle frame.

Optionally, the initializing the initial deep learning model by using the background identification parameter of the target detection model to obtain an initialized initial deep learning model includes:

assigning parameter values of parameters of each network layer in front of an output layer of the target detection model to corresponding parameters in the initial deep learning model;

assigning parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier to corresponding parameters in the initial deep learning model;

initializing other parameters which are not assigned in the initial deep learning model to obtain the initialized initial deep learning model.

Optionally, the assigning the parameter values of the regressor channel of the positioning coordinate in the output layer of the target detection model and the background category identification channel in the classifier to the corresponding parameters in the initial deep learning model includes:

obtaining parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier to obtain parameter values of all background related channels;

and according to the parameter values of the relevant channels of each background, carrying out parameter coverage on the weights and the tensors of the bias quantities of the parameters of the corresponding channels in the initial deep learning model along the dimensions of the characterization classes and the coordinates.

Optionally, the method further includes:

and inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the image to be detected is input into the target deep learning model, so as to obtain the target detection result of the image to be detected, including:

cleaning an original algorithm model in a designated image processor, and loading the target deep learning model by using the designated image processor;

and analyzing the image to be detected by using the target deep learning model through the designated image processor to obtain a target detection result of the image to be detected.

In a second aspect, an embodiment of the present application provides a deep learning model training apparatus, where the apparatus includes:

the model acquisition module is used for acquiring a trained target detection model, wherein the target detection model is used for carrying out target detection on an image and comprises a foreground and background identification structure;

the model initialization module is used for initializing an initial deep learning model to be trained by utilizing the background identification parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background identification structure;

and the sample training module is used for training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model.

Optionally, the apparatus further comprises:

the recognition structure judging module is used for judging whether the initial deep learning model to be trained comprises a foreground and background recognition structure;

and the identification structure adding module is used for adding the foreground and background identification structures in the output layer of the initial deep learning model when the initial deep learning model does not comprise the foreground and background identification structures, wherein the original classification structure in the output layer of the initial deep learning model is used for classifying the foreground target.

Optionally, the sample training module is specifically configured to:

Optionally, the model initialization module includes:

the front-end parameter assignment submodule is used for assigning parameter values of network layer parameters in front of an output layer of the target detection model to corresponding parameters in the initial deep learning model;

the back-end parameter assignment submodule is used for assigning the parameter values of a regressor channel of the positioning coordinates in the output layer of the target detection model and a background category identification channel in the classifier to corresponding parameters in the initial deep learning model;

and the other parameter assignment submodule is used for initializing other parameters which are not assigned in the initial deep learning model to obtain the initialized initial deep learning model.

Optionally, the back-end parameter assignment submodule is specifically configured to:

Optionally, the apparatus further comprises:

and the image recognition module is used for inputting the image to be detected into the trained target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the image recognition module is specifically configured to:

In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory;

the memory is used for storing a computer program;

the processor is used for realizing the following method steps when executing the program stored in the memory:

Optionally, the processor is further configured to:

Optionally, the processor is further configured to: and inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored, and when being executed by a processor, the computer program implements the following method steps:

Optionally, the computer program is further configured to, when executed by the processor, implement the following method steps:

The deep learning model training method, the deep learning model training device and the electronic equipment, provided by the embodiment of the application, are used for acquiring a trained target detection model, wherein the target detection model is used for carrying out target detection on an image and comprises a foreground and background identification structure; initializing an initial deep learning model to be trained by using background identification parameters of a target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background identification structure; and training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model. The initial deep learning model is initialized by using the background recognition parameters of the trained target detection model, and although the types and the forms of the targets in the sample picture of the target detection model and the sample picture of the initial deep learning model are possibly greatly different, the background recognition has stronger generalization capability and migration capability because the richness degree of the background is far greater than that of the targets, and a better migration training effect can be achieved by using the background recognition parameters of the target detection model. The initialized initial deep learning model has strong background recognition capability and robust target feature extraction capability, is easy to train to be convergent, and improves the accuracy of target detection of the deep learning model. Meanwhile, the convergence rate of deep learning model training can be increased, especially for training scenes with few sample pictures, the overfitting condition can be reduced, the training stability is increased, and the accuracy of target detection is improved. Of course, not all advantages described above need to be achieved at the same time in the practice of any one product or method of the present application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a first schematic diagram of a deep learning model training method according to an embodiment of the present application;

FIG. 2a is a second schematic diagram of a deep learning model training method according to an embodiment of the present application;

FIG. 2b is a second schematic diagram of a deep learning model training method according to an embodiment of the present application;

FIG. 3a is a diagram illustrating the output layer parameter tensors of the classifier according to the embodiment of the present application;

FIG. 3b is a diagram illustrating parameter initialization according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a deep learning model training apparatus according to an embodiment of the present application;

fig. 5 is a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

First, terms of art in the embodiments of the present application are explained:

migration training: the method refers to a process of training new target data on the basis of a trained model in a deep learning algorithm, and aims to improve the precision and stability of the training of the new target data.

In the related art, the training process of the deep learning model based on the migration training generally follows a uniform paradigm, i.e., the parameters of the classification model trained on the large-scale image classification data set are used to initialize the back-end structure of the new deep learning model, while the parameters of the structures such as the positioning and classification of the front end are initialized randomly, and training is performed on the basis of the initialization.

However, for an individual user or some small sample training scenarios, such as a cloud platform providing target detection service for the individual user, because the sample data for training the deep learning model is less, when the method is used for training, because a large number of randomly initialized parameters exist in the network structure of the front end, a large number of random factors are introduced during training, so that convergence is slow, and problems such as overfitting can occur, so that the accuracy of target detection of the deep learning model is affected.

In view of this, an embodiment of the present application provides a deep learning model training method, and referring to fig. 1, the method includes:

s101, acquiring a trained target detection model, wherein the target detection model is used for carrying out target detection on an image and comprises a foreground and background identification structure.

The deep learning model training method can be realized through electronic equipment such as a server, and particularly can be a server in a cloud platform.

The object detection model is used to identify a specified object in the image, i.e. to perform an object detection function in the image. The object detection model is a model containing foreground and background recognition structures, such as fast-RCNN (Regions-Convolutional Neural Networks) or YOLO (You see Only Once). For a deep learning model not including a foreground and background identification structure, such as an SSD (Single-point multi-box Detector) or a RetinaNet, the foreground and background identification structure may be added to an output layer of the deep learning model, and an original classification structure in the output layer is used for processing classification of a foreground object. The foreground and background identification structure may be a binary classification structure of the foreground and background identification task, for example, RPN (Region proposed network) and the like.

The target detection model is a trained deep learning model for target detection. Specifically, the target detection model can be obtained by training with a large target detection data set, and although the categories and forms of targets in the data set may have great differences, the abundance degree of the background is far greater than that of the targets, so that the recognition of the background is learned to have stronger generalization capability and migration capability, and a better migration training effect can be achieved on the target data set.

And S102, initializing the initial deep learning model by using the background identification parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background identification structure.

The background recognition parameters are parameters related to background recognition, and may include, for example, parameters of each network layer before an output layer of the model, parameters of a regressor channel for locating coordinates in the output layer, and parameters of a background category recognition channel in the classifier. The parameter values of the background identification parameters in the target detection model are respectively given to the corresponding background identification parameters in the initial deep learning model, and specifically, the background identification parameters in the target detection model can be used to directly cover the corresponding background identification parameters in the initial deep learning model. And (4) adopting a random assignment method for other parameters except the background identification parameter in the initial deep learning model.

And S103, training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model.

After the initialization of the initial deep learning model is completed, the initialized initial deep learning model can be trained by using a preset sample picture. In the embodiment of the application, any relevant method for training the deep learning model by using the sample picture can be adopted to train the initialized initial deep learning model, and details are not repeated here.

In the embodiment of the application, the initial deep learning model is initialized by using the background identification parameters of the trained target detection model, and although the type and the form of the target may be greatly different between the sample picture of the target detection model and the sample picture of the initial deep learning model, the abundance degree of the background is far greater than that of the target, so that the background identification has stronger generalization capability and migration capability, and a better migration training effect can be achieved by using the background identification parameters of the target detection model. The initialized initial deep learning model has strong background recognition capability and robust target feature extraction capability, is easy to train to be convergent, and improves the accuracy of target detection of the deep learning model. Meanwhile, the convergence rate of deep learning model training can be increased, especially for training scenes with few sample pictures, the overfitting condition can be reduced, the training stability is increased, and the accuracy of target detection is improved.

In order to ensure smooth migration of the background recognition parameters, the initial deep learning model needs to include a foreground and background recognition structure, in a possible implementation manner, referring to fig. 2a, before the initializing the initial deep learning model to be trained by using the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, the method further includes:

s201, judging whether the initial deep learning model to be trained comprises a foreground and background recognition structure.

Executing S202 when the initial deep learning model to be trained does not comprise a foreground and background recognition structure; when the initial deep learning model to be trained includes foreground and background recognition structures, S102 is performed.

And S202, when the initial deep learning model does not comprise a foreground and background identification structure, adding the foreground and background identification structure to an output layer of the initial deep learning model, wherein the original classification structure in the output layer of the initial deep learning model is used for classifying the foreground target.

When the initial deep learning model is a deep learning model without a foreground and background recognition structure, for example, when the initial deep learning model is a single-stage model such as SSD or RetinaNet, the foreground and background recognition structure may be added to an output layer of the initial deep learning model, and specifically may be a binary classification structure of a foreground and background recognition task. And the original classification structure in the output layer of the initial deep learning model is used for processing the classification of the foreground target, so that the two hierarchical tasks are decomposed. Correspondingly, when the initial deep learning model is trained by using sample data subsequently, training for recognizing the foreground and the background needs to be added into the classification loss function, and the hyper-parameters in the training can be properly adjusted in the training so as to ensure the training effect.

In the embodiment of the application, when the initial deep learning model does not include the foreground and background recognition structures, the foreground and background recognition structures are added on the output layer of the initial deep learning model, so that smooth transplantation of the background recognition parameters of the target detection model is ensured, and the practical model range of the deep learning model training method is increased.

In one possible embodiment, referring to fig. 2b, the initializing the initial deep learning model by using the background identification parameter of the target detection model to obtain an initialized initial deep learning model includes:

and S1021, endowing the parameter values of the network layer parameters before the output layer of the target detection model to the corresponding parameters in the initial deep learning model.

Specifically, the parameters of all network layers before the output layer of the target detection model may be covered in the corresponding parameters of the initial deep learning model respectively.

And S1022, giving the parameter values of the regressor channel of the positioning coordinate in the output layer of the target detection model and the background category identification channel in the classifier to corresponding parameters in the initial deep learning model.

Specifically, the background identification parameters in the output layer of the target detection model, including the parameters of the regressor channel of the positioning coordinates and the background classification identification channel in the classifier, may be respectively covered in the corresponding parameters of the output layer of the initial deep learning model.

step one, obtaining parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier, and obtaining parameter values of related channels of each background.

And step two, according to the parameter values of the relevant channels of the backgrounds, carrying out parameter coverage on the weights and the tensor of the bias quantities of the parameters of the corresponding channels in the initial deep learning model along the dimensions of the characterization classes and the coordinates.

For the output layer, the number of categories of the target detection model and the initial deep learning model may not be consistent, so that the parameters cannot be directly covered, but the regression device (4 channels) for positioning coordinates and the background class identification channel (1 channel) in the target classifier are both structures of the target detection model and the initial deep learning model, so that the parameters of the 5 channels can be reserved. Specifically, according to the parameters of the 5 channels in the target detection model, in an output layer of the initial deep learning model, the parameters of the 5 channels are respectively covered along the dimensions of the characterization class and the coordinates with respect to tensors of weights and offsets, where a schematic diagram of the parameters (including the weights and the offsets) of the output layer of the classifier may be as shown in fig. 3 a.

And S1023, initializing other parameters which are not assigned in the initial deep learning model to obtain the initialized initial deep learning model.

And initializing other parameters which are not assigned in the initial deep learning model by adopting any related initialization method, such as initialization operation of random assignment on the other parameters which are not assigned in the initial deep learning model.

Referring to fig. 3b specifically, the output layer includes an image input network, a back-end network, a region proposal network, and a head network, in the output layer, four channels of the regressor for the positioning coordinates and one channel of the background class identification in the classifier are covered with the parameters of the five channels in the target detection model. And initializing parameters of other channels in the classifier by adopting a random assignment mode.

The training method of the initialized initial deep learning model can be referred to as the training method of the deep learning model in the related art. In a possible implementation manner, the training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model includes:

step one, inputting a preset sample picture into the initialized initial deep learning model for training, and adjusting learning parameters of the initialized initial deep learning model, wherein the learning parameters comprise the background identification parameters.

Selecting preset sample images from preset sample images including marking information, inputting the preset sample images into the initialized initial deep learning model in stages for training, and adjusting learning parameters of the initialized initial deep learning model according to training effects, such as loss function values or local loss, after each stage of training is finished. The learning parameters include background identification parameters, learning rate, classification parameters, and the like.

And step two, when a preset ending condition is met, ending the training of the initialized initial deep learning model to obtain a trained target deep learning model.

The preset ending condition may be set according to an actual situation, for example, when the loss function converges, it is determined that the preset ending condition is satisfied; or when the training times reach a preset time threshold, judging that a preset ending condition is met; or when the training effect begins to decline, judging that a preset ending condition is met, and the like. And when the preset ending condition is met, ending the training of the initialized initial deep learning model to obtain the trained target deep learning model.

The type of the preset sample picture can be set according to the actual target detection requirement, and in a possible implementation manner, the preset sample picture comprises a face picture marked with a face frame, a human body picture marked with a human body frame, or a vehicle picture marked with a vehicle frame.

The preset sample pictures may include a positive sample picture and a negative sample picture. For example, when the target detection needs to be performed on a human body, the positive sample picture may be a human body picture marked with a human body frame, and the negative sample may be a picture not including a human body; for example, when the target detection needs to be performed on the vehicle, the positive sample picture may be a picture of the vehicle marked with a vehicle frame, and the negative sample may be a picture that does not include the vehicle.

In a possible embodiment, the method further includes:

And performing target detection on the image to be detected by using the trained target deep learning model.

In one possible embodiment, the inputting the image to be detected into the target deep learning model to obtain the target detection result of the image to be detected includes:

cleaning an original algorithm model in a designated image processor, and loading the target deep learning model by using the designated image processor.

The designated image processor is an image processor in a device such as a personal computer, a smart phone, a smart camera, or a server, and specifically, the designated image processor may be a GPU (Graphics Processing Unit), or in some cases, a device having a computing function such as a CPU (central Processing Unit) or an FPGA (Field Programmable Gate Array). The designated image processor may be enabled to clean up the original algorithm model and load the target deep learning model by sending instructions to the designated image processor.

And secondly, analyzing the image to be detected by using the target deep learning model through the designated image processor to obtain a target detection result of the image to be detected.

After the designated image processor loads the target deep learning model, the designated image processor can operate the target deep learning model, so that the target detection is carried out on the image to be detected, and the target detection result of the image to be detected is obtained.

The deep learning model training method can be applied to scenes such as a cloud platform and the like, aiming at individual users or enterprise users served by the cloud platform, and due to the fact that the number of training samples which can be provided by the individual users or the enterprise users is small, the model to be trained can be converged quickly, the overfitting condition is reduced, and the training stability is improved. The model obtained by the deep learning model training method of the embodiment of the application is used for target detection, and the accuracy of the target detection can be improved.

The embodiment of the present application further provides a deep learning model training device, see fig. 4, the device includes:

a model obtaining module 401, configured to obtain a trained target detection model, where the target detection model is used to perform target detection on an image, and the target detection model includes a foreground and background identification structure;

a model initialization module 402, configured to initialize an initial deep learning model to be trained by using the background identification parameters of the target detection model, so as to obtain an initialized initial deep learning model, where the initial deep learning model includes a foreground and background identification structure;

the sample training module 403 is configured to train the initialized initial deep learning model by using a preset sample picture, so as to obtain a trained target deep learning model.

Optionally, the apparatus further comprises:

and the identification structure adding module is used for adding a foreground and background identification structure on an output layer of the initial deep learning model when the initial deep learning model does not comprise the foreground and background identification structure, wherein the original classification structure in the output layer of the initial deep learning model is used for classifying the foreground target.

Optionally, the sample training module 403 is specifically configured to: inputting a preset sample picture into the initialized initial deep learning model for training, and adjusting learning parameters of the initialized initial deep learning model, wherein the learning parameters comprise the background identification parameters; and when a preset ending condition is met, ending the training of the initialized initial deep learning model to obtain a trained target deep learning model.

Optionally, the model initialization module 402 includes:

a front-end parameter assignment submodule, configured to assign parameter values of network layer parameters in front of an output layer of the target detection model to corresponding parameters in the initial deep learning model;

a back-end parameter assignment submodule, configured to assign parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background class identification channel in a classifier to corresponding parameters in the initial deep learning model;

and according to the parameter values of the background related channels, carrying out parameter coverage on the weights and the tensors of the bias quantities of the corresponding channel parameters in the initial deep learning model along the dimensions of the characterization classes and the coordinates.

Optionally, the apparatus further comprises: and the image recognition module is used for inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the image recognition module is specifically configured to: cleaning an original algorithm model in a designated image processor, and loading the target deep learning model by using the designated image processor; and analyzing the image to be detected by utilizing the target deep learning model through the appointed image processor to obtain a target detection result of the image to be detected.

An embodiment of the present application further provides an electronic device, including: a processor and a memory;

the memory is used for storing computer programs;

when the processor is used for executing the computer program stored in the memory, the following steps are realized:

Optionally, the processor is further configured to:

and when the initial deep learning model does not comprise a foreground and background identification structure, adding the foreground and background identification structure to an output layer of the initial deep learning model, wherein the original classification structure in the output layer of the initial deep learning model is used for classifying the foreground target.

Optionally, the training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model includes:

assigning the parameter values of the network layer parameters before the output layer of the target detection model to the corresponding parameters in the initial deep learning model;

assigning the parameter values of a regressor channel of the positioning coordinates in the output layer of the target detection model and a background category identification channel in the classifier to corresponding parameters in the initial deep learning model;

Optionally, the above-mentioned image to be detected is input into the above-mentioned target deep learning model, and the above-mentioned target detection result of the image to be detected is obtained, and includes:

and analyzing the image to be detected by utilizing the target deep learning model through the appointed image processor to obtain a target detection result of the image to be detected.

Optionally, when the processor is used to execute the computer program stored in the memory, any of the deep learning model training methods can be further implemented.

Optionally, referring to fig. 5, the electronic device according to the embodiment of the present application further includes a communication interface 502 and a communication bus 504, where the processor 501, the communication interface 502, and the memory 503 complete communication with each other through the communication bus 504.

The communication bus mentioned in the electronic device may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a RAM (Random Access Memory) or an NVM (Non-Volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the following steps:

Optionally, the computer program, when executed by a processor, can further implement any of the deep learning model training methods described above.

It should be noted that, in this document, the technical features in the various alternatives can be combined to form the scheme as long as the technical features are not contradictory, and the scheme is within the scope of the disclosure of the present application. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the embodiments of the apparatus, the electronic device, and the storage medium, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. A deep learning model training method, the method comprising:

2. The method according to claim 1, wherein before the initializing the initial deep learning model to be trained by using the background identification parameters of the target detection model to obtain the initialized initial deep learning model, the method further comprises:

3. The method according to claim 1, wherein the training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model comprises:

4. The method according to any one of claims 1 to 3, wherein the preset sample picture comprises a face picture marked with a face frame, a body picture marked with a body frame, or a vehicle picture marked with a vehicle frame.

5. The method according to claim 1, wherein initializing the initial deep learning model by using the background identification parameters of the target detection model to obtain an initialized initial deep learning model comprises:

6. The method of claim 5, wherein assigning parameter values of a regressor channel of the positioning coordinates in an output layer of the target detection model and a background class identification channel in a classifier to respective parameters in the initial deep learning model comprises:

7. The method of claim 1, further comprising:

8. The method according to claim 7, wherein the inputting the image to be detected into the target deep learning model to obtain the target detection result of the image to be detected comprises:

9. An apparatus for deep learning model training, the apparatus comprising:

10. The apparatus of claim 9, further comprising:

11. The apparatus of claim 9, wherein the sample training module is specifically configured to:

12. The apparatus according to any one of claims 9-11, wherein the predetermined sample pictures comprise a face picture labeled with a face frame, a body picture labeled with a body frame, or a vehicle picture labeled with a vehicle frame.

13. The apparatus of claim 9, wherein the model initialization module comprises:

14. The apparatus of claim 13, wherein the back-end parameter assignment submodule is specifically configured to:

15. The apparatus of claim 9, further comprising:

16. The apparatus of claim 15, wherein the image recognition module is specifically configured to: