CN112580408B

CN112580408B - Deep learning model training method and device and electronic equipment

Info

Publication number: CN112580408B
Application number: CN201910943965.7A
Authority: CN
Inventors: 章良君
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2019-09-30
Filing date: 2019-09-30
Publication date: 2024-03-12
Anticipated expiration: 2039-09-30
Also published as: CN112580408A

Abstract

The embodiment of the application provides a deep learning model training method, a deep learning model training device and electronic equipment, which are applied to the technical field of computer vision, and the method comprises the following steps: acquiring a trained target detection model, wherein the target detection model comprises a foreground and background identification structure; initializing an initial deep learning model to be trained by utilizing background recognition parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and a background recognition structure; and training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model. And initializing an initial deep learning model by utilizing the background recognition parameters of the trained target detection model, wherein the initialized initial deep learning model has stronger background recognition capability and has robust target feature extraction capability, so that the training is easy to converge, and the target detection accuracy of the deep learning model is improved.

Description

Deep learning model training method and device and electronic equipment

Technical Field

The application relates to the technical field of computer vision, in particular to a deep learning model training method and device and electronic equipment.

Background

With the development of artificial intelligence technology, particularly the emergence of neural networks, computer vision technology has been rapidly developed. In computer vision technology, target detection is expected to find the position of a target in an image and identify its category, and is generally performed by using a deep learning model in the prior art.

In the related art, the training process of the deep learning model generally follows a unified paradigm, that is, parameters of a classification model trained on a large image classification dataset (such as ImageNet) are used to initialize a back-end structure of a new deep learning model, parameters of structures such as positioning and classification of a front end are initialized randomly, and training is performed on the basis after the initialization is completed.

However, by adopting the method, a large number of random initialized parameters exist in the network structure at the front end, a large number of random factors are introduced during training to cause slower convergence, and aiming at training scenes with fewer sample data, the accuracy of the target detection of the deep learning model is influenced by factors such as overfitting.

Disclosure of Invention

The embodiment of the application aims to provide a deep learning model training method, a deep learning model training device and electronic equipment, so as to achieve the purpose of increasing the accuracy of target detection of a deep learning model. The specific technical scheme is as follows:

In a first aspect, an embodiment of the present application provides a deep learning model training method, where the method includes:

obtaining a trained target detection model, wherein the target detection model is used for detecting targets of images and comprises a foreground and background identification structure;

initializing an initial deep learning model to be trained by utilizing the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background recognition structure;

and training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model.

Optionally, before initializing the initial deep learning model to be trained by using the background recognition parameter of the target detection model to obtain an initialized initial deep learning model, the method further includes:

judging whether an initial deep learning model to be trained comprises a foreground and background identification structure;

and when the initial deep learning model does not comprise the foreground and background recognition structures, adding the foreground and background recognition structures to an output layer of the initial deep learning model, wherein an original classification structure is used for classifying foreground targets in the output layer of the initial deep learning model.

Optionally, training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model, including:

inputting a preset sample picture into the initialized initial deep learning model for training, and adjusting learning parameters of the initialized initial deep learning model, wherein the learning parameters comprise the background recognition parameters;

and when the preset ending condition is met, ending the training of the initialized initial deep learning model to obtain a trained target deep learning model.

Optionally, the preset sample picture includes a face picture marked with a face frame, or a human picture marked with a human frame, or a vehicle picture marked with a vehicle frame.

Optionally, initializing the initial deep learning model by using the background recognition parameter of the target detection model to obtain an initialized initial deep learning model, including:

assigning the parameter values of the parameters of each network layer before the output layer of the target detection model to each corresponding parameter in the initial deep learning model;

the parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier are endowed with corresponding parameters in the initial deep learning model;

Initializing other parameters which are not assigned in the initial deep learning model, and obtaining the initialized initial deep learning model.

Optionally, the assigning parameter values of a regressor channel of the positioning coordinates in the output layer of the target detection model and a background category identification channel in the classifier to each corresponding parameter in the initial deep learning model includes:

obtaining parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier, and obtaining parameter values of each background related channel;

and according to the parameter values of the background related channels, carrying out parameter coverage on the tensors of the weights and the paranoid amounts of the parameters of the corresponding channels in the initial deep learning model along the dimension representing the category and the coordinate.

Optionally, the method further comprises:

and inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected includes:

cleaning an original algorithm model in a designated image processor, and loading the target deep learning model by using the designated image processor;

And analyzing the image to be detected by using the target deep learning model through the specified image processor to obtain a target detection result of the image to be detected.

In a second aspect, embodiments of the present application provide a deep learning model training apparatus, the apparatus including:

the model acquisition module is used for acquiring a trained target detection model, wherein the target detection model is used for carrying out target detection on an image and comprises a foreground and background identification structure;

the model initialization module is used for initializing an initial deep learning model to be trained by utilizing the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background recognition structure;

and the sample training module is used for training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model.

Optionally, the apparatus further includes:

the recognition structure judging module is used for judging whether the initial deep learning model to be trained comprises a foreground and background recognition structure or not;

And the recognition structure adding module is used for adding the foreground and background recognition structures to the output layer of the initial deep learning model when the initial deep learning model does not comprise the foreground and background recognition structures, wherein the original classification structure is used for classifying foreground targets in the output layer of the initial deep learning model.

Optionally, the sample training module is specifically configured to:

Optionally, the model initialization module includes:

a front-end parameter assignment sub-module, configured to assign parameter values of parameters of each network layer before an output layer of the target detection model to corresponding parameters in the initial deep learning model;

The back-end parameter assignment sub-module is used for assigning parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier to each corresponding parameter in the initial deep learning model;

and the other parameter assignment sub-module is used for initializing other parameters which are not yet assigned in the initial deep learning model to obtain the initialized initial deep learning model.

Optionally, the back-end parameter assignment sub-module is specifically configured to:

Optionally, the apparatus further includes:

and the image recognition module is used for inputting the image to be detected into the trained target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the image recognition module is specifically configured to:

In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory;

the memory is used for storing a computer program;

the processor is configured to implement the following method steps when executing the program stored in the memory:

Optionally, the processor is further configured to:

Optionally, the processor is further configured to: and inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, which when executed by a processor, performs the following method steps:

Optionally, the computer program is further configured to implement the following method steps when executed by a processor:

According to the deep learning model training method, the deep learning model training device and the electronic equipment, a trained target detection model is obtained, wherein the target detection model is used for detecting targets of images and comprises a foreground and background identification structure; initializing an initial deep learning model to be trained by utilizing background recognition parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and a background recognition structure; and training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model. The background recognition parameters of the trained target detection model are utilized to initialize the initial deep learning model, and although the sample picture of the target detection model and the sample picture of the initial deep learning model may have larger difference in the category and the form of the target, the background recognition has stronger generalization capability and migration capability in view of the fact that the background richness is far greater than the target itself, and a better migration training effect can be achieved by utilizing the background recognition parameters of the target detection model. The initialized initial deep learning model has stronger background recognition capability, has robust target feature extraction capability, is easy to train to converge, and increases the accuracy of target detection of the deep learning model. Meanwhile, the convergence rate of training of the deep learning model can be increased, particularly, the situation of overfitting can be reduced aiming at training scenes with fewer sample pictures, the stability of training is increased, and the accuracy of target detection is improved. Of course, not all of the above-described advantages need be achieved simultaneously in practicing any one of the products or methods of the present application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a first schematic diagram of a deep learning model training method according to an embodiment of the present application;

FIG. 2a is a second schematic diagram of a deep learning model training method according to an embodiment of the present application;

FIG. 2b is a second schematic diagram of a deep learning model training method according to an embodiment of the present application;

FIG. 3a is a schematic diagram of a classifier output layer parameter tensor according to an embodiment of the present application;

FIG. 3b is a schematic diagram of parameter initialization according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a deep learning model training device according to an embodiment of the present application;

fig. 5 is a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

First, terms of art in the embodiments of the present application will be explained:

migration training: in the deep learning algorithm, a new target data training process is performed on the basis of a trained model, so that the accuracy and stability of the new target data training are improved.

In the related art, the training process of the deep learning model based on the migration training generally follows a unified paradigm, namely, the parameters of the classification model trained on the large-scale image classification data set are used for initializing the rear end structure of the new deep learning model, while the parameters of the structure such as positioning and classification of the front end are randomly initialized, and training is performed on the basis after the initialization is completed.

However, for personal users or some small sample training scenes, such as cloud platforms for providing target detection services for personal users, because sample data for training a deep learning model is less, when the method is adopted for training, a large number of random initialization parameters exist in a network structure at the front end, and a large number of random factors are introduced during training to cause slow convergence, so that problems such as fitting and the like can occur, and the accuracy of target detection of the deep learning model is affected.

In view of this, an embodiment of the present application provides a deep learning model training method, referring to fig. 1, the method includes:

s101, acquiring a trained target detection model, wherein the target detection model is used for detecting targets of images and comprises a foreground and background identification structure.

The deep learning model training method can be realized through electronic equipment such as a server, and specifically can be a server in a cloud platform.

The object detection model is used to identify a specified object in the image, i.e. to perform an object detection function in the image. The object detection model is a model that contains foreground and background recognition structures, such as fast-RCNN (Regions-Convolutional Neural Networks, regional convolutional neural network) or YOLO (You Only Look Once, you look only once), etc. For a deep learning model that does not include foreground and background recognition structures, such as SSD (Single Shot MultiBox Detector, single-point multi-core detector) or RetinaNet, a foreground and background recognition structure may be added to an output layer of the deep learning model, where an original classification structure is used to process classification of foreground objects. The foreground and background recognition structures may be two kinds of structures of foreground and background recognition tasks, for example, RPN (Region Proposal Networks, regional proposal network) and the like.

The target detection model is a trained deep learning model for target detection. Specifically, the target detection model can be obtained by training a large target detection data set, and although the types and the forms of the targets in the data set may have larger differences, the background is rich far more than the targets themselves, so that the recognition of the background is learned to have stronger generalization capability and migration capability, and a better migration training effect can be achieved on the target data set.

S102, initializing the initial deep learning model by using the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background recognition structure.

The background recognition parameters are parameters related to background recognition, and may include, for example, parameters of each network layer before an output layer of the model, parameters of a regressor channel of a positioning coordinate in the output layer, parameters of a background category recognition channel in a classifier, and the like. And respectively endowing the parameter values of the background recognition parameters in the target detection model with the corresponding background recognition parameters in the initial deep learning model, wherein the corresponding background recognition parameters in the initial deep learning model can be directly covered by using the background recognition parameters in the target detection model. Random assignment methods can be adopted for other parameters except the background identification parameters in the initial deep learning model.

And S103, training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model.

After the initialization of the initial deep learning model is completed, the initialized initial deep learning model can be trained by using a preset sample picture. In this embodiment of the present application, any relevant method for training a deep learning model by using a sample image may be used to train the initialized initial deep learning model, which is not described herein.

In the embodiment of the application, the initial deep learning model is initialized by using the background recognition parameters of the trained target detection model, and although the sample picture of the target detection model and the sample picture of the initial deep learning model may have larger difference in the category and the form of the target, the background recognition has stronger generalization capability and migration capability in view of the fact that the background richness is far greater than that of the target, and a better migration training effect can be achieved by using the background recognition parameters of the target detection model. The initialized initial deep learning model has stronger background recognition capability, has robust target feature extraction capability, is easy to train to converge, and increases the accuracy of target detection of the deep learning model. Meanwhile, the convergence rate of training of the deep learning model can be increased, particularly, the situation of overfitting can be reduced aiming at training scenes with fewer sample pictures, the stability of training is increased, and the accuracy of target detection is improved.

In order to ensure smooth migration of the background recognition parameters, the initial deep learning model needs to include a foreground and a background recognition structure, and in one possible implementation, referring to fig. 2a, before initializing the initial deep learning model to be trained by using the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, the method further includes:

s201, judging whether an initial deep learning model to be trained comprises a foreground and background recognition structure.

Executing S202 when the initial deep learning model to be trained does not comprise a foreground and background recognition structure; when the initial deep learning model to be trained includes foreground and background recognition structures, S102 is performed.

S202, when the initial deep learning model does not comprise the foreground and background recognition structures, the foreground and background recognition structures are added to the output layer of the initial deep learning model, wherein the original classification structure is used for classifying foreground targets in the output layer of the initial deep learning model.

When the initial deep learning model is a deep learning model which does not include a foreground and background recognition structure, for example, when the initial deep learning model is a single-stage model such as SSD or RetinaNet, the foreground and background recognition structure can be added to the output layer of the initial deep learning model, and the initial deep learning model can be a two-class structure of foreground and background recognition tasks. And the original classification structure in the output layer of the initial deep learning model is used for processing the classification of the foreground object, so that the classification is decomposed into two layering tasks. Correspondingly, when the initial deep learning model is trained by using sample data, training for identifying the foreground and the background is required to be added into the classification loss function, and super parameters in training can be properly regulated in training so as to ensure the training effect.

In the embodiment of the application, when the initial deep learning model does not comprise the foreground and background recognition structures, the foreground and background recognition structures are added to the output layer of the initial deep learning model, so that smooth transplantation of background recognition parameters of the target detection model is ensured, and the practical model range of the training method of the deep learning model is increased.

In a possible implementation manner, referring to fig. 2b, initializing the initial deep learning model by using the background recognition parameter of the object detection model to obtain an initialized initial deep learning model includes:

and S1021, assigning the parameter values of the network layer parameters before the output layer of the target detection model to the corresponding parameters in the initial deep learning model.

Specifically, parameters of all network layers before the output layer of the target detection model can be respectively covered into corresponding parameters of the initial deep learning model.

S1022, assigning the parameter values of the regression channel of the positioning coordinates in the output layer of the target detection model and the background category identification channel in the classifier to the corresponding parameters in the initial deep learning model.

Specifically, the background recognition parameters in the output layer of the target detection model, including the regression channel of the positioning coordinates and the parameters of the background category recognition channel in the classifier, can be respectively covered into the corresponding parameters of the output layer of the initial deep learning model.

Optionally, the assigning the parameter values of the regressor channel of the positioning coordinates in the output layer of the target detection model and the background class identification channel in the classifier to each corresponding parameter in the initial deep learning model includes:

step one, obtaining parameter values of a regression channel of positioning coordinates in an output layer of the target detection model and a background category identification channel in a classifier, and obtaining parameter values of each background related channel.

And step two, according to the parameter values of the background related channels, carrying out parameter coverage on the tensors of the weights and the paranoid amounts of the parameters of the corresponding channels in the initial deep learning model along the dimension representing the category and the coordinate.

For the output layer, the number of categories of the target detection model and the initial deep learning model may be inconsistent, so that the parameters cannot be directly covered, but the regression (4 channels) of the positioning coordinates and the background category recognition channel (1 channel) in the target classifier are structures of the target detection model and the initial deep learning model, so that the parameters of the 5 channels can be reserved. Specifically, parameters of the 5 channels may be respectively covered along dimensions of the characterization category and the coordinates according to parameters of the 5 channels in the target detection model by tensors of weights and offsets in an output layer of the initial deep learning model, where a schematic diagram of classifier output layer parameters (including weights and offsets) may be shown in fig. 3 a.

S1023, initializing other parameters which are not assigned in the initial deep learning model, and obtaining an initialized initial deep learning model.

And initializing other parameters which are not assigned in the initial deep learning model by adopting any related initialization method, such as initializing operation of carrying out random assignment on the other parameters which are not assigned in the initial deep learning model.

Referring specifically to fig. 3b, the output layer includes an image input network, a back-end network, a region proposal network, and a head network, four channels of the regressor for positioning coordinates in the output layer and one channel identified by a background class in the classifier, and corresponding parameters of the five channels in the initial deep learning model are covered by parameters of the five channels in the target detection model. And the parameters of other channels in the classifier can be initialized in a random assignment mode.

The training method of the initialized initial deep learning model can be referred to the training method of the deep learning model in the related technology. In a possible implementation manner, the training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model includes:

step one, inputting a preset sample picture into the initialized initial deep learning model for training, and adjusting learning parameters of the initialized initial deep learning model, wherein the learning parameters comprise the background recognition parameters.

And selecting preset sample images from all preset sample images which are prepared in advance and comprise marking information, inputting the preset sample images into the initialized initial deep learning model in stages for training, and adjusting learning parameters of the initialized initial deep learning model according to training effects, such as loss function values or local loss, after each stage of training is completed. The learning parameters include background recognition parameters, learning rate, classification parameters and the like.

And step two, finishing training of the initialized initial deep learning model when the preset end condition is met, and obtaining a trained target deep learning model.

The preset end condition may be set according to an actual situation, for example, when the loss function converges, it is determined that the preset end condition is satisfied; or when the training times reach a preset time threshold, judging that the preset ending condition is met; or when the training effect starts to decline, judging that the preset ending condition is met, and the like. And when the preset ending condition is met, ending the training of the initialized initial deep learning model to obtain a trained target deep learning model.

The type of the preset sample picture may be set according to an actual target detection requirement, and in a possible implementation manner, the preset sample picture includes a face picture marked with a face frame, a human body picture marked with a human body frame, or a vehicle picture marked with a vehicle frame.

The preset sample pictures may include positive sample pictures and negative sample pictures. For example, when target detection is required for a human body, the positive sample picture may be a human body picture marked with a human body frame, and the negative sample may be a picture not containing a human body; for example, when target detection is required for a vehicle, the positive sample picture may be a vehicle picture marked with a vehicle frame, and the negative sample may be a picture not containing a vehicle.

In one possible embodiment, the method further comprises:

And performing target detection on the image to be detected by using the trained target deep learning model.

In one possible implementation manner, the inputting the image to be detected into the target deep learning model to obtain the target detection result of the image to be detected includes:

step one, cleaning an original algorithm model in a designated image processor, and loading the target deep learning model by using the designated image processor.

The designated image processor is an image processor in a personal computer, a smart phone, a smart camera or a server, and specifically, the designated image processor may be a GPU (Graphics Processing Unit, graphics processor), or may be a device with a computing function, such as a CPU or an FPGA (Field-Programmable Gate Array, field programmable gate array). Instructions can be sent to the designated image processor to enable the designated image processor to clean up the original algorithm model and load the target deep learning model.

And step two, analyzing the image to be detected by using the target deep learning model through the specified image processor to obtain a target detection result of the image to be detected.

After the appointed image processor loads the target deep learning model, the appointed image processor can operate the target deep learning model, so that target detection is carried out on the image to be detected, and a target detection result of the image to be detected is obtained.

The deep learning model training method can be applied to scenes such as cloud platforms, and aiming at personal users or enterprise users of cloud platform services, the number of training samples which can be provided by the personal users or enterprise users is small, so that the model to be trained can be converged rapidly, the fitting condition is reduced, and the training stability is improved. The model obtained by the deep learning model training method can be used for target detection, and the accuracy of target detection can be improved.

The embodiment of the application also provides a training device for the deep learning model, referring to fig. 4, the device includes:

the model acquisition module 401 is configured to acquire a trained target detection model, where the target detection model is used for performing target detection on an image, and the target detection model includes a foreground and background recognition structure;

The model initialization module 402 is configured to initialize an initial deep learning model to be trained by using the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, where the initial deep learning model includes a foreground and a background recognition structure;

the sample training module 403 is configured to train the initialized initial deep learning model by using a preset sample picture, so as to obtain a trained target deep learning model.

Optionally, the apparatus further includes:

Optionally, the sample training module 403 is specifically configured to: inputting a preset sample picture into the initialized initial deep learning model for training, and adjusting learning parameters of the initialized initial deep learning model, wherein the learning parameters comprise the background recognition parameters; and when the preset ending condition is met, ending the training of the initialized initial deep learning model to obtain a trained target deep learning model.

Optionally, the model initialization module 402 includes:

and the other parameter assignment sub-module is used for initializing other parameters which are not assigned in the initial deep learning model and obtaining the initialized initial deep learning model.

Optionally, the above-mentioned back-end parameter assignment submodule is specifically configured to:

obtaining parameter values of a regression channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier, and obtaining parameter values of each background related channel;

Optionally, the apparatus further includes: and the image recognition module is used for inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the image recognition module is specifically configured to: cleaning an original algorithm model in a designated image processor, and loading the target deep learning model by using the designated image processor; and analyzing the image to be detected by using the target deep learning model through the specified image processor to obtain a target detection result of the image to be detected.

The embodiment of the application also provides electronic equipment, which comprises: a processor and a memory;

the memory is used for storing a computer program;

the processor is configured to execute the computer program stored in the memory, and implement the following steps:

initializing an initial deep learning model to be trained by using the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, wherein the initial deep learning model comprises a foreground and background recognition structure;

Optionally, the above processor is further configured to:

and when the initial deep learning model does not comprise the foreground and background recognition structures, adding the foreground and background recognition structures to the output layer of the initial deep learning model, wherein the original classification structure is used for classifying foreground targets in the output layer of the initial deep learning model.

the parameter values of a regressor channel of the positioning coordinates in the output layer of the target detection model and a background category identification channel in the classifier are endowed with corresponding parameters in the initial deep learning model;

Optionally, the above processor is further configured to: and inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected.

Optionally, the processor is configured to execute the computer program stored in the memory, and further implement any one of the deep learning model training methods.

Optionally, referring to fig. 5, the electronic device of the embodiment of the present application further includes a communication interface 502 and a communication bus 504, where the processor 501, the communication interface 502, and the memory 503 complete communication with each other through the communication bus 504.

The communication bus mentioned for the above-mentioned electronic devices may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the electronic device and other devices.

The Memory may include RAM (Random Access Memory ) or NVM (Non-Volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a CPU (Central Processing Unit ), NP (Network Processor, network processor), etc.; but also DSP (Digital Signal Processing, digital signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.

The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the following steps when being executed by a processor:

Optionally, when the computer program is executed by the processor, any one of the deep learning model training methods described above can also be implemented.

It should be noted that, in this document, the technical features in each alternative may be combined to form a solution, so long as they are not contradictory, and all such solutions are within the scope of the disclosure of the present application. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for embodiments of the apparatus, electronic device and storage medium, the description is relatively simple as it is substantially similar to the method embodiments, where relevant see the section description of the method embodiments.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. A method for training a deep learning model, the method comprising:

Training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model;

initializing an initial deep learning model to be trained by using the background identification parameters of the target detection model to obtain an initialized initial deep learning model, wherein the method comprises the following steps:

the parameter values of the parameters of each network layer before the output layer of the target detection model are endowed with corresponding parameters in an initial deep learning model; obtaining parameter values of a regressor channel of a positioning coordinate in an output layer of the target detection model and a background category identification channel in a classifier, and obtaining parameter values of each background related channel; according to the parameter values of the background related channels, carrying out parameter coverage on tensors of weights and paranoid amounts of the parameters of the corresponding channels in the initial deep learning model along the dimension representing the category and the coordinate; initializing other parameters which are not assigned in the initial deep learning model, and obtaining the initialized initial deep learning model.

2. The method of claim 1, wherein before initializing the initial deep learning model to be trained using the background recognition parameters of the target detection model to obtain an initialized initial deep learning model, the method further comprises:

3. The method of claim 1, wherein training the initialized initial deep learning model with the preset sample picture to obtain a trained target deep learning model comprises:

4. A method according to any one of claims 1-3, wherein the pre-set sample picture comprises a face picture marked with a face frame, or a body picture marked with a body frame, or a vehicle picture marked with a vehicle frame.

5. The method according to claim 1, wherein the method further comprises:

6. The method according to claim 5, wherein inputting the image to be detected into the target deep learning model to obtain a target detection result of the image to be detected comprises:

7. A deep learning model training apparatus, the apparatus comprising:

The sample training module is used for training the initialized initial deep learning model by using a preset sample picture to obtain a trained target deep learning model;

the model initialization module comprises:

the front-end parameter assignment sub-module is used for assigning parameter values of parameters of each network layer before an output layer of the target detection model to each corresponding parameter in the initial deep learning model;

the other parameter assignment sub-module is used for initializing other parameters which are not yet assigned in the initial deep learning model to obtain an initialized initial deep learning model;

the back-end parameter assignment submodule is specifically configured to:

8. The apparatus of claim 7, wherein the apparatus further comprises:

9. The apparatus of claim 7, wherein the sample training module is configured to:

10. The apparatus according to any one of claims 7-9, wherein the preset sample picture comprises a face picture labeled with a face frame, or a body picture labeled with a body frame, or a vehicle picture labeled with a vehicle frame.

11. The apparatus of claim 7, wherein the apparatus further comprises:

12. The apparatus according to claim 11, wherein the image recognition module is specifically configured to: