WO2022156331A1 - 知识蒸馏和图像处理方法、装置、电子设备和存储介质 - Google Patents

知识蒸馏和图像处理方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2022156331A1
WO2022156331A1 PCT/CN2021/130895 CN2021130895W WO2022156331A1 WO 2022156331 A1 WO2022156331 A1 WO 2022156331A1 CN 2021130895 W CN2021130895 W CN 2021130895W WO 2022156331 A1 WO2022156331 A1 WO 2022156331A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
output
output feature
feature map
model
Prior art date
Application number
PCT/CN2021/130895
Other languages
English (en)
French (fr)
Inventor
高梦雅
王宇杰
李全全
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Publication of WO2022156331A1 publication Critical patent/WO2022156331A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present application relates to computer technology, in particular to knowledge distillation and image processing methods, devices, electronic devices and storage media.
  • the neural network model has been developed rapidly.
  • deep convolutional neural network models such as RCNN (Region Convolutional Neural Networks), FAST-RCNN (Fast Region Convolutional Neural Networks, Fast Region Convolutional Neural Networks) can be used, Implement operations such as image classification, object detection, and semantic segmentation.
  • the structure of the neural network model will become more and more complex, and the space occupied will also become larger and larger. This may take up a lot of computing resources and storage space, and even make the neural network model unusable in devices like mobile phones.
  • a model compression method which can make the student model with a simple structure learn from the teacher model with a complex structure, and make the result of the student model as close to the teacher model as possible, so as to complete the model compression.
  • the present application provides a knowledge distillation method, the method includes: using a student model and a teacher model respectively, processing a training sample set to obtain a first output feature and a second output feature; the second output feature, determining a feature map pair matching between the feature map included in the first output feature and the feature map included in the second output feature, and determining each of the feature maps based on the feature map pair The corresponding relationship between the channels in which the two feature maps included in the pair are located respectively; the student model is trained; wherein, in each round of training, the student model and the teacher model are used respectively to analyze the sample process the data to obtain the third output feature and the fourth output feature; determine the error between the third output feature and the real feature corresponding to the sample data; The fourth output feature performs a feature alignment operation so that the feature map included in the third output feature and the feature map included in the fourth output feature are matched between feature maps with the same number of channels; The gap between the third output feature and the fourth output feature; and the model parameters of the student model are updated based on the error and
  • the present application also provides an image processing method, the method includes: acquiring a target image; using the student model trained by the knowledge distillation method shown in any of the foregoing embodiments to perform image processing on the target image to obtain an image process result.
  • the application also provides a knowledge distillation device, the device includes: a sample processing module, used to process the training sample set by using the student model and the teacher model respectively, to obtain the first output feature and the second output feature; Correspondence relationship a determining module, configured to determine, based on the first output feature and the second output feature, a feature map pair matching between the feature map included in the first output feature and the feature map included in the second output feature, Based on the feature map pair, determine the correspondence between the channels where the two feature maps included in each feature map pair are located respectively; a training module is used to train the student model; wherein, in each feature map pair In a round of training, use the student model and the teacher model to process the sample data respectively to obtain the third output feature and the fourth output feature; determine the real feature corresponding to the third output feature and the sample data The error between the third output feature or the fourth output feature is performed by using the corresponding relationship, so that the feature map included in the third output feature is the same as the feature included in the fourth output feature.
  • the present application also provides an image processing device, the device includes: an acquisition module for acquiring a target image; an image processing module for using the student model trained by the knowledge distillation method shown in any of the foregoing embodiments to Perform image processing on the target image to obtain the image processing result.
  • the present application also provides an electronic device, the device comprising: a processor; a memory for storing executable instructions of the processor; wherein the processor is configured to invoke the executable instructions stored in the memory , to implement the aforementioned knowledge distillation method or image processing method.
  • the present application also provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute the aforementioned knowledge distillation method or image processing method.
  • the present application also provides a computer program product, comprising a computer program stored in a memory, the computer program instructions implementing the aforementioned knowledge distillation method or image processing method when executed by a processor.
  • FIG. 1 is a schematic flowchart of a model training method shown in this application.
  • Fig. 2 is the method flow chart of a kind of model training method shown in this application;
  • FIG. 3 is a schematic flowchart of a model training shown in the application.
  • FIG. 5 is a schematic flowchart of a feature alignment method shown in this application.
  • Fig. 6 is the structural representation of a kind of knowledge distillation apparatus shown in the application.
  • FIG. 7 is a schematic diagram of a hardware structure of an electronic device shown in this application.
  • FIG. 1 is a schematic flowchart of a model training method shown in this application. It should be noted that the description of the process shown in FIG. 1 is only a schematic description of the process of the model training method, and fine-tuning can be performed in practical applications.
  • a training sample set may typically be a collection of images annotating the classification types of objects that appear in the image.
  • the original image can usually be labeled with ground truth by manual or machine-assisted labeling.
  • image annotation software can be used to annotate the classification type of the object appearing in the original image (for example, whether the object is a person, a car, or a tree, etc.), so as to obtain several training samples.
  • one-hot (one-hot) encoding or other methods may be used for encoding, and the present application does not limit the specific encoding method.
  • S104 may be executed first, and the same training sample is input to the student model and the teacher model for forward propagation to obtain the output features of the student model and the output features of the teacher model.
  • model complexity of the student model may be smaller than the model complexity of the teacher model (also referred to as the second model).
  • the student model and the teacher model can be any type of model.
  • the purpose of model training is to enable the student model to learn from the teacher model, so that the output effect of the student model is close to the teacher model, so as to achieve the purpose of compressing the model.
  • the teacher model can be a pretrained model. It can be understood that the training sample set used by the teacher model in the pre-training stage may be the same or a different sample set from the sample set constructed in step S102, which is not limited herein.
  • S106 may be executed to determine the gap between the output features of the student model and the output features of the teacher model based on the output features of the student model and the output features of the teacher model.
  • the disparity can be obtained using a predetermined disparity function.
  • the structure of the disparity function is not particularly limited.
  • the gap function can be determined with reference to a commonly used knowledge distillation function.
  • Knowledge distillation functions such as loss functions used in knowledge distillation algorithms.
  • the loss function can be a cross-entropy loss function, an exponential loss function, etc.
  • S108 may also be executed to determine errors based on the output features of the student model.
  • a preset loss function can be used to determine the error between the output feature corresponding to the student model and the real feature corresponding to the training sample.
  • the structure of the loss function is not particularly limited.
  • the loss function can be determined with reference to a commonly used knowledge distillation function.
  • S110 may be performed, and based on the result of weighted summation of the error and the gap, the model parameters of the student model are updated to complete a round of model training.
  • a gradient descent method may be used to determine the loss based on the result of weighted summation of the error and the difference.
  • the student model is then back-propagated according to the loss, thereby updating the model parameters of the student model.
  • the backpropagation may include Stochastic Gradient Descent (SGD), Batch Gradient Descent (BGD), or Mini-Batch Gradient Descent (MBGD) , which is not particularly limited here.
  • SGD Stochastic Gradient Descent
  • BGD Batch Gradient Descent
  • MBGD Mini-Batch Gradient Descent
  • steps S104-S110 may be performed repeatedly until the model converges.
  • the above embodiments illustrate methods for achieving model compression through model training.
  • the above methods still have problems such as slow model convergence, and it is difficult for the output features of the student model and the teacher model to be close enough.
  • the present application proposes a knowledge distillation method.
  • the method first performs a feature alignment operation when determining the gap between the output features of the student model and the teacher model, so that the feature map included in the output feature of the student model and the feature map included in the output feature of the teacher model are in the same position.
  • the feature maps of the number of channels are matched with each other, so that the feature maps with the same number of channels have the same or similar interpretation meaning. Therefore, when determining the gap, the error caused by the mismatch of the feature maps can be reduced, so that the determined gap is more real and accurate, thereby reducing the difficulty of model convergence, and making the output features of the student model easy to approach the output of the teacher model. features to improve the efficiency of model training.
  • FIG. 2 is a method flowchart of a model training method shown in this application.
  • the model training method shown in FIG. 2 can be applied to an electronic device.
  • the above-mentioned electronic device may execute the above-mentioned model training method by carrying a software system corresponding to the model training method.
  • the types of the above electronic devices may be notebook computers, computers, servers, mobile phones, PAD terminals, etc., which are not particularly limited in this application.
  • model training method can be executed only by the terminal device or the server device alone, or can be executed by the terminal device and the server device in cooperation.
  • the model training method can be integrated on the client side. After receiving the model training request, the terminal device equipped with the client can provide computing power through its own hardware environment to execute the model training method.
  • the model training method can be integrated into a system platform. After receiving the model training request, the server device equipped with the system platform can provide computing power through its own hardware environment to execute the above model training method.
  • the model training method may be divided into two tasks: constructing a training sample set and performing model training based on the training sample set.
  • the construction of the training sample set can be integrated in the client and carried on the terminal device.
  • the model training task can be integrated on the server and carried on the server device.
  • the terminal device can initiate a model training request to the server device after constructing the training sample set.
  • the server device may train the model based on the training sample set in response to the request.
  • the execution subject is an electronic device (hereinafter referred to as a device) as an example for description.
  • the model training method may include steps S202 to S206.
  • the student model and the teacher model can be any type of models.
  • the student model and the teacher model may be graphical models such as RCNN and FAST-RCNN.
  • the student model and the teacher model may be MASK-RCNN (Mask-based Regional Convolutional Neural Network) models.
  • MASK-RCNN Mask-based Regional Convolutional Neural Network
  • the first output feature is an output feature obtained by processing the image data set through the student model.
  • the second output feature is an output feature obtained by processing the image data set by the teacher model.
  • the first output feature and the second output feature may include multi-channel feature maps.
  • the feature map of each channel can represent the feature meaning of the image from an interpretation dimension.
  • the feature maps of some channels can represent the texture features of the image.
  • the feature maps of some channels can represent the contour features of the image.
  • the student model when performing S202, on the one hand, may be used to perform image processing on some images in the image data set to obtain output features of the student model corresponding to the partial images respectively. Then, averaging processing such as weighted summation is performed on pixel values in the same position in the output features of the student model corresponding to the partial images to obtain the first output feature.
  • the teacher model can be used to perform image processing on the partial images to obtain output features of the teacher model corresponding to the partial images respectively. Then, processing such as weighted summation is performed on pixel values in the same position in the output features of the teacher model corresponding to the partial images to obtain the second output feature.
  • the first output feature can also be obtained by selecting the maximum value or the minimum value from the output features of the student model, which is not described here. detail.
  • step S204 may be executed to determine, based on the first output feature and the second output feature, a feature map included in the first output feature and the second output feature A feature map pair matched between the feature maps included in the feature, and based on the feature map pair, the corresponding relationship between the channels in which the two feature maps included in each feature map pair are located is determined.
  • the feature map pair refers to a matched pair of feature maps. For example, if the feature map A included in the first output feature matches the feature map B included in the second output feature, then the feature map A and the feature map B constitute a feature map pair.
  • the feature map pair when the feature map pair is determined, for each feature map included in the first output feature, the feature map is used as the current feature map, the current feature map is vectorized to obtain a first vector, and the second output Each feature map to be matched in the feature map included in the feature is vectorized to obtain a second vector. Calculate the similarity score between the first vector and the second vector.
  • the feature map corresponding to the second vector with the highest similarity score of the first vector and the current feature map corresponding to the first vector are determined as a pair of feature maps. It should be noted that, methods such as Euclidean distance, cosine distance, etc. may be used for calculating the similarity, which is not limited here.
  • the feature map pair is determined, for each feature map of each channel included in the second output feature, the feature map is used as the current feature map, and a method similar to the foregoing steps is performed, and the specific process is not described in detail here. described.
  • the correspondence relationship refers to the correspondence relationship between the channel of the first output feature where the two feature maps included in the pair of feature maps are located and the channel of the second output feature where they are located. For example, if the feature map A in the 5th channel in the first output feature matches the feature map B in the 3rd channel in the second output feature, the above correspondence may be 1-5 and 2-3 correspondence. Among them, 1-5 represent the fifth channel of the first output feature, and 2-3 represent the third channel of the second output feature. It can be understood that, in the present application, other manners may also be used to maintain the above-mentioned corresponding relationship.
  • S206 can be continued to train the student model; wherein, in each round of training, the above-mentioned student model and the above-mentioned teacher model are respectively used to perform image processing on the sample images to obtain the third output feature and the third output feature.
  • Four output features determine the error between the third output feature and the real feature corresponding to the sample image; use the corresponding relationship to perform a feature alignment operation on the third output feature or the fourth output feature, to Matching the feature map included in the third output feature and the feature map included in the fourth output feature between feature maps with the same number of channels; determining the alignment between the third output feature and the fourth output feature the gap; update the model parameters of the student model based on the error and the gap.
  • the real feature is a feature used to determine the error.
  • the ground-truth features can be obtained from a pre-trained student model.
  • the student model may be an image classification model (the initial student model).
  • the initial student model can be pre-trained by using the training samples to obtain the student model.
  • the sample images marked with the real classification can be input into the pre-trained initial student model (ie, the student model) for forward propagation, and then the output features of the student model can be used as the real features of the sample images.
  • the real features may also be determined using known images prior to the sample image, eg, features derived through algorithms such as spatial geometric constraints.
  • the sample image may be an image in a sequence of images. It can be understood that the sample images in the image sequence are usually continuous images, and the objects appearing in the continuous images satisfy the spatial geometric constraints. Therefore, the real features of the sample image can be deduced from the images before the sample image.
  • the error may be the error between the third output feature and the real feature corresponding to the sample image.
  • the error may be determined using a pre-built loss function (eg, a cross-entropy loss function).
  • the purpose of the feature alignment operation is to match the feature maps included in the third output feature and the feature maps included in the fourth output feature between feature maps with the same number of channels.
  • feature transformation may be performed on the third output feature or the fourth output feature based on the corresponding relationship to complete the feature alignment operation.
  • the position of the feature map of each channel of the third output feature is adjusted (for example, the feature map of the first channel and the feature map of the second channel are exchanged, that is, the feature map of the first channel is The map is moved to the second channel, and the feature map of the second channel is moved to the first channel), so that the feature map included in the adjusted third output feature and the feature map included in the fourth output feature are in the same channel Matches between the feature maps of the number.
  • the position of the feature map of each channel of the fourth output feature is adjusted (for example, the feature map of the first channel and the feature map of the second channel are exchanged, that is, the feature map of the first channel is The feature map is moved to the second channel, and the feature map of the second channel is moved to the first channel), so that the feature map included in the third output feature and the feature map included in the adjusted fourth output feature are in the same number of channels. match between the feature maps.
  • the gap refers to the gap between the aligned third output feature and the fourth output feature.
  • the gap may be determined using a pre-built gap function (eg, a cross-entropy loss function). It can be understood that since the feature alignment operation is performed before the gap is determined, the error caused by the mismatch of the feature maps can be reduced when the error is determined, so that the determined gap is more real and accurate, thereby reducing the model.
  • the difficulty of convergence makes the output features of the student model easy to approach the output features of the teacher model, which improves the efficiency of model training.
  • FIG. 3 is a schematic flowchart of a model training shown in this application. As shown in Figure 3, in each round of training when the student model is trained, S2062 can be executed first, and the sample image is input into the student model and the teacher model to obtain the third output feature output by the student model and the output of the teacher model The fourth output feature of .
  • S2064 may be executed to determine the error between the third output feature and the real feature corresponding to the sample image based on the preset loss function.
  • S2066 Before determining the gap, S2066 may be performed, and an alignment operation may be performed to match the feature maps included in the third output feature and the feature maps included in the fourth output feature with the same number of channels.
  • Step S2070 is executed, and the model parameters of the student model are updated based on the error and the gap by using the back-propagation method. After one round of training is performed, steps S2062-S2068 may be performed repeatedly until the model converges.
  • a feature alignment operation is performed first, so that the feature maps included in the output features of the student model and the features included in the output features of the teacher model
  • the feature maps with the same number of channels match each other, so the feature maps with the same number of channels can have the same or similar interpretation meaning. Therefore, when the above gap is determined, the error caused by the mismatch of feature maps can be reduced, making the determined gap more real and accurate, thereby reducing the difficulty of model convergence, and making the output features of the student model easy to approach the output features of the teacher model. , which improves the efficiency of model training.
  • the student model may be a compressed model with a simple structure
  • the teacher model may be a model with a complex structure before compression.
  • the student model and the teacher model may be obtained by pre-training the initial student model and the initial teacher model through the training sample set.
  • the pre-training process is not described in detail here.
  • initialization parameters of the initial student model may be recorded prior to pre-training the initial student model.
  • the initialization parameters may include model parameters included in the initial student model before pre-training.
  • the model parameters of the initial student model before pre-training can be recorded. Therefore, in the subsequent model training of the student model, the recorded initialization parameters can be used to initialize the student model first, and then the model training can be performed, so as to ensure that the model change trend of the student model in the subsequent training process (during the learning process) can be The change trend of the model during pre-training is the same, so that the information contained in the initialization parameters of the student model can be effectively used to improve the learning effect of the student model.
  • the aforementioned method can be used, and the pre-trained student model and the teacher model can be used to perform image processing on the image data set to obtain the first output feature and the second output feature.
  • a bipartite graph matching algorithm or a greedy matching algorithm may be used to determine the feature map included in the first output feature and the second output feature based on the first output feature and the second output feature.
  • a feature map pair matched between the feature maps included in the output feature is output, and based on the feature map pair, the corresponding relationship between the channels in which the two feature maps included in each feature map pair are located is determined.
  • the bipartite graph matching algorithm or the greedy algorithm can determine the matching feature map pair between the feature map included in the first output feature and the feature map included in the second output feature, the above-mentioned corresponding relationship can be more accurately determined through the above algorithm.
  • the corresponding relationship is determined using a greedy matching algorithm
  • the feature map is used as a current feature map, and the feature included in the second output feature is determined
  • the first matching feature map matched with the current feature map
  • the first sub-correspondence between the channel where the current feature map is located and the channel where the first matching feature map is located is recorded;
  • a sub-correspondence relationship is determined, and the corresponding relationship is determined.
  • the matched feature map pair between the feature map included in the output feature of the student model and the feature map included in the output feature of the teacher model can be determined. Then, based on the determined feature map pairs, the corresponding relationship between the channels in which the two feature maps included in each feature map pair are located is determined.
  • the feature map included in the second output feature is deleted, and the determined and the first output feature includes The first feature map of is matched with the second matching feature map. For each feature map included in the first output feature except the first feature map, the feature map is used as the current feature map, and the remaining feature maps of the second output feature are determined to be the same as the current feature map.
  • the bipartite graph matching algorithm it is possible to determine the matched feature map pair between the feature map included in the output feature of the student model and the feature map included in the output feature of the teacher model. Then, based on the determined feature map pairs, the corresponding relationship between the channels in which the two feature maps included in each feature map pair are located is determined.
  • a transformation matrix may be generated based on the correspondence.
  • the transformation matrix is used to represent the correspondence between the channel where the feature map included in the first output feature is located and the channel where the feature map included in the second output feature is located.
  • the transformation matrix may be a 0-1 matrix.
  • FIG. 4 is a schematic diagram of a conversion matrix shown in this application.
  • the transformation matrix shown in FIG. 4 is used to represent the correspondence between the channel where the feature map included in the first output feature is located and the channel where the feature map included in the second output feature is located.
  • the number of rows in the transformation matrix represents the number of channels of the feature map included in the second output feature
  • the number of columns in the transformation matrix represents the number of channels in the feature map included in the first output feature
  • the elements in the transformation matrix represent the corresponding two features whether the images match. For example, 0 is a mismatch and 1 is a match.
  • the third element in the first row is 1, which can indicate that the feature map of the third channel in the first output feature matches the feature map of the first channel in the second output feature.
  • the second element in the second row is 1, which can indicate that the feature map of the second channel in the first output feature matches the feature map of the second channel in the second output feature.
  • the above-mentioned correspondence can be conveniently recorded through the transformation matrix.
  • the subsequent feature alignment can be facilitated by the transformation matrix.
  • the number of rows of the transformation matrix can also represent the number of channels where the feature maps included in the first output feature are located, and the number of columns of the transformation matrix represents the number of channels where the feature maps included in the second output feature are located.
  • S206 may be continued to train the student model.
  • the student model may be initialized using the initialization parameters recorded during the pre-training phase when the student model is being trained. The initialized student model is then trained.
  • the student model can be initialized by using the initialization parameters recorded in the pre-training stage, and then the model training can be performed to ensure that the model change trend of the student model in the subsequent training process (during the learning process) can be consistent with the pre-training model.
  • the change trend is the same, so that the information contained in the initialization parameters of the student model can be effectively used to improve the learning effect of the student model.
  • the fourth output feature is transformed by the transformation matrix, so that the feature map included in the third output feature and the feature map included in the fourth output feature are between the feature maps with the same number of channels. match.
  • the feature maps included in the fourth output feature may be numbered in a top-to-bottom order. Then build a column vector based on the above numbers. After that, multiply the transformation matrix by the column vector to get the multiplication result.
  • the multiplication result can represent the ordering of the feature maps included in the aligned fourth output feature.
  • the feature maps included in the fourth output feature may be reordered according to the order indicated by the multiplication result to obtain the fourth output feature after feature alignment.
  • the feature map included in the third output feature and the feature map included in the aligned fourth output feature are matched between feature maps with the same number of channels, that is, the features of the fourth output feature and the third output feature are completed. Therefore, when determining the gap between the two, it can reduce the error caused by the mismatch of feature maps, making the determined gap more real and accurate, thereby reducing the difficulty of model convergence, making the output features of the student model easy to approach the teacher model.
  • the output features improve the efficiency of model training.
  • the third output feature is converted by using the transformation matrix, so that the feature map included in the converted third output feature and the feature map included in the fourth output feature are in the feature with the same number of channels. match between graphs.
  • the Model parameters for the student model After determining the error between the third output feature and the real feature corresponding to the sample image, and the gap between the third output feature after feature alignment and the fourth output feature, the Model parameters for the student model.
  • a round of parameter update for the student model can be realized.
  • the feature alignment operation is performed first when determining the gap between the output features of the student model and the output features of the teacher model.
  • the feature maps with the same number of channels match each other, so that the feature maps with the same number of channels have the same or similar interpretations meaning.
  • the error caused by the mismatch of the feature maps can be reduced, so that the determined gap is more real and accurate, thereby reducing the difficulty of model convergence, and making the output features of the student model easy to approach the output of the teacher model. features, which improves the efficiency of model training and the effect of model compression.
  • model parameters of the student model may be updated based on a weighted sum of the error and the gap.
  • the weight of the weighted summation can be set according to the actual situation.
  • model training can be implemented by comprehensively utilizing the meaning of the error and the gap representation, thereby ensuring the output characteristics of the trained student model Close to the output features of the teacher model.
  • the corresponding relationship of different classification types can be determined according to the classification type of the image.
  • the classification type of the input sample image can be selected according to the classification type of the sample image.
  • Feature alignment is performed on the correspondences corresponding to the classification types, so as to improve the prediction effect of the student model for different classification types.
  • the partial images included in the image dataset used in S202 may include images of multiple classification types.
  • the above classification types can be set according to actual situations.
  • the above classification types can be people, walls, vehicles, etc.
  • the above classification types may include animals such as dogs, cats, and pigs.
  • the output features corresponding to the images of each classification type can be averaged respectively to obtain the first output features corresponding to each classification type. an output feature.
  • the output features corresponding to the images of each classification type may be averaged to obtain the second output features corresponding to each classification type.
  • the first output feature output by the student model and the second output feature output by the teacher model can be determined for images of different classification types.
  • a feature map pair matching between the feature map included in the first output feature and the feature map included in the second output feature is determined, based on the feature map Yes, determining the corresponding relationship between the channels where the two feature maps included in each of the feature map pairs are respectively located, and also includes: for each classification type in the plurality of classification types, the corresponding The first output feature and the second output feature corresponding to the classification type are determined to match between the feature map included in the first output feature corresponding to the classification type and the feature map included in the second output feature corresponding to the classification type Based on the feature map pair corresponding to the classification type, determine the corresponding relationship between the channels where the two feature maps included in each feature map pair corresponding to the classification type are respectively located.
  • the corresponding relationship between the channel where the feature map included in the output feature of the student model is located and the channel where the feature map included in the output feature of the teacher model is located can be determined for images of different classification types. Since the correspondences corresponding to the classification types are determined, errors caused by differences in output features corresponding to images of different classification types can be eliminated, so the accuracy of the determined correspondences can be improved.
  • FIG. 5 is a schematic flowchart of a feature alignment method shown in this application. As shown in FIG. 5 , when performing the feature alignment operation, S502 may be performed first to determine the classification type corresponding to the sample image.
  • the corresponding classification type can be determined by determining the annotation type of the above-mentioned sample image.
  • S504 may be executed, and a feature alignment operation is performed on the third output feature or the fourth output feature by using the corresponding relationship corresponding to the classification type, so that the feature map included in the third output feature is the same as the feature map included in the third output feature.
  • the feature maps included in the fourth output feature are matched between feature maps with the same number of channels.
  • the feature alignment operation can be performed according to the corresponding relationship corresponding to the classification type of the input sample image. Therefore, the accuracy of the feature alignment operation can be improved, thereby improving the training effect of the student model, and further improving the prediction effect of the student model.
  • the present application also proposes an image processing method.
  • the method can be applied to any type of electronic device.
  • image processing is performed by using the image processing model (that is, the above-mentioned student model) trained by the model training method shown in any of the foregoing embodiments, so that the image processing model with lower complexity can be used to achieve a better prediction effect, Furthermore, the image processing rate is improved without reducing the prediction effect.
  • the image processing model that is, the above-mentioned student model
  • the above image processing method may include: acquiring a target image.
  • Image processing is performed on the target image by using the student model trained by the knowledge distillation method shown in any of the foregoing embodiments to obtain an image processing result.
  • the aforementioned student model can be any type of model.
  • the student model may be an image classification model, an object detection model, an object tracking model, and the like. Since the student model can be trained by the knowledge distillation method shown in any of the foregoing embodiments, the model has the characteristics of simple structure and good prediction effect, and further improves the image processing rate without reducing the prediction effect. .
  • the present application also proposes a knowledge distillation device.
  • FIG. 6 is a schematic structural diagram of a knowledge distillation apparatus shown in this application.
  • the apparatus 600 may include: a sample processing module 610 for processing the training sample set by using the student model and the teacher model, respectively, to obtain the first output feature and the second output feature; the corresponding relationship determination module 620, for determining, based on the first output feature and the second output feature, a feature map pair matching between the feature map included in the first output feature and the feature map included in the second output feature, based on For the feature map pair, determine the correspondence between the number of channels where the two feature maps included in each feature map pair are located respectively; the training module 630 is used to train the student model; wherein, in the In each round of training, the student model and the teacher model are respectively used to process the sample data to obtain the third output feature and the fourth output feature; determine the true value corresponding to the third output feature and the sample image.
  • the feature alignment operation is performed on the third output feature or the fourth output feature by using the corresponding relationship, so that the feature map included in the third output feature is the same as that included in the fourth output feature.
  • the feature map the feature maps with the same number of channels are matched; the gap between the aligned third output feature and the fourth output feature is determined; and the model parameters of the student model are updated based on the error and the gap.
  • the sample processing module 610 is specifically configured to: use the student model to process one or more samples in the training sample set, and obtain a sample that is different from the one or more samples. Corresponding one or more student model output features; weighted summation of the values in the same position in the one or more student model output features corresponding to the one or more samples respectively, to obtain the first output feature; Using the teacher model, the one or more samples are processed to obtain the one or more teacher model output features corresponding to the one or more samples respectively; The values at the same position in the corresponding one or more teacher model output features are weighted and summed to obtain the second output feature.
  • the corresponding relationship determining module 620 is configured to: determine the corresponding relationship by using a bipartite graph matching algorithm or a greedy matching algorithm.
  • the correspondence determination module 620 is configured to: for each feature map included in the first output feature, use the feature map as a current feature map, and determine the second output In the feature map included in the feature, the first matching feature map matched with the current feature map; record the first sub-correspondence between the channel where the current feature map is located and the channel where the first matching feature map is located; The correspondence is determined based on the recorded first sub-correspondence.
  • the corresponding relationship determination module 620 is configured to: delete the feature map included in the second output feature according to the maintained correspondence relationship, and the determined relationship with the first output feature includes The second matching feature map matched with the first feature map of Second, in the feature maps of the remaining channels of the output feature, the third matching feature map that matches the current feature map; record the difference between the channel where the current feature map is located and the number of channels where the third matching feature map is located.
  • the second sub-correspondence relationship; the corresponding relationship is determined based on the recorded second sub-correspondence relationship.
  • the apparatus further includes: a pre-training module 630, configured to pre-train an initial student model and an initial teacher model through an initial training sample set to obtain the student model and the teacher model.
  • a pre-training module 630 configured to pre-train an initial student model and an initial teacher model through an initial training sample set to obtain the student model and the teacher model.
  • the device further includes: a recording module for recording initialization parameters corresponding to the initial student model before performing the pre-training on the initial student model; the training model is used for: using the initialization parameters, The student model is initialized; the initialized student model is trained.
  • the apparatus further includes: a generating module, configured to generate a transformation matrix based on the corresponding relationship; wherein the transformation matrix is used to represent the feature map included in the second output feature.
  • a generating module configured to generate a transformation matrix based on the corresponding relationship; wherein the transformation matrix is used to represent the feature map included in the second output feature. The corresponding relationship between the channel at and the channel where the feature map included in the first output feature is located.
  • the training model is used for: when the number of rows of the transformation matrix represents the number of channels in which the feature maps included in the second output feature are located, and the number of columns of the transformation matrix represents the number of channels in which the feature maps included in the second output feature are located.
  • the fourth output feature is converted by using the transformation matrix, so that the feature map included in the third output feature is the same as the fourth output feature.
  • the included feature maps are matched between feature maps with the same number of channels; or, when the number of rows of the transformation matrix represents the number of channels in which the feature maps included in the first output feature are located, the columns of the transformation matrix
  • the third output feature is converted by using the transformation matrix, so that the feature map included in the third output feature is the same as the third output feature.
  • the feature maps included in the four output features are matched between feature maps with the same number of channels.
  • the training model is used to: determine a loss based on a weighted sum of the error and the gap; backpropagate the student model based on the loss to update all Describe the model parameters of the student model.
  • the training sample set includes samples of multiple classification types; the correspondence determination module 620 is configured to: for each classification type of the multiple classification types, based on the classification type Corresponding to the first output feature and the second output feature, determine the feature map of each channel included in the first output feature corresponding to the classification type and the feature included in the second output feature corresponding to the classification type The feature map pair matched between the maps, based on the feature map pair corresponding to the classification type, determine the correspondence between the channels in which the two feature maps included in each feature map pair corresponding to the classification type are located.
  • the training module 630 is used to: determine the classification type corresponding to the sample data; use the corresponding relationship corresponding to the classification type to perform a feature alignment operation on the third output feature or the fourth output feature to make The feature map included in the third output feature and the feature map included in the fourth output feature are matched between feature maps with the same number of channels.
  • the present application also proposes an image processing device, the device includes: an acquisition module for acquiring a target image; an image processing module for using the student model trained by the knowledge distillation method shown in any of the foregoing embodiments to The target image is subjected to image processing to obtain the image processing result.
  • the embodiments of the knowledge distillation apparatus or the image processing apparatus shown in this application can be applied to electronic equipment.
  • the present application discloses an electronic device, which may include: a processor, a memory for storing executable instructions of the processor, wherein the processor is configured to invoke the executable instructions stored in the memory , to implement the aforementioned knowledge distillation method or image processing method.
  • FIG. 7 is a schematic diagram of a hardware structure of an electronic device shown in this application.
  • the electronic device may include a processor for executing instructions, a network interface for making network connections, a memory for storing operational data for the processor, and a knowledge distillation apparatus or image processing process for storing
  • the device corresponds to the non-volatile memory of the instructions.
  • the embodiments of the apparatus may be implemented by software, or may be implemented by hardware or a combination of software and hardware.
  • a device in a logical sense is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where the device is located.
  • the electronic device where the apparatus is located in the embodiment may also include other Hardware, no further details on this.
  • the corresponding instructions of the knowledge distillation apparatus or the image processing apparatus may also be directly stored in the memory, which is not limited here.
  • the present application proposes a computer-readable storage medium, where a computer program is stored in the storage medium, and the computer program is used to execute the aforementioned knowledge distillation method or image processing method.
  • one or more embodiments of the present application may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may employ a computer implemented on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein The form of the program product.
  • computer-usable storage media which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the subject matter and functional operations described in this application can be implemented in digital electronic circuits, computer software or firmware in tangible embodiment, computer hardware that can include the structures disclosed in this application and their structural equivalents, or their A combination of one or more of.
  • Embodiments of the subject matter described in this application may be implemented as one or more computer programs, ie, one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. multiple modules.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for interpretation by the data.
  • the processing device executes.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.
  • the processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, such as an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • a computer suitable for the execution of a computer program may include, for example, a general and/or special purpose microprocessor, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from read only memory and/or random access memory.
  • the basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operably coupled to, such mass storage devices to receive data therefrom or to include one or more mass storage devices, such as magnetic disks, magneto-optical disks, or optical disks, etc., for storing data. Send data to it, or both.
  • the computer does not have to have such a device.
  • the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PDA), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus (USB) ) flash drives for portable storage devices, to name a few.
  • PDA personal digital assistant
  • GPS global positioning system
  • USB universal serial bus
  • Computer readable media suitable for storage of computer program instructions and data may include all forms of non-volatile memory, media, and memory devices, and may include, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks) or removable discs), magneto-optical discs, and CD-ROM and DVD-ROM discs.
  • semiconductor memory devices eg, EPROM, EEPROM, and flash memory devices
  • magnetic disks eg, internal hard disks
  • removable discs removable discs
  • magneto-optical discs e.g., CD-ROM and DVD-ROM discs.
  • the processor and memory may be supplemented by or incorporated in special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本申请提出一种知识蒸馏和图像处理方法、装置、电子设备和存储介质。其中,所述方法可以包括,分别利用学生模型与教师模型对训练样本集进行处理得到第一输出特征与第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对所处的通道数之间的对应关系,对上述学生模型进行训练。其中,在每一轮训练中,利用所述对应关系对所述学生模型与所述教师模型的输出特征进行特征对齐操作,并根据特征对齐后的输出特征进行知识蒸馏。

Description

知识蒸馏和图像处理方法、装置、电子设备和存储介质
相关申请交叉引用
本申请主张申请号为202110090849.2、申请日为2021年1月22日的中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及计算机技术,具体涉及知识蒸馏和图像处理方法、装置、电子设备和存储介质。
背景技术
目前,神经网络模型得到了迅速的发展。例如,在图像处理任务中,可以利用诸如RCNN(Region Convolutional Neural Networks,区域卷积神经网络),FAST-RCNN(Fast Region Convolutional Neural Networks,快速区域卷积神经网络)等深度卷积神经网络模型,实现诸如图像分类、物体检测、语义分割等操作。
然而,随着任务越来越复杂,对处理结果的要求越来越高,神经网络模型的结构会变的越来越复杂,占用空间也越来越大。这将可能占用很大计算资源和存储空间,甚至导致神经网络模型无法利用在类似手机这样的设备中。
由此,需要一种模型压缩方法,可以使结构简单的学生模型向结构复杂的教师模型进行学习,让学生模型的结果尽可能接近教师模型,从而完成模型压缩。
发明内容
本申请提供了一种知识蒸馏方法,所述方法包括:分别利用学生模型与教师模型,对训练样本集进行处理,得到第一输出特征与第二输出特征;基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系;对所述学生模型进行训练;其中,在每一轮训练中,分别利用所述学生模型与所述教师模型,对样本数据进行处理,得到第三输出特征与第四输出特征;确定所述第三输出特征与所述样本数据对应的真实特征之间的误差;利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;确定对齐后的第三输出特征与第四输出特征之间的差距;基于所述误差与所述差距更新所述学生模型的模型参数。
本申请还提供了一种图像处理方法,所述方法包括:获取目标图像;利用根据前述任一实施例示出的知识蒸馏方法训练得到的所述学生模型对所述目标图像进行图像处理,得到图像处理结果。
本申请还提供了一种知识蒸馏装置,所述装置包括:样本处理模块,用于分别利用学生模型与教师模型,对训练样本集进行处理,得到第一输出特征与第二输出特征; 对应关系确定模块,用于基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系;训练模块,用于对所述学生模型进行训练;其中,在每一轮训练中,分别利用所述学生模型与所述教师模型,对样本数据进行处理,得到第三输出特征与第四输出特征;确定所述第三输出特征与所述样本数据对应的真实特征之间的误差;利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;确定对齐后的第三输出特征与第四输出特征之间的差距;基于所述误差与所述差距更新所述学生模型的模型参数。
本申请还提供了一种图像处理装置,所述装置包括:获取模块,用于获取目标图像;图像处理模块,用于利用根据前述任一实施例示出的知识蒸馏方法训练得到的学生模型对所述目标图像进行图像处理,得到图像处理结果。
本申请还提供了一种电子设备,所述设备包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现前述知识蒸馏方法或图像处理方法。
本申请还提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行前述知识蒸馏方法或图像处理方法。
本申请还提供了一种计算机程序产品,包括存储于存储器中的计算机程序,所述计算机程序指令被处理器执行时实现前述知识蒸馏方法或图像处理方法。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图说明
为了更清楚地说明本申请一个或多个实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请示出的一种模型训练方法的流程示意图;
图2为本申请示出的一种模型训练方法的方法流程图;
图3为本申请示出的一种模型训练的流程示意图;
图4为本申请示出的一种转换矩阵示意图;
图5为本申请示出的一种特征对齐方法的流程示意图;
图6为本申请示出的一种知识蒸馏装置的结构示意图;
图7为本申请示出的一种电子设备的硬件结构示意图。
具体实施方式
下面将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的设备和方法的示例。
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“上述”和“该”也旨在可以包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。还应当理解,本文中所使用的词语“如果”,取决于语境,可以被解释成为“在……时”或“当……时”或“响应于确定”。
图1为本申请示出的一种模型训练方法的流程示意图。需要说明的是,图1示出的流程说明仅为针对模型训练方法流程的示意性说明,在实际应用中可以进行微调。
如图1所示,在进行模型训练时通常需要先执行S102,准备训练样本集。
在图像分类领域中,训练样本集通常可以是标注了图像中出现的对象的分类类型的图像的集合。在准备训练样本集时,通常可以采用人工标注或机器辅助标注的方式对原始图像进行真值标注。例如,在获取到原始图像后,可以使用图像标注软件标注出原始图像中出现的对象的分类类型(例如,对象为人还是汽车还是大树等),从而得到若干训练样本。需要说明的是,在对训练样本进行特征编码时,可以采用one-hot(独热)编码等方式进行编码,本申请不对编码的具体方式进行限定。
在得到训练样本集后利用训练样本集,对学生模型进行模型训练。
其中,在每一轮训练中,可以先执行S104,将同一训练样本输入至学生模型与教师模型进行前向传播,得到学生模型的输出特征与教师模型的输出特征。
其中,所述学生模型(也被称为第一模型)的模型复杂度可以小于所述教师模型(也被称为第二模型)的模型复杂度。学生模型与教师模型可以是任意类型的模型,模型训练的目的在于使学生模型可以向教师模型进行学习,使学生模型输出的效果接近于教师模型,从而达到压缩模型的目的。
教师模型可以是经过预训练的模型。可以理解的是,教师模型在预训练阶段使用的训练样本集可以与步骤S102中构建的样本集是相同或不同的样本集,在此不作限定。
在得到输出特征后,可以执行S106,基于学生模型的输出特征和教师模型的输出特征,确定学生模型的输出特征与教师模型的输出特征之间的差距。
在一些示例中,可以利用预先设定的差距函数,得到所述差距。在本申请中,不对差距函数的结构进行特别限定。在一些示例中,可以参照常用的知识蒸馏函数确定差距函数。
知识蒸馏函数,例如包括在知识蒸馏算法中使用的损失函数。例如,损失函数可以是交叉熵损失函数,指数损失函数等。
在得到学生模型的输出特征和教师模型的输出特征后,还可以执行S108,基于学生模型的输出特征,确定误差。
在一些示例中,可以利用预设的损失函数,确定学生模型对应的输出特征与训练样本对应的真实特征之间的误差。在本申请中,不对损失函数的结构进行特别限定。在一些示例中,可以参照常用的知识蒸馏函数确定损失函数。
在确定误差与差距之后,可以执行S110,基于所述误差与所述差距进行加权求和的结果,更新学生模型的模型参数以完成一轮模型训练。
在本步骤中,可以采用梯度下降法,基于所述误差与所述差距进行加权求和的结果确定损失。然后根据损失对所述学生模型进行反向传播,从而更新所述学生模型的模型参数。
其中,所述反向传播可以包括随机梯度下降法(Stochastic Gradient Descent,SGD)、批量梯度下降法(Batch Gradient Descent,,BGD)、或小批量梯度下降法(Mini-Batch Gradient Descent,,MBGD),在此不作特别限定。
在执行完一次训练后,可以重复执行步骤S104-S110,直至模型收敛。
以上实施例示出了通过模型训练达到模型压缩的方法。在实际应用中,上述方法仍然存在:模型收敛速度慢、学生模型与教师模型的输出特征很难足够接近等问题。
鉴于此,本申请提出一种知识蒸馏方法。该方法通过在确定学生模型与教师模型输出特征之间的差距时,先进行了特征对齐操作,以使学生模型的输出特征包括的特征图与教师模型的输出特征包括的特征图中,处于相同通道数的特征图之间相互匹配,从而使得处于相同通道数的特征图具有相同或相似的解释含义。因此,在确定所述差距时,可以减少由于特征图不匹配带来的误差,使得确定的差距更加真实精确,进而减小了模型收敛的难度,使得学生模型的输出特征容易接近教师模型的输出特征,提升了模型训练的效率。
图2为本申请示出的一种模型训练方法的方法流程图。
图2示出的模型训练方法可以应用于电子设备中。其中,上述电子设备可以通过搭载与模型训练方法对应的软件系统执行上述模型训练方法。本申请实施例中,上述电子设备的类型可以是笔记本电脑、计算机、服务器、手机、PAD终端等,在本申请中不作特别限定。
可以理解的是,所述模型训练方法既可以仅通过终端设备或服务端设备单独执行,也可以通过终端设备与服务端设备配合执行。
例如,所述模型训练方法可以集成于客户端。搭载该客户端的终端设备在接收到模型训练请求后,可以通过自身硬件环境提供算力执行所述模型训练方法。
又例如,所述模型训练方法可以集成于系统平台。搭载该系统平台的服务端设备在接收到模型训练请求后,可以通过自身硬件环境提供算力执行上述模型训练方法。
还例如,所述模型训练方法可以分为构建训练样本集和基于该训练样本集进行模型训练两个任务。其中,构建训练样本集可以集成于客户端并搭载于终端设备。模型训 练任务可以集成于服务端并搭载于服务端设备。终端设备可以在构建训练样本集后向服务端设备发起模型训练请求。服务端设备在接收到模型训练请求后,可以响应于所述请求基于该训练样本集对模型进行训练。
以下以执行主体为电子设备(以下简称设备)为例进行说明。
如图2所示,模型训练方法可以包括步骤S202至步骤S206。
S202,分别利用学生模型与教师模型,对图像数据集进行图像处理,得到第一输出特征与第二输出特征。
其中,所述学生模型与所述教师模型可以是任意类型的模型。例如,在目标检测任务中,所述学生模型与所述教师模型可以是RCNN、FAST-RCNN等图模型。在实例分割任务中,所述学生模型与所述教师模型可以是MASK-RCNN(基于掩膜的区域卷积神经网络)模型。在此需要说明的是,本申请以图像处理任务为例说明了模型训练方法。在实际情形中,模型训练方法也可以应用于诸如文字处理任务、语音处理任务等任务中。本申请中不对其它任务下的模型训练方法进行详述。
所述第一输出特征,为通过学生模型对图像数据集进行处理得到的输出特征。所述第二输出特征,为通过教师模型对图像数据集进行处理得到的输出特征。
在一些示例中,第一输出特征与第二输出特征,可以包括多通道的特征图。其中,每个通道的特征图可以从一个解释维度表征图像具有的特征含义。例如,有些通道的特征图可以表征图像具有的纹理特征。再例如,有些通道的特征图可以表征图像具有的轮廓特征。
在一些示例中,在执行S202时,一方面可以利用学生模型,对图像数据集中的部分图像进行图像处理,得到与所述部分图像分别对应的学生模型的输出特征。然后,将与所述部分图像分别对应的学生模型的输出特征中处于相同位置的像素值进行诸如加权求和的平均处理,得到所述第一输出特征。
另一方面可以利用教师模型,对所述部分图像进行图像处理,得到与所述部分图像分别对应的教师模型的输出特征。然后,将与所述部分图像分别对应的教师模型的输出特征中处于相同位置的像素值进行诸如加权求和的处理,得到所述第二输出特征。
可以理解的是,在得到与所述部分图像分别对应的学生模型的输出特征后,也可以从所述学生模型的输出特征中选取最大值或最小值得到所述第一输出特征,在此不作详述。
通过对所述学生模型的输出特征进行平均处理,得到所述第一输出特征,可以获取到学生模型对图像数据集中的图像进行图像处理后较为真实均衡的处理结果,进而保证模型训练的效果。
在确定第一输出特征与第二输出特征后,可以执行步骤S204,基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系。
所述特征图对,是指匹配的一对特征图。例如,如果第一输出特征包括的特征图A,与第二输出特征包括的特征图B匹配,则特征图A与特征图B构成一对特征图对。
其中,在确定特征图对时,针对第一输出特征包括的特征图的每一个,将该特征图作为当前特征图,对该当前特征图进行向量化处理得到第一向量,以及对第二输出特征包括的特征图中每一个待匹配的特征图进行向量化处理,得到第二向量。计算该第一向量与该第二向量之间的相似度分数。将与该第一向量相似度分数最高的第二向量对应的特征图,以及该第一向量对应的当前特征图确定为一对特征图对。需要说明的是,计算相似度的方法可以采用诸如欧氏距离、余弦距离等方法,在此不作限定。
在一些示例中,在确定特征图对时,针对第二输出特征包括的各通道的特征图的每一个,将该特征图作为当前特征图,执行类似前述步骤的方法,具体过程在此不作详述。
所述对应关系,是指特征图对内包括的两个特征图分别所处的第一输出特征的通道与所处的第二输出特征的通道之间的对应关系。例如,如果第一输出特征中处于第5通道的特征图A,与第二输出特征中处于第3通道的特征图B匹配,则上述对应关系可以是1-5与2-3对应。其中,1-5表示第一输出特征第5通道,2-3表示第二输出特征第3通道。可以理解的是,在本申请中还可以使用其他方式维护上述对应关系。
在确定对应关系后,可以继续执行S206,对学生模型进行训练;其中,在每一轮训练中,分别利用上述学生模型与上述教师模型,对样本图像进行图像处理,得到第三输出特征与第四输出特征;确定所述第三输出特征与所述样本图像对应的真实特征之间的误差;利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;确定对齐后的第三输出特征与第四输出特征之间的差距;基于所述误差与所述差距更新所述学生模型的模型参数。
所述真实特征,是用于确定所述误差的特征。在一些示例中,可以通过经过预训练的学生模型获取所述真实特征。例如,在图像分类任务中,学生模型可以是图像分类模型(初始学生模型)。此时,可以利用训练样本对初始学生模型进行预训练得到学生模型。完成预训练后,可以将标注了真实分类的样本图像输入预训练后的初始学生模型(即,学生模型)进行前向传播,然后将该学生模型的输出特征作为所述样本图像的真实特征。在一些示例中,真实特征也可以是利用已知的该样本图像之前的图像确定,例如通过诸如空间几何约束等算法推导出来的特征。例如,样本图像可以是图像序列中的某一图像。可以理解的是,图像序列中的样本图像通常为连续图像,连续图像中出现的对象满足空间几何约束,因此,可以由该样本图像之前的图像推导出该样本图像的真实特征。
所述误差,可以是第三输出特征与样本图像对应的真实特征之间的误差。在一些示例中,可以利用预先构建的损失函数(例如,交叉熵损失函数)确定所述误差。
所述特征对齐操作,目的在于使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
在实际应用中,可以基于所述对应关系,对第三输出特征或第四输出特征进行特征变换,以完成特征对齐操作。
例如,依据所述对应关系,对所述第三输出特征的各通道的特征图进行位置调整(例如将第一通道的特征图和第二通道的特征图互换位置,即将第一通道的特征图移到第二通道,将第二通道的特征图移到第一通道),以使调整后的第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。又例如,依据所述对应关系,对所述第四输出特征的各通道的特征图进行位置调整(例如将第一通道的特征图和第二通道的特征图互换位置,即将第一通道的特征图移到第二通道,将第二通道的特征图移到第一通道),以使第三输出特征包括的特征图与调整后的第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
所述差距,是指对齐后的第三输出特征与第四输出特征之间的差距。在一些示例中,可以利用预先构建的差距函数(例如,交叉熵损失函数)确定所述差距。可以理解的是,由于在确定所述差距前进行了特征对齐操作,因此确定该误差时可以减少由于特征图不匹配带来的误差,使得确定的所述差距更加真实精确,进而减小了模型收敛的难度,使得学生模型的输出特征容易接近教师模型的输出特征,提升了模型训练的效率。
图3为本申请示出的一种模型训练的流程示意图。如图3所示,在对所述学生模型进行训练时的每一轮训练中,可以先执行S2062,将样本图像输入学生模型与教师模型,得到学生模型输出的第三输出特征与教师模型输出的第四输出特征。
然后,可以执行S2064,基于预设损失函数,确定第三输出特征与样本图像对应的真实特征之间的误差。
在确定所述差距前可以执行S2066,进行对齐操作以使第三输出特征包括的特征图与第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
执行S2068,确定特征对齐后的第三输出特征与第四输出特征之间的差距。
执行S2070,利用反向传播法,基于所述误差与所述差距更新学生模型的模型参数。在执行完一轮训练后,可以重复执行步骤S2062-S2068,直至模型收敛。
在上述方案中,由于在确定学生模型输出特征与教师模型输出特征之间的差距时,先进行了特征对齐操作,以使学生模型的输出特征包括的特征图与教师模型的输出特征包括的特征图中,处于相同通道数的特征图之间相互匹配,因此可以使得处于相同通道数的特征图具有相同或相似的解释含义。从而,在确定上述差距时,可以减少由于特征图不匹配带来的误差,使得确定的差距更加真实精确,进而减小了模型收敛的难度,使得学生模型的输出特征容易接近教师模型的输出特征,提升了模型训练的效率。
以下结合知识蒸馏算法进行模型压缩的场景进行实施例的说明。
此时,学生模型可以为压缩后的结构简单的模型,教师模型可以为压缩前的结构复杂的模型。
在一些示例中,在执行步骤S202之前,可以先通过训练样本集对初始学生模型与初始教师模型进行预训练得到学生模型和教师模型。在此不对预训练过程进行详细介绍。
在这里,可以获取预训练完成的学生模型与教师模型。
在一些示例中,在对初始学生模型进行预训练前,可以记录该初始学生模型的初始化参数。所述初始化参数,可以包括初始学生模型在预训练前包括的模型参数。
在这里,可以记录初始学生模型在预训练前的模型参数。由此在后续对学生模型进行模型训练时,可以先利用记录的初始化参数对学生模型进行初始化,然后再进行模型训练,从而保证学生模型在后续训练过程中(学习过程中)的模型变化趋势可以与预训练时的模型变化趋势相同,从而有效利用学生模型初始化参数蕴含的信息提升学生模型的学习效果。
在完成预训练后,可以利用前述方法,利用预训练完毕的学生模型与教师模型,对图像数据集进行图像处理,得到第一输出特征与第二输出特征。
在得到第一输出特征与第二输出特征后,可以利用二分图匹配算法或贪心匹配算法,基于第一输出特征与第二输出特征确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道的对应关系。
由于二分图匹配算法或贪心算法可以确定出第一输出特征包括的特征图与第二输出特征包括的特征图之间匹配的特征图对,因此通过上述算法,可以较为准确的确定上述对应关系。
在一些示例中,在利用贪心匹配算法确定对应关系时,针对所述第一输出特征包括的特征图中的每一个,将该特征图作为当前特征图,确定所述第二输出特征包括的特征图中,与该当前特征图匹配的第一匹配特征图;记录该当前特征图所处的通道与所述第一匹配特征图所处的通道之间的第一子对应关系;基于记录的第一子对应关系,确定所述对应关系。
在这里,通过贪心匹配算法,可以确定学生模型的输出特征包括的特征图与教师模型的输出特征包括的特征图之间匹配的特征图对。然后再基于确定的特征图对,确定各特征图对内包括的两个特征图所处的通道之间的对应关系。
在一些示例中,在利用二分图匹配算法,确定所述对应关系时,根据已维护的对应关系,删除所述第二输出特征包括的特征图中,已经确定的与所述第一输出特征包括的第一特征图匹配的第二匹配特征图。针对所述第一输出特征包括的特征图中除所述第一特征图外的每一个,将该特征图作为当前特征图,确定所述第二输出特征剩余的特征图中,与该当前特征图匹配的第三匹配特征图;记录该当前特征图所处的通道与所述第三匹配特征图所处的通道之间的第二子对应关系。基于记录的所述第二子对应关系,确定对应关系。
在这里,通过二分图匹配算法,可以确定学生模型的输出特征包括的特征图与教师模型的输出特征包括的特征图之间匹配的特征图对。然后再基于确定的特征图对,确定各特征图对内包括的两个特征图所处的通道之间的对应关系。
需要说明的是,在确定特征图对时,除了二分图匹配算法与贪心算法外,也可以使用其他算法。
在一些示例中,为了方便记录上述对应关系,可以基于所述对应关系,生成转换矩阵。
其中,转换矩阵用于表征第一输出特征包括的特征图所处的通道与第二输出特征包括的特征图所处的通道之间的对应关系。
在一些示例中,为了方便进行特征对齐操作,转换矩阵可以是0-1矩阵。
图4为本申请示出的一种转换矩阵示意图。图4示出的转换矩阵用于表征第一输出特征包括的特征图所处的通道与第二输出特征包括的特征图所处的通道之间的对应关系。其中,该转换矩阵的行数代表第二输出特征包括的特征图的通道数,转换矩阵的列数代表第一输出特征包括的特征图的通道数,转换矩阵中的元素代表对应的两个特征图是否匹配。例如,0为不匹配,1为匹配。
如图4所示,第一行中第3个元素为1,此时可以指示,第一输出特征中的第3通道的特征图与第二输出特征中的第1通道的特征图匹配。如图4所示,第二行中第2个元素为1,此时可以指示,第一输出特征中的第2通道的特征图与第二输出特征中的第2通道的特征图匹配。以此类推,若用字母M表示第一输出特征,字母N表示第二输出特征,M1表示第一输出特征的第1通道的特征图,则图4示出的转换矩阵表征M3与N1匹配,M2与N2匹配,M4与N3匹配,M5与N4匹配,M1与N5匹配。
一方面,通过转换矩阵可以方便的记录上述对应关系。另一方面,通过转换矩阵可以方便后续进行特征对齐。
可以理解的是,转换矩阵的行数也可以表征第一输出特征包括的特征图所处的通道数,转换矩阵的列数表征第二输出特征包括的特征图所处的通道数。
在确定上述对应关系后,则可以继续执行S206,对学生模型进行训练。
在一些示例中,在对学生模型进行训练时,可以利用在预训练阶段记录的初始化参数,对学生模型进行初始化操作。然后对初始化后的学生模型进行训练。
在这里,可以利用与预训练阶段记录的初始化参数对学生模型进行初始化,然后再进行模型训练,可以保证学生模型在后续训练过程中(学习过程中)的模型变化趋势可以与预训练时的模型变化趋势相同,从而有效利用学生模型初始化参数蕴含的信息提升学生模型的学习效果。
在一些示例中,当转换矩阵的行数表征第二输出特征包括的特征图所处的通道数,转换矩阵的列数表征第一输出特征包括的特征图所处的通道数时,在利用确定的对应关系进行特征对齐操作时,利用转换矩阵对第四输出特征进行转换,以使第三输出特征包括的特征图与第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
例如,可以按照由上至下的顺序对第四输出特征包括的特征图进行编号。然后基于上述编号构建列向量。之后,将转化矩阵与所述列向量相乘,得到相乘结果。在这里,相乘结果可以表征对齐后的第四输出特征包括的特征图的排序。最后,可以按照上述相乘结果指示的顺序,对第四输出特征包括的特征图进行重新排序,得到特征对齐后的第四输出特征。
在这里,第三输出特征包括的特征图与对齐后的第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配,即完成了第四输出特征与第三输出特征的特征对齐,从而在确定二者差距时,可以减少由于特征图不匹配带来的误差,使得确定的差距更加真实精确,进而减小了模型收敛的难度,使得学生模型的输出特征容易接近教师模型的输出特征,提升了模型训练的效率。
在一些示例中,当转换矩阵的行数表征第一输出特征包括的特征图所处的通道数,转换矩阵的列数表征第二输出特征包括的特征图所处的通道数时,在利用所述对应关系进行特征对齐操作时,利用转换矩阵对第三输出特征进行转换,以使转换后的第三输出特征包括的特征图与第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
在确定第三输出特征与样本图像对应的真实特征之间的误差,以及特征对齐后的第三输出特征与第四输出特征之间的差距后,可以基于所述误差与所述差距更新所述学生模型的模型参数。在这里,可以实现对学生模型的一轮参数更新,由于在对学生模型的训练过程中,在确定学生模型的输出特征与教师模型的输出特征之间的差距时,先进行了特征对齐操作,以使学生模型的输出特征包括的特征图与教师模型的输出特征包括的特征图中,处于相同通道数的特征图之间相互匹配,从而使得处于相同通道数的特征图具有相同或相似的解释含义。因此,在确定所述差距时,可以减少由于特征图不匹配带来的误差,使得确定的差距更加真实精确,进而减小了模型收敛的难度,使得学生模型的输出特征容易接近教师模型的输出特征,提升了模型训练的效率以及模型压缩的效果。
在一些示例中,可以基于所述误差与所述差距进行加权求和的结果更新所述学生模型的模型参数。
其中,加权求和的权重可以根据实际情形进行设定。
通过基于所述误差与所述差距进行加权求和的结果更新所述学生模型的模型参数,可以综合利用所述误差与所述差距表征的含义实现模型训练,从而保证训练的学生模型的输出特征接近教师模型的输出特征。
在一些示例中,为了进一步提升学生模型的预测效果,可以根据图像的分类类型确定不同分类类型对应的对应关系,在训练学生模型时,可以根据输入的样本图像的分类类型,选取该样本图像的分类类型对应的对应关系进行特征对齐,从而提升学生模型针对不同分类类型的预测效果。
在一些示例中,在S202中使用的图像数据集包括的部分图像可以包括多个分类类型的图像。
其中,上述分类类型可以根据实际情形进行设定。例如,在自动驾驶场景中,上述分类类型可以人物、墙壁、车辆等。再例如,在动物分类场景中,上述分类类型可以包括狗、猫、猪等动物。
此时,在对所述学生模型的输出特征进行平均处理,得到所述第一输出特征时,可以分别对各分类类型的图像对应的输出特征进行平均处理,得到各分类类型对应的所述第一输出特征。
在对所述教师模型的输出特征进行平均处理,得到所述第二输出特征时,可以分别对各分类类型的图像对应的输出特征进行平均处理,得到各分类类型对应的所述第二输出特征。
在这里,即可确定针对不同分类类型的图像,学生模型输出的第一输出特征以及教师模型输出的第二输出特征。
在基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与上述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系,还包括:针对多个分类类型中的每一个分类类型,可以基于该分类类型对应的所述第一输出特征与该分类类型对应的所述第二输出特征,确定该分类类型对应的第一输出特征包括的特征图与该分类类型对应的第二输出特征包括的特征图之间匹配的特征图对,基于该分类类型对应的所述特征图对,确定该分类类型所对应的每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系。
在这里,即可针对不同分类类型的图像,确定学生模型输出特征包括的特征图所处的通道与教师模型输出特征包括的特征图所处的通道之间的对应关系。由于确定了分类类型对应的对应关系,可以消除不同分类类型的图像对应的输出特征的差别导致的误差,因此可以提升确定的对应关系的准确性。
在进行特征对齐操作时,可以执行以下方法。
图5为本申请示出的一种特征对齐方法的流程示意图。如图5所示,在进行特征对齐操作时,可以先执行S502,确定样本图像对应的分类类型。
在一些示例中,可以通过确定上述样本图像的标注类型来确定对应的分类类型。
在确定分类类型后,可以执行S504,利用分类类型所对应的对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
在这里,可以根据使用与输入样本图像的分类类型对应的对应关系进行特征对齐操作,因此,可以提升特征对齐操作准确性,从而提升学生模型训练的效果,进而提升学生模型的预测效果。
本申请还提出一种图像处理方法。该方法可以应用于任意类型的电子设备。该方法通过利用前述任一实施例示出的模型训练方法训练得到的图像处理模型(即上述学生模型)进行图像处理,由此可以实现利用复杂度较低的图像处理模型达到较好的预测效果,进而在不降低预测效果的基础上,提升了图像处理速率。
上述图像处理方法可以包括:获取目标图像。
利用包括根据前述任一实施例示出的知识蒸馏方法训练得到的学生模型对所述目标图像进行图像处理,得到图像处理结果。
上述学生模型可以是任意类型的模型。例如,所述学生模型可以是图像分类模型、目标检测模型、目标跟踪模型等。由于所述学生模型可以通过前述任一实施例示出的知 识蒸馏方法训练得到,因此,该模型兼具结构简单与预测效果好的特点,进而在不降低预测效果的基础上,提升了图像处理速率。
与上述任一实施例相对应的,本申请还提出一种知识蒸馏装置。
图6为本申请示出的一种知识蒸馏装置的结构示意图。如图6所示,所述装置600可以包括:样本处理模块610,用于分别利用学生模型与教师模型,对训练样本集进行处理,得到第一输出特征与第二输出特征;对应关系确定模块620,用于基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道数之间的对应关系;训练模块630,用于对所述学生模型进行训练;其中,在每一轮训练中,分别利用所述学生模型与所述教师模型,对样本数据进行处理,得到第三输出特征与第四输出特征;确定所述第三输出特征与所述样本图像对应的真实特征之间的误差;利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;确定对齐后的第三输出特征与第四输出特征之间的差距;基于所述误差与所述差距更新所述学生模型的模型参数。
在示出的一些实施例中,所述样本处理模块610具体用于:利用所述学生模型,对所述训练样本集中的一个或多个样本进行处理,得到与所述一个或多个样本分别对应的一个或多个学生模型输出特征;将与所述一个或多个样本分别对应的一个或多个学生模型输出特征中处于相同位置的值进行加权求和,得到所述第一输出特征;利用所述教师模型,对所述一个或多个样本进行处理,得到与所述一个或多个样本分别对应的所述一个或多个教师模型输出特征;将与所述一个或多个样本分别对应的所述一个或多个教师模型输出特征中处于相同位置的值进行加权求和,得到所述第二输出特征。
在示出的一些实施例中,所述对应关系确定模块620用于:利用二分图匹配算法或贪心匹配算法,确定所述对应关系。
在示出的一些实施例中,所述对应关系确定模块620用于:针对所述第一输出特征包括的特征图中的每一个,将该特征图作为当前特征图,确定所述第二输出特征包括的特征图中,与该当前特征图匹配的第一匹配特征图;记录该当前特征图所处的通道与所述第一匹配特征图所处的通道之间的第一子对应关系;基于记录的所述第一子对应关系,确定所述对应关系。
在示出的一些实施例中,所述对应关系确定模块620用于:根据已维护的对应关系,删除所述第二输出特征包括的特征图中,已经确定的与所述第一输出特征包括的第一特征图匹配的第二匹配特征图;针对所述第一输出特征包括的特征图中除所述第一特征图外的每一个,将该特征图作为当前特征图,确定所述第二输出特征剩余的各通道的特征图中,与该当前特征图匹配的第三匹配特征图;记录该当前特征图所处的通道与所述第三匹配特征图所处的通道数之间的第二子对应关系;基于记录的所述第二子对应关系,确定所述对应关系。
在示出的一些实施例中,所述装置还包括:预训练模块630,用于通过初始训练样本集对初始学生模型与初始教师模型进行预训练,得到所述学生模型和所述教师模型。
所述装置还包括:记录模块,用于在对所述初始学生模型进行所述预训练之前,记录所述初始学生模型对应的初始化参数;所述训练模型用于:利用所述初始化参数,对所述学生模型进行初始化操作;对初始化后的所述学生模型进行训练。
在示出的一些实施例中,所述装置还包括:生成模块,用于基于所述对应关系,生成转换矩阵;其中,所述转换矩阵用于表征所述第二输出特征包括的特征图所处的通道与所述第一输出特征包括的特征图所处的通道之间的对应关系。
在示出的一些实施例中,所述训练模型用于:当所述转换矩阵的行数表征所述第二输出特征包括的特征图所处的通道数,所述转换矩阵的列数表征所述第一输出特征包括的特征图所处的通道数时,利用所述转换矩阵对所述第四输出特征进行转换,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;或,当所述转换矩阵的行数表征所述第一输出特征包括的特征图所处的通道数,所述转换矩阵的列数表征所述第二输出特征包括的特征图所处的通道数时,利用所述转换矩阵对所述第三输出特征进行转换,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
在示出的一些实施例中,所述训练模型用于:根据所述误差与所述差距进行加权求和的结果确定损失;根据所述损失对所述学生模型进行反向传播,以更新所述学生模型的模型参数。
在示出的一些实施例中,所述训练样本集包括多个分类类型的样本;所述对应关系确定模块620用于:针对所述多个分类类型中的每一个分类类型,基于该分类类型对应的所述第一输出特征与所述第二输出特征,确定该分类类型对应的所述第一输出特征包括的各通道的特征图与该分类类型对应的所述第二输出特征包括的特征图之间匹配的特征图对,基于该分类类型对应的所述特征图对,确定该分类类型对应的每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系;所述训练模块630用于:确定所述样本数据对应的分类类型;利用所述分类类型所对应的对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
本申请还提出一种图像处理装置,所述装置包括:获取模块,用于获取目标图像;图像处理模块,用于利用根据前述任一实施例示出的知识蒸馏方法训练得到的学生模型对所述目标图像进行图像处理,得到图像处理结果。
本申请示出的知识蒸馏装置或图像处理处理装置的实施例可以应用于电子设备上。相应地,本申请公开了一种电子设备,该设备可以包括:处理器,用于存储处理器可执行指令的存储器,其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现前述知识蒸馏方法或图像处理处理方法。
请参见图7,图7为本申请示出的一种电子设备的硬件结构示意图。
如图7所示,该电子设备可以包括用于执行指令的处理器,用于进行网络连接的网络接口,用于为处理器存储运行数据的内存,以及用于存储知识蒸馏装置或图像处理处理装置对应指令的非易失性存储器。
其中,所述装置的实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,除了图7所示的处理器、内存、网络接口、以及非易失性存储器之外,实施例中装置所在的电子设备通常根据该电子设备的实际功能,还可以包括其他硬件,对此不再赘述。
可以理解的是,为了提升处理速度,知识蒸馏装置或图像处理处理装置对应指令也可以直接存储于内存中,在此不作限定。
本申请提出一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行前述知识蒸馏方法或图像处理处理方法。
本领域技术人员应明白,本申请一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本申请一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(可以包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请中的“和/或”表示至少具有两者中的其中一个,例如,“A和/或B”可以包括三种方案:A、B、以及“A和B”。
本申请中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
所述对本申请特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
本申请中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、可以包括本申请中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本申请中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体上以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以被编码在人工生成的传播信号上,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。
本申请中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。所述 处理及逻辑流程还可以由专用逻辑电路—例如FPGA(现场可编程门阵列)或ASIC(专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。
适合用于执行计算机程序的计算机可以包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器接收指令和数据。计算机的基本组件可以包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将可以包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PDA)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。
适合于存储计算机程序指令和数据的计算机可读介质可以包括所有形式的非易失性存储器、媒介和存储器设备,例如可以包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。
虽然本申请包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本申请内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如上所述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结果。在某些情况下,多任务和并行处理可能是有利的。此外,所述实施例中的各种系统模块和组件的分离不应被理解为在所有实施例中均需要这样的分离,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或者封装成多个软件产品。
由此,主题的特定实施例已被描述。其他实施例在所附权利要求书的范围以内。在某些情况下,权利要求书中记载的动作可以以不同的顺序执行并且仍实现期望的结果。此外,附图中描绘的处理并非必需所示的特定顺序或顺次顺序,以实现期望的结果。在某些实现中,多任务和并行处理可能是有利的。
以上仅为本申请一个或多个实施例的较佳实施例而已,并不用以限制本申请一个或多个实施例,凡在本申请一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请一个或多个实施例保护的范围之内。

Claims (17)

  1. 一种知识蒸馏方法,包括:
    分别利用学生模型与教师模型,对训练样本集进行处理,得到第一输出特征与第二输出特征;
    基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系;
    对所述学生模型进行训练;其中,在每一轮训练中,
    分别利用所述学生模型与所述教师模型,对样本数据进行处理,得到第三输出特征与第四输出特征;
    确定所述第三输出特征与所述样本数据对应的真实特征之间的误差;
    利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;
    确定对齐后的第三输出特征与第四输出特征之间的差距;
    基于所述误差与所述差距更新所述学生模型的模型参数。
  2. 根据权利要求1所述的方法,其中,分别利用所述学生模型与所述教师模型,对所述训练样本集进行处理,得到所述第一输出特征与所述第二输出特征,包括:
    利用所述学生模型,对所述训练样本集中的一个或多个样本进行处理,得到与所述一个或多个样本分别对应的一个或多个学生模型输出特征;
    将与所述一个或多个样本分别对应的所述一个或多个学生模型输出特征中处于相同位置的值进行加权求和,得到所述第一输出特征;
    利用所述教师模型,对所述一个或多个样本进行处理,得到与所述一个或多个样本分别对应的一个或多个教师模型输出特征;
    将与所述一个或多个样本分别对应的所述一个或多个教师模型输出特征中处于相同位置的值进行加权求和,得到所述第二输出特征。
  3. 根据权利要求1或2所述的方法,其中,基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系,包括:
    利用二分图匹配算法或贪心匹配算法,确定所述对应关系。
  4. 根据权利要求3所述的方法,其中,利用所述贪心匹配算法,确定所述对应关系,包括:
    针对所述第一输出特征包括的特征图中的每一个,
    将该特征图作为当前特征图,
    确定所述第二输出特征包括的特征图中,与该当前特征图匹配的第一匹配特征图,
    记录该当前特征图所处的通道与所述第一匹配特征图所处的通道之间的第一子对应关系;
    基于记录的所述第一子对应关系,确定所述对应关系。
  5. 根据权利要求3所述的方法,其中,利用所述二分图匹配算法,确定所述对应关系,包括:
    根据已维护的对应关系,删除所述第二输出特征包括的特征图中,已经确定的与所述第一输出特征包括的第一特征图匹配的第二匹配特征图;
    针对所述第一输出特征包括的特征图中除所述第一特征图外的每一个,
    将该特征图作为当前特征图,
    确定所述第二输出特征剩余的特征图中,与该当前特征图匹配的第三匹配特征图,
    记录该当前特征图所处的通道与所述第三匹配特征图所处的通道之间的第二子对应关系;
    基于记录的所述第二子对应关系,确定所述对应关系。
  6. 根据权利要求1-5中任一项所述的方法,还包括:
    通过初始训练样本集对初始学生模型与初始教师模型进行预训练,得到所述学生模型和所述教师模型;
    所述方法还包括:
    在对所述初始学生模型进行所述预训练之前,记录所述初始学生模型对应的初始化参数;
    对所述学生模型进行训练,包括:
    利用所述初始化参数,对所述学生模型进行初始化操作;
    对初始化后的所述学生模型进行训练。
  7. 根据权利要求1-6中任一项所述的方法,还包括:
    基于所述对应关系,生成转换矩阵;其中,所述转换矩阵用于表征所述第二输出特征包括的特征图所处的通道与所述第一输出特征包括的特征图所处的通道之间的对应关系。
  8. 根据权利要求7所述的方法,其中,利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配,包括:
    当所述转换矩阵的行数表征所述第二输出特征包括的特征图所处的通道数,所述转换矩阵的列数表征所述第一输出特征包括的特征图所处的通道数时,利用所述转换矩阵对所述第四输出特征进行转换,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
  9. 根据权利要求7所述的方法,其中,利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配,包括:
    当所述转换矩阵的行数表征所述第一输出特征包括的特征图所处的通道数,所述转换矩阵的列数表征所述第二输出特征包括的特征图所处的通道数时,利用所述转换矩阵对所述第三输出特征进行转换,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
  10. 根据权利要求1-9中任一项所述的方法,其中,基于所述误差与所述差距更新所述学生模型的模型参数,包括:
    根据所述误差与所述差距进行加权求和的结果确定损失;
    根据所述损失对所述学生模型进行反向传播,以更新所述学生模型的模型参数。
  11. 根据权利要求2所述的方法,其中,所述训练样本集包括多个分类类型的样本;
    基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系,包括:
    针对所述多个分类类型中的每一个分类类型,
    基于该分类类型对应的所述第一输出特征与所述第二输出特征,确定该分类类型对应的所述第一输出特征包括的特征图与该分类类型对应的所述第二输出特征包括的特征图之间匹配的特征图对,
    基于该分类类型对应的所述特征图对,确定该分类类型对应的每个所述特征图对内包括的两个特征图分别所处的通道之间的对应关系;
    利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道的特征图之间匹配,包括:
    确定所述样本数据对应的分类类型;
    利用所述分类类型所对应的对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作,以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配。
  12. 一种图像处理方法,包括:
    获取目标图像;
    利用通过权利要求1-11任一所述的知识蒸馏方法得到的所述学生模型对所述目标图像进行图像处理,得到图像处理结果。
  13. 一种知识蒸馏装置,包括:
    样本处理模块,用于分别利用学生模型与教师模型,对训练样本集进行处理,得到第一输出特征与第二输出特征;
    对应关系确定模块,用于基于所述第一输出特征与所述第二输出特征,确定所述第一输出特征包括的特征图与所述第二输出特征包括的特征图之间匹配的特征图对,基于所述特征图对,确定每个所述特征图对内包括的两个特征图分别所处的通道数之间的对应关系;
    训练模块,用于对所述学生模型进行训练;其中,在每一轮训练中,分别利用所述学生模型与所述教师模型,对样本数据进行处理,得到第三输出特征与第四输出特征;确定所述第三输出特征与所述样本数据对应的真实特征之间的误差;利用所述对应关系对所述第三输出特征或所述第四输出特征进行特征对齐操作以使所述第三输出特征包括的特征图与所述第四输出特征包括的特征图中,处于相同通道数的特征图之间匹配;确定对齐后的第三输出特征与第四输出特征之间的差距;基于所述误差与所述差距更新所述学生模型的模型参数。
  14. 一种图像处理装置,包括:
    获取模块,用于获取目标图像;
    图像处理模块,用于利用通过权利要求1-11任一所述的知识蒸馏方法得到的所述 学生模型对所述目标图像进行图像处理,得到图像处理结果。
  15. 一种电子设备,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现如权利要求1-11任一所述的知识蒸馏方法或权利要求12所述的图像处理方法。
  16. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序用于执行如权利要求1-11任一所述的知识蒸馏方法或权利要求12所述的图像处理方法。
  17. 一种计算机程序产品,包括存储于存储器中的计算机程序,所述计算机程序指令被处理器执行时实现如权利要求1-11任一所述的知识蒸馏方法或权利要求12所述的图像处理方法。
PCT/CN2021/130895 2021-01-22 2021-11-16 知识蒸馏和图像处理方法、装置、电子设备和存储介质 WO2022156331A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110090849.2 2021-01-22
CN202110090849.2A CN112819050B (zh) 2021-01-22 2021-01-22 知识蒸馏和图像处理方法、装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022156331A1 true WO2022156331A1 (zh) 2022-07-28

Family

ID=75858950

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130895 WO2022156331A1 (zh) 2021-01-22 2021-11-16 知识蒸馏和图像处理方法、装置、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN112819050B (zh)
WO (1) WO2022156331A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117726884A (zh) * 2024-02-09 2024-03-19 腾讯科技(深圳)有限公司 对象类别识别模型的训练方法、对象类别识别方法及装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819050B (zh) * 2021-01-22 2023-10-27 北京市商汤科技开发有限公司 知识蒸馏和图像处理方法、装置、电子设备和存储介质
CN115565021A (zh) * 2022-09-28 2023-01-03 北京大学 基于可学习特征变换的神经网络知识蒸馏方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242297A (zh) * 2019-12-19 2020-06-05 北京迈格威科技有限公司 基于知识蒸馏的模型训练方法、图像处理方法及装置
CN111598923A (zh) * 2020-05-08 2020-08-28 腾讯科技(深圳)有限公司 目标跟踪方法、装置、计算机设备及存储介质
CN111898735A (zh) * 2020-07-14 2020-11-06 上海眼控科技股份有限公司 蒸馏学习方法、装置、计算机设备和存储介质
CN112115783A (zh) * 2020-08-12 2020-12-22 中国科学院大学 基于深度知识迁移的人脸特征点检测方法、装置及设备
CN112819050A (zh) * 2021-01-22 2021-05-18 北京市商汤科技开发有限公司 知识蒸馏和图像处理方法、装置、电子设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247989B (zh) * 2017-06-15 2020-11-24 北京图森智途科技有限公司 一种实时的计算机视觉处理方法及装置
CN108830288A (zh) * 2018-04-25 2018-11-16 北京市商汤科技开发有限公司 图像处理方法、神经网络的训练方法、装置、设备及介质
CN110263842B (zh) * 2019-06-17 2022-04-05 北京影谱科技股份有限公司 用于目标检测的神经网络训练方法、装置、设备、介质
CN111260056B (zh) * 2020-01-17 2024-03-12 北京爱笔科技有限公司 一种网络模型蒸馏方法及装置
KR102191351B1 (ko) * 2020-04-28 2020-12-15 아주대학교산학협력단 지식 증류법 기반 의미론적 영상 분할 방법

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242297A (zh) * 2019-12-19 2020-06-05 北京迈格威科技有限公司 基于知识蒸馏的模型训练方法、图像处理方法及装置
CN111598923A (zh) * 2020-05-08 2020-08-28 腾讯科技(深圳)有限公司 目标跟踪方法、装置、计算机设备及存储介质
CN111898735A (zh) * 2020-07-14 2020-11-06 上海眼控科技股份有限公司 蒸馏学习方法、装置、计算机设备和存储介质
CN112115783A (zh) * 2020-08-12 2020-12-22 中国科学院大学 基于深度知识迁移的人脸特征点检测方法、装置及设备
CN112819050A (zh) * 2021-01-22 2021-05-18 北京市商汤科技开发有限公司 知识蒸馏和图像处理方法、装置、电子设备和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117726884A (zh) * 2024-02-09 2024-03-19 腾讯科技(深圳)有限公司 对象类别识别模型的训练方法、对象类别识别方法及装置
CN117726884B (zh) * 2024-02-09 2024-05-03 腾讯科技(深圳)有限公司 对象类别识别模型的训练方法、对象类别识别方法及装置

Also Published As

Publication number Publication date
CN112819050A (zh) 2021-05-18
CN112819050B (zh) 2023-10-27

Similar Documents

Publication Publication Date Title
WO2022156331A1 (zh) 知识蒸馏和图像处理方法、装置、电子设备和存储介质
CN110366734B (zh) 优化神经网络架构
Iscen et al. Label propagation for deep semi-supervised learning
CN109643383B (zh) 域分离神经网络
CN111797893B (zh) 一种神经网络的训练方法、图像分类系统及相关设备
US20210342643A1 (en) Method, apparatus, and electronic device for training place recognition model
US20200026986A1 (en) Neural network method and appartus with parameter quantization
US11651214B2 (en) Multimodal data learning method and device
CN116261731A (zh) 基于多跳注意力图神经网络的关系学习方法与系统
US20200234119A1 (en) Systems and methods for obtaining an artificial intelligence model in a parallel configuration
WO2022174805A1 (zh) 模型训练与图像处理方法、装置、电子设备和存储介质
WO2021253941A1 (zh) 神经网络模型训练、图像分类、文本翻译方法及装置、设备
US20230073669A1 (en) Optimising a neural network
CN112446888A (zh) 图像分割模型的处理方法和处理装置
CN111340057B (zh) 一种分类模型训练的方法及装置
WO2021012691A1 (zh) 用于检索图像的方法和装置
CN114155388B (zh) 一种图像识别方法、装置、计算机设备和存储介质
CN114445692B (zh) 图像识别模型构建方法、装置、计算机设备及存储介质
CN113869366B (zh) 模型训练方法、亲属关系分类方法、检索方法及相关装置
CN112861474B (zh) 一种信息标注方法、装置、设备及计算机可读存储介质
CN114912540A (zh) 迁移学习方法、装置、设备及存储介质
CN114663714A (zh) 图像分类、地物分类方法和装置
CN110175231B (zh) 视觉问答方法、装置和设备
CN112132175A (zh) 对象分类方法、装置、电子设备及存储介质
CN111738403B (zh) 一种神经网络的优化方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21920707

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21920707

Country of ref document: EP

Kind code of ref document: A1