CN115170917B

CN115170917B - Image processing method, electronic device and storage medium

Info

Publication number: CN115170917B
Application number: CN202210701707.XA
Authority: CN
Inventors: 刘宁; 唐剑; 蒯文啸; 张法朝; 奉飞飞
Original assignee: Midea Group Co Ltd; Midea Group Shanghai Co Ltd
Current assignee: Midea Group Co Ltd; Midea Group Shanghai Co Ltd
Priority date: 2022-06-20
Filing date: 2022-06-20
Publication date: 2023-11-07
Anticipated expiration: 2042-06-20
Also published as: WO2023246177A1; CN115170917A

Abstract

The application relates to the technical field of computers, and provides an image processing method, electronic equipment and a storage medium, wherein the method comprises the following steps: inputting an image to be processed into a target image processing model to obtain target image information output by the target image processing model, wherein the target image processing model is established based on a target mask tensor, and an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor; the target mask tensor is determined based on the steps of: acquiring a first calculation map corresponding to a first image processing model; determining a second computational graph based on the first computational graph; and performing forward calculation and backward calculation of the first image processing model according to the second calculation map to determine the target mask tensor. According to the method, redundant channels are removed from a target image processing model for processing the image to be processed, the parameter number is reduced, the calculation speed of the model is improved on the premise that the image processing precision is not affected, and the hardware requirements of model deployment are reduced.

Description

Image processing method, electronic device and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method, an electronic device, and a storage medium.

Background

Since 2012, the deep learning algorithm achieves the achievement of attractive attention in the image classification task, an image processing model is built based on a deep convolutional neural network to gradually replace the traditional statistical learning to become a mainstream frame and method of computer vision, and the deep learning algorithm is widely applied to aspects including face recognition, auxiliary driving and the like.

However, the high-precision image processing model is generally complex in design, huge in parameter quantity, needs to occupy high storage space and consume computing resources during deployment, and is difficult to directly deploy application at the equipment end with medium and low computational power.

Disclosure of Invention

The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides an image processing method, which reduces the demand of an image processing task on equipment end computing force.

An image processing method according to an embodiment of the first aspect of the present application includes:

inputting an image to be processed into a target image processing model to obtain target image information output by the target image processing model, wherein the target image processing model is built based on a target mask tensor, and an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor;

The target mask tensor is determined based on the steps of:

acquiring a first calculation map corresponding to a first image processing model, wherein the first image processing model is a model obtained by structural pruning, and the first calculation map comprises a plurality of first nodes which are in one-to-one correspondence with a plurality of operators in the first image processing model;

determining a second calculation graph based on the first calculation graph, wherein the second calculation graph comprises a plurality of second nodes, and the second nodes are obtained by packaging at least one first node;

and performing forward computation and backward computation of the first image processing model according to the second computation graph to determine the target mask tensor.

According to the image processing method provided by the embodiment of the application, the first image processing model subjected to model pruning is subjected to representation reconstruction to obtain the first calculation diagram, the operator repackaging is performed based on the first calculation diagram to obtain the second calculation diagram, the target mask tensor is further determined, the redundant channels are removed, the parameter quantity is reduced, the target image processing model is automatically generated, the compact model capable of being arranged at the middle-low computing power equipment end is automatically generated from the sparse model, the image to be processed is processed by utilizing the compact model, and the model compression is realized without influencing the model precision.

According to one embodiment of the present application, the performing forward computation and backward computation of the first image processing model according to the second computation graph, determining the target mask tensor includes:

performing forward computation and corresponding reverse computation on the first image processing model according to the second computation graph, and updating a mark mask corresponding to each operator according to the output result of the forward computation and the corresponding reverse computation;

and circularly executing the forward computation and the corresponding reverse computation until the mark mask corresponding to each operator is not updated any more, and obtaining the target mask tensor based on the mark mask corresponding to each operator.

According to one embodiment of the present application, the performing, according to the second computation graph, a forward computation and a corresponding backward computation on the first image processing model, updating, according to an output result of the forward computation and the corresponding backward computation, a marker mask corresponding to each of the operators includes:

and determining that the output result of forward calculation of any operator in the first image processing model is 0 or the solution gradient of reverse calculation is 0, and updating the mark mask corresponding to any operator to be 0.

and performing forward computation on the first image processing model according to the second computation graph, and performing reverse computation on the first image processing model based on the output result of the forward computation of the first image processing model to determine the target mask tensor.

According to one embodiment of the present application, the obtaining a first computation graph corresponding to a first image processing model includes:

traversing the operators of the first image processing model, and determining the association relation of the operators;

the first computational graph is determined based on the plurality of operators and the association relationship.

According to one embodiment of the present application, the determining a second calculation map based on the first calculation map includes:

and packaging the first nodes with the same scope in the first computational graph into the second nodes, and determining the second computational graph.

According to one embodiment of the application, before the performing of the forward computation and the backward computation of the first image processing model from the second computation graph, the method further comprises:

And randomly initializing the weight corresponding to the operator with the weight in the first image processing model to be a non-0 value.

An image processing apparatus according to an embodiment of a second aspect of the present application includes:

the processing module is used for inputting an image to be processed into a target image processing model to obtain target image information output by the target image processing model, the target image processing model is built based on a target mask tensor, and an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor;

the target mask tensor is determined based on the steps of:

An electronic device according to an embodiment of the third aspect of the present application includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the image processing method as any one of the above when executing the computer program.

A non-transitory computer readable storage medium according to an embodiment of the fourth aspect of the present application has stored thereon a computer program which, when executed by a processor, implements the image processing method as described in any of the above.

A computer program product according to an embodiment of the fifth aspect of the application comprises a computer program which, when executed by a processor, implements the image processing method as described in any of the above.

The above technical solutions in the embodiments of the present application have at least one of the following technical effects:

performing representation diagram reconstruction on a first image processing model subjected to model pruning to obtain a first calculation diagram, performing operator repackaging on the basis of the first calculation diagram to obtain a second calculation diagram, further determining a target mask tensor, directly generating a compact model to process an image to be processed through an automatic search reconstruction mode, directly arranging the compact model at a middle-low computational power equipment end, needing no professional knowledge, and reducing the time and labor cost of the deployment of a sparse model in landing.

Further, updating the mark masks corresponding to all operators in the first image processing model according to whether the output result of the forward computation is 0 and whether the solving gradient of the backward computation is 0, stopping the forward computation and the backward computation after the mark masks corresponding to all operators in the first image processing model are not updated any more, determining the target mask tensor, accurately judging the operators and parameters which need to be reserved when the compact model is built, and ensuring the precision of the target image processing model.

Furthermore, the new second calculation graph is obtained by repackaging the action domains corresponding to the first nodes according to the layer structure one by one, the second nodes in the second calculation graph correspond to the layer structure of the first image processing model one by one, and each operator in the first image processing model can determine the input and output, domain names, father nodes, child nodes and other attributes in the second calculation graph.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of an image processing method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a deployment flow of a target image processing model according to an embodiment of the present application;

fig. 3 is a schematic structural view of an image processing apparatus according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in further detail below with reference to the accompanying drawings and examples. The following examples are illustrative of the application but are not intended to limit the scope of the application.

In describing embodiments of the present application, it should be noted that the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

The image processing method according to the embodiment of the application is described below with reference to fig. 1 and 2, and the pruned sparse model is utilized to automatically generate a compact model to process the image to be processed, so that model compression is realized and model processing precision is ensured.

As shown in fig. 1, the image processing method according to the embodiment of the present application includes step 110, where the execution body of the method is a controller of a device terminal, or a cloud or an edge server.

And 110, inputting the image to be processed into a target image processing model to obtain target image information output by the target image processing model.

The target image processing model is built based on the target mask tensor, and the image channels in the target image information correspond to the channels with the component of 1 in the target mask tensor.

In this embodiment, the target image processing model may perform operations such as image transformation, image enhancement, image restoration, image compression and encoding, image reconstruction, and image segmentation and feature extraction on the image to be processed.

For example, the image to be processed is input to a target image processing model, the target image processing model performs an image enhancement operation on the image to be processed, and the output target image information is an image obtained by adding the image to be processed.

For another example, the image to be processed is input to a target image processing model, the target image processing model performs image segmentation and feature extraction operations on the image to be processed, classification processing is further performed, and the output target image information is the type of the object in the image to be processed.

The target image information corresponding to the image to be processed output by the target image processing model comprises information such as image width, image height, image channel and the like.

In this embodiment, a channel with a component of 1 in the target mask tensor is extracted and stored in an initializing operator, a target image processing model is built, and an image channel in the target image information corresponds to the channel with the component of 1 in the target mask tensor.

The target mask tensor is determined based on the steps of:

And performing forward calculation and backward calculation of the first image processing model according to the second calculation map to determine the target mask tensor.

It should be noted that, the target image information obtained by performing image processing on the image to be processed by the first image processing model through a plurality of operators is consistent with the target image information obtained by performing image processing on the image to be processed by the target image processing model, the first image processing model is a sparse model obtained by structured pruning, and the target image processing model is a compact model obtained by further removing redundant parameters.

It will be appreciated that the target mask tensor is to store the marker masks corresponding to the plurality of operators in the first image processing model in the form of tensors, and the number of channels of the operators is equal to the number of components of the target mask tensor.

In this embodiment, after the target mask tensor is determined, only the channel with the component equal to 1 in the target mask tensor needs to be extracted and stored into the initialized new operator, so that the redundant channel with the component value of 0 can be removed, and a compact target image processing model is built.

The following describes a procedure for determining a target mask tensor according to an embodiment of the present application:

the first step, a first calculation map corresponding to the first image processing model is obtained.

The first image processing model is an image processing model obtained by structured pruning, the structured pruning is pruning by taking a convolution kernel as a unit, an auxiliary layer masking layer is inserted, the masking layer only sets the weight of a corresponding channel to 0, but the weights of the corresponding channel are not actually deleted from the model, and the obtained first image processing model is a sparse model.

The operators in the first image processing model characterize the operations performed by the functions, which are performed on the image input to the first image processing model, e.g., convolution, feature fusion, etc.

The first computational graph is a mathematical image that characterizes the node-to-node relationship, and the node signature handles digital, symbolic, or signature inputs and outputs.

And determining a plurality of first nodes in one-to-one correspondence according to a plurality of operators in the first image processing model, and obtaining a first calculation map according to the plurality of first nodes.

It should be noted that, in this embodiment, the plurality of operators in the first image processing model refers to all operators in the first image processing model that characterize operations performed by the functions, and accordingly, the plurality of first nodes of the first computation graph characterize all processing numbers, symbols, or flag type inputs and outputs in the first image processing model.

For example, in the first image processing model, there are operator a, operator B, operator C, operator D, operator E, operator F, operator G, and operator H.

The first calculation map corresponding to the obtained first image processing model comprises a first node a, a first node b, a first node c, a first node d, a first node e, a first node f, a first node g and a first node h.

Wherein the first node a corresponds to operator a, the first node B corresponds to operator B, and so on.

In actual implementation, the first image processing model may be expanded by using a deep learning training framework such as PaddlePaddle, tensorflow, caffe, theano, MXNet, torch and PyTorch, to obtain a corresponding first calculation map.

Taking the first image processing model as ResNet18 as an example, the first computation graph developed by the PyTorch deep learning training framework comprises 372 first nodes corresponding to 372 operators of a plurality of types, the 372 first nodes can be divided into constant nodes for storing input and output, network module nodes and functional function nodes, wherein the network module nodes can correspond to operations such as convolution and activation functions, and the functional function nodes can correspond to operations such as splicing and transposition.

And a second step of determining a second calculation map based on the first calculation map.

And repackaging the plurality of first nodes in the first calculation graph to obtain a plurality of second nodes, and further obtaining a second calculation graph.

In this embodiment, the second node is encapsulated by at least one first node.

In actual implementation, encapsulation and fusion can be performed according to the operator characteristics corresponding to each first node in the first calculation graph.

For example, according to the layer structure corresponding to the operator corresponding to the first node in the first image processing model, the first node is encapsulated, and the second node is obtained.

And packaging one or more first nodes corresponding to operators belonging to a certain convolution layer in the first image processing model into a second node, wherein the second node characterizes processing digital, symbol or mark type input and output of the convolution layer.

In this embodiment, the first nodes in the first computational graph are repackaged according to the layer structure features of the first image processing model, so as to obtain a plurality of second nodes, and the second computational graph is determined according to the plurality of second nodes.

It can be understood that, according to the layer structure characteristics of the first image processing model, the second nodes obtained by packaging the first nodes correspond to the layer structure of the first image processing model, the number of layers of the first image processing model is high, the number of second nodes in the second calculation graph is high, and the connection relationship of the second nodes in the second calculation graph represents the interlayer data input and output relationship of the first image processing model.

Taking the first image processing model as ResNet18, the first calculation map developed by the PyTorch deep learning training frame comprises 372 first nodes corresponding to 372 operators of a plurality of types as an example.

A convolution layer torch.nn.Conv2d is split into 4 prim:: listConstruct function nodes and 1 aten:: -Convolumtion module nodes in the first computational graph, and the weights, inputs and outputs of the convolution layer are stored in constant nodes in the first computational graph.

The batch normalization layer Batch Normalization has inputs split into 4 constant nodes in the first computational graph, weight, bias, running_mean, and running_var.

And packaging a plurality of first nodes such as a function node, a module node, a constant node and the like of the convolution layer torch.nn.Conv2d as a second node, and packaging 4 first nodes of the batch normalization layer Batch Normalization as a second node to determine a second calculation graph.

The second nodes after encapsulation are in one-to-one correspondence with the layer structure of the first image processing model, and the number of the second nodes in the second computational graph of the ResNet18 obtained after encapsulation of the first nodes is reduced to 69.

And thirdly, performing forward computation and backward computation of the first image processing model according to the second computation graph, and determining the target mask tensor of the first image processing model.

The forward calculation refers to a process of inputting data into a model and obtaining a result through calculation of the model, the forward calculation of the model can be composed of a plurality of operation operations, and the forward calculation of the model can be completed by using a calculation graph corresponding to an operator of the model by a framework.

The reverse calculation is a calculation process of calculating the gradient of each layer in the model by taking the error between the result of the forward calculation and the expected result as input and reversely propagating the error according to an automatic differentiation mechanism.

In this embodiment, forward computation and backward computation of the first image processing model may be performed according to the second computation graph, and the input-output marker masks are created for all operators in the first image processing model, so as to obtain a target mask tensor of the first image processing model.

It will be appreciated that the forward and backward calculations of the first image processing model are performed according to the second computational graph, i.e. according to the layer structure in the first image processing model, and when the residual structure occurs in the first image processing model, the marker mask determined by the backbone network of the residual structure may be shared to the branch network of the residual structure, so that the channel dimensions of the pruned backbone network and the branch network remain consistent.

It should be noted that, the target mask tensor determined according to the mark masks corresponding to the plurality of operators in the first image processing model is a multiple linear function, where multiple linearity refers to that the tensor is linear for each parameter, and the component of the tensor is a value of the tensor acting on a corresponding set of basis vectors.

It may be appreciated that the target mask tensor includes a marker mask corresponding to a plurality of operators in the first image processing model, the marker mask corresponding to each operator characterizes the parameter importance of the operator, and whether the parameter is redundant or not is determined according to the marker mask corresponding to each operator, and whether the operator needs to be preserved or not is determined.

In this embodiment, compared with the first image processing model, the target image processing model is a compact model with less parameter and more simplified structure, and the target image processing model has smaller calculation resource requirements and memory requirements, so that the target image processing model can meet wider application requirements than a sparse model which only performs structured pruning operation.

In actual implementation, after the compact model of the target image processing model is built according to the target mask tensor, the target image processing model can be transplanted on hardware equipment, and the process of building the target image processing model is equivalent to the process of pruning the sparse model of the first image processing model.

For example, a convolutional neural network model for performing a computer vision recognition task is structured and pruned to obtain a first image processing model a, and by the image processing method, a compact model of a target image processing model B is obtained, where the target image processing model B may be disposed on an end-side device where an image acquisition device with middle-low computational power is located.

In the embodiment of the application, the first image processing model subjected to model pruning is subjected to representation graph reconstruction to obtain a first calculation graph, operator repackaging is performed based on the first calculation graph to obtain a second calculation graph, then the target mask tensor is determined, the compact model is directly generated in an automatic search reconstruction mode, the target image processing model can be directly applied to equipment end with middle and low calculation power without professional knowledge, and the time and labor cost for deploying the sparse model for landing are reduced.

According to the application method of the image processing model provided by the embodiment of the application, the target image processing model is a compact model generated by an automatic searching and reconstructing mode based on the first image processing model of the model pruning, and is used for processing the image to be processed, the target image processing model removes redundant channels, the parameter number is reduced, the calculation speed of the model is improved on the premise of not influencing the model precision, and the hardware requirements of hardware equipment deployed by the target image processing model are reduced.

In some embodiments, the third step of determining the target mask tensor in the image processing method may include:

performing forward computation and corresponding reverse computation on the first image processing model according to the second computation graph, and updating a mark mask corresponding to each operator in the plurality of operators according to output results of the forward computation and the corresponding reverse computation;

and circularly executing forward computation and corresponding reverse computation until determining that the mark mask corresponding to each operator in the plurality of operators of the first image processing model is not updated any more, and obtaining a target mask tensor based on the mark mask corresponding to each operator.

In this embodiment, forward computation and backward computation of the first image processing model are performed according to the second computation graph, the input-output mark masks are created for all operators in the first image processing model, and the mark masks corresponding to each operator are updated by multiple forward computation and multiple backward computation.

When the mark masks corresponding to all operators in the first image processing model are not updated any more, the mark mask corresponding to each operator in all operators is determined, and then the target mask operation of the first image processing model is generated according to the determined mark mask corresponding to each operator.

In this embodiment, performing forward computation and reverse computation on the first image processing model a plurality of times according to the second computation graph, and updating the marker mask corresponding to each of the plurality of operators according to the output results of the forward computation and the corresponding reverse computation may include:

In this embodiment, during the forward computation operation and the backward computation operation performed a plurality of times, when the forward computation is performed, it is checked whether the output result of the forward computation of the operator is 0, if so, the marker mask corresponding to the operator is updated to be 0, and when the backward computation is performed, it is checked whether the solution gradient of the backward computation of the operator is 0, and if so, the marker mask corresponding to the operator is updated to be 0.

For example, a forward computing operation is performed on a first operator in the first image processing model according to the second computation graph, and when an output result of the first forward computing is determined to be 0, a mark mask corresponding to the first operator is updated to be 0.

It can be understood that the marker mask corresponding to the first operator is updated to 0, which indicates that the first operator in the first image processing model belongs to the inactive position, and the weight and the marker mask are set to 0, so that in the process of establishing the target image processing model, the operation on the component of the target mask tensor corresponding to the marker mask of the first operator is not required.

For another example, when it is determined that the solution gradient calculated in the reverse direction of the second operator is 0, the marker mask corresponding to the second operator is updated to 0, which indicates that the second operator in the first image processing model belongs to an inactive position, and the weight and the marker mask are set to 0, in the process of establishing the target image processing model, there is no need to perform an operation on the component of the target mask tensor corresponding to the marker mask of the second operator.

In this embodiment, the marker masks corresponding to all operators in the first image processing model are updated according to whether the output result of the forward computation is 0 and whether the solution gradient of the backward computation is 0, and after the marker masks corresponding to all operators in the first image processing model are no longer updated, the forward computation and the backward computation are stopped, so as to determine the target mask tensor of the first image processing model.

A specific embodiment is described below.

And (3) carrying out structured pruning to obtain a first image processing model, obtaining an output mask of a BN layer of the first image processing model in the training process of the structured pruning, and after carrying out representation reconstruction and operator repackaging to obtain a second calculation map, distributing corresponding mark masks to all output channels and model weights of the first image processing model according to the second calculation map.

Because the output channels of the BN layer and the convolution layer are in one-to-one correspondence, when forward calculation is executed, the output channel mask of the convolution layer can be obtained by reasoning according to the output mask of the BN layer in the structured pruning process.

And in the third step of determining the target mask tensor, before performing forward computation and backward computation of the first image processing model according to the second computation graph, randomly initializing the weight corresponding to the operator with the weight in the first image processing model to be a non-0 value.

Before performing the forward computation and the backward computation, all operators with weights in the first image processing model need to be screened out, and the weights are randomly initialized to be non-0 value, so as to avoid removing operators which are not removed but have the weight value of 0 in the process of subsequently establishing the target image processing model, and prevent interference with the establishment of the subsequent compact model.

For example, the first image processing model comprises a layer structure of Conv1- > BN1- > Conv2- > BN2 and the second computational graph characterizes the layer structure relationship of Conv1- > BN1- > Conv2- > BN 2.

Pruning of Conv1 convolutional layers is described below.

Assuming that the input image to be processed is an RGB picture, the output shape of Conv1, BN1 is (B, C1, H, W), then the weight shape of Conv1 is (3, C1, K1), the output shape of Conv2, BN2 is (B, C2, H, W), and the weight shape of Conv2 is (C1, C2, K2).

In the sparse training of the previous structured pruning, of the C1 channels for which BN1 has been obtained, the mask of D channels is marked 0, that is to say the D dimensions are to be pruned.

The positions of the branches to be pruned are corresponding to the output channels of Conv1 when forward computation is performed, namely D channels of Conv1 are marked as 0, the rest are marked as 1, the shape of the output dimension of Conv1 is changed into (B, C1-D, H, W) after pruning, H represents the image height, W represents the image width, and C1-D represents the image channels.

Then, the weight of Conv1 needs to be pruned, the counter-propagation is performed according to the second calculation graph, the solving gradient size of the weight tensor is calculated, the solving gradient of a certain operator position is equal to 0, which means that the weight of the position is not updated, then the mask of the operator is marked as 0, and the weight shape of Conv1 is changed into (3, C1-D, K1, K1) after pruning.

and performing forward computation on the first image processing model according to the second computation graph, and performing reverse computation on the first image processing model based on the output result of the forward computation of the first image processing model to determine a target mask tensor.

In this embodiment, forward computation is performed on the first image processing model according to the second computation graph, and reverse computation is performed on the first image processing model according to the output result of the forward computation.

And performing inverse computation on the first image processing model, and taking the error between the output result of the forward computation and the expected result as input, computing the gradient of each layer in the first image processing model, namely, computing the solving gradient of the output in the inverse computation.

In some embodiments, the first step of determining the target mask tensor in the image processing method may include:

traversing a plurality of operators of the first image processing model, and determining association relations of the operators;

a first computational graph is determined based on the plurality of operators and the association relationship.

In this embodiment, the first image processing model obtained by structured pruning is traversed, a plurality of operators of the first image processing model are structured and ordered, association relations among the operators are determined, and a previous operator, a next operator and corresponding input and output information of each operator are determined.

And determining a plurality of first nodes corresponding to the operators according to the operators of the first image processing model, and determining the connection relation of the first nodes in the first calculation graph according to the association relation among the operators.

In actual execution, according to the previous-stage operator, the next-stage operator and the corresponding input/output information corresponding to each operator in the first image processing model, the father node, the child node and the corresponding input/output information corresponding to the first node are determined.

In some embodiments, the second step of determining the target mask tensor in the image processing method may include:

and packaging the first nodes with the same scope in the first computational graph as a second node, and determining the second computational graph.

It will be appreciated that the plurality of first nodes in the first computational graph correspond to a plurality of operators in the first image processing model that characterize the operation performed by the function, each operator having its own corresponding scope.

The first image processing model is provided with a plurality of layers, the first nodes of each layer after being split share the same acting domain, the first nodes of the same acting domain can be repackaged into new second nodes by searching the first nodes of the same acting domain, and the first calculation graph is packaged to obtain a second calculation graph.

In this embodiment, the scope corresponding to the first node is repackaged module by module according to the layer structure, so as to obtain a new second computation graph, the second node in the second computation graph corresponds to the layer structure of the first image processing model one by one, and each operator in the first image processing model can determine the input and output, domain name, father node, child node and other attributes in the second computation graph.

The process of compressing the sparse model obtained by unstructured pruning is described below.

And obtaining a second image processing model through unstructured pruning, setting the redundancy parameter of the second image processing model to be 0, and establishing a target sparse matrix based on the second image processing model with the redundancy parameter set to be 0, wherein the target sparse matrix is used for storing non-0 elements in the second image processing model.

Mining redundant parameters in the second image processing model, setting the mined redundant parameters to 0, and setting the 0-operation-based second image processing model to comprise 0 element and non-0 element.

In the embodiment, 0 element in the second image processing model is removed in a sparse matrix form, a target sparse matrix is established to store non-0 element in the second image processing model, the storage space of the memory occupied by the second image processing model is saved, and the image to be processed is processed through the model of the target sparse matrix.

For example, the numerical values and the line number start-stop points of all non-0 elements in the two-dimensional matrix of the second image processing model are stored by two one-dimensional sparse matrices, respectively.

In the embodiment, 0 element in the second image processing model is removed in a sparse matrix form, so that the storage space of the memory occupied by the second image processing model is saved, the time and labor cost for deploying the second image processing model as a sparse model are reduced, and model compression is realized without influencing model precision.

A specific embodiment is described below.

As shown in fig. 2, in step 200, the model is thinned, and the thinning operation of pruning is performed on the initial neural network model with a large number of redundant parameters, and pruning includes structured pruning and unstructured pruning.

For structured pruning:

and 210, structured pruning to obtain a first image processing model.

Step 211, representing graph reconstruction, using the deep learning training frame to develop the first image processing model, and obtaining a corresponding first calculation graph.

And 212, repackaging the operators, and repackaging the corresponding nodes of the operators according to the layer structure of the first image processing model to obtain a second calculation graph.

Step 213, creating and updating the mask, performing forward computation and backward computation of the first image processing model according to the second computation graph, and determining the target mask tensor of the first image processing model.

And 214, removing redundant channels, extracting channels with component equal to 1 in the target mask tensor, storing the channels in the initialized new operator, removing redundant channels with component value of 0, and generating a compact target image processing model.

And (3) entering step 230, deploying hardware, and deploying the compact target image processing model obtained in step 214 into hardware equipment for application.

For unstructured pruning:

and 220, unstructured pruning to obtain a second image processing model.

And 221, setting the weight to 0, mining the redundant parameters in the second image processing model, and setting the mined redundant parameters to 0.

Step 222, sparse matrix storage, namely establishing a target sparse matrix to store non-0 elements in the second image processing model, and saving the storage space of the memory occupied by the second image processing model.

And (3) entering a step 230 and hardware deployment, and deploying the target sparse matrix obtained in the step 222 into hardware equipment for application.

According to the embodiment of the application, the pruned sparse model is further compressed, the parameter quantity of the model is reduced, the obtained compact model and the sparse matrix can be directly deployed on low-calculation-force end side equipment, and the time cost and the labor cost for the neural network model to be deployed and landed are reduced.

The image processing apparatus provided in the embodiment of the present application will be described below, and the image processing apparatus described below and the image processing method described above may be referred to correspondingly to each other.

As shown in fig. 3, an image processing apparatus provided in an embodiment of the present application includes:

the processing module 310 is configured to input an image to be processed into a target image processing model, obtain target image information output by the target image processing model, and establish the target image processing model based on a target mask tensor, where an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor;

The target mask tensor is determined based on the steps of:

According to the image processing device provided by the embodiment of the application, the target image processing model is a compact model generated by an automatic searching and reconstructing mode based on the first image processing model of model pruning, redundant channels are removed from the target image processing model for processing the image to be processed, the parameter number is reduced, the calculation speed of the model is improved on the premise that the model precision is not influenced, and the hardware requirements of hardware equipment deployed by the target image processing model are reduced.

In some embodiments, the processing module 310 is configured to perform a forward computation and a corresponding backward computation on the first image processing model according to the second computation graph, and update the marker mask corresponding to each operator according to an output result of the forward computation and the corresponding backward computation;

And performing forward computation and corresponding reverse computation circularly until the mark mask corresponding to each operator is not updated any more, and obtaining a target mask tensor based on the mark mask corresponding to each operator.

In some embodiments, the processing module 310 is configured to determine that an output result of forward computation of any operator in the first image processing model is 0 or a solution gradient of backward computation is 0, and update a marker mask corresponding to any operator to be 0.

In some embodiments, the processing module 310 is configured to perform a forward computation on the first image processing model according to the second computation graph, and perform a backward computation on the first image processing model based on an output result of the forward computation of the first image processing model, to determine the target mask tensor.

In some embodiments, the processing module 310 is configured to traverse a plurality of operators of the first image processing model, and determine an association relationship of the plurality of operators;

In some embodiments, the processing module 310 is configured to encapsulate a first node with the same scope in the first computation graph as a second node, and determine the second computation graph.

In some embodiments, the processing module 310 is configured to randomly initialize a weight corresponding to an operator with a weight in the first image processing model to a non-0 value before performing forward computation and backward computation of the first image processing model according to the second computation graph.

Fig. 4 illustrates a physical schematic diagram of an electronic device, as shown in fig. 4, which may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform an image processing method comprising: inputting an image to be processed into a target image processing model to obtain target image information output by the target image processing model, wherein the target image processing model is established based on a target mask tensor, and an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor;

the target mask tensor is determined based on the steps of:

Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Further, the present application also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing the image processing method provided by the above method embodiments, the method comprising: inputting an image to be processed into a target image processing model to obtain target image information output by the target image processing model, wherein the target image processing model is established based on a target mask tensor, and an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor;

The target mask tensor is determined based on the steps of:

In another aspect, embodiments of the present application also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the image processing method provided in the above embodiments, the method including: inputting an image to be processed into a target image processing model to obtain target image information output by the target image processing model, wherein the target image processing model is established based on a target mask tensor, and an image channel in the target image information corresponds to a channel with a component of 1 in the target mask tensor;

The target mask tensor is determined based on the steps of:

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

The above embodiments are only for illustrating the present application, and are not limiting of the present application. While the application has been described in detail with reference to the embodiments, those skilled in the art will appreciate that various combinations, modifications, or equivalent substitutions can be made to the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application, and it is intended to be covered by the scope of the claims of the present application.

Claims

1. An image processing method, comprising:

the target mask tensor is determined based on the steps of:

Performing forward computation and backward computation of the first image processing model according to the second computation graph to update a mark mask corresponding to each operator according to output results of the forward computation and the corresponding backward computation; and determining the target mask tensor according to the updated mark mask corresponding to each operator.

2. The image processing method according to claim 1, wherein the performing forward computation and backward computation of the first image processing model from the second computation graph, determining the target mask tensor, comprises:

3. The image processing method according to claim 2, wherein the performing forward computation and corresponding backward computation on the first image processing model according to the second computation graph, updating the mark mask corresponding to each of the operators according to the output result of the forward computation and corresponding backward computation, comprises:

4. The image processing method according to claim 1, wherein the performing forward computation and backward computation of the first image processing model from the second computation graph, determining the target mask tensor, comprises:

5. The method according to claim 1, wherein the acquiring a first calculation map corresponding to the first image processing model includes:

6. The image processing method according to any one of claims 1 to 5, wherein the determining a second calculation map based on the first calculation map includes:

7. The image processing method according to any one of claims 1 to 5, characterized in that before the performing of the forward computation and the backward computation of the first image processing model from the second computation graph, the method further comprises:

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image processing method of any of claims 1 to 7 when the program is executed by the processor.

9. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the image processing method according to any one of claims 1 to 7.

10. A computer program product comprising a computer program which, when executed by a processor, implements the image processing method according to any one of claims 1 to 7.