CN111144511A - Image processing method, system, medium and electronic terminal based on neural network - Google Patents
Image processing method, system, medium and electronic terminal based on neural network Download PDFInfo
- Publication number
- CN111144511A CN111144511A CN201911420549.5A CN201911420549A CN111144511A CN 111144511 A CN111144511 A CN 111144511A CN 201911420549 A CN201911420549 A CN 201911420549A CN 111144511 A CN111144511 A CN 111144511A
- Authority
- CN
- China
- Prior art keywords
- image processing
- calibration data
- quantization
- neural network
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Facsimile Image Signal Circuits (AREA)
- Image Analysis (AREA)
- Color Image Communication Systems (AREA)
Abstract
The invention provides an image processing method, a system, a storage medium and an electronic terminal based on a neural network, wherein the method comprises the following steps: acquiring a plurality of image processing parameters corresponding to a calibration data set through the calibration data set, wherein each image processing parameter corresponds to each layer in a neural network; processing an input image through an image processing parameter corresponding to a current layer, acquiring an output result of the current layer, and judging whether to output an adjusted image processing parameter or not based on the output result of the current layer; if the judgment result is yes, processing the output result of the current layer through the adjusted image processing parameters, and inputting the processed output result to the next layer until the processing of all layers is completed; the invention realizes the dynamic adjustment of the whole neural network reasoning, ensures that the image processing result has good adaptability, and reduces the precision loss of each layer of the neural network during image quantization processing to the maximum extent on the basis of ensuring the image processing speed.
Description
Technical Field
The invention relates to the field of computer application, in particular to an image processing method, an image processing system, a storage medium and an electronic terminal based on a neural network.
Background
The deep neural network usually comprises a large number of parameters and computing nodes, the network reasoning process is computationally huge, and the requirements of artificial intelligence application on the network reasoning speed are increasingly high, especially on the end-side AI device, which usually needs the computing performance of real-time processing. In order to increase the speed of network inference, besides utilizing various efficient computing platforms (AIAccelerators), some methods of model compression and computation acceleration are applied, and among them, fixed-point processing, i.e. quantization processing in the computing process, is the most common. In order to fully utilize the computing power of the computing platform, quantization processing (including quantization of network weights and quantization of feature maps) must be performed. The quantization process inevitably brings quantization errors (quantization errors), which results in a loss of precision (accuracy loss) of the calculated result to some extent, and the problem is more prominent when the quantized value is expressed with a low precision representation (low precision representation) such as INT8 or less.
At present, quantization methods in the prior art are basically focused on quantizing network model parameters, a static quantization method (offline quantization) based on the replication dataset is mostly adopted, and the online inference of the network has no good adaptability, i.e. the quantization of feature map, and cannot be adjusted or optimized according to real network input.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention provides an image processing method, system, storage medium and electronic terminal based on neural network to solve the above technical problems.
The invention provides an image processing method based on a neural network, which comprises the following steps:
acquiring a plurality of image processing parameters corresponding to a calibration data set through the calibration data set, wherein each image processing parameter corresponds to each layer in a neural network; each image processing parameter comprises a group of saturation thresholds and quantization factors which correspond to each other;
processing an input image through an image processing parameter corresponding to a current layer, acquiring an output result of the current layer, and judging whether to output an adjusted image processing parameter or not based on the output result of the current layer;
if the judgment result is yes, the output result of the current layer is processed through the adjusted image processing parameters and then input to the next layer until the processing of all layers is completed. The image processing parameters are dynamically adjusted according to the image processing result of the current layer, quantization optimization is performed aiming at the characteristic diagram of each layer of the network, and the precision loss caused by quantization is reduced to the maximum extent while the image processing efficiency is ensured.
Optionally, if the determination result is negative, the output result of the current layer is directly input to the next layer until the processing of all layers is completed. The image processing efficiency is ensured.
Optionally, a quantization array corresponding to the calibration data set is obtained through the calibration data set, where the quantization array includes a plurality of image processing parameters, each image processing parameter includes a set of saturation thresholds and quantization factors corresponding to each other, and each image processing parameter in the quantization array corresponds to each layer in the neural network. By the method, the optimal saturation threshold and the quantization factor of the characteristic diagram of each layer of the network can be statically quantized, and the optimal alternative parameter selection is provided for dynamic adjustment.
Optionally, the calibration data set includes a plurality of sub-calibration data sets, where each sub-calibration data set corresponds to a quantization array.
Optionally, augmenting the image in the calibration data set;
and clustering the images in the expanded calibration data set to obtain a plurality of sub-calibration data sets. And acquiring the optimal saturation threshold and the quantization factor of the characteristic diagram of each layer of the network in a clustering and quantizing manner.
Optionally, the features are counted according to the number of channels of each picture in the calibration data set, so as to obtain a feature vector of each picture;
and clustering according to the feature vectors, acquiring the plurality of sub-calibration data sets and recording the mass center of each sub-calibration data set.
Optionally, when the input image of the current layer is processed, the image processing parameters of the current layer are determined according to the comparison between the input image and the sub-calibration data set.
Optionally, calculating a contrast parameter of the input image, where the contrast parameter includes a mean and a variance of each channel of the input image;
obtaining a degree of difference by comparing the contrast parameter to the centroid of the sub-calibration data set;
and determining the image processing parameters corresponding to the current layer according to the difference. In this way, the type with the minimum difference degree is found, so that the saturation threshold and the quantization factor of each layer of the network for the current input picture are obtained.
Optionally, a d-dimensional vector set formed by a plurality of d-dimensional vectors is obtained by respectively calculating a mean value and a variance of each channel of each picture in the expanded calibration data set, where d is 2n and n is the number of channels;
and clustering the d-dimensional vector set to obtain the sub-calibration data set.
Optionally, judging whether to output the adjusted image processing parameter according to the output result of the current layer, including:
obtaining the maximum value Max and the minimum value Min of all feature maps in the output result of the current layer, taking M as Max { | Max |, | Min | }, and reducing M into a floating point MfIf, if
Mf<alpha*T[m][l],
Wherein, alpha is a determination coefficient,
then M is selectedfAs an adjusted saturation threshold.
Optionally, the adjusted quantization factor corresponding to the adjusted saturation threshold is obtained as follows:
SR’=SR[m][l]*T[m][l]/Mf
wherein, SR' is the adjusted quantization factor, SR [ m ] [ l ] is the quantization factor of the current layer l, T [ m ] [ l ] is the saturation threshold of the current layer l, and m is the index value with the minimum difference.
Optionally, when performing clustering, the plurality of sub-calibration data sets are obtained by performing clustering for a plurality of times.
The invention also provides an image processing system based on the neural network, which comprises:
the quantization module is used for acquiring a plurality of image processing parameters corresponding to the calibration data set through the calibration data set, and each image processing parameter corresponds to each layer in the neural network; each image processing parameter comprises a group of saturation thresholds and quantization factors which correspond to each other;
the image processing module is used for processing the input image through the image processing parameters corresponding to the current layer, acquiring the output result of the current layer and judging whether to output the adjusted image processing parameters or not based on the output result of the current layer;
if the judgment result is yes, the output result of the current layer is processed through the adjusted image processing parameters and then input to the next layer until the processing of all layers is completed.
Optionally, the image processing module is further configured to, if the determination result is negative, directly input the output result of the current layer to a next layer until the processing of all layers is completed.
Optionally, the quantization module is further configured to
And acquiring a quantization array corresponding to the calibration data set through the calibration data set, wherein the quantization array comprises a plurality of image processing parameters, each image processing parameter comprises a group of saturation thresholds and quantization factors which are mutually corresponding, and each image processing parameter in the quantization array corresponds to each layer in the neural network.
Optionally, the calibration data set includes a plurality of sub-calibration data sets, where each sub-calibration data set corresponds to a quantization array.
Optionally, the quantization module comprises
A pre-processing unit for augmenting images in the calibration data set;
and the clustering unit is used for clustering the images in the expanded calibration data set to obtain a plurality of sub-calibration data sets.
Optionally, the quantization module further comprises
The characteristic acquisition unit is used for counting the characteristics according to the number of channels of each picture in the calibration data set to acquire a characteristic vector of each picture;
and the clustering unit carries out clustering processing according to the feature vector, acquires the plurality of sub-calibration data sets and records the mass center of each sub-calibration data set.
Optionally, the quantization module further includes:
the contrast parameter calculation unit is used for calculating contrast parameters of the input image, and the contrast parameters comprise the mean value and the variance of each channel of the input image;
an input image comparison unit for obtaining a degree of difference by comparing the contrast parameter with the centroid of the sub-calibration data set.
Optionally, the comparison parameter calculating unit obtains a d-dimensional vector set formed by a plurality of d-dimensional vectors by respectively calculating a mean value and a variance of each channel of each picture in the expanded calibration data set, where d is 2n, and n is a channel number;
and the clustering unit is used for clustering the d-dimensional vector set to obtain the sub-calibration data set.
Optionally, the image processing module is configured to determine whether to output the adjusted image processing parameter according to the output result of the current layer, and if the determination result is yes, obtain a process of the adjusted image processing parameter, which specifically includes:
obtaining the maximum value Max and the minimum value Min of all feature maps in the output result of the current layer, taking M as Max { | Max |, | Min | }, and reducing M into a floating point MfIf, if
Mf<alpha*T[m][l],
Wherein, alpha is a determination coefficient,
then M is selectedfAs an adjusted saturation threshold.
Optionally, the image processing module obtains the adjusted quantization factor corresponding to the adjusted saturation threshold by:
SR’=SR[m][l]*T[m][l]/Mf
wherein, SR' is the adjusted quantization factor, SR [ m ] [ l ] is the quantization factor of the current layer l, T [ m ] [ l ] is the saturation threshold of the current layer l, and m is the index value with the minimum difference.
The invention also provides a computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method of any one of the above.
The present invention also provides an electronic terminal, comprising: a processor and a memory;
the memory is adapted to store a computer program and the processor is adapted to execute the computer program stored by the memory to cause the terminal to perform the method as defined in any one of the above.
The invention has the beneficial effects that: according to the image processing method, the system, the storage medium and the electronic terminal based on the neural network, when the neural network performs image processing, aiming at input pictures with different characteristics input by each layer, image processing parameters can be reselected according to the output result of the current layer, so that the dynamic adjustment of the inference of the whole neural network is realized, the result of the image processing has good adaptability, the difference of data distribution of each layer of feature maps during the image processing of the neural network and the non-controllability of the difference are overcome, the problem that the characteristics of each layer of feature maps inferred from the input pictures with similar characteristics are similar is solved, the optimal quantization of each layer of feature maps of the network during online inference is realized, and the precision loss of each layer of the neural network during the image quantization processing is reduced to the greatest extent on the basis of ensuring the image processing speed.
Drawings
Fig. 1 is a schematic flow chart of an image processing method based on a neural network in an embodiment of the present invention.
Fig. 2 is a schematic flow chart of layer-by-layer image processing of the neural network-based image processing method in the embodiment of the present invention.
FIG. 3 is a schematic diagram of a quadratic quantization flow of an image processing method based on a neural network according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a hardware structure of another terminal device according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
In the following description, numerous details are set forth to provide a more thorough explanation of embodiments of the present invention, however, it will be apparent to one skilled in the art that embodiments of the present invention may be practiced without these specific details, and in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring embodiments of the present invention.
As shown in fig. 1, the image processing method based on the neural network in the embodiment includes:
s1, acquiring a plurality of image processing parameters corresponding to the calibration data set through the calibration data set, wherein each image processing parameter corresponds to each layer in the neural network; each image processing parameter comprises a group of saturation thresholds and quantization factors which correspond to each other;
s2, processing the input image through the image processing parameters corresponding to the current layer, obtaining the output result of the current layer and based on the output result of the current layer;
s3, judging whether the adjusted image processing parameters are output or not;
and S4, if the judgment result is yes, processing the output result of the current layer through the adjusted image processing parameters, and inputting the processed output result to the next layer until the processing of all layers is completed.
And if the judgment result is negative, directly inputting the output result of the current layer to the next layer until the processing of all layers is finished.
In this embodiment, a method of dynamically obtaining an adjusted image processing parameter according to an output result of a current layer is adopted, and quantization optimization is mainly performed on a feature map (feature map) of each layer of a network, so as to reduce precision loss caused by quantization. In step S1, a smaller standard data set (calibration dataset) may be used to perform floating point inference to obtain histogram statistics of feature maps of each layer, and further obtain an optimal saturation threshold of each layer and a quantization factor under symmetric linear quantization, but the features of an input picture may have a larger difference from the statistical features of the standard data set when the network performs online inference, and it is not guaranteed that each picture is optimal or better by using a set of saturation threshold and quantization factor, and even it is difficult to guarantee in a statistical sense.
In this embodiment, a quantization array corresponding to a calibration data set is obtained through the calibration data set, where the quantization array in this embodiment includes a plurality of image processing parameters, each image processing parameter includes a set of saturation thresholds and quantization factors corresponding to each other, and each image processing parameter in the quantization array corresponds to each layer in a neural network. In this embodiment, first, a quantization array corresponding to a calibration data set is obtained through the calibration data set, then an input image is processed through an image processing parameter corresponding to a first layer, an output result of the first layer is obtained, then whether an adjusted image processing parameter is output or not is judged according to the output result of the first layer, if a preset condition is met, the judgment result is yes, the output result of the current layer is processed through the adjusted image processing parameter and then is input to a next layer, and the process is repeated, whether the adjusted image processing parameter is output or not is judged for each layer according to the output result of the previous layer, that is, once the quantization of the feature map of the layer is found to be better, the adjusted image processing parameter, that is, the adjusted saturation threshold and the adjusted quantization factor, is obtained according to the output result of the layer, and image processing is performed according to the adjusted saturation threshold and quantization factor, until the processing of all layers is completed. And dynamic quantification is formed, so that each layer of characteristic diagram of the neural network is optimally quantified.
In this embodiment, a standard data set is first expanded, then a plurality of sub-datasets (sub-datasets) are obtained by clustering images in the expanded calibration data set, optionally, the sub-datasets are divided into a plurality of sub-calibration data sets by using a clustering method, then a set of optimal saturation threshold and quantization factor is obtained for each sub-calibration data set to obtain a plurality of sets of quantization selections, and when image processing is performed, for example, during online reasoning, one of the sets of quantization selections is selected as an optimal quantization selection for each image by using a clustering criterion.
In this embodiment, the features are counted according to the number of channels of each picture in the extended calibration data set, and a feature vector of each picture is obtained; and clustering according to the feature vectors, and recording the mass center of each class so as to obtain the optimal saturation threshold and the quantization factor. And comparing the mean value and the variance of each channel of the current picture with the mass center of each type respectively to obtain the difference, and obtaining the saturation threshold and the quantization factor of each layer according to the difference.
Specifically, in this embodiment, a calibration data set is first expanded by k times, for example, k is 10, pictures (calibration images) in the calibration data set are expanded from 500 original pictures to 500k, an average value and a variance are respectively obtained for each channel (channel) in each picture to form a d-dimensional vector (d is 2n, and n is the number of channels), and a total of 500k d-dimensional vectors form a set X { X1, X2, X3, …, and X500k }; dividing X into K classes (whether the definition of K is the same or not needs to be confirmed) by using a clustering method (such as K-means clustering), and recording the centroid of each class, which is equivalent to dividing 500K pictures into K sub-calibration data sets; then, for each sub-calibration data set, the optimal saturation threshold T and quantization factor SR of each feature map layer of the network are obtained, which are specifically as follows:
first, a network floating point (FP32) inference is performed on the sub-calibration data set;
second, for each layer in the neural network:
s121, collecting all feature map values of the layer, dividing the feature map values into a plurality of bins, performing histogram statistics on the distribution of the bins, and recording the distribution as distr _ ref;
s122, generating N quantized (symmetric linear quantization) distribution diagrams, which are denoted as distr _ quant (i), by using N different saturation thresholds, where i is 0,1,2, …, and N-1, and the number of bins in each distribution diagram is the same as that in step S121;
and S123, traversing all the i values in the step S122, respectively calculating KL divergence (Kullback-Leibler divergence) between the distr _ quant (i) and the distr _ ref, and finding the i value with the minimum KL divergence so as to determine the optimal saturation threshold of the layer.
And S124, adopting symmetric linear quantization to obtain the optimal quantization factor of the layer according to the quantized fixed point bit width and the saturation threshold in the step S123.
The k groups of saturation thresholds and quantization factors are obtained by the above method, and are expressed as T [ k ] [ L ], SR [ k ] [ L ] by arrays, where L is the number of network layers (or the number of layers after fusion, such as a convolutional layer and an activation layer fused into one layer), in this embodiment, the convolutional and activation layer is taken as an example, and a specific layer-by-layer fixed-point reasoning process is shown in fig. 2.
In this embodiment, when each layer of the neural network performs image processing, the mean and variance of each channel of the current picture are calculated in the input preprocessing or the first batch norm layer of the network, and compared with each centroid of k classes of X, the class with the smallest difference (the measurement criterion is consistent with that in the above clustering method) is found, the index value m is recorded, and T [ m ], SR [ m ] is taken as the saturation threshold and quantization factor for each layer of the network of the current input picture.
In this embodiment, the selected group of optimally quantized image processing parameters includes saturation thresholds and quantization factors of feature maps of each layer of the network, and it cannot be guaranteed that quantization of each layer is better in a real-time reasoning process, and if a certain layer of quantization causes a greater precision loss, precision (accuracycacy) of a final result output by the network cannot be guaranteed. In this embodiment, on the basis of step S1, a dynamically adjusted quantization strategy is introduced, an input image is processed by using an image processing parameter corresponding to a current layer, an output result of the current layer is obtained, and then an adjusted image processing parameter is obtained according to the output result, so as to conditionally adjust the quantization of the current layer in an online inference process, that is: once the quantization of the layer featuremap is found to be more optimal, the adjusted saturation threshold and quantization factor are obtained (only used temporarily and the result of the static quantization is not updated). The embodiment can complete the dynamic adjustment process with the minimum calculation cost to ensure that the time cost (time cost) of network online reasoning is not increased, and by the mode in the embodiment, the precision loss caused by quantization can be reduced to the greatest extent on the basis of ensuring the image processing efficiency.
In this embodiment, taking convolution + active layer as an example, the layer-by-layer fixed point reasoning process is shown in fig. 2, where Weight and Input featuremap are quantized offline, that is, quantized statically, and Weight quantization and featuremap quantization are independent from each other, and the fixed point value is a quantization factor floating point value. Layer _ n _ input _ Q is an input factor.
The quantization factor of the current layer weight is recorded as 2^ (A), the quantization factor of the current layer input feature is recorded as SR _ n ^ 2^ (IN _ Q), A and IN _ Q can be floating point values with decimal numbers, through convolution calculation, if an output result overflows (32bit range), right shift processing is carried out, and a shift bit number is recorded as B. And quantifying bias, wherein the quantified bit number is as follows: and IN _ Q + A + B, and carrying out 32-bit saturation treatment after quantization. output + bias → result, and carry out 32bit saturation processing on result. The result is shifted to the right to the target quantization bit width (INT16, INT8, etc.) and the shift bit number is denoted C. And (5) obtaining the quantization result of the current layer output featuremap through activation processing and saturation. The quantization factor of Layer _ n _ output _ Q is SF _ n, the quantization factor of Layer _ n +1_ input _ Q is SR _ n +1, and the quantization factor of the current Layer output result is SF _ n ^ 2 (IN _ Q + a + B + C). scalingconversion is:
Layer_n+1_input_Q=Layer_n_output_Q*SR_n+1/SF_n。
so far, accomplished the successive layer fixed point reasoning process of current layer, accomplished the image processing of current layer promptly, the successive layer is handled according to this kind of mode, if this moment, when the output result of a certain layer is input to next layer, when the quantization of a certain layer leads to great precision loss, if do not adjust image processing parameter, then can make neural network from this layer, the precision receives the influence, and then leads to the precision (accuracy) of the final result of whole neural network output also can't obtain guaranteeing.
As shown in fig. 3, in this embodiment, if it is detected that the condition for dynamically adjusting the saturation threshold and the quantization factor is satisfied after the Layer _ n Layer calculation is finished, the adjusted saturation threshold and quantization factor are determined, image processing is performed based on the quantization factor, and then reference calculation of the next Layer is performed to avoid being affected by the accuracy of the Layer after quantization.
Specifically, after the current layer (note index value l) is calculated, the maximum value and the minimum value of the output result (output results) of the current layer are obtained, where M is Max { | Max |, | Min | }, and M is restored to a floating point and is noted as M { | Max |, | Min | }fIf M isf<alpha*T[m][l](value of alpha is greater than or equal to 0.5 and less than or equal to 1 in the embodiment), T is the saturation threshold of the current layer determined in step S1, then M is addedfTemporarily as the optimal saturation threshold of the current layer, symmetric linear quantization is also adopted, and the current optimal quantization factor of the layer is SR ═ SR [ m ═ m-][l]*T[m][l]/MfThe specific dynamic quantization adjustment process is as follows:
and when the dynamic adjustment condition is satisfied, acquiring the adjusted image processing parameters according to the output result Layer _ n _ output _ Q of the current Layer, and then performing image processing.
Therefore, the adjusted quantization factor is:
Layer_n+1_input_Q’=round(Layer_n_output_Q*SR_n+1’/SF_n)
wherein, SF _ n is the saturation threshold of the current layer; SR _ n + 1' is the adjusted saturation threshold.
Also, in the same manner as above,
and acquiring the adjusted image processing parameters according to the output of Layer _ n +1_ output _ Q', and then performing image processing.
Therefore, the adjusted quantization factor is:
Layer_n+2_input_Q=round(Layer_n+1_output_Q’*SR_n+2’/SF_n+1)
SF _ n +1 is the saturation threshold of n +1 layer, and SR _ n + 2' is the adjusted saturation threshold.
When the condition of dynamic adjustment is not satisfied, the static quantization result is kept unchanged. And after the Layer _ n +1 is calculated, detecting that the image processing parameters do not need to be adjusted, and directly inputting the quantization result to the next Layer.
And then, sequentially carrying out picture processing layer by layer, and executing the dynamic quantization on the input pictures of each layer until the real-time reasoning of all layers is completed.
Correspondingly, the present embodiment further provides a dynamic quantification system based on neural network inference, including:
the quantization module is used for acquiring a plurality of image processing parameters corresponding to the calibration data set through the calibration data set, and each image processing parameter corresponds to each layer in the neural network; each image processing parameter comprises a group of saturation thresholds and quantization factors which correspond to each other;
the image processing module is used for processing the input image through the image processing parameters corresponding to the current layer, acquiring the output result of the current layer and based on the output result of the current layer;
if the judgment result is yes, the output result of the current layer is processed through the adjusted image processing parameters and then input to the next layer until the processing of all layers is completed.
In this embodiment, the quantization module is further configured to obtain, through a calibration data set, a quantization array corresponding to the calibration data set, where the quantization array includes a plurality of image processing parameters, each image processing parameter includes a set of saturation threshold and quantization factor corresponding to each other, and each image processing parameter in the quantization array corresponds to each layer in the neural network.
In this embodiment, the image processing module processes the output result according to the newly selected next layer temporary image processing parameter, and obtains the output result of the next layer until the processing of all layers is completed. The calibration data set includes a plurality of sub-calibration data sets, wherein each of the sub-calibration data sets corresponds to a quantization array.
The quantization module in this embodiment includes:
a pre-processing unit for expanding the image in the calibration data set;
a clustering unit for clustering the images in the extended calibration data set to obtain multiple sub-calibration data sets
The characteristic acquisition unit is used for counting the characteristics according to the number of channels of each picture in the calibration data set to acquire a characteristic vector of each picture;
and the clustering unit carries out clustering processing according to the characteristic vector, acquires the plurality of sub-calibration data sets and records the mass center of each sub-calibration data set.
The contrast parameter calculation unit is used for calculating contrast parameters of the input image, and the contrast parameters comprise the mean value and the variance of each channel of the input image;
an input image comparison unit for obtaining a degree of difference by comparing the contrast parameter with the centroid of the sub-calibration data set.
In this embodiment, the clustering unit may calculate and count features (first order moment and second order moment) of the pictures in the calibration data set according to the channel, further form a feature vector of each picture, and then divide the feature vector into K classes by using a clustering method, optionally, K-means clustering may be used, and when performing clustering processing, the optimal is taken by performing multiple clustering to avoid falling into local optimal, and the center of each class is recorded. In this embodiment, each class in the clustering result is subjected to static quantization, a saturation threshold and a quantization factor that are statistically optimal for each layer of the network are determined, and k sets of quantization options are obtained altogether.
In this embodiment, when performing online network inference, each time a fixed-point reference of one layer is completed, it may detect whether an input feature map image processing parameter, that is, a quantization factor, input to the next layer needs to be adjusted. If the adjustment condition is determined to be satisfied according to the output result of the current layer, the image processing parameters, namely the saturation threshold and the quantization factor, are re-determined, image processing is performed based on the re-determined quantization factor, and then reference calculation of the next layer is performed, so that calculation can be effectively simplified, and time consumption is greatly reduced.
In this embodiment, the selected group of optimally quantized image processing parameters includes saturation thresholds and quantization factors of feature maps of each layer of the network, and it cannot be guaranteed that quantization of each layer is better in a real-time reasoning process, and if a certain layer of quantization causes a greater precision loss, precision (accuracycacy) of a final result output by the network cannot be guaranteed. In this embodiment, a quantization strategy for dynamic adjustment is introduced, an input image is processed by an image processing parameter corresponding to a current layer, an output result of the current layer is obtained, and then an image processing parameter of a next layer is reselected according to the output result, so as to conditionally adjust quantization of the current layer in an online inference process, that is: once the quantization of the feature map of the layer is found to be better, the saturation threshold and quantization factor of the layer are recalculated. The present embodiment can accomplish this dynamic adjustment process with minimal computational cost to ensure that the time cost (time cost) of network online reasoning is not increased.
In this embodiment, when each layer of the neural network performs image processing, the mean and variance of each channel of the current picture are calculated in the input preprocessing or the first batch norm layer of the network, and compared with each centroid of k classes of X, the class with the smallest difference (the measurement criterion is consistent with that in the above clustering method) is found, the index value m is recorded, and T [ m ], SR [ m ] is taken as the saturation threshold and quantization factor for each layer of the network of the current input picture.
After the current layer (recording index value l) is calculated, acquiring the maximum value and the minimum value of the output result (output feature maps) of the current layer, wherein M is Max { | Max |, | Min | }, reducing M into a floating point, and recording as M | Max | }fIf M isf<alpha*T[m][l](value of alpha is greater than or equal to 0.5 and less than or equal to 1 in the embodiment), T is the saturation threshold of the current layer determined in step S1, then M is addedfTemporarily as the optimal saturation threshold of the current layer, symmetric linear quantization is also adopted, and the current optimal quantization factor of the layer is SR ═ SR [ m ═ m-][l]*T[m][l]/Mf。
When the dynamic adjustment condition is satisfied, the image processing is performed again according to the adjusted image processing parameter and the quantization factor SF _ n of the output result of the current layer.
When the condition of dynamic adjustment is not satisfied, the static quantization result is kept unchanged. And after the calculation is finished, the condition that the next layer of quantization factor is dynamically adjusted is detected to be not established, and the quantization factor is returned to the original quantization calculation.
The present embodiment also provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements any of the methods in the present embodiments.
The present embodiment further provides an electronic terminal, including: a processor and a memory;
the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the method in the embodiment.
The computer-readable storage medium in the present embodiment can be understood by those skilled in the art as follows: all or part of the steps for implementing the above method embodiments may be performed by hardware associated with a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The electronic terminal provided by the embodiment comprises a processor, a memory, a transceiver and a communication interface, wherein the memory and the communication interface are connected with the processor and the transceiver and are used for completing mutual communication, the memory is used for storing a computer program, the communication interface is used for carrying out communication, and the processor and the transceiver are used for operating the computer program so that the electronic terminal can execute the steps of the method.
In this embodiment, the Memory may include a Random Access Memory (RAM), and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In the above-described embodiments, reference in the specification to "the embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least some embodiments, but not necessarily all embodiments. The various appearances of the phrase "the present embodiment," "one embodiment," or "another embodiment" are not necessarily all referring to the same embodiment. If the specification states a component, feature, structure, or characteristic "may", "might", or "could" be included, that particular component, feature, structure, or characteristic is not necessarily included. If the specification or claim refers to "a" or "an" element, that does not mean there is only one of the element. If the specification or claim refers to "a further" element, that does not preclude there being more than one of the further element.
In the embodiments described above, although the present invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory structures (e.g., dynamic ram (dram)) may use the discussed embodiments. The embodiments of the invention are intended to embrace all such alternatives, modifications and variances that fall within the broad scope of the appended claims.
The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Fig. 4 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention; as shown, the terminal device may include: an input device 1100, a first processor 1101, an output device 1102, a first memory 1103, and at least one communication bus 1104. The communication bus 1104 is used to implement communication connections between the elements. The first memory 1103 may comprise a high-speed RAM memory and may also include a non-volatile storage NVM, such as at least one disk memory, in which various programs may be stored in the first memory 1103 for performing various processing functions and implementing the method steps of the embodiments of the present invention of fig. 1-3.
Alternatively, the first processor 1101 may be, for example, a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the first processor 1101 is coupled to the input device 1100 and the output device 1102 through a wired or wireless connection.
Optionally, the input device 1100 may include a variety of input devices, such as at least one of a user-oriented user interface, a device-oriented device interface, a software programmable interface, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware plug-in interface (e.g., a USB interface, a serial port, etc.) for data transmission between devices; optionally, the user-facing user interface may be, for example, a user-facing control key, a voice input device for receiving voice input, and a touch sensing device (e.g., a touch screen with a touch sensing function, a touch pad, etc.) for receiving user touch input; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, such as an input pin interface or an input interface of a chip; the output devices 1102 may include output devices such as a display, audio, and the like.
In this embodiment, the processor of the terminal device includes a function for executing each module in the image processing system based on the neural network, and specific functions and technical effects may refer to the above embodiments, which are not described herein again.
Fig. 5 is a schematic hardware structure diagram of a terminal device according to an embodiment of the present application. Fig. 5 is a specific embodiment of the implementation process of fig. 4. As shown, the terminal device of the present embodiment may include a second processor 1201 and a second memory 1202.
The second processor 1201 executes the computer program code stored in the second memory 1202 to implement the method described in fig. 4 in the above embodiment.
The second memory 1202 is configured to store various types of data to support operations at the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, such as messages, pictures, videos, and so forth. The second memory 1202 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Optionally, a second processor 1201 is provided in the processing assembly 1200. The terminal device may further include: communication component 1203, power component 1204, multimedia component 1205, speech component 1206, input/output interfaces 1207, and/or sensor component 1208. The specific components included in the terminal device are set according to actual requirements, which is not limited in this embodiment.
The processing component 1200 generally controls the overall operation of the terminal device. The processing component 1200 may include one or more second processors 1201 to execute instructions to perform all or part of the steps of the neural network based image processing method described above. Further, the processing component 1200 can include one or more modules that facilitate interaction between the processing component 1200 and other components. For example, the processing component 1200 can include a multimedia module to facilitate interaction between the multimedia component 1205 and the processing component 1200.
The power supply component 1204 provides power to the various components of the terminal device. The power components 1204 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal device.
The multimedia components 1205 include a display screen that provides an output interface between the terminal device and the user. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The voice component 1206 is configured to output and/or input voice signals. For example, the voice component 1206 includes a Microphone (MIC) configured to receive external voice signals when the terminal device is in an operational mode, such as a voice recognition mode. The received speech signal may further be stored in the second memory 1202 or transmitted via the communication component 1203. In some embodiments, the speech component 1206 further comprises a speaker for outputting speech signals.
The input/output interface 1207 provides an interface between the processing component 1200 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor component 1208 includes one or more sensors for providing various aspects of status assessment for the terminal device. For example, the sensor component 1208 may detect an open/closed state of the terminal device, relative positioning of the components, presence or absence of user contact with the terminal device. The sensor assembly 1208 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact, including detecting the distance between the user and the terminal device. In some embodiments, the sensor assembly 1208 may also include a camera or the like.
The communication component 1203 is configured to facilitate communications between the terminal device and other devices in a wired or wireless manner. The terminal device may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one embodiment, the terminal device may include a SIM card slot therein for inserting a SIM card therein, so that the terminal device may log onto a GPRS network to establish communication with the server via the internet.
As can be seen from the above, the communication component 1203, the voice component 1206, the input/output interface 1207 and the sensor component 1208 involved in the embodiment of fig. 5 can be implemented as the input device in the embodiment of fig. 4.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (24)
1. An image processing method based on a neural network, comprising:
acquiring a plurality of image processing parameters corresponding to a calibration data set through the calibration data set, wherein each image processing parameter corresponds to each layer in a neural network; each image processing parameter comprises a group of saturation thresholds and quantization factors which correspond to each other;
processing an input image through an image processing parameter corresponding to a current layer, acquiring an output result of the current layer, and judging whether to output an adjusted image processing parameter or not based on the output result of the current layer;
if the judgment result is yes, the output result of the current layer is processed through the adjusted image processing parameters and then input to the next layer until the processing of all layers is completed.
2. The image processing method based on the neural network as claimed in claim 1, wherein if the determination result is negative, the output result of the current layer is directly input to the next layer until the processing of all layers is completed.
3. The neural network-based image processing method according to claim 1, comprising:
and acquiring a quantization array corresponding to the calibration data set through the calibration data set, wherein the quantization array comprises a plurality of image processing parameters, each image processing parameter comprises a group of saturation thresholds and quantization factors which are mutually corresponding, and each image processing parameter in the quantization array corresponds to each layer in the neural network.
4. The neural network-based image processing method of claim 3, wherein the calibration data set comprises a plurality of sub-calibration data sets, wherein each of the sub-calibration data sets corresponds to a quantization array.
5. The neural network-based image processing method according to claim 4,
augmenting the image in the calibration data set;
and clustering the images in the expanded calibration data set to obtain a plurality of sub-calibration data sets.
6. The neural network-based image processing method according to claim 5,
counting the features according to the number of channels of each picture in the calibration data set to obtain a feature vector of each picture;
and clustering according to the feature vectors, acquiring the plurality of sub-calibration data sets and recording the mass center of each sub-calibration data set.
7. The neural network-based image processing method of claim 6, wherein when processing the input image of the current layer, determining the image processing parameters of the current layer according to the comparison between the input image and the sub-calibration data set.
8. The neural network-based image processing method according to claim 7,
calculating contrast parameters of the input image, wherein the contrast parameters comprise the mean value and the variance of each channel of the input image;
obtaining a degree of difference by comparing the contrast parameter to the centroid of the sub-calibration data set;
and determining the image processing parameters corresponding to the current layer according to the difference.
9. The image processing method based on the neural network as claimed in claim 6, wherein a d-dimensional vector set formed by a plurality of d-dimensional vectors is obtained by averaging and variance respectively for each channel of each picture in the expanded calibration data set, wherein d is 2n, and n is the number of channels;
and clustering the d-dimensional vector set to obtain the sub-calibration data set.
10. The method of claim 1, wherein determining whether to output the adjusted image processing parameter according to the output result of the current layer comprises:
obtaining the maximum value Max and the minimum value Min of all feature maps in the output result of the current layer, taking M as Max { | Max |, | Min | }, and reducing M into a floating point MfIf, if
Mf<alpha*T[m][l],
Wherein, alpha is a determination coefficient,
then M is selectedfAs an adjusted saturation threshold.
11. The neural network-based image processing method of claim 10, wherein the adjusted quantization factor corresponding to the adjusted saturation threshold is obtained by:
SR’=SR[m][l]*T[m][l]/Mf
wherein, SR' is the adjusted quantization factor, SR [ m ] [ l ] is the quantization factor of the current layer l, T [ m ] [ l ] is the saturation threshold of the current layer l, and m is the index value with the minimum difference.
12. The neural network-based image processing method according to claim 9, wherein the plurality of sub-calibration data sets are acquired by performing clustering processing a plurality of times when performing clustering processing.
13. An image processing system based on a neural network, comprising:
the quantization module is used for acquiring a plurality of image processing parameters corresponding to the calibration data set through the calibration data set, and each image processing parameter corresponds to each layer in the neural network; each image processing parameter comprises a group of saturation thresholds and quantization factors which correspond to each other;
the image processing module is used for processing the input image through the image processing parameters corresponding to the current layer, acquiring the output result of the current layer and judging whether to output the adjusted image processing parameters or not based on the output result of the current layer;
if the judgment result is yes, the output result of the current layer is processed through the adjusted image processing parameters and then input to the next layer until the processing of all layers is completed.
14. The image processing system of claim 13, wherein the image processing module is further configured to, if the determination result is negative, directly input the output result of the current layer to a next layer until the processing of all layers is completed.
15. The neural network-based image processing system of claim 13, wherein the quantization module is further configured to quantize the image data using a quantization factor
And acquiring a quantization array corresponding to the calibration data set through the calibration data set, wherein the quantization array comprises a plurality of image processing parameters, each image processing parameter comprises a group of saturation thresholds and quantization factors which are mutually corresponding, and each image processing parameter in the quantization array corresponds to each layer in the neural network.
16. The neural network-based image processing system of claim 15, wherein the calibration data set includes a plurality of sub-calibration data sets, wherein each of the sub-calibration data sets corresponds to a quantization array.
17. The neural network-based image processing system of claim 16, wherein the quantization module comprises
A pre-processing unit for augmenting images in the calibration data set;
and the clustering unit is used for clustering the images in the expanded calibration data set to obtain a plurality of sub-calibration data sets.
18. The neural network-based image processing system of claim 17, wherein the quantization module further comprises
The characteristic acquisition unit is used for counting the characteristics according to the number of channels of each picture in the calibration data set to acquire a characteristic vector of each picture;
and the clustering unit carries out clustering processing according to the feature vector, acquires the plurality of sub-calibration data sets and records the mass center of each sub-calibration data set.
19. The neural network-based image processing system of claim 18, wherein the quantization module further comprises:
the contrast parameter calculation unit is used for calculating contrast parameters of the input image, and the contrast parameters comprise the mean value and the variance of each channel of the input image;
an input image comparison unit for obtaining a degree of difference by comparing the contrast parameter with the centroid of the sub-calibration data set.
20. The neural network-based image processing system of claim 19, wherein the contrast parameter calculating unit obtains a d-dimensional vector set formed by a plurality of d-dimensional vectors by respectively averaging and variance for each channel of each picture in the extended calibration data set, wherein d is 2n, and n is the number of channels;
and the clustering unit is used for clustering the d-dimensional vector set to obtain the sub-calibration data set.
21. The image processing system according to claim 13, wherein the image processing module is configured to determine whether to output the adjusted image processing parameter according to the output result of the current layer, and if the determination result is yes, obtain the adjusted image processing parameter, and specifically includes:
obtaining the maximum value Max and the minimum value Min of all feature maps in the output result of the current layer, taking M as Max { | Max |, | Min | }, and reducing M into a floating point MfIf, if
Mf<alpha*T[m][l],
Wherein, alpha is a determination coefficient,
then M is selectedfAs an adjusted saturation threshold.
22. The neural network-based image processing system of claim 21, wherein the image processing module obtains the adjusted quantization factor corresponding to the adjusted saturation threshold by:
SR’=SR[m][l]*T[m][l]/Mf
wherein, SR' is the adjusted quantization factor, SR [ m ] [ l ] is the quantization factor of the current layer l, T [ m ] [ l ] is the saturation threshold of the current layer l, and m is the index value with the minimum difference.
23. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implementing the method of any one of claims 1 to 12.
24. An electronic terminal, comprising: a processor and a memory;
the memory is for storing a computer program and the processor is for executing the computer program stored by the memory to cause the terminal to perform the method of any of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911420549.5A CN111144511B (en) | 2019-12-31 | 2019-12-31 | Image processing method, system, medium and electronic terminal based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911420549.5A CN111144511B (en) | 2019-12-31 | 2019-12-31 | Image processing method, system, medium and electronic terminal based on neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111144511A true CN111144511A (en) | 2020-05-12 |
CN111144511B CN111144511B (en) | 2020-10-20 |
Family
ID=70522886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911420549.5A Active CN111144511B (en) | 2019-12-31 | 2019-12-31 | Image processing method, system, medium and electronic terminal based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111144511B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860841A (en) * | 2020-07-28 | 2020-10-30 | Oppo广东移动通信有限公司 | Quantization model optimization method, device, terminal and storage medium |
CN111985635A (en) * | 2020-09-02 | 2020-11-24 | 北京小米松果电子有限公司 | Method, device and medium for accelerating neural network inference processing |
CN112287986A (en) * | 2020-10-16 | 2021-01-29 | 浪潮(北京)电子信息产业有限公司 | Image processing method, device and equipment and readable storage medium |
CN112990457A (en) * | 2021-03-26 | 2021-06-18 | 开放智能机器(上海)有限公司 | Offline quantitative tuning method, apparatus, device, medium, and program product |
CN113010469A (en) * | 2021-03-18 | 2021-06-22 | 恒睿(重庆)人工智能技术研究院有限公司 | Image feature extraction method, device and computer-readable storage medium |
CN113705791A (en) * | 2021-08-31 | 2021-11-26 | 上海阵量智能科技有限公司 | Neural network inference quantification method and device, electronic equipment and storage medium |
WO2023003432A1 (en) * | 2021-07-22 | 2023-01-26 | 주식회사 사피온코리아 | Method and device for determining saturation ratio-based quantization range for quantization of neural network |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129074A (en) * | 2010-01-15 | 2011-07-20 | 中国科学院电子学研究所 | Satellite SAR original data anti-saturation vector compression coding and decoding method |
US20160328644A1 (en) * | 2015-05-08 | 2016-11-10 | Qualcomm Incorporated | Adaptive selection of artificial neural networks |
CN107688855A (en) * | 2016-08-12 | 2018-02-13 | 北京深鉴科技有限公司 | It is directed to the layered quantization method and apparatus of Complex Neural Network |
CN108881660A (en) * | 2018-05-02 | 2018-11-23 | 北京大学 | A method of computed hologram is compressed using the quantum nerve network of optimization initial weight |
CN109165743A (en) * | 2018-07-17 | 2019-01-08 | 东南大学 | A kind of semi-supervised network representation learning algorithm based on depth-compression self-encoding encoder |
CN109214515A (en) * | 2017-06-30 | 2019-01-15 | 华为技术有限公司 | A kind of deep neural network inference method and calculate equipment |
CN109523017A (en) * | 2018-11-27 | 2019-03-26 | 广州市百果园信息技术有限公司 | Compression method, device, equipment and the storage medium of deep neural network |
CN110210620A (en) * | 2019-06-04 | 2019-09-06 | 北京邮电大学 | A kind of channel pruning method for deep neural network |
CN110310164A (en) * | 2019-07-30 | 2019-10-08 | 广州云从信息科技有限公司 | Image processing system, method, platform, machine readable media and equipment |
CN110335591A (en) * | 2019-07-04 | 2019-10-15 | 广州云从信息科技有限公司 | A kind of parameter management method, device, machine readable media and equipment |
CN110363297A (en) * | 2019-07-05 | 2019-10-22 | 上海商汤临港智能科技有限公司 | Neural metwork training and image processing method, device, equipment and medium |
CN110363279A (en) * | 2018-03-26 | 2019-10-22 | 华为技术有限公司 | Image processing method and device based on convolutional neural networks model |
CN110443165A (en) * | 2019-07-23 | 2019-11-12 | 北京迈格威科技有限公司 | Neural network quantization method, image-recognizing method, device and computer equipment |
CN110610237A (en) * | 2019-09-17 | 2019-12-24 | 普联技术有限公司 | Quantitative training method and device of model and storage medium |
-
2019
- 2019-12-31 CN CN201911420549.5A patent/CN111144511B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102129074A (en) * | 2010-01-15 | 2011-07-20 | 中国科学院电子学研究所 | Satellite SAR original data anti-saturation vector compression coding and decoding method |
US20160328644A1 (en) * | 2015-05-08 | 2016-11-10 | Qualcomm Incorporated | Adaptive selection of artificial neural networks |
CN107688855A (en) * | 2016-08-12 | 2018-02-13 | 北京深鉴科技有限公司 | It is directed to the layered quantization method and apparatus of Complex Neural Network |
CN109214515A (en) * | 2017-06-30 | 2019-01-15 | 华为技术有限公司 | A kind of deep neural network inference method and calculate equipment |
CN110363279A (en) * | 2018-03-26 | 2019-10-22 | 华为技术有限公司 | Image processing method and device based on convolutional neural networks model |
CN108881660A (en) * | 2018-05-02 | 2018-11-23 | 北京大学 | A method of computed hologram is compressed using the quantum nerve network of optimization initial weight |
CN109165743A (en) * | 2018-07-17 | 2019-01-08 | 东南大学 | A kind of semi-supervised network representation learning algorithm based on depth-compression self-encoding encoder |
CN109523017A (en) * | 2018-11-27 | 2019-03-26 | 广州市百果园信息技术有限公司 | Compression method, device, equipment and the storage medium of deep neural network |
CN110210620A (en) * | 2019-06-04 | 2019-09-06 | 北京邮电大学 | A kind of channel pruning method for deep neural network |
CN110335591A (en) * | 2019-07-04 | 2019-10-15 | 广州云从信息科技有限公司 | A kind of parameter management method, device, machine readable media and equipment |
CN110363297A (en) * | 2019-07-05 | 2019-10-22 | 上海商汤临港智能科技有限公司 | Neural metwork training and image processing method, device, equipment and medium |
CN110443165A (en) * | 2019-07-23 | 2019-11-12 | 北京迈格威科技有限公司 | Neural network quantization method, image-recognizing method, device and computer equipment |
CN110310164A (en) * | 2019-07-30 | 2019-10-08 | 广州云从信息科技有限公司 | Image processing system, method, platform, machine readable media and equipment |
CN110610237A (en) * | 2019-09-17 | 2019-12-24 | 普联技术有限公司 | Quantitative training method and device of model and storage medium |
Non-Patent Citations (2)
Title |
---|
YUHUI XU 等: "Deep neural network compression with single and multiple level quantization", 《ARXIV:1803.03289V2》 * |
毕鹏程 等: "轻量化卷积神经网络技术研究", 《计算机工程与应用》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860841A (en) * | 2020-07-28 | 2020-10-30 | Oppo广东移动通信有限公司 | Quantization model optimization method, device, terminal and storage medium |
CN111860841B (en) * | 2020-07-28 | 2023-11-14 | Oppo广东移动通信有限公司 | Optimization method, device, terminal and storage medium of quantization model |
CN111985635A (en) * | 2020-09-02 | 2020-11-24 | 北京小米松果电子有限公司 | Method, device and medium for accelerating neural network inference processing |
CN112287986B (en) * | 2020-10-16 | 2023-07-18 | 浪潮(北京)电子信息产业有限公司 | Image processing method, device, equipment and readable storage medium |
WO2022078002A1 (en) * | 2020-10-16 | 2022-04-21 | 浪潮(北京)电子信息产业有限公司 | Image processing method and apparatus, device, and readable storage medium |
CN112287986A (en) * | 2020-10-16 | 2021-01-29 | 浪潮(北京)电子信息产业有限公司 | Image processing method, device and equipment and readable storage medium |
CN113010469A (en) * | 2021-03-18 | 2021-06-22 | 恒睿(重庆)人工智能技术研究院有限公司 | Image feature extraction method, device and computer-readable storage medium |
CN113010469B (en) * | 2021-03-18 | 2023-05-26 | 恒睿(重庆)人工智能技术研究院有限公司 | Image feature extraction method, device and computer readable storage medium |
CN112990457A (en) * | 2021-03-26 | 2021-06-18 | 开放智能机器(上海)有限公司 | Offline quantitative tuning method, apparatus, device, medium, and program product |
CN112990457B (en) * | 2021-03-26 | 2024-05-03 | 开放智能机器(上海)有限公司 | Offline quantization optimization method, device, equipment, medium and program product |
WO2023003432A1 (en) * | 2021-07-22 | 2023-01-26 | 주식회사 사피온코리아 | Method and device for determining saturation ratio-based quantization range for quantization of neural network |
CN113705791A (en) * | 2021-08-31 | 2021-11-26 | 上海阵量智能科技有限公司 | Neural network inference quantification method and device, electronic equipment and storage medium |
CN113705791B (en) * | 2021-08-31 | 2023-12-19 | 上海阵量智能科技有限公司 | Neural network reasoning quantification method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111144511B (en) | 2020-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111144511B (en) | Image processing method, system, medium and electronic terminal based on neural network | |
US10929746B2 (en) | Low-power hardware acceleration method and system for convolution neural network computation | |
CN110363279B (en) | Image processing method and device based on convolutional neural network model | |
CN111950723A (en) | Neural network model training method, image processing method, device and terminal equipment | |
CN112200295B (en) | Ordering method, operation method, device and equipment of sparse convolutional neural network | |
CN113132723B (en) | Image compression method and device | |
CN109598250B (en) | Feature extraction method, device, electronic equipment and computer readable medium | |
CN114049530A (en) | Hybrid precision neural network quantization method, device and equipment | |
CN113505848A (en) | Model training method and device | |
CN114676825A (en) | Neural network model quantification method, system, device and medium | |
CN112748899A (en) | Data processing method and related equipment | |
CN110651273B (en) | Data processing method and equipment | |
CN112488297A (en) | Neural network pruning method, model generation method and device | |
CN116306987A (en) | Multitask learning method based on federal learning and related equipment | |
CN115456169A (en) | Model compression method, system, terminal and storage medium | |
CN113282535B (en) | Quantization processing method and device and quantization processing chip | |
CN109190757B (en) | Task processing method, device, equipment and computer readable storage medium | |
CN113011210A (en) | Video processing method and device | |
CN116992946B (en) | Model compression method, apparatus, storage medium, and program product | |
CN115759192A (en) | Neural network acceleration method, device, equipment, chip and storage medium | |
CN111915689B (en) | Method, apparatus, electronic device, and computer-readable medium for generating an objective function | |
CN114757348A (en) | Model quantitative training method and device, storage medium and electronic equipment | |
CN112561779B (en) | Image stylization processing method, device, equipment and storage medium | |
CN112672405A (en) | Power consumption calculation method and device, storage medium, electronic device and server | |
CN112418388A (en) | Method and device for realizing deep convolutional neural network processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |