CN118413673A

CN118413673A - Image feature compression method and device based on code rate and accuracy optimization

Info

Publication number: CN118413673A
Application number: CN202410473001.1A
Authority: CN
Inventors: 蒋伟; 沈昊宇; 杨俊杰
Original assignee: Shanghai University of Electric Power
Current assignee: Shanghai University of Electric Power
Priority date: 2024-04-19
Filing date: 2024-04-19
Publication date: 2024-07-30

Abstract

The invention relates to an image feature compression method and device based on code rate and accuracy optimization, wherein the method comprises the following steps: extracting middle convolution layer characteristics of the convolution neural network at the edge end; partitioning a characteristic channel of the output characteristic of the convolution layer according to the characteristic importance; tiling the characteristic channels in sequence to obtain a characteristic diagram; performing discrete cosine transform on the feature map; carrying out block uniform quantization on the feature map according to the importance partition result; performing arithmetic coding on the feature map to obtain a bit stream, and calculating a code rate; performing arithmetic decoding, dequantization and inverse discrete cosine transformation on the bit stream to recover the feature vector which can be input into the residual convolutional neural network; the classification accuracy is obtained through the output end of the convolutional neural network, so that a code rate and accuracy loss model is established, and the optimal quantization parameters are obtained for data compression. Compared with the prior art, the method and the device can select the optimal quantization parameter according to the code rate requirement of the scene and further compress the characteristic data quantity.

Description

Image feature compression method and device based on code rate and accuracy optimization

Technical Field

The invention relates to the technical field of image feature coding, in particular to an image feature compression method and device based on code rate and accuracy optimization.

Background

As intelligent applications become more common in our daily lives, various data transmissions become more important, such as: monitoring data in cities, fault detection in industrial scenes, video data in intelligent driving, and the like. Because the computing power and the real-time transmission of the edge equipment are limited, all the neural networks cannot be arranged on the equipment, the deep neural networks are required to be segmented, the extracted intermediate data are compressed, and the data are transmitted to the rear end through a wireless channel, so that most of the computation can be processed in the cloud.

In the existing image feature coding method, the importance of different channels on the accuracy of the final task is not considered, so that even if the image features are quantized and coded, a large data volume is generated. And according to different code rate requirements in different scenes, judgment cannot be quickly made, and proper quantization parameters are selected.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide the image characteristic compression method and the device based on code rate and accuracy optimization, which can select the optimal quantization parameter according to the code rate requirement of a scene and further compress the characteristic data quantity.

The aim of the invention can be achieved by the following technical scheme:

The image characteristic compression method based on code rate and accuracy optimization is characterized by comprising the following steps of:

inputting an image to be processed into a convolutional neural network for feature extraction, selecting a convolutional layer from the convolutional neural network, and extracting features output by the convolutional layer;

Partitioning a characteristic channel of the output characteristic of the convolution layer according to the characteristic importance;

Tiling the characteristic channels in sequence to obtain a characteristic diagram; discrete cosine transforming the whole characteristic diagram;

Carrying out block uniform quantization on the feature map after discrete cosine transformation according to the importance partition result;

performing arithmetic coding on the quantized feature map to obtain a bit stream, and calculating a code rate;

performing arithmetic decoding, dequantization and inverse discrete cosine transformation on the bit stream to recover the feature vector which can be input into the residual convolutional neural network;

And obtaining classification accuracy through the output end of the convolutional neural network, thereby establishing a code rate and accuracy loss model, carrying out parameter optimization according to the code rate and accuracy loss model, and obtaining optimal quantization parameters for data compression.

Further, the selected convolution layer is the convolution layer with the least number of convolution layer channels in the convolution neural network.

Further, the partitioning process specifically includes:

the variance of each characteristic channel is calculated, the first K characteristic channels with the smallest variance are selected, the characteristic channels are defined as non-important characteristic areas, and the rest characteristic channels are defined as important characteristic areas.

Further, the discrete cosine transform has a computational expression:

u＝0,1...M-1

v＝0,1...N-1

Where M, N are the length and width of the feature map F (i, j), F (u, v) is the feature after two-dimensional discrete cosine transform, u, v are the horizontal and vertical pixel coordinates of the feature F (u, v), c (u) and c (v) are compensation coefficients, F (i, j) is the feature map before two-dimensional DCT transform, and i, j are the horizontal and vertical pixel coordinates of the feature map F (i, j).

Further, the expression of the uniform quantization of the blocks is:

where V is the feature vector before quantization, For quantized feature vectors, min (V) and max (V) are the minimum value and the maximum value in V, and q is a uniform quantization coefficient;

The uniform quantized coefficients of the non-significant feature region are coarse quantized coefficients q ₂, the uniform quantized coefficients of the significant feature region are lightweight coefficients q ₁, and the lightweight coefficients q ₁ are greater than the coarse quantized coefficients q ₂.

Further, the lightweight coefficient q ₁ and the coarse quantization coefficient q ₂ satisfy a constraint condition:

q₁≥q₂∈{3,4,5,6,7,8}。

Further, the expression of the inverse discrete cosine transform is:

i＝0,1...M-1

j＝0,1...N-1

Where F ' (i, j) is a feature map obtained after inverse discrete cosine transform, F ' (u, v) is a feature obtained after inverse quantization, M, N are the length and width of feature map F (i, j), u, v are the horizontal and vertical pixel coordinates of feature F ' (u, v), c (u) and c (v) are compensation coefficients, F (i, j) is a feature map before two-dimensional DCT transform, and i, j are the horizontal and vertical pixel coordinates of feature map F (i, j), respectively.

Further, the expression of the code rate and accuracy loss model is as follows:

J＝argmin(1-V_AP(q₁,q₂))+λR(q₁,q₂)

wherein, V _AP(q₁,q₂) is the value of the accuracy corresponding to the characteristic recovered after the quantization of the current quantization parameter (q ₁,q₂), lambda is the proportionality coefficient, R (q ₁,q₂) is the code rate corresponding to the current quantization parameter (q ₁,q₂), and J is the code rate and the accuracy loss model output value.

Further, the code rate and accuracy loss model is optimized using lagrangian multipliers.

The invention also provides an image characteristic compression device based on code rate and accuracy optimization, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor calls the computer program to execute the steps of the method.

Compared with the prior art, the invention has the following advantages:

In the image compression method, firstly, the image features output by the 18 th layer convolution layer with the minimum convolution channel are selected, and the feature quantity is compressed for the first time; selecting K channels with the smallest variance from the characteristic map partitions, representing that the content of the information is small, defining the part of the characteristic map areas as non-important areas, and using the rest areas as important areas; after DCT conversion is completed, the importance area adopts light uniform quantization, and the non-importance area adopts heavy uniform quantization, so that the data size can be reduced in a targeted manner. And finally, according to the code rate and the accuracy obtained by different quantization parameters, establishing a code rate and accuracy loss model, optimizing by using Lagrangian multipliers, and selecting proper quantization parameters according to the requirements of the code rate of the scene.

In summary, in image feature compression, importance of different channels is fully considered, feature data is greatly reduced through channel matching and block quantization, and a code rate and accuracy loss model is built according to a changeable transmission code rate of a scene so as to select an optimal quantization parameter, so that transmission efficiency is effectively improved.

Drawings

Fig. 1 is a flow chart of an image feature compression method based on code rate and accuracy optimization provided in an embodiment of the present invention;

fig. 2 is a schematic diagram of a split-region quantization code rate-accuracy optimization flow provided in an embodiment of the present invention;

Fig. 3 is a schematic diagram of an image feature coding framework with optimized partial area code rate-accuracy in an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

Example 1

As shown in fig. 1 and 3, the present embodiment provides an image feature compression method based on code rate and accuracy optimization, which includes the following steps:

s1: inputting an image to be processed into a convolutional neural network for feature extraction, selecting a convolutional layer from the convolutional neural network, and extracting features output by the convolutional layer;

S2: partitioning a characteristic channel of the output characteristic of the convolution layer according to the characteristic importance;

S3: tiling the characteristic channels in sequence to obtain a characteristic diagram; discrete cosine transforming the integral characteristic diagram;

s4: carrying out block uniform quantization on the feature map after discrete cosine transformation according to the importance partition result;

S5: performing arithmetic coding on the quantized feature map to obtain a bit stream, and calculating a code rate;

S6: performing arithmetic decoding, dequantization and inverse discrete cosine transformation on the bit stream to recover the feature vector which can be input into the residual convolutional neural network;

s7: the classification accuracy is obtained through the output end of the convolutional neural network, so that a code rate and accuracy loss model is established, parameter optimization is carried out according to the code rate and accuracy loss model, and the optimal quantization parameters are obtained for data compression.

For step S1, in this embodiment, the convolutional neural network is Resnet to be used, and the selection rule of the convolutional layer is specifically: the convolutional layer with the least number of convolutional layer channels is selected for output, so the 18 th convolutional layer in Resnet is selected for output in this example.

In step 2, partitioning according to the feature importance is to calculate the variance of each feature channel, and select the channel with the smallest variance of the previous K (K is determined according to the number of middle channels of the neural network), which is defined as a non-importance feature area, and the rest feature channels are defined as importance feature areas. Since low variance features generally contain less detail, have less impact on tasks, smaller feature variances are used as criteria for selecting non-important regions.

In step3, firstly, tiling is performed according to the channel sequence to obtain a feature map, and then two-dimensional DCT transformation is performed on the whole feature map f (i, j), and then:

u＝0,1...M-1

v＝0,1...N-1

In step S4, according to the feature map after DCT transformation, coarse quantization q ₂ is used for the selected non-important feature region, lightweight q ₁ is used for the rest important feature regions, and constraint conditions q ₁≥q₂ e {3,4,5,6,7,8}. The formula for uniform quantization of bits is:

in step S5, the quantized feature map is subjected to arithmetic coding to obtain a compressed bit stream, and the code rate is calculated.

In step S6, the expression of the IDCT is:

i＝0,1...M-1

j＝0,1...N-1

In the formula, F '(i, j) is a feature map obtained after inverse discrete cosine transform, F' (u, v) is a feature obtained after inverse quantization, and the data is restored to a multi-channel data type which can be input into the rest of the neural network.

In step S7, minimizing the classification accuracy loss L _AP caused by distortion can be described as:

minL_AP(q₁,q₂),s.t.R(q₁,q₂)≤R_c

where the value of the classification accuracy loss L _AP is a number between 0 and 1, and R _c represents a limited code rate. As the classification accuracy increases, the loss of accuracy is also smaller. The performance penalty can be described as:

L_AP(q₁,q₂)＝1-V_AP(q₁,q₂)

Where V _AP(q₁,q₂) represents the value of the feature correspondence accuracy that is recovered after quantization by selecting the current quantization parameter (q ₁,q₂).

As shown in fig. 2, according to different quantization parameters, a model of the corresponding code rate and accuracy building code rate (rate) and accuracy loss (L _AP) is obtained and optimized by using lagrangian multipliers:

J＝argmin(1-V_AP(q₁,q₂))+λR(q₁,q₂)

Wherein L _AP is the accuracy obtained by subtracting the method from the accuracy without any quantization coding operation, and R is the code rate.

In the image compression method, firstly, the image features output by the 18 th layer convolution layer with the minimum convolution channel are selected, and the feature quantity is compressed for the first time; selecting K channels with the smallest variance from the characteristic map partitions, representing that the content of the information is small, defining the part of the characteristic map areas as non-important areas, and using the rest areas as important areas; after DCT conversion is completed, the importance area adopts light uniform quantization, and the non-importance area adopts heavy uniform quantization, so that the data size can be reduced in a targeted manner. And finally, according to the code rate and the accuracy obtained by different quantization parameters, establishing a code rate and accuracy loss model, optimizing by using Lagrangian multipliers, and selecting proper quantization parameters according to the requirements of the code rate of the scene. In summary, in image feature compression, importance of different channels is fully considered, feature data is greatly reduced through channel matching and block quantization, and a code rate and accuracy loss model is built according to a changeable transmission code rate of a scene so as to select an optimal quantization parameter, so that transmission efficiency is effectively improved.

The embodiment also provides an image feature compression device based on code rate and accuracy optimization, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor calls the computer program to execute the steps of the method.

The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims

1. The image characteristic compression method based on code rate and accuracy optimization is characterized by comprising the following steps of:

2. The method for compressing image features based on code rate and accuracy optimization of claim 1, wherein the selected convolutional layer is a convolutional layer with the least number of convolutional layer channels in a convolutional neural network.

3. The image feature compression method based on code rate and accuracy optimization of claim 1, wherein the partitioning process specifically comprises:

The variance of each characteristic channel is calculated, the first K characteristic channels with the smallest variance are selected and defined as non-important characteristic areas, the rest characteristic channels are defined as important characteristic areas, and K is a positive integer smaller than the total number of the characteristic channels.

4. The image feature compression method based on code rate and accuracy optimization of claim 1, wherein the discrete cosine transform has a computational expression:

u＝0,1...M-1

v＝0,1...N-1

5. The image feature compression method based on code rate and accuracy optimization of claim 3, wherein the expression of block uniform quantization is:

6. The image feature compression method based on code rate and accuracy optimization of claim 5, wherein the lightweight coefficient q ₁ and the coarse quantization coefficient q ₂ satisfy a constraint condition:

q₁≥q₂∈{3,4,5,6,7,8}。

7. The image feature compression method based on code rate and accuracy optimization of claim 1, wherein the expression of the inverse discrete cosine transform is:

i＝0,1...M-1

j＝0,1...N-1

8. The image feature compression method based on code rate and accuracy optimization of claim 1, wherein the expression of the code rate and accuracy loss model is:

J＝arg min(1-V_AP(q₁,q₂))+λR(q₁,q₂)

9. The method for compressing image features based on code rate and accuracy optimization of claim 1, wherein the code rate and accuracy loss model is optimized using lagrangian multipliers.

10. An image feature compression device based on code rate and accuracy optimization, comprising a memory and a processor, the memory storing a computer program, the processor invoking the computer program to perform the steps of the method of any of claims 1-9.