CN115225896A

CN115225896A - Image compression method and device, terminal equipment and computer readable storage medium

Info

Publication number: CN115225896A
Application number: CN202110405285.7A
Authority: CN
Inventors: 肖云雷; 刘阳兴
Original assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Current assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date: 2021-04-15
Filing date: 2021-04-15
Publication date: 2022-10-21

Abstract

The application is applicable to the technical field of image processing, and provides an image compression method, an image compression device, terminal equipment and a computer readable storage medium, wherein the method comprises the following steps: inputting an image to be processed into a reversible network for processing, and outputting a first characteristic diagram of a first frequency; inputting the first characteristic diagram into a variation encoder network for processing, and outputting a second characteristic diagram corresponding to the first characteristic diagram; and inputting the second characteristic diagram and the sampling characteristic diagram which is distributed in a preset way and corresponds to the second characteristic diagram into a reversible network for inverse operation processing, and outputting a compressed image of the image to be processed. Through the image compression method and the image compression device, the problems that the effect of a traditional image compression mode is poor, the image quality after compression is low and the visual effect is poor can be solved, the image compression effect can be improved while the flow consumption of the images or videos in the transmission process is reduced, and the visual effect is guaranteed.

Description

Image compression method and device, terminal equipment and computer readable storage medium

Technical Field

The present application belongs to the field of image processing technologies, and in particular, to an image compression method, an image compression device, a terminal device, and a computer-readable storage medium.

Background

With the development of the information age, in the internet field, people acquire information by watching videos or images, and image transmission also becomes an important communication mode. With the increase of data volume and information volume, the physical space occupied by the video or image is larger and larger; if unprocessed video or images are transmitted directly over a network, a large amount of network bandwidth is occupied and a large amount of traffic is consumed.

At present, before a video or an image is transmitted, a compression process is performed on the video or the image to reduce the volume of the video or the image; however, the traditional image compression method has poor effect, and the quality of the compressed image is low, so that the visual effect of the user is influenced.

Disclosure of Invention

The embodiment of the application provides an image compression method, an image compression device, terminal equipment and a computer readable storage medium, and can solve the problems that the traditional image compression mode is poor in effect, the quality of a compressed image is low, and therefore the visual effect of watching of a user is influenced.

In a first aspect, an embodiment of the present application provides an image compression method, including:

inputting an image to be processed into a reversible network for processing, and outputting a first characteristic diagram of a first frequency;

inputting the first characteristic diagram into a variation encoder network for processing, and outputting a second characteristic diagram corresponding to the first characteristic diagram;

and inputting the second characteristic diagram and the sampling characteristic diagram which is distributed in a preset way and corresponds to the second characteristic diagram into a reversible network for inverse operation processing, and outputting a compressed image of the image to be processed.

In a second aspect, an embodiment of the present application provides an image compression apparatus, including:

the first processing unit is used for inputting the image to be processed into the reversible network for processing and outputting a first characteristic diagram of a first frequency;

the second processing unit is used for inputting the first characteristic diagram into the variation encoder network for processing and outputting a second characteristic diagram corresponding to the first characteristic diagram;

and the third processing unit is used for inputting the second characteristic diagram and the sampling characteristic diagram which is distributed in a preset way and corresponds to the second characteristic diagram into a reversible network for carrying out inverse operation processing, and outputting a compressed image of the image to be processed.

In a third aspect, an embodiment of the present application provides a terminal device, where the terminal device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements an image compression method when executing the computer program.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements an image compression method.

In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute any one of the image compression methods in the first aspect.

It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.

Compared with the prior art, the embodiment of the application has the advantages that: according to the embodiment of the application, the terminal equipment inputs the image to be processed into the reversible network, and outputs the first characteristic diagram of the first frequency after the image to be processed is processed by the reversible network; inputting the first characteristic diagram into a variation encoder network, and outputting a second characteristic diagram corresponding to the first characteristic diagram after the first characteristic diagram is processed by the variation encoder network; inputting the second characteristic diagram and a sampling characteristic diagram which is distributed in a preset mode and corresponds to the second characteristic diagram into a reversible network, and performing inverse operation processing on the reversible network to obtain a compressed image of the image to be processed; the image to be processed is processed through the reversible network and the variational encoder network, and the inverse operation processing of the reversible network is adopted, so that the image compression effect is improved while the flow and occupied bandwidth required in the transmission process of the image or video are saved, and the quality of the compressed image is ensured; has stronger usability and practicability.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flowchart of an application scenario provided in an embodiment of the present application;

FIG. 2 is a flowchart illustrating an image compression method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a reversible network architecture provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of a network architecture of a variational encoder according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a post-processing network architecture according to an embodiment of the present application;

FIG. 6 is a schematic diagram illustrating a comparison of visual effects of compressed images provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of an image compression apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

With the development of the internet, more and more websites or applications are widely developed and used, and users acquire information through images or videos in the network. In the network storage or transmission process, the physical space occupied by the image or video is large, and if the image or video is directly stored or transmitted, a large amount of storage space and a large network bandwidth are occupied. If the main information of the image can be kept and the image can be compressed as much as possible, a great deal of bandwidth occupancy rate and traffic consumption can be saved.

Conventional image compression algorithms (BPG) apply deep learning to the field of image compression. In the aspect of image compression, the method also comprises the step of constructing a variation self-coding network, and the variation self-coding network is combined with deep learning to perform image compression. The compression effect thereof is yet to be further improved.

Referring to fig. 1, a schematic flowchart of an application scenario provided in an embodiment of the present application is shown, which is a process for processing an image when the image is compressed. As shown in fig. 1, a variational encoder network is combined with a reversible network, which may also be other similar hyper-division networks, so that in the process of compressing an image, the effect of enhancing the image quality based on deep learning is achieved, and the image compression effect is improved.

As shown in fig. 1, in the process of compressing an image by using the network architecture provided in the embodiment of the present application, a terminal device obtains an image x to be processed, where the terminal device may acquire the image x in real time through an integrated camera device, and may also be a stored image to be uploaded or transmitted. In the application process, the terminal device carries out forward propagation calculation on an image x to be processed, inputs the image x to be processed into a reversible network, carries out convolution processing through the reversible network and outputs a characteristic diagram y; and the characteristic graph y is a low-frequency characteristic graph representing the overall information of the image to be processed. And inputting the characteristic graph y into a variation encoder network, and outputting the characteristic graph y' after a series of processing such as down-sampling, convolution, normalization and up-sampling of the variation encoder network.

The reversible network and the variational encoder network shown in fig. 1 are obtained by training sample images through a training set and a preset loss function. The preset loss function comprises a cross entropy loss function, and a loss value between the high-frequency image and the preset distributed sampling image is calculated according to the divergence through the cross entropy loss function; the high-frequency image is an image output by the sample image after being processed by a reversible network to be trained, and is used for representing the detail texture characteristics of the sample image, and the preset distribution can be zero-mean Gaussian distribution corresponding to the high-frequency image.

The terminal device obtains a sampling feature map of preset distribution of the feature map y', the distribution of the sampling feature map is the same as that of a sampling image corresponding to a high-frequency image in the network training process, for example, the sampling feature map is a feature map of zero-mean Gaussian distribution.

And the terminal equipment inputs the feature graph y ' and the sampling feature graph of the preset distribution of the feature graph y ' into the reversible network again, the reversible network carries out inverse operation processing on the feature graph y ' and the sampling feature graph, and a compressed image of the image x to be processed is output through inverse propagation calculation.

According to the embodiment of the application, based on deep learning, the variable-length encoder network is utilized and combined with the reversible network, the compression of the image or the video is realized, the loss of bandwidth resources and flow occupied in the transmission process is saved, the compression effect of the image or the video is improved, and the quality of the compressed image or video and the display effect of the image or the video are ensured.

The specific processing procedure of image compression and the training procedure of the reversible network and the variational encoder network are further described by specific embodiments.

Referring to fig. 2, which is a schematic flow chart of an image compression method provided in an embodiment of the present application, the image compression method is based on deep learning, combines a variational encoder network and a reversible network, and introduces a post-processing network structure into the variational encoder network to implement compression of an image or a video. The variational encoder network comprises a nonlinear encoding network, a quantization module, a prior encoding network, an entropy encoding module, a prior decoding network and a nonlinear decoding network; the reversible network comprises two branches: a low frequency branch and a high frequency branch, based on two branches, the reversible network comprising at least one compressed block downlink Module, each compressed block comprising at least one reversible block InvBlock and a conversion block Harr Transformation. Based on the network architecture, the image compression method comprises the following steps:

step S201, the terminal device inputs the image to be processed into the reversible network for processing, and outputs a first feature map of the first frequency.

In some embodiments, the image to be processed may be an image or a video frame acquired by the terminal device and acquired in real time through the shooting device, and may also be an image or a video frame to be uploaded or transmitted stored in the terminal device. The reversible network may be one compression block or a plurality of compression blocks in series.

The reversible network comprises at least one compression block connected in series, each compression block comprises a conversion block and at least one reversible block connected in series, the conversion block is used for executing conversion operation, and the reversible block is used for executing convolution operation. The conversion block is used for carrying out down-sampling and filtering processing on an input image through a scale function; for example Haar wavelet transform (Haar transform), low-pass filtering and high-pass filtering are performed from both the horizontal and vertical directions, resulting in a feature map of different frequencies, the size of which is halved with respect to the input image. Each reversible block comprises a plurality of dense convolutional networks which are mutually associated through a preset logical relationship. And continuously inputting the feature graph into the reversible block, and obtaining a first feature graph of the first frequency output finally through convolution processing of the dense convolution networks and operation of a preset logical relation. The first feature map of the first frequency is a low-frequency feature map, and is used for representing global features of the input image.

In some embodiments, the terminal device inputs the image to be processed into the reversible network for processing, and outputs a first feature map of a first frequency, including:

step S2011, the terminal equipment inputs the image to be processed into a reversible network; in the process of processing through one compression block, the terminal equipment acquires a first conversion feature map and a second conversion feature map of an input image through a conversion block in a current compression block.

The first conversion characteristic diagram corresponds to a first frequency, the second conversion characteristic diagram corresponds to a second frequency, the input image is an image to be processed or an output image of a previous compression block, the previous compression block is connected with and adjacent to the current compression block in series, and the first frequency is smaller than the second frequency.

In some embodiments, the reversible network includes one or more compression blocks connected in series, each compression block has the same structure, and includes a conversion block and at least one reversible block connected in series, and the processing procedure of each compression block on the input image is also the same, and the processing procedure of any one compression block is taken as an example for description here.

Referring to fig. 3, a schematic diagram of a reversible network architecture according to an embodiment of the present application is provided. The reversible network shown in fig. 3 comprises two tandem compression blocks descending Module, and in practical application, the number of tandem compression blocks descending Module can be increased or decreased as required. Taking the first compression block calculated by forward propagation in fig. 3 as an example, the input image of the compression block is the image x to be processed, and the first conversion feature map of the first frequency is obtained through the filtering process of the conversion block

And a second transfer profile for a second frequency

It should be noted that the conversion block may be a Haar wavelet conversion module (Haar transform), and performs low-pass and high-pass filtering on the input image to obtain a first conversion feature map of the first frequency and a second conversion feature map of the second frequency; the first conversion characteristic diagram of the first frequency is a low-frequency characteristic diagram, and the second conversion characteristic diagram of the second frequency is a high-frequency characteristic diagram.

In step S2012, the terminal device performs a convolution operation on the first conversion feature map and the second conversion feature map through at least one reversible block serially connected in the current compressed block, so as to obtain a first output image and a second output image.

In some embodiments, each reversible block includes a plurality of dense convolutional networks, which are associated by a preset logical relationship. As shown in fig. 3, phi, ρ, and η in the reversible block are dense convolutional networks, respectively, and based on the three dense convolutional networks, the operation processing is performed on the first conversion feature map and the second conversion feature map through a preset logical relationship, so as to output a first output image and a second output image.

Understandably, in the reversible block, on the basis of convolution processing using three dense convolution networks, two branches, a low-frequency branch and a high-frequency branch, are divided. The low frequency branch corresponds to the input first conversion profile and the output first output image, and the high frequency branch corresponds to the input second conversion profile and the output second output image.

In addition, the three dense convolution networks of phi, rho and eta in each reversible block can be three networks with the same layer structure or different. Each dense convolutional network connecting each layer to every other layer in a feed-forward manner, e.g., a dense convolutional network having L layers, having L (L + 1)/2 direct connections, with connections between each adjacent two layers and between each layer and its subsequent layer; for each layer, all the feature maps of its previous layer are used as inputs, and the feature map of its own is used as an input for all its subsequent layers. For example, each dense convolutional network is dense at five levels with a growth rate of k =4, each level taking as input all the previous feature maps. The convolution operation of the reversible block on the input image alleviates the problem of gradient disappearance, enhances the characteristic propagation and greatly reduces the parameter quantity of the network.

In some embodiments, the terminal device performs a convolution operation on the first conversion feature map and the second conversion feature map through at least one reversible block concatenated in the current compression block to obtain a first output image and a second output image, and the method includes:

a1, the terminal equipment performs convolution operation on the input characteristic diagram through the current reversible block to obtain a first output characteristic diagram and a second output characteristic diagram.

The input characteristic diagram comprises a first conversion characteristic diagram and a second conversion characteristic diagram, or an output characteristic diagram of a previous reversible block, and the previous reversible block is connected with a current reversible block in series and is adjacent to the current reversible block.

In some embodiments, as shown in fig. 3, each compression block includes a plurality of reversible blocks in series, e.g., each compression block may include one conversion block and eight reversible blocks in series. If the current reversible block is a reversible block which is connected with the conversion block in series and is adjacent to the conversion block, the input characteristic diagram of the current reversible block is a first conversion characteristic diagram and a second conversion characteristic diagram; and if the current reversible block is at any position among the plurality of reversible blocks connected in series or the last reversible block, the input characteristic diagram of the current reversible block is the output characteristic diagram of the last reversible block, and the last reversible block is connected with the current reversible block in series and is adjacent to the current reversible block.

Illustratively, the terminal device performs convolution operation on the input feature maps of the two branches through the reversible block, as shown in fig. 3, and performs operation through three dense convolution networks of phi, rho and eta and a preset logical relationship based on deep learning to obtain a first output feature map and a second output feature map output by the reversible block. The first output characteristic diagram and the second output characteristic diagram correspond to two branches respectively, the first output characteristic diagram corresponds to a low-frequency branch, and the second output characteristic diagram corresponds to a high-frequency branch.

In some embodiments, the terminal device performs a convolution operation on the input feature map through the current reversible block to obtain a first output feature map and a second output feature map, including:

the terminal equipment executes convolution operation on the input characteristic graph through a preset logical operation formula to obtain a first output characteristic graph and a second output characteristic graph;

the preset logical operation formula is expressed as follows:

is an input profile for a first frequency,

is an input profile for a second frequency,

is a first output characteristic diagram of the first output,

and phi, rho and eta are dense convolution networks respectively for a second output characteristic diagram, l and l +1 are used for representing the sequence relation before and after one convolution operation of the reversible block respectively, the first output characteristic diagram corresponds to a first frequency, and the second output characteristic diagram corresponds to a second frequency. The first frequency corresponds to a low frequency and the second frequency corresponds to a high frequency.

In some embodiments of the present invention, the,

and

for the low-frequency branch after one logic operation based on the dense convolution network, the method comprises the following steps

After convolution operation of dense convolution network phi, the sum

Overlapping to obtain a first output characteristic diagram of the current reversible block

And

aiming at the high-frequency branch after one logic operation based on the dense convolution network, the dense convolution network rho is aligned to the first output characteristic diagram

Taking the convolution operation result as an index of a natural number e to form an exponential function, and calculating a tensor product of the value of the exponential function and the input feature map of the second frequency; pairing the first output feature map by the dense convolution network η

Performing convolution operation, and overlapping the convolution operation result with tensor product to obtain a second output characteristic diagram

A2, the terminal device takes the first output characteristic diagram and the second output characteristic diagram as input characteristic diagrams of a next reversible block, the next reversible block executes convolution operation on the first output characteristic diagram and the second output characteristic diagram, and the current reversible block and the next reversible block are connected in series and adjacent; or if the current reversible block is the last reversible block in at least one reversible block connected in series in the current compression block, taking the first output feature map as a first output image and taking the second output feature map as a second output image. Or, if the current compressed block is the last of the at least one compressed block in series and the current reversible block is the last of the at least one reversible block in series in the current compressed block, taking the first output characteristic map as the first characteristic map.

In some embodiments, the first output image and the second output image are output images of each compressed block, i.e. an output feature map of the last reversible block in each compressed block; the first output feature map and the second output feature map are output feature maps of each reversible block in the compressed block.

Understandably, the processing procedure of each reversible block is the same, the processing procedure of each compressed block is the same, the input image is processed by a series of reversible blocks and compressed blocks in the reversible network, a first output image of a first frequency and a second output image of a second frequency are output by the last compressed block, and the first output image of the first frequency is taken as a first feature map of low frequency.

The reversible network may include one or more compression blocks, and after the input image is processed by each compression block, a feature map with a size corresponding to a down-sampled twice of the size of the input image is output by each compression block. In practical applications, the number of reversible blocks and the number of compressed blocks in the reversible network may be selected according to the application site.

Step S2013, the terminal device takes the first output image and the second output image as input images of a next compression block, the next compression block processes the first output image and the second output image, and the current compression block and the next compression block are connected in series and are adjacent; or if the current compression block is the last compression block in the at least one compression block in the series connection, the first output image is used as the first feature map of the first frequency.

In some embodiments, if more than two compression blocks are included in the reversible network in series, the input image is processed by each compression block in turn, for example, the first output image and the second output image output by the current compression block are taken as input images of the next compression block which is connected in series and adjacent, and the next compression block converts and convolves the first output image and the second output image.

If the current compression block is the last compression block in the compression blocks connected in series in the reversible network, the first output image and the second output image are the last output results of the reversible network, the first output image of the first frequency is used as the first feature map y output by the reversible network, and the second output image of the second frequency is used as the high-frequency feature map z output by the reversible network.

Step S202, the terminal equipment inputs the first characteristic diagram into the variation encoder network for processing, and outputs a second characteristic diagram corresponding to the first characteristic diagram.

In some embodiments, the processing for each image or video frame in compressing the image or video involves a variational encoder network and a reversible network. The first characteristic diagram is a low-frequency characteristic diagram which is output by a reversible network and used for representing the global characteristics of an original image, and the second characteristic diagram is a characteristic diagram which is output after the variational encoder network carries out encoding and decoding processing on the first characteristic diagram.

Illustratively, the terminal device inputs the low-frequency characteristic diagram output by the reversible network into the variational encoder network, and each module of the variational encoder network performs coding and decoding processing on the low-frequency characteristic diagram and outputs a second characteristic diagram. The coding and decoding processing comprises coding processing of a nonlinear coding network and an a priori coding network and decoding processing of an a priori decoding network and a nonlinear decoding network.

Referring to fig. 4, a schematic diagram of a network architecture of a variational encoder according to an embodiment of the present application is shown. As shown in fig. 4, the variational encoder network includes a non-linear encoding network, a quantization module, an a priori encoding network, an entropy encoding module, an a priori decoding network, and a non-linear decoding network.

In some embodiments, the inputting, by the terminal device, the first feature map into the variational encoder network for processing, and outputting the second feature map corresponding to the first feature map includes:

step S2021, the terminal equipment inputs the first feature diagram into a variation encoder network; processing the first characteristic diagram through a nonlinear coding network in a variation coder network, and outputting a third characteristic diagram corresponding to the first characteristic diagram; and carrying out quantization rounding processing on the third characteristic diagram to obtain a fourth characteristic diagram.

As shown in FIG. 4, in some embodiments, the non-linear encoding network in the variational encoder network includes a convolution layer and a normalization layer. Wherein, the number of channels of the convolution layer is 192, the convolution kernel is 5 multiplied by 5, and the convolution layer also comprises the down-sampling processing with the convolution kernel of 2; the NORMALIZATION layer is a GENERALIZED NORMALIZATION layer (GND) suitable for image reconstruction. The normalization layer described herein does not refer to a single layer of calculation, but is actually a normalization processing unit, which includes calculation of multiple layers. The calculation formula of the normalization layer GDN can be expressed as:

wherein x is _i For normalizing input feature maps of i-th layer in processing unit，β _i And gamma _i For a trained parameter, e.g. beta _i May be set to 10 ^-6 ，γ _i The initial value of (2) can be set to 0.1,y _i Is the output characteristic diagram of the ith layer in the normalization processing unit.

Illustratively, the terminal device inputs the first feature map Y into the nonlinear coding network, and outputs a third feature map Y after processing of the convolution layer and the normalization layer, and the quantization module performs quantization rounding processing on the third feature map Y to obtain a fourth feature map YQ.

The quantization module directly performs rounding operation on the pixels of the third feature map Y to which random noise is added, wherein the added random noise can be uniform noise, and the value range is-0.5 to +0.5. And generating a random number of a corresponding type according to the type of the uniform noise, adding the random number to the pixel value of the third characteristic diagram Y, rounding the sum of the pixel values of all the obtained pixel points, and compressing the rounded sum to a range of [0,255], so as to obtain a fourth characteristic diagram YQ.

Step S2022, the terminal device inputs the third feature map into a prior coding network in the variational coder network for processing, and outputs a fifth feature map; and carrying out quantization rounding processing on the fifth characteristic diagram to obtain a sixth characteristic diagram.

In some embodiments, the a priori coding network in the variational encoder network includes convolutional and normalization layers and an activation layer. As shown in fig. 4, the third feature map Y is input into the a priori coding network while quantization evidence is taken on the third feature map Y. The a priori coding network includes normalization layers Abs, convolutional layers Conv128 × 3 × 3, activation layers Relu, and convolutional layers Conv128 × 5 × 5/2 ↓. And outputting a fifth feature map Z through calculation of each layer of the prior coding network, and carrying out quantization rounding processing on the fifth feature map Z by a quantization module through the same quantization rounding mode as the above to obtain a sixth feature map ZQ.

Step S2023, the terminal device processes the sixth feature map through the entropy coding module to obtain a probability map of the sixth feature map; carrying out arithmetic coding processing on the sixth feature map and the probability map of the sixth feature map to obtain a first binary file; and carrying out arithmetic decoding processing on the first binary file to obtain a sixth characteristic diagram.

In some embodiments, the terminal device processes the sixth feature map ZQ through an entropy coding module to obtain a probability map ZP of the sixth feature map ZQ. The probability ZP is calculated by an entropy coding model of an entropy coding module, and the formula of the entropy coding model is as follows:

wherein p is a representation matrix or vector of the probability map ZP, y _i For the ith feature point of the sixth feature map ZQ,

is a gaussian distribution function, σ is a variance (corresponding variance can be generated according to autocorrelation of each feature point of the sixth feature map ZQ), y represents the sixth feature map ZQ, and z represents the probability map ZP.

In some embodiments, the sixth profile ZQ and the probability map ZP are fed to an arithmetic coding module AE and are coded by the arithmetic coding module. The arithmetic coding is one of entropy coding, and different characters are coded differently based on the probability of occurrence of the characters in the image data of the sixth feature map ZQ and the probability map ZP, so that the first binary file z _ bit is obtained. For example, the interval [0, 1) is continuously divided into a plurality of sub-intervals, each sub-interval representing a character in one image data, the size of each sub-interval being proportional to the probability of the corresponding character appearing in the image data; the greater the probability, the larger the subinterval, and all subintervals sum up to exactly the interval [0,1 ]. The arithmetic decoding processing is the inverse operation of the arithmetic coding processing, and in the arithmetic decoding processing process, the first binary file z _ bit is input into an arithmetic decoding module AD, and a decoding output characteristic diagram ZQ is combined with a probability diagram ZP.

Step S2024, the terminal device inputs the sixth feature map into a priori decoding network in the variational encoder network for processing, and outputs the first variance parameter.

In some embodiments, the sixth feature map ZQ is input to the a priori decoding network, and is processed by the a priori decoding network to obtain the first variance parameter. As shown in fig. 4, the a priori decoding network includes an active layer and a convolutional layer, the active layer using a softplus function; the convolutional layer comprises: a convolution operation with a channel of 128 and a convolution kernel of 3 × 3, a convolution operation with a channel number of 128 and a convolution kernel of 5 × 5, and an upsampling operation with a convolution kernel of 2.

Step S2025, the terminal device inputs the fourth feature map and the first variance parameter into an entropy coding module to obtain a probability map of the fourth feature map; carrying out arithmetic coding processing on the fourth feature map and the probability map of the fourth feature map to obtain a second binary file; and performing arithmetic decoding processing on the second binary file to obtain a fourth characteristic diagram.

In some embodiments, the terminal device inputs the fourth feature map YQ and the first variance parameter into the entropy coding module, and obtains the probability map YP of the fourth feature map through the calculation of formula (4). The fourth feature map YQ and the probability map YP are input to an arithmetic coding module AE, and based on the same coding principle as described above, a second binary file y _ bit is obtained.

In some embodiments, the second binary file y _ bit is input to the arithmetic decoding module AD, and the output characteristic map YQ is decoded in combination with the probability map YP.

It should be noted that, in the image transmission process, the terminal device also transmits a binary file of the image, and the size of the binary file is much smaller than that of the image, so that the flow and the occupied bandwidth are saved; and when restoring the image content, decoding processing is also performed based on the binary file.

Step S2026, the terminal device inputs the fourth feature map into the non-linear decoding network and the post-processing network in the variational encoder network for processing, and outputs a second feature map corresponding to the first feature map.

In some embodiments, the terminal device inputs the fourth feature map YQ into a non-linear decoding network and a post-processing network in the variational encoder network, and outputs the second feature map after convolution processing.

In some embodiments, the non-linear decoding network convolution layer and the inverse normalization layer. As shown in fig. 4, the convolutional layer includes a convolution operation with a number of channels 192 and a convolution kernel of 5 × 5 and an upsampling operation with a convolution kernel of 2. The inverse normalization layer is a back propagation operation process of the normalization layer GDN in the non-linear coding network.

Illustratively, as shown in fig. 5, the post-processing network provided in the embodiment of the present application includes a volume aggregation layer, a residual module, and an activation layer. Wherein, the volume set layer comprises convolution operation with 32 channel numbers and 3 multiplied by 3 convolution kernels; the performance of the whole network can be improved through the residual error module Res Block, and the accuracy and precision of the whole network are improved; the activation layer adopts a Leaky Relu activation function, so that the stability of the whole network is improved.

And step S203, the terminal equipment inputs the second characteristic diagram and the sampling characteristic diagram which is distributed in a preset mode and corresponds to the second characteristic diagram into a reversible network to carry out inverse operation processing, and a compressed image of the image to be processed is output.

In some embodiments, the terminal device performs inverse operation processing on the image processed by the variational encoder network through a reversible network. In the process of inverse operation, processing is carried out step by step according to the structure of the original reversible network. For example, a reversible network comprises two compression blocks in series, each compression block comprising a conversion block and eight reversible blocks in series. Inputting the second feature map into a reversible network, in each compressed block, performing operation processing on the reversible block according to a logic opposite to a preset logic in the forward propagation calculation process to obtain an output feature map of each reversible block, and taking the output feature map as an input feature map of a next reversible block which is connected in series and adjacent to the output feature map (the next reversible block at the position is the next reversible block in the backward propagation calculation process); or the output characteristic diagram of the reversible block is used as the input characteristic diagram of the conversion block, and the conversion block carries out reverse conversion processing to obtain the output characteristic diagram of the current compression block. The terminal device takes the output characteristic diagram of the current compression block as the input characteristic diagram of the next compression block which is connected in series and adjacent (here, the next compression block is the next compression block in the reverse propagation calculation process); or taking the output characteristic diagram of the current compression block as the output characteristic diagram of the reversible network inverse operation processing, namely, the compression image.

The preset distribution sampling feature map corresponding to the second feature map may be a sampling image processed by the second feature map according to gaussian distribution.

In some embodiments, the inputting, by the terminal device, the second feature map and the sampling feature map corresponding to the second feature map and having preset distribution into the reversible network to perform inverse operation processing, and outputting a compressed image of the image to be processed includes:

step S2031, the terminal device performs inverse operation processing on the second feature map and the sampling feature map of the preset distribution corresponding to the second feature map through the reversible block in the current compression block to obtain an output feature map of the first frequency and an output feature map of the second frequency, wherein the first frequency is smaller than the second frequency;

step S2032, the terminal equipment inputs the output characteristic diagram of the first frequency and the output characteristic diagram of the second frequency into a conversion block in the current compression block for inverse operation processing to obtain an inverse operation image;

step S2033, the terminal device takes the inverse operation image as the input image of the next compression block, and the next compression block and the current compression block are two compression blocks which are connected in series and adjacent in the inverse operation process; or if the current compression block is the compression block processed by the last inverse operation, taking the inverse operation image as the compression image of the image to be processed.

The formula of the inverse operation processing of the reversible block is expressed as follows:

is an input profile of the first frequency of the reversible block during the second profile or inverse operation process,

is a sampling characteristic diagram of preset distribution corresponding to the second characteristic diagram or an input characteristic diagram of the second frequency of the reversible block in the inverse operation processing process,

is an output profile of the first frequency of the reversible block during the inverse operation process,

phi, rho and eta are respectively dense convolution networks for the output characteristic diagram of the second frequency of the reversible block in the inverse operation processing process, and j +1 and j are respectively used for representing the sequence relation before and after one inverse operation of the reversible block.

It should be noted that the terminal device performs inverse operation processing through a reversible network, and in a reversible block, operation processing is performed for a logic performed opposite to a preset logic of forward propagation calculation, for example, as shown in formula (5) and formula (6), where normal convolution operation is performed through a dense convolution network.

According to the embodiment of the application, the terminal equipment inputs the image to be processed into the reversible network, and outputs the first characteristic diagram of the first frequency after the image to be processed is processed by the reversible network; inputting a first characteristic diagram into a variation encoder network, and outputting a second characteristic diagram corresponding to the first characteristic diagram after the first characteristic diagram is processed by the variation encoder network; inputting the second characteristic diagram and a sampling characteristic diagram which is distributed in a preset mode and corresponds to the second characteristic diagram into a reversible network, and performing inverse operation processing on the reversible network to obtain a compressed image of the image to be processed; the reversible network comprises at least one compression block connected in series, each compression block comprises a conversion block and at least one reversible block connected in series, the conversion block is used for executing conversion operation, and the reversible block is used for executing convolution operation; the variational encoder network is used for executing encoding and decoding operations; the image to be processed is processed through the reversible network and the variational encoder network, and the inverse operation processing of the reversible network is adopted, so that the image compression effect is improved and the quality of the compressed image is ensured while the flow and occupied bandwidth required in the image or video transmission process are saved.

In some embodiments, each sub-network or module in the reversible network and the variational encoder network used in the practical application process is obtained by training a sample image of a training set, and the training process for the reversible network and the flat powder encoder network is further described below by specific embodiments, before an image to be processed is input into the reversible network for processing and a first feature map of a first frequency is output, the training process of the image compression method includes:

1. the terminal equipment obtains a sample image of a training set, inputs the sample image into a reversible network to be trained for processing, and outputs a first image with a first frequency and a second image with a second frequency.

In some embodiments, the first frequency is less than the second frequency, the first image of the first frequency corresponding to a low frequency signature, and the second image of the second frequency corresponding to a high frequency signature. The low-frequency feature map is used for representing the global features of the original sample image, and the high-frequency feature map is used for representing the detail texture features of the original sample image.

2. And the terminal equipment inputs the first image into a nonlinear coding network in a variational coder network to be trained for processing and outputs a third image.

In some embodiments, the first image is processed by a non-linear coding network in the encoder network to be trained, based on the same processing principle as described above in the practical application, to obtain a third image.

3. The terminal equipment carries out quantization rounding processing on the third image to obtain a fourth image; based on the same processing principle in practical application, the quantization module performs quantization rounding processing on the third image to obtain a fourth image.

4. The terminal equipment inputs the third image into a prior coding network in a variational coder network to be trained for quantization rounding processing, and outputs a fifth image; based on the same processing principle in practical application, the third image is processed by a priori coding network and a quantization module in the variational coder network to be trained, and a fifth image is obtained.

5. The terminal equipment inputs the fifth image into a prior decoding network in a variational encoder network to be trained for processing, and outputs a second variance parameter; based on the same processing principle in practical application, the fifth image is processed by a priori decoding network in the variational encoder network to be trained, and a second variance parameter is output.

6. The terminal equipment inputs the fourth image into a nonlinear decoding network and a post-processing network in a variational encoder network to be trained for processing, and outputs a sixth image; based on the same processing principle in the practical application, the fourth image is processed by the nonlinear decoding network and the post-processing network in the variational encoder network to be trained, and a sixth image is obtained.

7. The terminal equipment inputs the sixth image and the second image into a reversible network to be trained for inverse operation processing, and outputs a compressed image of the sample image; based on the same inverse operation processing principle in practical application, the reversible network to be trained performs inverse operation processing on the sixth image and the second image to obtain a compressed image of the sample image.

8. The terminal device calculates a first loss value by a first loss function from the sample image and the compressed image of the sample image.

In some embodiments, the first loss function may select a norm loss function that minimizes the sum of squares of the differences of the sample image and the compressed image, e.g., the first loss function is represented as follows:

among them, loss _recon Is a first loss value, x is a sample image,

and i is the number of the feature points of the compressed image corresponding to the sample image, and m is the total number of the feature points.

9. And the terminal equipment calculates a second loss value through the first loss function according to the sampling image and the sixth image which correspond to the sample image and are in preset multiples.

In some embodiments, the predetermined number of times of the sampled image is determined according to the number of compression blocks in the reversible network to be trained, and one compression block corresponds to two-magnification down-sampling, for example, if the reversible network in fig. 3 includes two compression blocks, it corresponds to four-magnification down-sampling image (i.e. the size of the sampled image is one fourth of the size of the original sample image).

Accordingly, a second loss value loss is calculated using the same first loss function as described above, e.g., equation (7) _guide 。

10. And the terminal equipment calculates a third loss value through the first loss function according to the first image and the sixth image.

The same calculation principle as above, the third loss value loss is calculated by the first loss function, as in equation (7) _comp 。

11. And the terminal equipment calculates a fourth loss value through a second loss function according to the second image and the sampling graph of the preset distribution corresponding to the second image.

In some embodiments, the preset distribution may be a zero-mean gaussian distribution; the second loss function may be a cross entropy loss function; the divergence may be a JS (Jensen-Shannon) divergence for measuring the similarity of the two probability distributions. Calculating a fourth loss value loss of the second image and the sampling image of the preset distribution according to the divergence through a cross entropy loss function _distr 。

12. And the terminal equipment inputs the fifth image into an entropy coding module and calculates to obtain a fifth loss value.

Calculating a fifth loss value loss _ z by an entropy coding module through the autocorrelation of the feature points of the fifth image by using the formula (4) _entropy 。

13. And the terminal equipment inputs the fourth image and the second variance parameter into an entropy coding module, and calculates to obtain a sixth loss value.

Calculating a sixth loss value loss _ y by an entropy coding module based on the feature point of the fourth image and the second variance parameter by using a formula (4) _entropy 。

14. And the terminal equipment performs joint training on the reversible network to be trained and the variational encoder network to be trained through the first loss value, the second loss value, the third loss value, the fourth loss value, the fifth loss value and the sixth loss value, and adjusts the network parameters of the reversible network to be trained and the network parameters of the variational encoder network to be trained to obtain the reversible network and the variational encoder network.

In some embodiments, the total loss value loss in the training process is calculated according to the total loss function, and the network parameters of the reversible network and the variational encoder network are jointly trained and adjusted according to the total loss value loss. Wherein the total loss function is expressed as follows:

loss＝λ*(loss _recon +loss _guide +loss _distr +loss _comp )+loss_y _entropy +loss_z _entropy (8)

λ is a parameter for balancing compression ratio and image quality, and the larger λ, the smaller the compression ratio, and the better the recovered image quality, for example, the value of λ may be 1.0/2560.

In addition, when the training times reach the preset times or the total loss value is not changed or reduced, the training is stopped, and the trained reversible network and the trained variational encoder network are obtained. And after the training of the reversible network and the variational encoder network is finished, testing the trained network through a test set. The test set selected was the Kodak dataset used for comparison.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

In the test process, the average result of the compression results of 24 kodak images is selected, the reversible structure and the variational encoder are subjected to joint training, and the corresponding test result of image compression is obtained. And simultaneously, only the variational encoder is trained independently, and a corresponding test result of image compression is obtained. And comparing the test result corresponding to the joint training, the test result corresponding to the individual training and the compression result of the image compression algorithm bpg. As shown in table 1, the Peak Signal to Noise Ratio (PSNR), the multi-scale structure similarity MS-SSIM (structural similarity) and the pixel depth Bpp are compared. Aiming at the compression result corresponding to the joint training, when the Bpp is smaller than the compression result corresponding to the individual training, the PSNR is relatively higher and is simultaneously higher than the compression result of the image compression algorithm bpg; although the MS-SSIM of the compression result corresponding to the joint training is lower than that corresponding to the individual training; however, in combination with subjective vision, the visual effect of the image is shown in (a) and (b) of fig. 6, where (a) of fig. 6 is the visual effect after image compression trained alone, and (b) of fig. 6 is the visual effect after image compression trained jointly, and the visual effect of image compression corresponding to joint training is significantly better than the visual effect of image compression corresponding to training alone.

TABLE 1

Fig. 7 shows a block diagram of an image compression apparatus provided in an embodiment of the present application, corresponding to the image compression method described in the above embodiment, and only the relevant parts of the embodiment of the present application are shown for convenience of description.

Referring to fig. 7, the apparatus includes:

a first processing unit 71, configured to input an image to be processed into a reversible network for processing, and output a first feature map of a first frequency;

a second processing unit 72, configured to input the first feature map into the variational encoder network for processing, and output a second feature map corresponding to the first feature map;

and the third processing unit 73 is configured to input the second feature map and the sampling feature maps corresponding to the second feature map and having preset distributions into a reversible network to perform inverse operation processing, and output a compressed image of the image to be processed.

In some embodiments, the reversible network comprises at least one compression block in series, each compression block comprising a conversion block and at least one reversible block in series, the conversion block for performing a conversion operation and the reversible block for performing a convolution operation; the variational encoder network is used to perform codec operations.

In some embodiments, the first processing unit 71 includes:

the conversion module is used for inputting the image to be processed into the reversible network; in the process of processing through one compression block, a first conversion characteristic diagram and a second conversion characteristic diagram of an input image are obtained through a conversion block in a current compression block, the first conversion characteristic diagram corresponds to a first frequency, the second conversion characteristic diagram corresponds to a second frequency, the input image is an image to be processed or an output image of a previous compression block, the previous compression block is connected with and adjacent to the current compression block in series, and the first frequency is smaller than the second frequency;

the reversible module is used for performing convolution operation on the first conversion characteristic diagram and the second conversion characteristic diagram through at least one reversible block connected in series in the current compression block to obtain a first output image and a second output image;

the first output sub-module is used for taking the first output image and the second output image as input images of a next compression block, and the next compression block processes the first output image and the second output image, and the current compression block and the next compression block are connected in series and are adjacent; or if the current compression block is the last compression block in the at least one compression block in the series connection, the first output image is used as the first feature map of the first frequency.

In some embodiments, the conversion module comprises:

the convolution processing submodule is used for performing convolution operation on the input feature map through the current reversible block to obtain a first output feature map and a second output feature map, wherein the input feature map comprises a first conversion feature map and a second conversion feature map or an output feature map of a previous reversible block, and the previous reversible block is connected with the current reversible block in series and is adjacent to the current reversible block;

the second output submodule is used for taking the first output characteristic diagram and the second output characteristic diagram as input characteristic diagrams of a next reversible block, and performing convolution operation on the first output characteristic diagram and the second output characteristic diagram by the next reversible block, wherein the current reversible block is connected with and adjacent to the next reversible block in series; or if the current reversible block is the last reversible block in at least one reversible block connected in series in the current compression block, the first output feature map is used as the first output image, and the second output feature map is used as the second output image.

In some embodiments, the convolution processing sub-module is further configured to:

performing convolution operation on the input characteristic diagram through a preset logical operation formula to obtain a first output characteristic diagram and a second output characteristic diagram; the preset logical operation formula is expressed as follows:

is an input profile for a first frequency,

is an input profile for a second frequency,

is a first output characteristic diagram and is used as a first output characteristic diagram,

and for a second output characteristic diagram, phi, rho and eta are dense convolution networks respectively, l and l +1 are used for representing the sequence relation before and after one convolution operation of the reversible block respectively, the first output characteristic diagram corresponds to a first frequency, and the second output characteristic diagram corresponds to a second frequency.

In some embodiments, the second processing unit 72 includes:

the first processing submodule is used for inputting the first characteristic diagram into the variation encoder network; processing the first characteristic diagram through a nonlinear coding network in a variation coder network, and outputting a third characteristic diagram corresponding to the first characteristic diagram;

the second processing submodule is used for carrying out quantization rounding processing on the third characteristic diagram to obtain a fourth characteristic diagram;

the third processing submodule is used for inputting the third feature map into a prior coding network in the variation coder network for processing and outputting a fifth feature map;

the fourth processing submodule is used for carrying out quantization rounding processing on the fifth characteristic diagram to obtain a sixth characteristic diagram;

the fifth processing submodule is used for processing the sixth feature map through the entropy coding module to obtain a probability map of the sixth feature map;

the sixth processing submodule is used for carrying out arithmetic coding processing on the sixth feature map and the probability map of the sixth feature map to obtain a first binary file;

the seventh processing submodule is used for performing arithmetic decoding processing on the first binary file to obtain a sixth characteristic diagram;

the eighth processing submodule is used for inputting the sixth feature map into a priori decoding network in the variational encoder network for processing and outputting a first variance parameter;

the ninth processing sub-module is used for inputting the fourth feature map and the first variance parameter into the entropy coding module to obtain a probability map of the fourth feature map;

the tenth processing submodule is used for carrying out arithmetic coding processing on the fourth feature map and the probability map of the fourth feature map to obtain a second binary file;

the eleventh processing submodule is used for performing arithmetic decoding processing on the second binary file to obtain a fourth characteristic diagram;

and the twelfth processing submodule is used for inputting the fourth feature diagram into a nonlinear decoding network and a post-processing network in the variational encoder network for processing and outputting a second feature diagram corresponding to the first feature diagram.

In some embodiments, the third processing unit 73 includes:

the inverse operation module is used for carrying out inverse operation processing on the second characteristic diagram and a sampling characteristic diagram corresponding to the second characteristic diagram and in preset distribution through a reversible block in the current compression block to obtain an output characteristic diagram of a first frequency and an output characteristic diagram of a second frequency, wherein the first frequency is smaller than the second frequency;

the third output submodule is used for inputting the output characteristic diagram of the first frequency and the output characteristic diagram of the second frequency into the conversion block in the current compression block for inverse operation processing to obtain an inverse operation image; taking the inverse operation image as an input image of a next compression block, wherein the next compression block and the current compression block are two compression blocks which are connected in series and adjacent in the inverse operation process; or if the current compression block is the compression block processed by the last inverse operation, taking the inverse operation image as the compression image of the image to be processed;

the formula of the inverse operation processing of the reversible block is as follows:

is an input feature map of a first frequency of a reversible block during a second feature map or inverse operation process,

and phi, rho and eta are dense convolution networks respectively in the output characteristic diagram of the second frequency of the reversible block in the inverse operation processing process, and j +1 and j are used for representing the sequence relation before and after one inverse operation of the reversible block respectively.

In some embodiments, the apparatus further comprises a training unit for performing the steps of:

acquiring a sample image of a training set, inputting the sample image into a reversible network to be trained for processing, and outputting a first image with a first frequency and a second image with a second frequency;

inputting the first image into a nonlinear coding network in a variational coder network to be trained for processing, and outputting a third image;

carrying out quantization rounding processing on the third image to obtain a fourth image;

inputting the third image into a prior coding network in a variational coder network to be trained for quantization rounding processing, and outputting a fifth image;

inputting the fifth image into a priori decoding network in a variational encoder network to be trained for processing, and outputting a second variance parameter;

inputting the fourth image into a nonlinear decoding network and a post-processing network in a variational encoder network to be trained for processing, and outputting a sixth image;

inputting the sixth image and the second image into a reversible network to be trained for inverse operation processing, and outputting a compressed image of the sample image;

calculating a first loss value through a first loss function according to the sample image and the compressed image of the sample image;

calculating a second loss value through a first loss function according to the sampling image and the sixth image which correspond to the sample image and are in preset multiples;

calculating a third loss value through a first loss function according to the first image and the sixth image;

calculating a fourth loss value through a second loss function according to the second image and a sampling graph of preset distribution corresponding to the second image;

inputting the fifth image into an entropy coding module, and calculating to obtain a fifth loss value;

inputting the fourth image and the second variance parameter into an entropy coding module, and calculating to obtain a sixth loss value;

and performing joint training on the reversible network to be trained and the variational encoder network to be trained through the first loss value, the second loss value, the third loss value, the fourth loss value, the fifth loss value and the sixth loss value, and adjusting the network parameters of the reversible network to be trained and the network parameters of the variational encoder network to be trained to obtain the reversible network and the variational encoder network.

It should be noted that, for the information interaction, execution process, and other contents between the above devices/units, the specific functions and technical effects thereof based on the same concept as those of the method embodiment of the present application can be specifically referred to the method embodiment portion, and are not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps that can be implemented in the foregoing method embodiments.

The embodiments of the present application provide a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the above method embodiments when executed.

Fig. 8 is a schematic structural diagram of a terminal device 8 according to an embodiment of the present application. As shown in fig. 8, the terminal device 8 of this embodiment includes: at least one processor 80 (only one shown in fig. 8), a memory 81, and a computer program 82 stored in the memory 81 and executable on the at least one processor 80, the steps of any of the various method embodiments described above being implemented when the computer program 82 is executed by the processor 80.

The terminal device 8 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing device. The terminal device 8 may include, but is not limited to, a processor 80, a memory 81. Those skilled in the art will appreciate that fig. 8 is merely an example of the terminal device 8, and does not constitute a limitation of the terminal device 8, and may include more or less components than those shown, or combine some components, or different components, such as an input-output device, a network access device, and the like.

The Processor 80 may be a Central Processing Unit (CPU), and the Processor 80 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 81 may in some embodiments be an internal storage unit of the terminal device 8, such as a hard disk or a memory of the terminal device 8. The memory 81 may also be an external storage device of the terminal device 8 in other embodiments, such as a plug-in hard disk provided on the terminal device 8, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 81 may also include both an internal storage unit of the terminal device 8 and an external storage device. The memory 81 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of a computer program. The memory 81 may also be used to temporarily store data that has been output or is to be output.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or apparatus capable of carrying computer program code to a terminal device, recording medium, computer Memory, read-Only Memory (ROM), random-Access Memory (RAM), electrical carrier wave signals, telecommunications signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present application, and they should be construed as being included in the present application.

Claims

1. An image compression method, comprising:

and inputting the second characteristic diagram and the sampling characteristic diagram which is distributed in a preset mode and corresponds to the second characteristic diagram into the reversible network for inverse operation processing, and outputting a compressed image of the image to be processed.

2. The method of claim 1, wherein the reversible network comprises at least one compression block in series, each of the compression blocks comprising a conversion block and at least one reversible block in series, the conversion block to perform a conversion operation and the reversible block to perform a convolution operation; the variational encoder network is used for performing encoding and decoding operations.

3. The method as claimed in claim 2, wherein said inputting the image to be processed into the reversible network for processing and outputting the first feature map of the first frequency comprises:

inputting an image to be processed into a reversible network;

in the process of processing through one compression block, a first conversion feature map and a second conversion feature map of an input image are obtained through a conversion block in a current compression block, the first conversion feature map corresponds to a first frequency, the second conversion feature map corresponds to a second frequency, the input image is an output image of the image to be processed or a previous compression block, the previous compression block is connected with and adjacent to the current compression block in series, and the first frequency is smaller than the second frequency;

performing convolution operation on the first conversion feature map and the second conversion feature map through at least one reversible block connected in series in the current compression block to obtain a first output image and a second output image;

taking the first output image and the second output image as input images of a next compression block, and processing the first output image and the second output image by the next compression block, wherein the current compression block is connected with and adjacent to the next compression block in series; alternatively, the first and second liquid crystal display panels may be,

and if the current compressed block is the last compressed block in the at least one compressed block in the series connection, taking the first output image as a first feature map of the first frequency.

4. The method of claim 3, wherein said performing a convolution operation on said first transformed feature map and said second transformed feature map by at least one invertible block concatenated in said current compressed block to obtain a first output image and a second output image comprises:

performing convolution operation on an input feature map through a current reversible block to obtain a first output feature map and a second output feature map, wherein the input feature map comprises the first conversion feature map and the second conversion feature map, or an output feature map of a previous reversible block, and the previous reversible block is connected with and adjacent to the current reversible block in series;

taking the first output feature map and the second output feature map as input feature maps of a next reversible block, and performing convolution operation on the first output feature map and the second output feature map by the next reversible block, wherein the current reversible block is connected in series and adjacent to the next reversible block; alternatively, the first and second electrodes may be,

and if the current reversible block is the last reversible block in at least one reversible block connected in series in the current compression block, taking the first output feature map as a first output image, and taking the second output feature map as a second output image.

5. The method of claim 4, wherein performing a convolution operation on the input feature map with the current invertible block to obtain a first output feature map and a second output feature map comprises:

performing convolution operation on the input characteristic diagram through a preset logical operation formula to obtain a first output characteristic diagram and a second output characteristic diagram;

wherein the preset logical operation formula is expressed as follows:

is an input profile for the first frequency,

is an input profile for the second frequency,

and for a second output characteristic diagram, phi, rho and eta are dense convolution networks respectively, l and l +1 are used for representing the sequence relation before and after one convolution operation of the reversible block respectively, the first output characteristic diagram corresponds to the first frequency, and the second output characteristic diagram corresponds to the second frequency.

6. The method according to any one of claims 1 to 5, wherein the inputting the first feature map into a variational encoder network for processing and outputting a second feature map corresponding to the first feature map comprises:

inputting the first feature map into a variation encoder network;

processing the first characteristic diagram through a nonlinear coding network in the variation coder network, and outputting a third characteristic diagram corresponding to the first characteristic diagram;

carrying out quantization rounding processing on the third characteristic diagram to obtain a fourth characteristic diagram;

inputting the third feature map into a prior coding network in the variational coder network for processing, and outputting a fifth feature map;

carrying out quantization rounding processing on the fifth feature map to obtain a sixth feature map;

processing the sixth feature map through an entropy coding module to obtain a probability map of the sixth feature map;

carrying out arithmetic coding processing on the sixth feature map and the probability map of the sixth feature map to obtain a first binary file;

performing arithmetic decoding processing on the first binary file to obtain the sixth feature map;

inputting the sixth feature map into a prior decoding network in the variation encoder network for processing, and outputting a first variance parameter;

inputting the fourth feature map and the first variance parameter into the entropy coding module to obtain a probability map of the fourth feature map;

performing arithmetic coding processing on the fourth feature map and the probability map of the fourth feature map to obtain a second binary file;

performing arithmetic decoding processing on the second binary file to obtain the fourth feature map;

and inputting the fourth feature diagram into a nonlinear decoding network and a post-processing network in the variation encoder network for processing, and outputting a second feature diagram corresponding to the first feature diagram.

7. The method according to claim 2, wherein the inputting the second feature map and the sampling feature map of the preset distribution corresponding to the second feature map into the reversible network for inverse operation processing and outputting the compressed image of the image to be processed comprises:

performing inverse operation processing on the second feature map and the sampling feature maps corresponding to the second feature map and preset distribution through the reversible block in the current compression block to obtain an output feature map of the first frequency and an output feature map of a second frequency, wherein the first frequency is smaller than the second frequency;

inputting the output characteristic diagram of the first frequency and the output characteristic diagram of the second frequency into the conversion block in the current compression block to perform inverse operation processing to obtain an inverse operation image;

taking the inverse operation image as an input image of a next compression block, wherein the next compression block and the current compression block are two compression blocks which are connected in series and adjacent in the inverse operation process; alternatively, the first and second liquid crystal display panels may be,

if the current compression block is the compression block processed by the last inverse operation, taking the inverse operation image as the compression image of the image to be processed;

wherein the formula of the inverse operation processing of the reversible block is expressed as follows:

is the second profile or the input profile of the first frequency of the reversible block during inverse operation processing,

is a sampling characteristic diagram of preset distribution corresponding to the second characteristic diagram or an input characteristic diagram of the second frequency of the reversible block in the process of inverse operation processing,

is an output profile of the first frequency of the reversible block during an inverse operation process,

8. The method according to any one of claims 1 to 5, wherein before inputting the image to be processed into the reversible network for processing and outputting the first feature map of the first frequency, the method further comprises:

acquiring a sample image of a training set, inputting the sample image into a reversible network to be trained for processing, and outputting a first image with the first frequency and a second image with the second frequency;

inputting the third image into a prior coding network in the variational coder network to be trained for quantization rounding processing, and outputting a fifth image;

inputting the fifth image into a priori decoding network in the variational encoder network to be trained for processing, and outputting a second variance parameter;

inputting the fourth image into a nonlinear decoding network and a post-processing network in the variational encoder network to be trained for processing, and outputting a sixth image;

inputting the sixth image and the second image into the reversible network to be trained for inverse operation processing, and outputting a compressed image of the sample image;

calculating a first loss value by a first loss function according to the sample image and a compressed image of the sample image;

calculating a second loss value through the first loss function according to the sampling image and the sixth image which correspond to the sample image and are in preset multiples;

calculating a third loss value by the first loss function according to the first image and the sixth image;

inputting the fourth image and the second variance parameter into the entropy coding module, and calculating to obtain a sixth loss value;

through first loss value, second loss value third loss value fourth loss value fifth loss value and sixth loss value, it is right to wait the reversible network of training with wait to train the variation encoder network carries out the joint training, and adjusts wait the network parameter of the reversible network of training with wait to train the network parameter of the variation encoder network, obtain the reversible network with the variation encoder network.

9. An image compression apparatus, comprising:

the second processing unit is used for inputting the first characteristic diagram into a variation encoder network for processing and outputting a second characteristic diagram corresponding to the first characteristic diagram;

and the third processing unit is used for inputting the second feature map and the sampling feature maps which are distributed in a preset mode and correspond to the second feature map into the reversible network for inverse operation processing, and outputting the compressed image of the image to be processed.

10. A terminal device, characterized in that the terminal device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the method according to any of claims 1 to 8 when executing the computer program.

11. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1 to 8.