CN116167920A

CN116167920A - Image compression and reconstruction method based on super-resolution and priori knowledge

Info

Publication number: CN116167920A
Application number: CN202310325666.3A
Authority: CN
Inventors: 梅晓勇; 王鑫鑫; 杨毅; 张可金; 黄昌勤
Original assignee: Zhejiang Normal University CJNU
Current assignee: Zhejiang Normal University CJNU
Priority date: 2023-03-24
Filing date: 2023-03-24
Publication date: 2023-05-26

Abstract

The invention discloses an image compression and reconstruction method based on super-resolution and priori knowledge, which comprises the following steps: acquiring a teaching image from a first terminal; inputting the teaching image into a preset two-way priori type acquirer to obtain a preliminary priori feature map; compressing the teaching image by utilizing pyramid convolution and based on semantic information to obtain a compressed image; embedding prior features in the preliminary prior feature map into a super-resolution depth reconstruction process, wherein the prior features are features matched with preset prior conditions in the teaching image; and carrying out shallow feature extraction and deep feature learning of the prior features of the embedded image on the compressed image by using the multi-scale significant feature extraction super-resolution based on an attention mechanism, and reconstructing to obtain a reconstructed image. The invention can improve the realization of image compression and reconstruction based on text face double priori, and can be widely applied to the field of image compression and reconstruction.

Description

Image compression and reconstruction method based on super-resolution and priori knowledge

Technical Field

The invention relates to the technical field of image compression and reconstruction, in particular to an image compression and reconstruction method based on super-resolution and priori knowledge.

Background

In the digital era, the synchronous classroom is a teaching form for providing synchronous teaching services to the assisted school, which breaks through the limitation of places and spaces. The compression and reconstruction of the image are important steps of synchronous classroom image transmission, however, the existing image compression model relieves the problems caused by image transmission, but has some defects such as low compression efficiency, poor display effect of the image or video frame with high compression ratio at the terminal and the like.

The image compression and reconstruction process essentially is to re-add high frequency details to the compressed image after the image is compressed to generate a high definition image, while the image super resolution is intended to reconstruct a high definition image from a low definition image. However, most of the existing super-resolution researches consider the images as natural scene images uniformly, and although related works such as RCAN, HAN and the like all achieve excellent performance far higher than bicubic difference, the prior information contained in the image content is ignored. Considering the specificity of the synchronous classroom scene, the transmission picture usually takes the teacher and student figures and texts as main pictures for transmission, and although researchers study the prior information of text hyper-score and face hyper-score, such as TSRN and SISN, the problems of single application scene, inaccurate prior estimation and the like still exist. Image compression and reconstruction based on text human face double priori under synchronous classroom vision need to be considered.

Disclosure of Invention

In view of this, the embodiment of the invention provides an image compression and reconstruction method based on super-resolution and priori knowledge, so as to improve and realize image compression and reconstruction based on text face double priors.

An aspect of an embodiment of the present invention provides an image compression and reconstruction method based on super resolution and a priori knowledge, including:

acquiring a teaching image from a first terminal;

inputting the teaching image into a preset two-way prior type acquirer to obtain a preliminary prior feature map, wherein the two-way prior type acquirer is used for classifying the input image to obtain a first image mainly comprising text and a second image mainly comprising a teacher and a student, and converting the feature distribution of the first image or the second image into feature distribution matching with the input image;

compressing the teaching image by utilizing pyramid convolution and based on semantic information to obtain a compressed image;

embedding prior features in the preliminary prior feature map into a super-resolution depth reconstruction process, wherein the prior features are features matched with preset prior conditions in the teaching image;

and carrying out shallow feature extraction and deep feature learning of the prior features of the embedded image on the compressed image by using the multi-scale significant feature extraction super-resolution based on an attention mechanism, and reconstructing to obtain a reconstructed image.

Preferably, the inputting the teaching image into a preset two-way prior type acquirer to obtain a preliminary prior feature map includes:

inputting the teaching image into a preset two-way priori type acquirer, and recognizing by a text priori device in the two-way priori type acquirer to obtain the teaching image as a first image; extracting a probability vector sequence in the first image through the text prior device; converting the probability vector sequence into a feature distribution matching with the teaching image to obtain a preliminary text priori feature map;

or alternatively, the first and second heat exchangers may be,

inputting the teaching image into a preset two-way priori type acquirer, and recognizing the teaching image into a second image by a face priori device in the two-way priori type acquirer; extracting face distribution codes in the second image through the face priori device; and converting the face distribution codes into feature distribution matching with the teaching image to obtain a preliminary portrait priori feature map.

Preferably, compressing the teaching image based on semantic information by using pyramid convolution to obtain a compressed image, including:

the teaching feature map is subjected to shallow convolution to obtain a convolution feature map;

Inputting the convolution feature map into a first branch of pyramid convolution, and fusing high-low layer information of the convolution feature map to generate a first feature map containing a plurality of semantic layer information;

inputting the convolution feature map into a second branch of pyramid convolution to obtain a second feature map according to a low channel capacity space;

and aggregating the first characteristic diagram and the second characteristic diagram, and downsampling the aggregated characteristic diagram to obtain a compressed image.

Preferably, the embedding the prior feature in the preliminary prior feature map into the compressed image includes:

the compressed image passes through a super-resolution module to generate a first multi-scale depth feature map;

inputting the preliminary prior feature map into a two-way prior feature converter to convert and align the features of the preliminary prior feature map and generate a second multi-scale depth feature map;

and fusing the first multi-scale depth feature map and the second multi-scale depth feature map, and embedding the prior features in the preliminary prior feature map into a super-resolution depth reconstruction process.

Preferably, the multi-scale salient feature extraction super-resolution based on the attention mechanism performs shallow feature extraction and deep feature learning on the compressed image embedded with the prior feature, and reconstructs the reconstructed image, including:

Extracting shallow features from the compressed image;

extracting depth residual multi-scale characteristics from the compressed image, and aggregating the depth residual multi-scale characteristics through jump connection;

and in the depth residual error multi-scale feature extraction process, fusing prior features in the preliminary prior feature map to carry out image depth reconstruction.

Preferably, the method further comprises:

training a teacher network according to the teaching image and the reconstructed image by adopting a priori knowledge propagation scheme based on knowledge distillation, wherein the teacher network is used for generating a teaching reconstructed image according to an input teaching image;

according to the teacher network, adopting a knowledge distillation mode to transmit priori information from the teacher to the student network;

and regenerating a new reconstructed image corresponding to the teaching image according to a first loss between the teaching image and the reconstructed image of the student network, and a second loss between the teaching image and the reconstructed image of the student network.

Preferably, the method further comprises:

and transmitting the reconstructed image to a second terminal so that the second terminal can display the reconstructed image.

Another aspect of the embodiments of the present invention further provides an image compression and reconstruction apparatus based on super resolution and a priori knowledge, including:

An image acquisition unit for acquiring a teaching image from a first terminal;

the prior feature acquisition unit is used for inputting the teaching image into a preset two-way prior type acquirer to obtain a preliminary prior feature image, and the two-way prior type acquirer is used for classifying the input image to obtain a first image mainly containing text and a second image mainly containing a teacher and students, and converting the feature distribution of the first image or the second image into feature distribution matching with the input image;

the image compression unit is used for compressing the teaching image by utilizing pyramid convolution and based on semantic information to obtain a compressed image;

the prior feature embedding unit is used for embedding prior features in the preliminary prior feature map into the super-resolution depth reconstruction process, wherein the prior features are features matched with preset prior conditions in the teaching image;

and the image reconstruction unit is used for carrying out shallow feature extraction on the compressed image and deep feature learning of the prior feature of the embedded image based on the multi-scale significant feature extraction super-resolution of the attention mechanism, and reconstructing the reconstructed image.

Another aspect of the embodiment of the invention also provides an electronic device, which includes a processor and a memory;

The memory is used for storing programs;

the processor executes the program to implement the method described above.

Another aspect of the embodiments of the present invention also provides a computer-readable storage medium storing a program that is executed by a processor to implement the above-described method.

Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the method described above.

Compared with the existing image compression scheme, the image compression and reconstruction method based on super-resolution and priori knowledge adopts the same compression and reconstruction mode for different types of images, ignores the difference between the different types of images, solves the compression and reconstruction problem of different types of images by adopting a compression frame which effectively combines image priori and super-resolution, and can effectively utilize the detail characteristics of various images based on the compressed image reconstruction guided by the priori knowledge. In addition, the super-resolution-based image reconstruction can recover and supplement the high-frequency details of the compressed image, and effectively improve the visual quality effect of the high-reconstruction image.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of an image compression and reconstruction method based on super-resolution and priori knowledge according to an embodiment of the present invention;

FIG. 2 is an exemplary flowchart of an image compression and reconstruction method based on super-resolution and prior knowledge according to an embodiment of the present invention;

FIG. 3 is an explanatory diagram of various variables and their meanings of the embodiment provided by the embodiment of the present invention;

FIG. 4 is a diagram of a method architecture in a synchronous classroom area according to an embodiment of the present invention;

FIG. 5 is a diagram of an overall network model structure of an embodiment provided by an embodiment of the present invention;

FIG. 6 is a diagram illustrating an exemplary architecture of a two-way a priori type acquirer according to an embodiment of the present invention;

FIG. 7 is a flowchart illustrating an exemplary process for embedding image prior knowledge according to an embodiment of the present invention;

FIG. 8 is an exemplary diagram of image compression based on semantic information extraction according to an embodiment of the present invention;

FIG. 9 is an exemplary diagram of super-resolution based image reconstruction according to an embodiment of the present invention;

fig. 10 is a block diagram of an image compression and reconstruction device based on super-resolution and a priori knowledge according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Referring to fig. 1, an embodiment of the present invention provides an image compression and reconstruction method based on super resolution and a priori knowledge, which specifically includes the following steps:

step S100: and acquiring a teaching image from the first terminal.

Step S110: inputting the teaching image into a preset two-way prior type acquirer to obtain a preliminary prior feature map, wherein the two-way prior type acquirer is used for classifying the input image to obtain a first image mainly comprising text and a second image mainly comprising a teacher and a student, and converting the feature distribution of the first image or the second image into feature distribution matching with the input image.

Specifically, the method comprises the following steps:

inputting the teaching image into a preset two-way priori type acquirer, and recognizing by a text priori device in the two-way priori type acquirer to obtain the teaching image as a first image; extracting a probability vector sequence in the first image through the text prior device; and converting the probability vector sequence into a characteristic distribution matching with the teaching image to obtain a preliminary priori characteristic map.

Or may include the steps of:

inputting the teaching image into a preset two-way priori type acquirer, and recognizing the teaching image into a second image by a face priori device in the two-way priori type acquirer; extracting face distribution codes in the second image through the face priori device; and converting the face distribution codes into feature distribution matching with the teaching image to obtain a preliminary priori feature map.

Step S120: and compressing the teaching image based on semantic information by utilizing pyramid convolution to obtain a compressed image.

Specifically, the method comprises the following steps:

s1, carrying out shallow convolution on the teaching feature map to obtain a convolution feature map.

S2, inputting the convolution feature map into a first branch of pyramid convolution, and fusing high-low layer information of the convolution feature map to generate a first feature map containing a plurality of semantic layer information.

S3, inputting the convolution feature map into a second branch of pyramid convolution to obtain a second feature map according to the low-channel capacity space.

S4, aggregating the first feature map and the second feature map, and downsampling the aggregated feature map to obtain a compressed image.

Step S130: and embedding the prior features in the preliminary prior feature map into the compressed image, wherein the prior features are features matched with preset prior conditions in the teaching image.

Specifically, the method comprises the following steps:

s1, the compressed image passes through a super-resolution module to generate a first multi-scale depth feature map.

S2, inputting the preliminary prior feature map into a two-way prior feature converter so as to convert and align the features of the preliminary prior feature map and generate a second multi-scale depth feature map.

S3, fusing the first multi-scale depth feature map and the second multi-scale depth feature map, and embedding prior features in the preliminary prior feature map into a super-resolution depth reconstruction process.

Step S140: and carrying out shallow feature extraction and deep feature learning of the prior features of the embedded image on the compressed image by using the multi-scale significant feature extraction super-resolution based on an attention mechanism, and reconstructing to obtain a reconstructed image.

Specifically, the method comprises the following steps:

s1, extracting shallow features from the compressed image.

S2, extracting the depth residual multi-scale characteristics of the compressed image, and aggregating the depth residual multi-scale characteristics through jump connection.

And S3, fusing prior features in the preliminary prior feature map to carry out image depth reconstruction in the depth residual error multi-scale feature extraction process.

And S4, reconstructing a reconstructed image corresponding to the compressed image according to the shallow layer features and the fused depth residual multi-scale features by the up-sampling layer and the image reconstruction convolution layer.

In addition, after the reconstructed image is obtained, the method can also transmit the reconstructed image to the second terminal so as to display the reconstructed image by the second terminal.

Further, the invention may further comprise the steps of:

s1, training a teacher network according to the teaching image and the reconstructed image, wherein the teacher network is used for generating a teaching reconstructed image according to the input teaching image.

S2, regenerating a new reconstructed image corresponding to the teaching image according to a first loss between the teaching image and the reconstructed image of the teaching network, a second loss between the teaching image and the reconstructed image of the student network and a third loss between the reconstructed image of the teaching image and the reconstructed image of the student network.

In order to describe the present invention in more detail, practical application of the present invention will be described in the following with specific examples.

Referring to fig. 2, an exemplary flowchart of an image compression and reconstruction method based on super resolution and a priori knowledge is provided in an embodiment of the present invention. Referring to FIG. 3, an illustrative diagram of the variables and their meanings used in the following process is provided in accordance with an embodiment of the present invention.

1. The main assumption is that.

In the synchronous classroom learning environment, from the overall architecture point of view, a supporting school synchronous classroom including a teacher and a local student and a supported school synchronous classroom including a student at a different place are included, as shown in fig. 4. And the supporting school synchronous classroom image or video frame completes image compression and reconstruction through an image compression Model IC Model and an image reconstruction Model SR Model, and is transmitted to the supported school synchronous classroom. Likewise, the image or video frame of the assisted school synchronization class may be transmitted to the support school synchronization class via this process.

From a network architecture perspective, the initial teaching image or video frame generated in the learning environment is defined as

The network is divided into a teacher network and a student network, which are abstract meanings in knowledge distillation, as shown in fig. 5. Teacher network basis I _h The self significance characteristics carry out category labeling on the image, the image priori type IP of the image is obtained, the image is input into an image compression module IC to complete the compression of data quantity, and then the image is input into an image reconstruction module IR based on super resolution. Meanwhile, a text or face recognition model corresponding to the IP is extracted and called to obtain priori knowledge of the image. And finally, embedding the priori knowledge into the depth reconstruction process of the IR to finish the super-resolution image reconstruction of the double-path self-adaptive priori guidance. The student network transmits priori information from the teacher to the student network by using knowledge distillation, so that the influence of insufficient prior estimation on image reconstruction is effectively avoided.

Specifically, the IP module extracts key features of an image on the premise of extracting priori knowledge of the image, performs classification judgment on a text image and a face image, and determines different types of priori information p= { P after classification _t ,P _f The new high definition image is added and embedded in the subsequent reconstruction module to guide the reconstruction of the high definition image.

The IC module is used for compressing the image and reducing the data size of the transmission image to obtain a low-resolution image

The key components responsible for image compression in the IC are an efficient image compression module EIC, wherein the EIC consists of a traditional convolution module, a pyramid feature fusion module PFM and a downsampling module DM, and the module can furthest retain the effective features of the image during image compression. Two paths are arranged in the IC, and Path A is the combination of the traditional compression mode and EIC, so that further compression is realized, and the compression rate is improved; path B is a direct use of EIC for images Compression is performed to achieve optimization of the reconstructed visual quality effect.

However, the IC process actually causes the image to lose a portion of the information, which is a great challenge for decompression of the image. The image decompression process is essentially similar to the image super-resolution process in that high frequency detail is re-added to the compressed image to generate a high definition image. Therefore, the invention reconstructs the compressed image by combining the super resolution algorithm to obtain a reconstructed image

In addition, the dependency relationship of the characteristic channels is modeled by adopting an attention mechanism in the network, and the more important characteristic channels are focused. Meanwhile, based on classification information in the IP, the prior models of different types of images are adaptively selected, and corresponding prior model information P is utilized _t And P _f Providing guidance for image reconstruction. By means of reconstructed I _S For a priori P _t And P _f The refinement and continuous prior information update keep more effective features and key features of the image, and can finish better reconstruction of the compressed image.

2. And constructing a teaching image compression reconstruction model based on a super-resolution algorithm and priori knowledge.

2.1 construction of the model.

2.2 workflow of model.

2.2.1 image data acquisition.

The image or video frame data is derived from teaching image data I _h The teaching image comprises a teaching image mainly comprising texts and a teaching image mainly comprising teacher and student portrait information. After the image data are acquired, firstly, the image data are uniformly preprocessed and standardized, and then, the feature selection is carried out on the image data by adopting a model ordering method, so that two types of image data, namely a text image and a face image, are generated, and a training set is generated.

2.2.2 image a priori type acquisition with a priori knowledge embedding.

1. Image a priori type acquisition.

In order to fully utilize prior information, such as text information and face texture information, of each type of image data, the invention designs a two-way prior type acquirer, as shown in fig. 6.

Firstly, the characteristics of the input image are judged,

representing an image prior feature classifier, inputting an image I _h Is divided into a face image and a text image, I _t Images representing text types, I _f Images representing the type of face, both sent to different a priori streams, formulated as:

/>

the rich information contained in the pre-trained model is then utilized in the two-way prior stream to generate prior information. Specifically, the invention designs a priori information generator PG comprising a text priori device TP and a human face priori device FP, and deep priori features P of the priori information generator PG are extracted from a text image and a human face image respectively _t And P _f . By utilizing the network guided by the deep priori knowledge P, the self-adaptive reconstruction of the compressed image can be realized in the process of reconstructing the image with the depth super-resolution later, and the detail information of the image can be recovered. TP is composed of a text recognition model CRNN, probability prediction can be carried out on classification of text characters, and a |T| dimensional probability vector sequence is generated in TP, wherein |T| represents the number of characters learned by a priori model. The FP consists of a face recognition model VGGFace, and the pre-training model is packaged with abundant and various priori information, so that the closest potential face distribution codes can be matched according to the input image.

Finally, since the pre-trained prior model is directed to the image recognition task rather than the image reconstruction task, the prior information P will be generated _t And P _f Integration into the image reconstruction process is challenging. In order to solve the problem of mismatch between the distribution difference and the feature mapping between the prior feature and the actual image feature, the invention designs a two-way prior feature converter DPT, wherein the DPT-T and the DPT-F can be used for convertingP _t And P _f Conversion to features that can be exploited

And->

Effectively embedded in the image reconstruction process, expressed as:

wherein (alpha) _t ,β _t ) And (alpha) _f ,β _f ) The conversion coefficient is that the output prior feature is converted into the feature distribution which is applicable to the same as the image reconstruction feature by using the two-way prior feature converter, so that the image reconstruction is promoted.

2. The image prior knowledge is embedded.

In the image reconstruction module, the invention proposes to use the super-resolution reconstruction branch SR and the image priori branch to jointly complete the image reconstruction operation, and designs and uses an integrated module to combine the priori knowledge and the original information so as to improve the recovery performance and the generalization capability of the image reconstruction. As shown in fig. 7, the SR branch may reproduce the high-definition image HR from the input low-definition image LR and the a priori information generator PG guide characteristics.

Firstly, a super-resolution module SRB of a compressed image LR passing through an SR branch generates and outputs a multi-scale depth feature map F _s . The high-definition image HR generates a preliminary prior feature map through a prior information generator PG, and then converts and aligns the features of the preliminary prior feature map through a two-way prior feature converter DPT to generate a depth feature map F _p 。

Second, in order to reconstruct a high-fidelity and trusted restored image, the present invention utilizes F with the original features of the image _s With F with a priori knowledge _p To adjust the generative model. The two depth feature maps pass through an inconsistent function f _i After the mapping of (a), a feature map which fuses priori knowledge and original image information is obtained

Guiding image reconstruction, the formula is:

finally, the feature map is fused

And the method is combined with the original SR branch characteristics, and the abundant and various details provided in priori knowledge are utilized to guide image reconstruction, so that the difficulty of image recovery is reduced.

2.2.3 image compression based on semantic information extraction.

The image compression aims at reducing the image data volume while keeping the key information of the image, and the compression is beneficial to the reconstruction of the image while keeping the effective high-low semantic information of the image. The acquisition of semantic information of an image usually depends on the extraction of a convolution kernel, and a receptive field is usually acquired by stacking convolution layers up and down and sampling down in a CNN network, but the mode that all positions use the same convolution kernel can only acquire the context information with a fixed size, so that the capability of extracting features of the convolution kernel is limited. Therefore, the pyramid convolution is introduced into the compression module in the network to retain more effective image characteristics, the kernel with smaller receptive field focuses on more detailed image information, the kernel with larger receptive field focuses on more comprehensive context information, and the two information are complementary and the calculation cost is smaller.

The image compression module EIC can compress the image directly and efficiently, and can also be used in combination with a conventional compression scheme. As shown in fig. 8, in EIC, an input high definition image

Through shallow layer rollObtaining a characteristic diagram F by integration ₀ Then feeding the two branches into a pyramid feature fusion module PFM branch which fuses high and low layer information to generate a pyramid feature fusion module PFM branch structurally containing a plurality of semantic layer information, wherein the feature is expressed as F _PFM The method comprises the steps of carrying out a first treatment on the surface of the And shallow branching of low channel capacity space, characterized by F ₁ . Subsequently aggregating the above features to obtain feature F containing rich context information ₃ . The formula is as follows:

F ₃ ＝f _concat (F ₁ ,F _PFM )

in pyramid convolution inside the PFM, the feature extraction flow is as follows:

1. the number of layers of the pyramid convolution is set to be L1 to L4, and the convolution kernel size K of each level ₁ To K ₄ Increasing while the core depth is decreasing.

2. Grouping the input feature images by adopting a grouping convolution mode to apply kernels with different depths to generate feature images

Wherein MF is _i (. Cndot.) represents the size of the input feature map, F _p (. Cndot.) performing pyramid convolution, K _n ² Representing the space size of each level, +.>

The depth of each level kernel is represented as follows:

3. in order to extract image features more effectively, the present invention introduces a channel attention module SE based on the pyramid convolution. The SE module may further promote useful features, such as edge textures, while promoting receptive fields and suppressing some unimportant features, depending on the importance of the input feature channel. The SE module first generates a feature map for each feature map

Extruding operation is carried out, and global feature vector Z is obtained after calculation and average _g N is the number of layers, F _sq (. Cndot.) a squeeze operation is performed, formulated as follows:

4. by Z _g Learning FM _g The characteristic weight value of each channel is obtained by two FC layers to obtain a weight value vector Z between 0 and 1 _w The process uses F _fc (. Cndot.) representation.

5. Each channel of the original output channels

Weights corresponding to->

Weighting is performed to obtain a new weighted feature +.>

F _w (. Cndot.) represents the operation of assigning weights. The formula is:

6. splicing the feature graphs with the reassigned weights to obtain a feature graph subjected to one-time pyramid convolution

The pyramid convolution group number isj，[i ₀ ,i ₁ ,…,i _j-1 ]Features generated for groups of pyramids that represent different numbers, respectively.

2.2.4 super resolution based image reconstruction.

By designing a reconstruction network matched with the compression process, high-quality reconstruction of an image by using a deep learning algorithm instead of increasing hardware cost is the main direction of image compression reconstruction at present. The invention designs a multiscale salient feature extraction super-resolution network MCAN based on an attention mechanism to reconstruct a compressed image, and the structure can refer to fig. 9, and the structure can extract multiscale features of the image, recover and reconstruct the compressed image to the greatest extent, and effectively improve the visual effect of the reconstructed image. The MCAN consists of three parts, namely shallow feature extraction, deep feature learning and reconstruction.

First shallow feature extraction, for input low resolution image I _l Shallow feature F of image extraction ₀ ＝f _shallow (I _l ) Wherein f _shallow (. Cndot.) represents the shallow feature extraction layer.

And secondly, depth feature extraction, wherein the network comprises a plurality of multi-scale feature extraction residual groups MG and a deep feature fusion module IFM, and the IFM is connected through jump so as to aggregate more and more effective features. Depth residual multi-scale feature extraction consists of 4 steps:

1. initializing an MG layer

Extracting features of shallow layer features to obtain initial features +.>

2. Iterating the MG layer continuously to complete the depth extraction of the image features, wherein

Represents the n-1 th iteration MG layer, F ₁ To initialize MG layer extraction features, F _n As a feature extracted over n MG iterations,the formula is:

3. meanwhile, the invention introduces self-adaptive weight in the jump path, and the multi-scale characteristics enter the IFM-MG, w through jump connection _i Multi-scale feature F representing layer i residual _i Is adaptive to learn salient features. f (f) _concat (. Cndot.) feature fusion is achieved, formulated as:

F _IFM ＝f _concat [w ₁ F ₁ ,w ₂ F ₂ ,…,w _n F _n-1 ,F _n ]

IFM-MG fusion Main Path feature F _n Multi-hop feature derivation F _IFM ，F _df Represents F _IFM And F is equal to ₀ The depth characteristics of the output.

F _df ＝f _concat (F _IFM ,F ₀ )

The MG is internally composed of a plurality of multi-channel attention modules MCAB and an IFM, and the representation capability of the network is improved through residual learning. Inside the MCAB, multiple features are acquired through group convolution and multiple connection mechanisms, while a dual attention module EAB is introduced to acquire salient feature representations. The dual attention module EAB includes a CA channel attention and ESA space attention module, which are connected in series, and can adaptively readjust the space and channel characteristics. Feature extraction in MCAB consists of 3 steps:

1. Generating features from input features via different scale convolution layers

Introducing custom weights [ lambda ] ₁ ,λ ₂ ,λ ₃ ,λ ₄ ]Multiple feature fusion generated by different scale convolution layers to obtain feature F _c Expressed in terms of:

2. the CA model is adopted to obtain the attention weight of each channel, and the weight is linearly combined with the initialized feature map X to obtain a feature map weighted by channels.

3. Inputting the channel weighted feature map into SA model to obtain spatial attention weight, and linear combination of the input feature map and spatial attention weight to obtain final output feature map F _β 。θ _c And theta _s Representing channel attention weights and spatial attention weights, F (·) represents the method used to calculate the final modulation signature F _β Is a modulation function of (a). Expressed in terms of:

F _β ＝f(θ _c ,θ _S ,X)

finally, a reconstruction module, which extracts final features F from an upsampling layer and an image reconstructed convolution layer _ff ，F _SR (·) represents the super-resolution reconstruction of the image, formulated as:

F _ff ＝F _SR (I _l )

an image output via the network

Good visual reconstruction effect is obtained. Based on the above functions, the super-resolution network can restore the compressed image to a high quality image to a great extent.

2.2.5 a priori knowledge propagation based on knowledge distillation.

In order to avoid inaccurate prior knowledge estimation, a network can utilize image prior knowledge under the condition of not carrying out prior estimation, and the prior knowledge propagation scheme based on knowledge distillation is designed. Specifically, the network is divided into a teacher network and a student network, with a priori knowledge being propagated from the teacher to the students through knowledge distillation. The teacher network is the same as the student network backbone architecture, except that the teacher network embeds image prior information to effectively guide image reconstruction. The a priori knowledge propagation based on knowledge distillation consists of the steps of:

1. input deviceImage I of (2) _HR First, a teacher network is entered, which uses a priori information P of the image,

representing a reconstructed image generated by a teacher network, f _T (. Cndot.) represents the teacher network image compression reconstruction process. The teacher's network is denoted by:

2. after the teacher network is pre-trained, the training of the student network is constrained and guided by the characteristics and output of the network.

Representing a reconstructed image generated by a student network, D representing distillation information of the image, f _S (. Cndot.) represents the student network image compression reconstruction process. The student network may be expressed as:

3. optimizing image reconstruction using three losses, namely, losses between reconstructed images of teacher network and student network

Loss between teacher network reconstructed image and realistic scene image ∈>

Loss between student network reconstructed image and realistic scene image ∈>

Referring to fig. 10, an embodiment of the present invention provides an image compression and reconstruction apparatus based on super resolution and a priori knowledge, including:

an image acquisition unit for acquiring a teaching image from a first terminal;

Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the method shown in fig. 1.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims

1. An image compression and reconstruction method based on super resolution and priori knowledge is characterized by comprising the following steps:

acquiring a teaching image from a first terminal;

2. The method for compressing and reconstructing an image based on super-resolution and priori knowledge according to claim 1, wherein said inputting the teaching image into a preset two-way a priori type acquirer to obtain a preliminary a priori feature map comprises:

Or alternatively, the first and second heat exchangers may be,

inputting the teaching image into a preset two-way priori type acquirer, and recognizing the teaching image into a second image by a face priori device in the two-way priori type acquirer; extracting face distribution codes in the second image through the face priori device; and converting the face distribution codes into feature distribution matching with the teaching image to obtain a preliminary face priori feature map.

3. The method for compressing and reconstructing an image based on super-resolution and a priori knowledge according to claim 1, wherein compressing the teaching image by using pyramid convolution and based on semantic information to obtain a compressed image comprises:

4. The method for image compression and reconstruction based on super-resolution and prior knowledge according to claim 1, wherein the embedding the prior feature in the preliminary prior feature map into the super-resolution depth reconstruction process comprises:

5. The method for compressing and reconstructing an image based on super-resolution and priori knowledge according to claim 1, wherein the multi-scale salient feature extraction super-resolution based on the attention mechanism performs shallow feature extraction on the compressed image and deep feature learning of embedding the prior feature of the image, and reconstructing the reconstructed image, comprising:

extracting shallow features from the compressed image;

6. The method for image compression and reconstruction based on super resolution and a priori knowledge according to claim 1, further comprising:

7. The method for image compression and reconstruction based on super resolution and a priori knowledge according to claim 1, further comprising:

8. An image compression and reconstruction device based on super resolution and priori knowledge, comprising:

an image acquisition unit for acquiring a teaching image from a first terminal;

9. An electronic device comprising a processor and a memory;

the memory is used for storing programs;

the processor executing the program implements the method of any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that the storage medium stores a program that is executed by a processor to implement the method of any one of claims 1 to 7.