CN114022357A

CN114022357A - Image reconstruction method, training method, device and equipment of image reconstruction model

Info

Publication number: CN114022357A
Application number: CN202111275994.4A
Authority: CN
Inventors: 袁苇航
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2022-02-08

Abstract

The disclosure provides an image reconstruction method, an image reconstruction device, image reconstruction equipment and a storage medium, and relates to the field of artificial intelligence, in particular to the technical field of computer vision and deep learning. The specific implementation scheme is as follows: determining at least two feature information of the target image based on the at least two feature lookup tables; the at least two characteristic lookup tables are respectively associated with at least two characteristic extraction modules in the image reconstruction model and used for representing the mapping relation between the input pixel information of the associated characteristic extraction modules and the extracted characteristic pixel information; convolution kernels of the first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate; fusing at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information; and determining a reconstructed image of the target image according to the fusion characteristic information. The image reconstruction method and the image reconstruction device can improve the efficiency and quality of image reconstruction.

Description

Image reconstruction method, training method, device and equipment of image reconstruction model

Technical Field

The present disclosure relates to the field of artificial intelligence, and in particular, to the field of computer vision and deep learning technologies, and in particular, to an image reconstruction method, an image reconstruction model training device, an electronic device, and a computer-readable storage medium.

Background

Super-Resolution (SR) refers to reconstructing a corresponding high-Resolution image from an observed low-Resolution image. With the rapid development of the video and live broadcast industry, new requirements are put forward on the super-resolution technology.

Disclosure of Invention

The disclosure provides an image reconstruction method, an image reconstruction model training method, an image reconstruction device and image reconstruction equipment.

According to an aspect of the present disclosure, there is provided an image reconstruction method including:

determining at least two feature information of the target image based on the at least two feature lookup tables; the image reconstruction model comprises at least two characteristic lookup tables, at least two characteristic lookup tables and at least two characteristic extraction modules, wherein the at least two characteristic lookup tables are respectively associated with the at least two characteristic extraction modules in the image reconstruction model and are used for representing the mapping relation between input pixel information of the associated characteristic extraction modules and extracted characteristic pixel information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

fusing the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information;

and determining a reconstructed image of the target image according to the fusion characteristic information.

According to another aspect of the present disclosure, there is provided a training method of an image reconstruction model, including:

respectively extracting features of the sample image through at least two feature extraction modules in the image reconstruction model to be trained to obtain at least two pieces of feature information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

determining a reconstructed image of the sample image according to the fusion characteristic information;

performing iterative training on the image reconstruction model by using a loss function; the loss function is used for characterizing the difference between the reconstructed image and a real image of the sample image;

the input image of the feature fusion module and the extracted feature information are used for constructing a feature lookup table for the feature fusion module; the characteristic lookup table is used for representing the mapping relation between the input pixel information and the extracted characteristic pixel information.

According to still another aspect of the present disclosure, there is provided an image reconstruction apparatus including:

the characteristic searching module is used for determining at least two characteristic information of the target image based on at least two characteristic searching tables; the image reconstruction model comprises at least two characteristic lookup tables, at least two characteristic lookup tables and at least two characteristic extraction modules, wherein the at least two characteristic lookup tables are respectively associated with the at least two characteristic extraction modules in the image reconstruction model and are used for representing the mapping relation between input pixel information of the associated characteristic extraction modules and extracted characteristic pixel information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

the fusion characteristic determining module is used for fusing the at least two pieces of characteristic information through a characteristic fusion module in the image reconstruction model to obtain fusion characteristic information;

and the image reconstruction module is used for determining a reconstructed image of the target image according to the fusion characteristic information.

According to still another aspect of the present disclosure, there is provided a training apparatus of an image reconstruction model, including:

the characteristic information determining module is used for respectively extracting the characteristics of the sample image through at least two characteristic extracting modules in the image reconstruction model to be trained to obtain at least two pieces of characteristic information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

the image reconstruction module is used for determining a reconstructed image of the sample image according to the fusion characteristic information;

the iterative training module is used for performing iterative training on the image reconstruction model by using the loss function; the loss function is used for characterizing the difference between the reconstructed image and a real image of the sample image;

According to still another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to cause the at least one processor to perform a method provided by any embodiment of the disclosure.

According to yet another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method provided by any of the embodiments of the present disclosure.

According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method provided by any of the embodiments of the present disclosure.

According to the technology disclosed by the invention, the efficiency and the quality of image reconstruction can be improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1a is a schematic diagram of an image reconstruction method provided in accordance with an embodiment of the present disclosure;

FIG. 1b is a schematic diagram of a prediction of an image reconstruction model provided according to an embodiment of the present disclosure;

FIG. 1c is a schematic diagram of training an image reconstruction model provided in accordance with an embodiment of the present disclosure;

FIG. 1d is a schematic structural diagram of an image reconstruction model provided in accordance with an embodiment of the present disclosure;

fig. 1e is a schematic structural diagram of a feature extraction module provided according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of another image reconstruction method provided in accordance with an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of another image reconstruction method provided in accordance with an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a training method of an image reconstruction model according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of another training method for an image reconstruction model provided in accordance with an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of an image reconstruction apparatus provided in accordance with an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a training apparatus for an image reconstruction model according to an embodiment of the present disclosure;

fig. 8 is a block diagram of an electronic device for implementing an image reconstruction method or a training method of an image reconstruction model according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The scheme provided by the embodiment of the disclosure is described in detail below with reference to the accompanying drawings.

Fig. 1a is a schematic diagram of an image reconstruction method provided according to an embodiment of the present disclosure, which is applicable to a case of performing enhanced reconstruction on an image. The method may be performed by an image reconstruction apparatus, which may be implemented in hardware and/or software and may be configured in an electronic device. Referring to fig. 1a, the method specifically includes the following:

s110, determining at least two characteristic information of the target image based on at least two characteristic lookup tables;

the image reconstruction model comprises at least two characteristic lookup tables, at least two characteristic lookup tables and at least two characteristic extraction modules, wherein the at least two characteristic lookup tables are respectively associated with the at least two characteristic extraction modules in the image reconstruction model and are used for representing the mapping relation between input pixel information of the associated characteristic extraction modules and extracted characteristic pixel information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

s120, fusing the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information;

and S130, determining a reconstructed image of the target image according to the fusion characteristic information.

In the embodiment of the present disclosure, the image reconstruction model is used to perform enhanced reconstruction on an image, the target image may be a low-resolution image, the reconstructed image may be a high-resolution image, and the super-resolution multiple of the high-resolution image may be S, for example, 2, that is, the resolution of the reconstructed image may be S times the resolution of the target image. Taking the target image as 8 bits (i.e. the pixel value range in the target image is 0-255), the reconstructed image can be 8 × S bits, i.e. the pixel value range in the reconstructed image is 0-2^8×S-1。

The image reconstruction model can comprise at least two feature extraction modules and a feature fusion module, wherein each feature extraction module is used for extracting features of different scales from a target image to obtain at least two pieces of feature information, and the feature fusion module is used for fusing the at least two pieces of feature information to obtain fused feature information; the image reconstruction module may also include a 1 x 1 convolutional layer and pixel reconstruction (pixelshuffle) module.

Fig. 1b is a schematic diagram of prediction of an image reconstruction model according to an embodiment of the present disclosure, referring to fig. 1b, in a prediction stage, a target image may be input into at least two feature lookup tables (Look-Up tables, LUTs) to obtain at least two feature information of the target image respectively; fusing at least two pieces of feature information through a fusion module in the image reconstruction model to obtain fused feature information; the reconstructed image of the target image may be determined according to the fusion feature information, for example, the fusion feature information may be passed through a 1 × 1 convolution layer and a pixel recombination module to obtain the reconstructed image. The characteristic lookup table is associated with the characteristic extraction module of the image reconstruction model in the training stage, one characteristic lookup table is associated with the characteristic extraction module, and the characteristic lookup table is used for representing the mapping relation between the input pixel information of the associated characteristic extraction module and the extracted characteristic pixel information.

Fig. 1c is a schematic training diagram of an image reconstruction model provided according to an embodiment of the present disclosure, referring to fig. 1c, in a training phase, at least two feature information may be obtained by respectively performing feature extraction on a sample image by at least two feature extraction modules in the image reconstruction model, and at least two feature information may be obtained by fusing at least two feature information by a fusion module in the image reconstruction model; a reconstructed image of the sample image may be determined from the fused feature information. After the training of the image reconstruction model is completed, the feature lookup tables can be respectively constructed for the feature extraction modules.

Each feature extraction module comprises a first convolution layer, convolution kernels of the first convolution layers are the same in size, and expansion rates of the first convolution layers are different. Under the condition that the sizes of the convolution kernels of the first convolution kernels are the same, namely the sizes of the convolution kernels of the first convolution kernels are not required to be adjusted, different hole convolution kernels are constructed through the expansion rates, and feature maps of different scales are extracted from a target image by adopting the hole convolution kernels, so that the receptive field of the first convolution kernels is enlarged, and the perception capability of an image reconstruction model on complex textures is improved; in addition, since the convolution kernel size of the first convolution layer does not need to be adjusted, the size of the feature lookup table of the feature extraction module does not need to be influenced, and the lookup efficiency of the feature lookup table can be maintained. That is to say, different hole convolution kernels are constructed through each expansion rate, on the premise that the feature searching efficiency is not sacrificed, the receptive field of the first convolution layer is enlarged, and the perception capability of the image reconstruction model on complex textures is improved.

According to the technical scheme of the embodiment of the disclosure, the convolution kernels of the first convolution layer in each feature extraction module are the same in size, different cavity convolution kernels are constructed through each expansion rate, on the premise that the feature search efficiency is not sacrificed, the receptive field of the first convolution layer is expanded, and the perception capability of the image reconstruction model on complex textures is improved.

In an optional embodiment, the expansion rate of the first convolution layer in the kth feature extraction module is k, and the feature extraction module further comprises at least two residual blocks and a second convolution layer connected in series with the first convolution layer.

Fig. 1d is a schematic structural diagram of an image reconstruction model according to an embodiment of the present disclosure, and referring to fig. 1d, a target image is processed by K feature extraction modules (K is the number of the feature extraction modules) 11, a feature fusion module 21, a 1 × 1 convolutional layer 31, and a pixel recombination module 41 to obtain a reconstructed image, i.e., a super-resolution image. Fig. 1e is a schematic structural diagram of a kth feature extraction module provided according to an embodiment of the present disclosure, and referring to fig. 1e, the kth feature extraction module may include a first convolution layer 111, at least two residual blocks 112, and a second convolution layer 113 connected in series, where a convolution kernel size of the first convolution layer 111 may be n × n, for example, 2 × 2, and a degree of expansion may be k, that is, a convolution kernel size of the first convolution layer 111 is nk × nk, and a convolution kernel size of the second convolution layer 113 may be 1 × 1. The feature extraction module comprising the first convolution layer 111, the residual block 112 and the second convolution layer 113 connected in series is used for extracting features of the target image, so that the feature extraction capability of the feature extraction module, namely the expression capability of feature information can be improved, and the quality of a reconstructed image can be further improved.

In an optional implementation manner, the execution subject of the method is an image playing terminal. The image playing end can be a video playing end or a live client. The feature extraction module in the image reconstruction model has large calculation amount and the fusion module has small calculation amount. In the prediction stage, the feature lookup table is adopted to replace the feature extraction module, only the fusion module is reserved, the calculated amount of image reconstruction is greatly reduced, the image reconstruction efficiency is improved, and the problem that the over-score effect is reduced because an end-to-end network is limited by the calculation force of an image playing end in the related technology is solved.

Fig. 2 is a schematic diagram of another image reconstruction method provided in accordance with an embodiment of the present disclosure. The present embodiment is an alternative proposed on the basis of the above-described embodiments. In the embodiment of the present disclosure, the feature lookup table of the feature extraction module is constructed according to the precision compression factor. Referring to fig. 2, the image reconstruction method provided in this embodiment includes:

s210, compressing original pixel information in the target image by adopting a precision compression coefficient to respectively obtain first compressed pixel information and second compressed pixel information;

s220, determining a pixel error between the first compressed pixel information and the original pixel information;

s230, respectively searching first characteristic information associated with the first compressed pixel information and second characteristic information associated with the second compressed pixel information based on each characteristic lookup table;

s240, performing interpolation according to the first characteristic information, the second characteristic information and the pixel error to obtain interpolation characteristic information;

s250, fusing at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information;

the at least two pieces of feature information include first feature information, second feature information, and interpolation feature information.

And S260, determining a reconstructed image of the target image according to the fusion characteristic information.

The image reconstruction method comprises the steps that at least two feature lookup tables are respectively associated with at least two feature extraction modules in an image reconstruction model, namely each feature extraction module is provided with a feature lookup table associated with the feature extraction module and used for representing the mapping relation between input pixel information of the associated feature extraction module and extracted feature pixel information; the convolution kernels of the first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate.

The precision compression coefficient is used for reducing the scale of the characteristic lookup table, and the scale of the characteristic lookup table is in negative correlation with the value of the precision compression coefficient. The value of the precision compression coefficient is related to the size of the target image.

And respectively constructing a feature lookup table aiming at each feature extraction module. For convenience of description, the k-th feature lookup table is constructed for the k-th feature extraction module as an example. The feature lookup table may be denoted as L_k[2ⁱ][2ⁱ][2ⁱ][2ⁱ][S²]Where i may be a precision compression factor, for example, an 8-bit image, the pixel values range from 0 to 2⁸Between-1, i can take the value of 8,4 or 2; in which the first four digits are [2 ]ⁱ][2ⁱ][2ⁱ][2ⁱ]Respectively representing the values of 4 input pixel points, wherein S is a multiple of super-resolution, taking S equal to 2 as an example, S²The values of (1) and (2) are respectively used for representing characteristic pixel points at the 0 th, 1 th, 2 th and 3 th positions in the output characteristic information. Taking the convolution kernel size of the first convolution layer as 2 × 2, the input 4 original pixel point values can be A, B, C and D respectively, the 4 feature pixel point values output by the kth feature extraction module can be a, b, c and D respectively as an example, and the mapping relationship in the kth feature lookup table is as follows:

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][0]＝a

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][1]＝b

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][2]＝c

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][3]＝d

the scale of the characteristic lookup table can be further reduced through precision compression, and the characteristic information lookup efficiency is improved, so that the image reconstruction efficiency is improved.

In a prediction stage, compressing original pixel information in a target image by adopting a precision compression coefficient to respectively obtain first compressed pixel information and second compressed pixel information, and determining a pixel error between the first compressed pixel information and the original pixel information; for each feature lookup table, searching first feature information associated with the first compressed pixel information and second feature information associated with the second compressed pixel information in a subsection mode, and carrying out interpolation according to the first feature information, the second feature information and the pixel error to obtain interpolation feature information; and taking the first characteristic information, the interpolation characteristic information and the second characteristic information as the characteristic information of the characteristic lookup table. The interpolation algorithm in the embodiments of the present disclosure is not particularly limited, and for example, a tetrahedral interpolation algorithm may be used to perform interpolation. Still taking the kth feature lookup table as an example, the feature information of the target image determined based on the kth feature lookup table is obtained by respectively looking up the first feature information and the second feature information from the kth feature lookup table and performing interpolation according to the first feature information, the second feature information and the pixel error. And fusing the characteristic information of the target image determined by each characteristic lookup table to obtain fused characteristic information.

In an optional implementation manner, the compressing original pixel information in a target image by using the precision compression coefficient to obtain first compressed pixel information and second compressed pixel information includes:

respectively compressing original pixel information in the target image by the following formula:

wherein, I is the original pixel information in the target image, I^rAnd I^cRespectively, first compressed pixel information and second compressed pixel information, round () representing down-fetchingInteger, ceil () denotes rounding up, i is the precision compression factor, and the convolution kernel of the first convolution layer is 2 × 2. I-I^rIs the pixel error. By compressing the original pixel information in the target image, the searching times of the characteristic lookup table can be reduced, and the image reconstruction efficiency is improved. It should be noted that the precision compression coefficients used in the feature lookup table construction process and the lookup process are the same.

According to the technical scheme of the embodiment of the disclosure, the scale of the characteristic lookup table can be reduced through pixel compression in the construction process of the characteristic lookup table; in addition, in the searching process of the characteristic lookup table, the searching times of the characteristic lookup table can be reduced through pixel compression, and the image reconstruction efficiency is improved.

Fig. 3 is a schematic diagram of another image reconstruction method provided in accordance with an embodiment of the present disclosure. The present embodiment is an alternative proposed on the basis of the above-described embodiments. Referring to fig. 3, the image reconstruction method provided in this embodiment includes:

s310, rotating the target image by at least two angles to obtain at least two rotating images;

s320, respectively determining at least two pieces of feature information of each rotating image based on at least two feature lookup tables;

s330, performing reverse rotation on at least two pieces of characteristic information of each rotating image, and weighting a reverse rotation result to obtain at least two pieces of characteristic information of the target image;

s340, fusing the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information;

and S350, determining a reconstructed image of the target image according to the fusion characteristic information.

The image reconstruction model comprises at least two characteristic lookup tables, at least two characteristic lookup tables and at least two characteristic extraction modules, wherein the at least two characteristic lookup tables are respectively associated with the at least two characteristic extraction modules in the image reconstruction model and are used for representing the mapping relation between input pixel information of the associated characteristic extraction modules and extracted characteristic pixel information; the convolution kernels of the first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate.

In the disclosed embodiment, the sum of the first angle of rotation and the second angle of counter-rotation may be an integer multiple of 360 °. For example, the first angles of rotation may be 0 °, 90 °, 180 ° and 270 °, respectively, and the second angles of counter-rotation may be 0 °, 270 °, 180 ° and 90 °, respectively.

Specifically, the input target image can be rotated by 0 °, 90 °, 180 ° and 270 ° to obtain four rotated images; respectively determining at least two pieces of feature information of the four rotating images based on the at least two feature lookup tables; and carrying out reverse rotation on the characteristic information, and weighting the reverse rotation result. That is, for each feature lookup table, the four inverse rotation results are weighted and averaged to obtain feature information associated with the feature lookup table; and fusing the feature information associated with each feature lookup table to obtain fused feature information. By rotating the target image, the coverage area searched by the characteristic information can be improved, namely the receptive field is further improved, and the quality of the reconstructed image is further improved.

According to the technical scheme of the embodiment of the disclosure, before feature search is carried out based on the feature search table, the coverage rate of the feature search on the original pixel points in the target image can be improved by rotating the target image, so that the quality of the reconstructed image is further improved.

Fig. 4 is a schematic diagram of a training method for an image reconstruction model according to an embodiment of the present disclosure, which may be applied to a case of training the image reconstruction model. The method can be executed by a training device of an image reconstruction model, which can be implemented in hardware and/or software and can be configured in an electronic device. Referring to fig. 4, the method specifically includes the following steps:

s410, respectively extracting the characteristics of the sample image through at least two characteristic extraction modules in the image reconstruction model to be trained to obtain at least two pieces of characteristic information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

s420, fusing the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information;

s430, determining a reconstructed image of the sample image according to the fusion characteristic information;

s440, performing iterative training on the image reconstruction model by using a loss function; the loss function is used for characterizing the difference between the reconstructed image and a real image of the sample image;

The sample image may be a low-resolution image, the real image may be a high-resolution image associated with the sample image, and the reconstructed image may be a high-resolution image determined by an image reconstruction model. The image reconstruction model can comprise at least two feature extraction modules and a feature fusion module, wherein each feature extraction module comprises a first convolution layer, and convolution kernels of the first convolution layers have the same size and different expansion rates. Different cavity convolution kernels are obtained by expanding the convolution kernels of the first convolution layer with the same size through each expansion rate, namely the convolution kernels of the first convolution layer in each feature extraction module are the same in size, but the cavity convolution kernels constructed through expansion are different in size, so that feature maps with different sizes can be extracted due to different receptive fields of the feature extraction modules, and the perception capability of the image reconstruction model on complex textures is improved.

After the iterative training of the image reconstruction model is completed, each feature extraction module can be used for constructing a feature lookup table associated with each feature extraction module. Because the convolution kernel size of the first convolution layer in each first feature extraction module is kept unchanged, although the perception capability of the complex texture is improved, the size of the feature lookup table of the feature extraction module is not required to be influenced, and the lookup efficiency of the feature lookup table can also be kept.

In an embodiment of the present disclosure, the image reconstruction model may include at least two feature extraction modules and a fusion module; each feature extraction module may include a first convolution layer, at least two residual blocks, and a second convolution layer connected in series; the expansion rate of the first convolution layer in the kth feature extraction module may be k. The feature extraction module comprising the first convolution layer, the residual block and the second convolution layer which are connected in series is used for extracting the features of the target image, so that the feature extraction capability of the feature extraction module can be improved, namely the expression capability of feature information can be improved, and the quality of a reconstructed image is further improved.

Fig. 5 is a schematic diagram of another training method for an image reconstruction model provided in an embodiment of the present disclosure, and with reference to fig. 5, the method specifically includes the following steps:

s510, respectively extracting features of the sample image through at least two feature extraction modules in the image reconstruction model to be trained to obtain at least two pieces of feature information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

s520, fusing the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fused feature information;

s530, determining a reconstructed image of the sample image according to the fusion characteristic information;

s540, performing iterative training on the image reconstruction model by using a loss function; the loss function is used for characterizing the difference between the reconstructed image and a real image of the sample image;

s550, compressing original pixel information in the sample image by adopting a precision compression coefficient to obtain compressed pixel information;

and S560, constructing a feature lookup table for the feature extraction module according to the compressed pixel information and the feature pixel information extracted by the feature extraction module.

The characteristic lookup table is used for representing the mapping relation between the input pixel information and the extracted characteristic pixel information.

The precision compression coefficient is used for reducing the scale of the characteristic lookup table, and the scale of the characteristic lookup table is in negative correlation with the value of the precision compression coefficient. The value of the precision compression coefficient is related to the size of the target image. A kth feature lookup table L may be constructed for the kth feature extraction module_kFeature lookup table L_kThe middle mapping relationship can be expressed as: l is_k[2ⁱ][2ⁱ][2ⁱ][2ⁱ][S²]I may be a precision compression coefficient, for example, an 8-bit image, and i may take the value of 8,4, or 2; feature lookup table L_kThe middle and front four digits respectively represent the values of 4 input pixel points, and S is a multiple of the super-resolution. Taking the convolution kernel size of the first convolution layer as 2 × 2, the input 4 original pixel point values can be A, B, C and D respectively, the 4 feature pixel point values output by the kth feature extraction module can be a, b, c and D respectively as an example, and the mapping relationship in the kth feature lookup table is as follows:

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][0]＝a

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][1]＝b

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][2]＝c

L_k[A/2⁴][B/2⁴][C/2⁴][D/2⁴][3]＝d。

the scale of the feature lookup table can be further reduced through precision compression, so that the subsequent feature information lookup efficiency is improved, namely the image reconstruction efficiency is improved.

In an optional implementation manner, the performing, by at least two feature extraction modules in the image reconstruction model to be trained, feature extraction on the sample image respectively to obtain at least two pieces of feature information includes: rotating the sample image by at least two angles to obtain at least two rotating images; respectively extracting features of the rotating image through at least two feature extraction modules in an image reconstruction model to be trained to obtain rotating feature information; and carrying out reverse rotation on the rotation characteristic information, and weighting a reverse rotation result to obtain at least two characteristic information of the sample image.

In order to further increase the receptive field, the input sample image is respectively rotated by 0 degrees, 90 degrees, 180 degrees and 270 degrees to obtain different rotation images; for each feature extraction module in the image reconstruction model, respectively extracting features of each rotating image through the feature extraction module to obtain each rotating feature information; carrying out reverse rotation on each piece of rotation characteristic information to obtain each reverse rotation result; and weighting each reverse rotation result to obtain the feature information extracted by the feature extraction module.

According to the technical scheme of the embodiment of the disclosure, in the construction process of the characteristic lookup table, the scale of the characteristic lookup table can be reduced through precision compression; in the model training stage, the receptive field of feature extraction can be further increased through sample image rotation.

Fig. 6 is a schematic diagram of an image reconstruction apparatus according to an embodiment of the present disclosure, where the embodiment is applicable to enhanced reconstruction of an image, and the apparatus is configured in an electronic device and can implement an image reconstruction method according to any embodiment of the present disclosure. Referring to fig. 6, the image reconstruction apparatus 600 specifically includes the following:

a feature lookup module 610, configured to determine at least two feature information of the target image based on the at least two feature lookup tables; the image reconstruction model comprises at least two characteristic lookup tables, at least two characteristic lookup tables and at least two characteristic extraction modules, wherein the at least two characteristic lookup tables are respectively associated with the at least two characteristic extraction modules in the image reconstruction model and are used for representing the mapping relation between input pixel information of the associated characteristic extraction modules and extracted characteristic pixel information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

a fusion feature determining module 620, configured to fuse the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fusion feature information;

an image reconstructing module 630, configured to determine a reconstructed image of the target image according to the fusion feature information.

In an alternative embodiment, the feature lookup table of the feature extraction module is constructed according to precision compression coefficients;

the feature lookup module 610 includes:

the pixel compression unit is used for compressing original pixel information in the target image by adopting the precision compression coefficient to respectively obtain first compressed pixel information and second compressed pixel information;

the pixel error unit is used for determining the pixel error between the first compressed pixel information and the original pixel information;

a feature search unit configured to search, based on each feature search table, first feature information associated with the first compressed pixel information and second feature information associated with the second compressed pixel information, respectively;

and the characteristic interpolation unit is used for carrying out interpolation according to the first characteristic information, the second characteristic information and the pixel error to obtain interpolation characteristic information.

In an optional implementation manner, the pixel compression unit is specifically configured to:

wherein, I is the original pixel information in the target image, I^rAnd I^cRespectively, the first compressed pixel information and the second compressed pixel information, round () represents rounding-down, ceil () represents rounding-up, i is a precision compression coefficient, and the convolution kernel of the first convolution layer is 2 × 2.

In an alternative embodiment, the feature lookup module 610 includes:

the rotating unit is used for rotating the target image by at least two angles to obtain at least two rotating images;

the information searching unit is used for respectively determining at least two pieces of characteristic information of each rotating image based on at least two characteristic searching tables;

and the reverse rotation unit is used for performing reverse rotation on the at least two pieces of characteristic information of each rotated image and weighting the reverse rotation result to obtain the at least two pieces of characteristic information of the target image.

In an alternative embodiment, the image reconstruction apparatus 600 is disposed at an image playing end.

According to the technical scheme, the high-frequency information reconstruction from low resolution to high resolution can be effectively learned based on deep learning. In the prediction stage, the feature extraction module is converted into the feature lookup table, and only the fusion module is reserved, so that the neural network calculation in the prediction process is greatly reduced, the prediction speed is effectively improved, and the problem that the existing end-to-end neural network method is limited by the reduction of the calculation power over-division effect of the mobile terminal is solved. The image features of different scales are extracted by connecting the first convolution layers of all the expansion degrees in parallel, and the image features of different scales are efficiently fused.

Fig. 7 is a schematic diagram of an image reconstruction model training apparatus according to an embodiment of the present disclosure, which is applicable to a case of training an image reconstruction model, and the apparatus is configured in an electronic device, and can implement the image reconstruction model training method according to any embodiment of the present disclosure. Referring to fig. 7, the image reconstruction apparatus 700 specifically includes the following:

the characteristic information determining module 710 is configured to perform characteristic extraction on the sample image through at least two characteristic extraction modules in the image reconstruction model to be trained, respectively, to obtain at least two pieces of characteristic information; convolution kernels of first convolution layers in the at least two feature extraction modules are the same in size and different in expansion rate;

a fusion feature determining module 720, configured to fuse the at least two pieces of feature information through a feature fusion module in the image reconstruction model to obtain fusion feature information;

an image reconstruction module 730, configured to determine a reconstructed image of the sample image according to the fusion feature information;

an iterative training module 740, configured to perform iterative training on the image reconstruction model by using a loss function; the loss function is used for characterizing the difference between the reconstructed image and a real image of the sample image;

In an alternative embodiment, the training apparatus 700 for image reconstruction model further includes a lookup table constructing module, where the lookup table constructing module includes:

the pixel compression unit is used for compressing original pixel information in the sample image by adopting the precision compression coefficient to obtain compressed pixel information;

and the lookup table construction unit is used for constructing a characteristic lookup table for the characteristic extraction module according to the compressed pixel information and the characteristic pixel information extracted by the characteristic extraction module.

In an optional implementation, the feature information determining module 710 includes:

the image rotation unit is used for rotating the sample image by at least two angles to obtain at least two rotated images;

the characteristic extraction unit is used for respectively extracting the characteristics of the rotating image through at least two characteristic extraction modules in the image reconstruction model to be trained to obtain rotating characteristic information;

and the characteristic reverse rotation unit is used for performing reverse rotation on the rotation characteristic information and weighting a reverse rotation result to obtain at least two pieces of characteristic information of the sample image.

According to the technical scheme of the embodiment, in the training stage, effective multi-scale image features are extracted by using a residual error network with large depth and width, the fusion process of the features with different scales is placed at the end of the network, and the fusion process is realized only through feature fusion and 1 × 1 convolution operation, so that the feature extraction module has large calculation amount, and the fusion module has small calculation amount. And moreover, the feature extraction module is converted into a lookup table in the prediction stage, and only the fusion module is reserved, so that the neural network calculation in the prediction process is greatly reduced, the prediction speed is effectively improved, and the problem that the existing end-to-end neural network method is limited by the reduction of the calculation power excess effect of the mobile terminal is solved.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units that perform machine learning model algorithms, a digital information processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as an image reconstruction method or a training method of an image reconstruction model. For example, in some embodiments, the image reconstruction method or the training method of the image reconstruction model may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the image reconstruction method or the training method of the image reconstruction model described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured by any other suitable means (e.g. by means of firmware) to perform an image reconstruction method or a training method of an image reconstruction model.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs executing on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. An image reconstruction method, comprising:

2. The method of claim 1, wherein a first convolution layer in a kth feature extraction module has an expansion rate of k, the feature extraction module further comprising at least two residual blocks and a second convolution layer in series with the first convolution layer.

3. The method of claim 1, wherein the feature lookup table of the feature extraction module is constructed from precision compression coefficients;

the determining at least two feature information of the target image based on the at least two feature lookup tables includes:

compressing original pixel information in the target image by using the precision compression coefficient to respectively obtain first compressed pixel information and second compressed pixel information;

determining a pixel error between the first compressed pixel information and the original pixel information;

based on each feature lookup table, respectively looking up first feature information associated with the first compressed pixel information and second feature information associated with the second compressed pixel information;

and carrying out interpolation according to the first characteristic information, the second characteristic information and the pixel error to obtain interpolation characteristic information.

4. The method of claim 3, wherein the compressing original pixel information in the target image by using the precision compression coefficient to obtain first compressed pixel information and second compressed pixel information comprises:

5. The method of claim 1, wherein the determining at least two feature information of the target image based on at least two feature lookup tables comprises:

rotating the target image by at least two angles to obtain at least two rotating images;

respectively determining at least two pieces of feature information of each rotating image based on at least two feature lookup tables;

and carrying out reverse rotation on at least two pieces of characteristic information of each rotating image, and weighting the reverse rotation result to obtain at least two pieces of characteristic information of the target image.

6. The method of claim 1, wherein the execution subject of the method is an image player.

7. A training method of an image reconstruction model comprises the following steps:

8. The method of claim 7, wherein a first convolution layer in a kth feature extraction module has an expansion rate of k, the feature extraction module further comprising at least two residual blocks and a second convolution layer in series with the first convolution layer.

9. The method of claim 7, after iteratively training the image reconstruction model using the loss function, further comprising:

compressing original pixel information in the sample image by adopting a precision compression coefficient to obtain compressed pixel information;

and constructing a feature lookup table for the feature extraction module according to the compressed pixel information and the feature pixel information extracted by the feature extraction module.

10. The method according to claim 7, wherein the performing feature extraction on the sample image by at least two feature extraction modules in the image reconstruction model to be trained respectively to obtain at least two pieces of feature information comprises:

rotating the sample image by at least two angles to obtain at least two rotating images;

respectively extracting features of the rotating image through at least two feature extraction modules in an image reconstruction model to be trained to obtain rotating feature information;

and carrying out reverse rotation on the rotation characteristic information, and weighting a reverse rotation result to obtain at least two characteristic information of the sample image.

11. An image reconstruction apparatus comprising:

12. The apparatus of claim 11, wherein a first convolutional layer in a kth feature extraction module has an expansion rate of k, the feature extraction module further comprising at least two residual blocks and a second convolutional layer in series with the first convolutional layer.

13. The apparatus of claim 11, the feature lookup table of the feature extraction module is constructed from precision compression coefficients;

the feature lookup module comprises:

14. The apparatus of claim 13, wherein the pixel compression unit is specifically configured to:

15. The apparatus of claim 11, wherein the feature lookup module comprises:

16. The apparatus of claim 11, wherein the apparatus is configured at an image playback end.

17. An apparatus for training an image reconstruction model, comprising:

18. The apparatus of claim 17, wherein a first convolutional layer in a kth feature extraction module has an expansion rate of k, the feature extraction module further comprising at least two residual blocks and a second convolutional layer in series with the first convolutional layer.

19. The apparatus of claim 17, further comprising a lookup table construction module comprising:

20. The apparatus of claim 17, wherein the feature information determination module comprises:

21. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.

22. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-10.

23. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-10.