CN118710501A - Training method of image super-resolution reconstruction system, image reconstruction method and system - Google Patents
Training method of image super-resolution reconstruction system, image reconstruction method and system Download PDFInfo
- Publication number
- CN118710501A CN118710501A CN202411186704.2A CN202411186704A CN118710501A CN 118710501 A CN118710501 A CN 118710501A CN 202411186704 A CN202411186704 A CN 202411186704A CN 118710501 A CN118710501 A CN 118710501A
- Authority
- CN
- China
- Prior art keywords
- image
- task
- reconstructed
- degradation
- prompt
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000006731 degradation reaction Methods 0.000 claims abstract description 111
- 230000015556 catabolic process Effects 0.000 claims abstract description 110
- 230000006870 function Effects 0.000 claims abstract description 38
- 238000013507 mapping Methods 0.000 claims description 68
- 239000011159 matrix material Substances 0.000 claims description 20
- 241000669618 Nothes Species 0.000 claims description 17
- 238000007781 pre-processing Methods 0.000 claims description 15
- 230000000694 effects Effects 0.000 claims description 10
- 238000010276 construction Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000005481 NMR spectroscopy Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000002059 diagnostic imaging Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 239000013585 weight reducing agent Substances 0.000 description 1
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a training method of an image super-resolution reconstruction system, an image reconstruction method and a system, wherein the training method comprises the following steps: for any degenerate task: acquiring a plurality of images to be reconstructed corresponding to the degradation task; based on a backbone network, according to each image to be reconstructed and task identifications of a predetermined degradation task, a prompt map generation module is trained to obtain a prompt map function corresponding to the degradation task, and then a trained image super-resolution reconstruction system is obtained, the trained image super-resolution reconstruction system can realize dynamic construction of self-adaptive prompt information, customized context information can be provided adaptively for different degradation tasks, the number of required model training parameters can be obviously reduced even under the condition of keeping feature diversity unchanged, the flexibility of a model is improved, the adaptability and the operation efficiency of the model are obviously enhanced, common knowledge forgetting phenomenon in cross-field application is effectively lightened, and continuous learning is realized.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a training method of an image super-resolution reconstruction system, an image reconstruction method and an image reconstruction system.
Background
The image super-resolution technology, which is a key means for improving the image quality, plays a significant role in a plurality of fields such as natural image analysis, medical imaging and the like. The fundamental purpose is to recover high resolution details from low resolution images, enhancing visual experience and analytical accuracy. While numerous approaches have brought significant advances to the field of image super-resolution, most of these approaches are tailored to specific scenes, making it difficult to achieve good generalization across scenes and image types. Furthermore, the ever changing data requirements force models to learn new scenarios continuously, which presents a significant challenge to computing resources.
In recent years, a prompt-based continuous learning strategy achieves a certain achievement in tasks such as image classification, wherein a multi-layer perceptron is used as an adaptive prompt generator, so that knowledge preservation of historical tasks and learning of new tasks can be effectively balanced. But this approach is limited in effect when pixel level accuracy requirements or complex degradation modes are faced with super resolution tasks. Traditional hint generation strategies have difficulty maintaining a balance between existing knowledge and new task learning while efficiently covering the multi-dimensional detail needs of degraded tasks. Significant differences between image data in the super-resolution domain of continuous images, particularly inconsistencies in low-level features such as texture and color, also exacerbate the difficulty of cross-task migration learning. In addition, the diversity and complexity of the image degradation mechanism is changed from simple interpolation to professional degradation in medical imaging, so that the field gap between tasks is further enlarged.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a training method of an image super-resolution reconstruction system, an image reconstruction method and a system, which specifically include:
In a first aspect, the invention provides a training method of an image super-resolution reconstruction system, which is applied to the image super-resolution reconstruction system, wherein the image super-resolution reconstruction system comprises a backbone network and a prompt mapping generation module;
The method comprises the following steps:
For any degenerate task:
acquiring a plurality of images to be reconstructed corresponding to the degradation task;
Based on the backbone network, training a prompt mapping generation module according to each image to be reconstructed and task identification of a predetermined degradation task to obtain a prompt mapping function corresponding to the degradation task, and further obtaining a trained image super-resolution reconstruction system.
In a second aspect, the present invention provides an image reconstruction method, comprising:
Acquiring an image to be reconstructed;
extracting query characteristics of an image to be reconstructed;
Determining a task corresponding to the image to be reconstructed according to the query characteristics of the image to be reconstructed and a query characteristic group corresponding to a plurality of predetermined degradation tasks;
Inputting query features of an image to be reconstructed and tasks corresponding to the image to be reconstructed into a prompt map generation module in the image super-resolution reconstruction system, which is trained according to the training method of any image super-resolution reconstruction system provided by the first aspect, so that the prompt map generation module obtains prompt mapping functions corresponding to various degradation tasks according to the query features of the image to be reconstructed and the tasks corresponding to the degradation tasks in advance, and obtains prompt information corresponding to the image to be reconstructed;
Inputting query features of the image to be reconstructed and a prompt mapping function corresponding to the image to be reconstructed into a backbone network in the image super-resolution reconstruction system, which is obtained by training according to the training method of any image super-resolution reconstruction system provided in the first aspect, so that the backbone network reconstructs the image to be reconstructed according to the query features of the image to be reconstructed and prompt information corresponding to the image to be reconstructed.
In a third aspect, the present invention further provides an image super-resolution reconstruction system, including:
The system comprises an image preprocessing module, a task matching module, a prompt mapping module and a backbone network;
The image preprocessing module is used for acquiring an image to be reconstructed and extracting query characteristics of the image to be reconstructed;
The task matching module is used for determining a task corresponding to the image to be reconstructed according to the query characteristics of the image to be reconstructed extracted by the image preprocessing module and the stored query characteristic group corresponding to the predetermined multiple degradation tasks;
The prompt mapping module is used for obtaining prompt mapping functions corresponding to various degradation tasks according to the query characteristics of the image to be reconstructed extracted by the image preprocessing module, the tasks corresponding to the image to be reconstructed determined by the task matching module and the prompt mapping functions corresponding to the various degradation tasks through pre-training, so as to obtain prompt information corresponding to the image to be reconstructed;
And the backbone network is used for reconstructing the image to be reconstructed according to the query characteristics of the image to be reconstructed extracted by the image preprocessing module and the prompt information corresponding to the image to be reconstructed obtained by the prompt mapping module.
The invention has the beneficial effects that:
According to the training method, the image reconstruction method and the system of the image super-resolution reconstruction system, the training method obtains a plurality of images to be reconstructed corresponding to a degradation task according to any degradation task, based on a backbone network, according to each image to be reconstructed and task identification of the degradation task determined in advance, a prompt mapping generation module is trained to obtain a prompt mapping function corresponding to the degradation task, and further the trained image super-resolution reconstruction system is obtained.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
FIG. 1 is a schematic flow chart of a training method of an image super-resolution reconstruction system provided by the invention;
FIG. 2 is a schematic diagram of a window-based transducer module according to the present invention;
FIG. 3 is a schematic flow chart of an image reconstruction method according to the present invention;
Fig. 4 is a schematic diagram of an image reconstruction system according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but embodiments of the present invention are not limited thereto.
Fig. 1 is a flow chart of a training method of an image super-resolution reconstruction system, which is provided by the invention, and the method is applied to the image super-resolution reconstruction system, wherein the image super-resolution reconstruction system comprises a backbone network and a prompt map generating module.
As shown in fig. 1, for any degenerate task, the method includes:
s101, acquiring a plurality of images to be reconstructed corresponding to the degradation task.
The degradation tasks correspond to different image degradation situations.
Image degradation refers to degradation of the quality of an acquired image during the formation, recording, processing, and transmission of the image due to imperfections in the imaging system, recording apparatus, transmission medium, processing methods, and the like. This phenomenon is manifested as blurring, distortion, noise-containing, or the like of an image. The causes of image degradation are various, and include motion blur caused by movement of an object or a photographing device, blur caused by long-time exposure, out-of-focus, blur caused by light angle, blur caused by atmospheric disturbance, and the like. Furthermore, too short an exposure time results in too few photons captured by the camera and also causes image degradation. Image degradation not only affects the sharpness and detail of the image, but may also mask important information in the image, thereby affecting the analysis and application of the image.
S102, training a prompt map generation module based on a backbone network according to each image to be reconstructed and task identification of a predetermined degradation task to obtain a prompt map function corresponding to the degradation task, and further obtaining a trained image super-resolution reconstruction system.
The method can train and obtain the prompt mapping function corresponding to each degradation task, and establish the corresponding relation between the degradation task and the prompt mapping function.
Illustratively, before S102, query feature groups corresponding to multiple degradation tasks and task identifiers corresponding to the query feature groups are determined and stored, so as to establish a correspondence relationship among the degradation tasks, the query feature groups, and the task identifiers. And further, after obtaining the query features of the image to be reconstructed, matching the query features of the image to be reconstructed with the query feature groups corresponding to the degradation tasks, and if the matching is successful, determining that the image to be reconstructed belongs to the degradation task successfully matched with the query features. Still further, according to the corresponding relation between the degraded task and the task identifier, the task identifier corresponding to the image to be reconstructed is determined.
When the query feature groups and task identifications corresponding to the multiple degradation tasks are determined, the query feature groups and task identifications corresponding to each degradation task can be sequentially determined, or the query feature groups and task identifications of different degradation tasks can be synchronously determined.
In one possible implementation, determining the task identity of the degraded task includes the following steps A1-A3:
a1, respectively acquiring a plurality of training images corresponding to a plurality of degradation tasks.
A2, respectively extracting the query characteristics of each training image to obtain a query characteristic group corresponding to each degradation task.
Specifically, query features of a plurality of training images corresponding to each degradation task are extracted respectively, and query features of all training images corresponding to each degradation task are query feature groups corresponding to the task.
And A3, clustering the task query feature groups respectively, and determining each clustering center as a task identifier of a corresponding degraded task.
Optionally, the K-Means classification algorithm is adopted to cluster the task query feature groups respectively, and each cluster center represents the generalization of one type of image features.
Illustratively, from the firstItem degradation task NoTraining imageIs expressed asWherein, the method comprises the steps of, wherein,The dimensions of the feature space are represented,Representing dimensions asFeature space. Subsequently, the K-Means classification algorithm is applied to the method from the first stepAll of the task of item degradationQuery features for individual training imagesDivided intoAnd clustering the groups. Task identification is then performed byThe cluster centers are formed by the method,Represent the firstTotal number of training images corresponding to the task of item degradation.
The task identification determined by the method not only effectively distinguishes the characteristics of different degraded tasks, but also can be used for accurately matching the input image to be reconstructed to the corresponding task category, thereby ensuring the high efficiency and the accuracy of task identification.
Further, in one possible implementation manner, based on the backbone network, according to each image to be reconstructed and a task identifier of a predetermined degradation task, a prompt map generating module is trained to obtain a prompt map function corresponding to the degradation task, which includes the following steps B1-B5:
and B1, extracting query characteristics of an nth image to be reconstructed in the nth training, wherein the nth image to be reconstructed is any image to be reconstructed.
Wherein n is a positive integer greater than or equal to 1.
Optionally, extracting query features of the image to be reconstructed includes: query features of the image to be reconstructed are extracted by a command line artwork processor (Contrastive Language-IMAGE PRETRAINING, CLIP), i.e., a CLIP image editor.
Wherein the normalized output of the last layer linear transformation of the CLIP image encoder is defined as the query feature of a single image.
And B2, constructing an nth prompt mapping function according to the query characteristics of the nth image to be reconstructed, the task identification of the predetermined degradation task and the nth prompt mapping base.
The nth prompt mapping base is obtained through n-1 round training.
Specifically, the hint mapping base is a linear transformer, represented by a size ofIs a two-dimensional matrix of (a),The input dimension of the hint map base is represented,Representing the output dimension of the hint map base.
In one possible implementation, the hint mapping base is a linear transformer, and the hint mapping base satisfies:
,
wherein, Represent the firstItem degradation task NoThe corresponding hint mapping base is identified by the respective task,The representation dimensions are respectively、AndIs a three-dimensional matrix of the real number domain of (c),Represents the total number of layers of the window-based transducer module,The input dimension of the hint map base is represented,Representing the output dimension of the hint map base.
To alleviate the problem of trainable parameter expansion, the hint mapping base in the above implementation may be further replaced by two low rank matrices, specifically, in another possible implementation, the hint mapping base is expressed as:
,
wherein, ,,Represent the firstItem degradation task NoThe corresponding hint mapping base is identified by the respective task,Represent the firstItem degradation task NoThe respective task identifies a corresponding weight matrix that,Represent the firstEach task of the item-degenerate task identifies a corresponding shared matrix,Represents an intermediate variable which is referred to as,Far smaller thanAnd,Represents the total number of layers of the window-based transducer module,The input dimension of the hint map base is represented,The output dimension of the hint map base is represented,The representation dimensions are respectively、AndIs a three-dimensional matrix of the real number domain of (c),The representation dimensions are respectively、AndIs a three-dimensional matrix of the real number domain of (c),An index representing a task identity corresponding to each degraded task,An index representing the degenerate task.
By the method, not only is the quantity of the training parameters greatly reduced, but also the matrix decomposition is skillfully utilized to endow the parameters with stronger expressive force. In particular, the method comprises the steps of,As each degenerated task, a specific weight matrix is identified and is focused on capturing multi-level detailed knowledge in each degenerated task, and the degenerated task shares the matrixThe generalization information among all clusters in a degraded task is converged, and a set of efficient task level knowledge frames is formed. By setting the intermediate dimension R much smaller than the original input dimensionAnd output dimensionThe value of the model (C) can ensure that a prompt mapping base can be generated only by a very small amount of training data, thereby further promoting the weight reduction of the model and improving the learning and training efficiency.
Further, in one possible implementation, the hint mapping function is expressed as:
,
wherein, Represent the firstItem degradation task NoA hint mapping function corresponding to the respective image,Representation ofAndIs used for the internal product of (a),Represent the firstItem degradation task NoThe corresponding query characteristics of the individual images are,Represent the firstItem degradation task NoThe identity of the individual task(s),Represent the firstItem degradation task NoThe corresponding hint mapping base is identified by the respective task,An index representing the image to which each degradation task corresponds,An index representing the degraded task is provided,An index representing a task identity corresponding to each degraded task,。
The method constructs a powerful mechanism to resist catastrophic forgetting by skillfully fusing task identification with low rank prompt mapping base with multiple granularity and design of shared low rank matrix. The design not only promotes the efficient absorption of new knowledge, but also effectively reserves the learning result of the previous task, ensures that the model keeps overall task adaptability and long-term performance stability while continuously evolving.
And B3, reconstructing the nth image to be reconstructed based on the backbone network according to the query characteristics of the nth image to be reconstructed and the nth prompt mapping function.
Specifically, the query feature and the nth prompt mapping function of the nth image to be reconstructed are input into a backbone network to obtain the reconstructed nth image to be reconstructed.
The SwinIR skeleton model can flexibly adapt to super-resolution processing of various input images, and the SwinIR skeleton model consists of a head convolution, a characteristic enhancement module containing long-distance jump residual error connection and an up-sampling module based on pixel shuffling operation. The feature enhancement module includes six feature enhancement phases, each phase containing six window-based Transformer modules. In each transducer module, the linear layer is able to extract the query, key and value parameters involved in the self-attention mechanism.
Based on the above characteristics, swinIR skeleton model can be used as backbone network.
In general, continuous learning is based on classification tasks and cannot be directly migrated to image tasks, and in order to specially adapt to degradation tasks, the invention adjusts keys and value variables by prompting a mapping function so that the continuous learning can be adapted to the image field.
As shown in fig. 2, in one possible implementation, the backbone network employs SwinIR skeleton model, and the feature enhancement module of the backbone network includes multiple feature enhancement phases, each feature enhancement phase including a multi-layer window-based transform module; the parameters of the linear layer output in any layer window-based transform module are expressed as:
,
,
wherein, Represent the firstLayer window-based Transformer Module output with respect to the first layerItem degradation task NoThe keys of the individual images are displayed in a single-image,Representing input NoLayer window-based the first of the transducer modulesItem degradation task NoCharacteristic information of the individual images (i.e. the firstLayer based on the characteristic information output by the window's transducer module),Represent the firstThe layer is based on the key weight matrix corresponding to the window's transducer module,Represent the firstItem degradation task NoA hint mapping function corresponding to the respective image,Represent the firstLayer window-based Transformer Module output concerns the firstItem degradation task NoThe value of the individual image is calculated,Represent the firstThe layer is based on a value weight matrix corresponding to the window's transducer module,An index operation is represented as such,Representing the index of the window-based transducer module,An index representing the degraded task is provided,And an index representing the image corresponding to each degradation task.
As above, for the firstLayer Window-based Transformer Module, given input featuresAnd a weight matrix,The method has the advantages that one dimension of the input characteristics is represented, the physical meaning corresponding to the dimension can be set according to actual conditions, the adaptation adjustment can be carried out on keys and value variables corresponding to the window-based transducer module according to prompt mapping functions corresponding to different images to be reconstructed, and continuous learning is achieved.
And B4, when the reconstruction effect of the nth image to be reconstructed does not meet the preset condition, updating the current prompt mapping base according to the reconstruction effect of the nth image to be reconstructed to obtain an n+1th prompt mapping base, and executing the n+1th training until the reconstruction effect obtained by the x training meets the preset condition.
Wherein x is a positive integer greater than or equal to n+1.
It should be noted that, for the images to be reconstructed corresponding to different degradation tasks, the corresponding reconstruction effects and preset conditions are different. For example, for an overexposed image, the reconstruction effect may correspond to an image content complement condition, and the preset condition may correspond to a preset image content complement degree; for the blurred image caused by low pixel, the corresponding reconstruction effect can correspond to the resolution enhancement condition, and the preset condition can be preset resolution.
And B5, determining the prompt mapping function corresponding to the x-th training as the prompt mapping function corresponding to the degradation task.
The training method of the image super-resolution reconstruction system is applied to the image super-resolution reconstruction system, and the image super-resolution reconstruction system comprises a backbone network and a prompt mapping generation module; aiming at any degradation task, the method trains a prompt map generating module according to each image to be reconstructed and task identification of the degradation task determined in advance based on a backbone network by acquiring a plurality of images to be reconstructed corresponding to the degradation task, obtains a prompt map function corresponding to the degradation task, further obtains a trained image super-resolution reconstruction system which can continuously learn different degradation tasks, the method can realize the dynamic construction of the self-adaptive prompt information, adaptively provide customized context information for different degradation tasks, remarkably reduce the number of required model training parameters even under the condition of keeping the feature diversity unchanged, improve the flexibility of the model, remarkably enhance the adaptability and the operation efficiency of the model, effectively lighten the common knowledge forgetting phenomenon in cross-field application and realize continuous learning.
Fig. 3 is a schematic flow chart of an image reconstruction method provided by the present invention, as shown in fig. 3, the method includes:
s301, acquiring an image to be reconstructed.
The image to be reconstructed is the image to be reconstructed with super resolution.
The image to be reconstructed may be an image acquired by any type of image acquisition device, for example, may be a photograph taken by a camera, or may be a medical image such as a nuclear magnetic resonance image or a B-ultrasonic image acquired by a medical device.
S302, extracting query characteristics of the image to be reconstructed.
Optionally, extracting query features of the image to be reconstructed includes: and extracting query characteristics of the image to be reconstructed through a CLIP image editor.
Wherein the normalized output of the last layer linear transformation of the CLIP image encoder is defined as the query feature of a single image.
S303, determining a degradation task corresponding to the image to be reconstructed according to the query feature of the image to be reconstructed and a query feature set corresponding to a plurality of degradation tasks determined in advance.
The method for determining the query feature set corresponding to the multiple degradation tasks may be specifically described with reference to the task identifier of how to determine the degradation task in the above method embodiment, which is not described herein.
S304, inputting query features of the image to be reconstructed and degradation tasks corresponding to the image to be reconstructed into a prompt map generation module in the image super-resolution reconstruction system trained according to the training method of any image super-resolution reconstruction system provided by the invention, so that the prompt map generation module obtains prompt information corresponding to the image to be reconstructed according to the query features of the image to be reconstructed, the tasks corresponding to the image to be reconstructed and prompt map functions corresponding to degradation tasks through pre-training.
It should be noted that, according to the related description in the embodiment corresponding to the training method, there is a mapping relationship among the degraded task, the query feature group, the task identifier, and the prompt mapping function, so that the task identifier corresponding to the image to be reconstructed may also be determined according to the query feature of the image to be reconstructed and task identifiers corresponding to a plurality of predetermined degraded tasks; inputting the query characteristics of the image to be reconstructed and the task identifiers corresponding to the image to be reconstructed into a prompt map generation module in the image super-resolution reconstruction system trained according to the training method of any image super-resolution reconstruction system provided by the invention, so that the prompt map generation module obtains prompt information corresponding to the image to be reconstructed according to the query characteristics of the image to be reconstructed, the task identifiers corresponding to the image to be reconstructed and prompt map functions corresponding to various degradation tasks through pre-training.
S305, inputting query characteristics of the image to be reconstructed and prompt information corresponding to the image to be reconstructed into a backbone network in the image super-resolution reconstruction system obtained by training according to the training method of any image super-resolution reconstruction system provided by the invention, so that the backbone network reconstructs the image to be reconstructed according to the query characteristics of the image to be reconstructed and the prompt information corresponding to the image to be reconstructed.
The invention also provides an image super-resolution reconstruction system, see fig. 4, comprising:
An image preprocessing module 41, a task matching module 42, a hint mapping module 43 and a backbone network 44.
The image preprocessing module 41 is configured to acquire an image to be reconstructed and extract query features of the image to be reconstructed. In FIG. 4, the image to be reconstructed is represented asThe query features of the image to be reconstructed are expressed as。
The task matching module 42 determines a task corresponding to the image to be reconstructed according to the query feature of the image to be reconstructed extracted by the image preprocessing module and the stored query feature set corresponding to the predetermined multiple degradation tasks.
The prompt mapping module 43 is configured to obtain prompt information corresponding to the image to be reconstructed according to the query feature of the image to be reconstructed extracted by the image preprocessing module, the task corresponding to the image to be reconstructed determined by the task matching module, and a prompt mapping function corresponding to each degradation task obtained by pre-training.
The backbone network 44 is configured to reconstruct the image to be reconstructed according to the query feature of the image to be reconstructed extracted by the image preprocessing module and the prompt information corresponding to the image to be reconstructed obtained by the prompt mapping module. In FIG. 4, the super-resolution reconstructed image is represented asCONV denotes the convolutional layer, WTB denotes the window-based transform module.
It should be noted that, different modules in the image super-resolution reconstruction system based on the prompt technology may be disposed in the same electronic device, or may be disposed in different electronic devices. When deployed in different electronic devices, the electronic devices may be communicatively coupled by any means.
For the system embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and details, advantages, and the like may be found in the part of the description of the method embodiment.
In order to further prove the beneficial effects of the invention, the invention also provides simulation data, which are specifically as follows:
1. Simulation conditions
The simulation experiment is carried out on the central processing unit Intel XEON E5-2680V4 209 CPU,NVIDIA GTX 3090 GPU and Ubuntu 16.04 operating systems by using pytorch 1.13.13 of open source of Facebook company in America. The databases used were the indoor images NYU database created in 2012 by Nathan Silberman and Rob Fergus, the real image RealSR database proposed by Cai et al, the DIV2K database of single image super resolution challenge NTIRE open source, and the IXI database of images from nuclear magnetic resonance scans of three different hospitals in london.
The low resolution images in the database are generated using four different degradation modes, respectively, including bilateral interpolation, real world degradation, complex degradation formed by random combinations, and frequency domain downsampling degradation.
2. Emulation content
Compared with the existing continuous learning prompt strategy CODA-P, the simulation experiment is conducted in depth. CODA-P employs a set of attention-directed key-query mechanism learning units that are dynamically combined according to input condition weights to form adaptive cues. As shown in Table 1, after continuous training of four databases, the image reconstruction method based on the prompt technology provided by the invention shows lower forgetting rate, is reduced by 5.5%, and simultaneously has excellent performances on average peak signal-to-noise ratio and average structural similarity, and the performances are respectively improved by 3.48% and 17.7%.
Table 1 experimental results of continuous training on four data sets
From the experimental results shown in table 1, it is clear that the image reconstruction method based on the prompt technology provided by the invention can flexibly cope with diversified image degradation modes by means of the innovative self-adaptive prompt mapping function, and meanwhile, maintains a lower forgetting rate, thereby forcefully verifying the technical advantages of the invention.
The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.
Claims (10)
1. The training method of the image super-resolution reconstruction system is characterized by being applied to the image super-resolution reconstruction system, wherein the image super-resolution reconstruction system comprises a backbone network and a prompt mapping generation module;
the method comprises the following steps:
For any degenerate task:
Acquiring a plurality of images to be reconstructed corresponding to the degradation task;
Based on the backbone network, training the prompt map generating module according to each image to be reconstructed and the task identification of the degradation task, and obtaining a prompt map function corresponding to the degradation task, thereby obtaining a trained image super-resolution reconstruction system.
2. The method according to claim 1, wherein the training the hint map generating module based on the backbone network according to each image to be reconstructed and a task identifier of the degradation task determined in advance to obtain a hint map function corresponding to the degradation task includes:
In the nth training, extracting query characteristics of an nth image to be reconstructed, wherein the nth image to be reconstructed is any image to be reconstructed, and n is a positive integer greater than or equal to 1;
Constructing an nth prompt mapping function according to the query characteristics of the nth image to be reconstructed, the task identification of the degradation task and the nth prompt mapping base, wherein the nth prompt mapping base is a prompt mapping base obtained by n-1 round training;
Reconstructing the nth image to be reconstructed according to the query characteristics of the nth image to be reconstructed and the nth prompt mapping function based on the backbone network;
When the reconstruction effect of the nth image to be reconstructed does not meet the preset condition, updating the current prompt mapping base according to the reconstruction effect of the nth image to be reconstructed to obtain an n+1th prompt mapping base, and executing the n+1th training until the reconstruction effect obtained by the x training meets the preset condition, wherein x is a positive integer greater than or equal to n+1;
and determining a prompt mapping function corresponding to the x-th training as the prompt mapping function corresponding to the degradation task.
3. The method according to claim 1 or 2, wherein determining a task identity of the degraded task comprises:
Respectively acquiring a plurality of training images corresponding to a plurality of degradation tasks;
Respectively extracting query characteristics of each training image to obtain a query characteristic group corresponding to each degradation task;
and clustering the task query feature groups respectively, and determining each clustering center as a task identification of a corresponding degraded task.
4. The method of claim 1 or 2, wherein the backbone network employs a SwinIR backbone model, the feature enhancement module of the backbone network comprising a plurality of feature enhancement phases, each of the feature enhancement phases comprising a multi-layer window-based transform module;
The parameters of the linear layer output in the window-based transform module at any layer are expressed as:
,
,
wherein, Represent the firstLayer window-based Transformer Module output with respect to the first layerItem degradation task NoThe keys of the individual images are displayed in a single-image,Representing input NoLayer window-based the first of the transducer modulesItem degradation task NoThe characteristic information of the individual images is used,Represent the firstThe layer is based on the key weight matrix corresponding to the window's transducer module,Represent the firstItem degradation task NoA hint mapping function corresponding to the respective image,Represent the firstLayer window-based Transformer Module output concerns the firstItem degradation task NoThe value of the individual image is calculated,Represent the firstThe layer is based on a value weight matrix corresponding to the window's transducer module,An index operation is represented as such,Representing the index of the window-based transducer module,An index representing the degraded task is provided,And an index representing the image corresponding to each degradation task.
5. The method of claim 2, wherein the hint map base is a linear transformer,
The hint mapping base satisfies:
,
wherein, Represent the firstItem degradation task NoThe corresponding hint mapping base is identified by the respective task,The representation dimensions are respectively、AndIs a three-dimensional matrix of the real number domain of (c),Represents the total number of layers of the window-based transducer module,The input dimension of the hint map base is represented,Representing the output dimension of the hint map base.
6. The method of claim 2, wherein the hint map base is a linear transformer,
The hint mapping base is expressed as:
,
wherein, ,,Represent the firstItem degradation task NoThe corresponding hint mapping base is identified by the respective task,Represent the firstItem degradation task NoThe respective task identifies a corresponding weight matrix that,Represent the firstEach task of the item-degenerate task identifies a corresponding shared weight matrix,Represents an intermediate variable which is referred to as,Far smaller thanAnd,Represents the total number of layers of the window-based transducer module,The input dimension of the hint map base is represented,The output dimension of the hint map base is represented,The representation dimensions are respectively、AndIs a three-dimensional matrix of the real number domain of (c),The representation dimensions are respectively、AndIs a three-dimensional matrix of the real number domain of (c),An index representing a task identity corresponding to each degraded task,An index representing the degenerate task.
7. The method according to claim 1 or 2, characterized in that the hint mapping function is expressed as:
,
wherein, Represent the firstItem degradation task NoA hint mapping function corresponding to the respective image,Representation ofAndIs used for the internal product of (a),Represent the firstItem degradation task NoThe query characteristics of the individual images are used,Represent the firstItem degradation task NoThe identity of the individual task(s),Represent the firstItem degradation task NoThe corresponding hint mapping base is identified by the respective task,An index representing the image to which each degradation task corresponds,An index representing the degraded task is provided,An index representing a task identity corresponding to each degraded task,。
8. The method of claim 2, wherein the extracting query features of the image to be reconstructed comprises:
And extracting the query characteristics of the image to be reconstructed through a CLIP image editor.
9. An image reconstruction method, comprising:
Acquiring an image to be reconstructed;
extracting query characteristics of the image to be reconstructed;
Determining a task corresponding to the image to be reconstructed according to the query characteristics of the image to be reconstructed and a query characteristic group corresponding to a plurality of degradation tasks which are determined in advance;
Inputting the query characteristics of the image to be reconstructed and the tasks corresponding to the image to be reconstructed into a prompt map generating module in an image super-resolution reconstruction system obtained by training according to the training method of the image super-resolution reconstruction system as claimed in any one of claims 1 to 8, so that the prompt map generating module obtains prompt information corresponding to the image to be reconstructed according to the query characteristics of the image to be reconstructed and the tasks corresponding to the image to be reconstructed and prompt map functions corresponding to various degradation tasks by pre-training;
Inputting the query characteristics of the image to be reconstructed and the prompt information corresponding to the image to be reconstructed into a backbone network in the image super-resolution reconstruction system obtained by training according to the training method of the image super-resolution reconstruction system as claimed in any one of claims 1 to 8, so that the backbone network reconstructs the image to be reconstructed according to the query characteristics of the image to be reconstructed and the prompt information corresponding to the image to be reconstructed.
10. An image super-resolution reconstruction system, comprising:
The system comprises an image preprocessing module, a task matching module, a prompt mapping module and a backbone network;
The image preprocessing module is used for acquiring an image to be reconstructed and extracting query characteristics of the image to be reconstructed;
The task matching module is used for determining a task corresponding to the image to be reconstructed according to the query characteristics of the image to be reconstructed extracted by the image preprocessing module and a stored query characteristic group corresponding to a plurality of degradation tasks which are determined in advance;
The prompt mapping module is used for obtaining prompt mapping functions corresponding to various degradation tasks according to the query characteristics of the image to be reconstructed extracted by the image preprocessing module, the tasks corresponding to the image to be reconstructed determined by the task matching module and the prompt information corresponding to the image to be reconstructed through pre-training;
the backbone network is used for reconstructing the image to be reconstructed according to the query characteristics of the image to be reconstructed extracted by the image preprocessing module and the prompt information corresponding to the image to be reconstructed obtained by the prompt mapping module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411186704.2A CN118710501A (en) | 2024-08-28 | 2024-08-28 | Training method of image super-resolution reconstruction system, image reconstruction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411186704.2A CN118710501A (en) | 2024-08-28 | 2024-08-28 | Training method of image super-resolution reconstruction system, image reconstruction method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118710501A true CN118710501A (en) | 2024-09-27 |
Family
ID=92806044
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411186704.2A Pending CN118710501A (en) | 2024-08-28 | 2024-08-28 | Training method of image super-resolution reconstruction system, image reconstruction method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118710501A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115482573A (en) * | 2022-09-29 | 2022-12-16 | 歌尔科技有限公司 | Facial expression recognition method, device and equipment and readable storage medium |
WO2023098688A1 (en) * | 2021-12-03 | 2023-06-08 | 华为技术有限公司 | Image encoding and decoding method and device |
-
2024
- 2024-08-28 CN CN202411186704.2A patent/CN118710501A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023098688A1 (en) * | 2021-12-03 | 2023-06-08 | 华为技术有限公司 | Image encoding and decoding method and device |
CN115482573A (en) * | 2022-09-29 | 2022-12-16 | 歌尔科技有限公司 | Facial expression recognition method, device and equipment and readable storage medium |
Non-Patent Citations (3)
Title |
---|
JIANG, A 等: "DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution", DALPSR: LEVERAGE DEGRADATION-ALIGNED LANGUAGE PROMPT FOR REAL-WORLD IMAGE SUPER-RESOLUTION, 15 August 2024 (2024-08-15) * |
徐文博;孙广玲;陆小锋;: "预训练网络引导的人脸图像超分辨率重建", 工业控制计算机, no. 06, 25 June 2020 (2020-06-25) * |
文渊博 等: "基于视觉提示学习的天气退化图像恢复", 计算机学报, 27 June 2024 (2024-06-27) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11775829B2 (en) | Generative adversarial neural network assisted video reconstruction | |
US11625613B2 (en) | Generative adversarial neural network assisted compression and broadcast | |
Chen et al. | Hdrunet: Single image hdr reconstruction with denoising and dequantization | |
WO2022267641A1 (en) | Image defogging method and system based on cyclic generative adversarial network | |
KR20210074360A (en) | Image processing method, device and apparatus, and storage medium | |
CN112507617B (en) | Training method of SRFlow super-resolution model and face recognition method | |
CN113379606B (en) | Face super-resolution method based on pre-training generation model | |
CN113066034A (en) | Face image restoration method and device, restoration model, medium and equipment | |
CN113284051A (en) | Face super-resolution method based on frequency decomposition multi-attention machine system | |
CN114830168B (en) | Image reconstruction method, electronic device, and computer-readable storage medium | |
CN116071494A (en) | High-fidelity three-dimensional face reconstruction and generation method based on implicit nerve function | |
CN110570375B (en) | Image processing method, device, electronic device and storage medium | |
CN115131203A (en) | LR image generation method and real image super-resolution method based on uncertainty | |
CN113724134A (en) | Aerial image blind super-resolution reconstruction method based on residual distillation network | |
Liu et al. | Facial image inpainting using multi-level generative network | |
CN110443755B (en) | Image super-resolution method based on high-low frequency signal quantity | |
CN116452715A (en) | Dynamic human hand rendering method, device and storage medium | |
CN118710501A (en) | Training method of image super-resolution reconstruction system, image reconstruction method and system | |
CN114862699A (en) | Face repairing method, device and storage medium based on generation countermeasure network | |
CN114119428A (en) | Image deblurring method and device | |
CN115311152A (en) | Image processing method, image processing apparatus, electronic device, and storage medium | |
CN115222606A (en) | Image processing method, image processing device, computer readable medium and electronic equipment | |
Yang et al. | An end‐to‐end perceptual enhancement method for UHD portrait images | |
Korkmaz et al. | Two-stage domain adapted training for better generalization in real-world image restoration and super-resolution | |
CN113688694B (en) | Method and device for improving video definition based on unpaired learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |