CN116740501A - Training method and application of image blurring region restoration compensation model - Google Patents

Training method and application of image blurring region restoration compensation model Download PDF

Info

Publication number
CN116740501A
CN116740501A CN202310706302.XA CN202310706302A CN116740501A CN 116740501 A CN116740501 A CN 116740501A CN 202310706302 A CN202310706302 A CN 202310706302A CN 116740501 A CN116740501 A CN 116740501A
Authority
CN
China
Prior art keywords
model
image
training
loss
glm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310706302.XA
Other languages
Chinese (zh)
Inventor
郁强
韩致远
王国梁
来佳飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCI China Co Ltd
Original Assignee
CCI China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCI China Co Ltd filed Critical CCI China Co Ltd
Priority to CN202310706302.XA priority Critical patent/CN116740501A/en
Publication of CN116740501A publication Critical patent/CN116740501A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a training method and application of an image fuzzy region restoration compensation model, comprising three steps of training sample selection, GLM model parameter initialization and Mask training, wherein a GLM-6B model is used as a basis, and a target detection module and a target segmentation module are added to the GLM-6B model and training is carried out by utilizing a knowledge distillation method, so that the GLM-6B model can process image tasks; and then, after the FFN module is added to the GLM-6B model, the GLB-6B model is trained by stages by utilizing reconstruction loss and perception loss, so that the GLM-6B model can be used for repairing and supplementing the image fuzzy area and has higher accuracy and generalization.

Description

Training method and application of image blurring region restoration compensation model
Technical Field
The application relates to the field of image processing, in particular to a training method and application of an image blurring region restoration compensation model.
Background
The camera is a device for converting image signals into electric signals and is widely applied to the fields of mobile phones, computers, monitoring, medical treatment and the like. The imaging quality of the camera is not only dependent on the factors of sensors, circuits, algorithms and the like in the camera, but also influenced by the factors of lenses, glass covers and the like outside the camera. The lens is one of the most important components of the camera, determines parameters such as a visual angle, an aperture, a focal length and the like of the camera, and is also directly contacted with the external environment, so that the camera is easy to be polluted by dirt such as dust, water drops, fingerprints and the like. The stains on the lens can influence the imaging effect of the camera, and mainly show the phenomena of whole or partial blurring, darkness, distortion and the like of the image, and the phenomena can reduce the indexes such as definition, contrast, color reproducibility and the like of the image, so that the acquisition and the utilization of the image information by a user are influenced. Therefore, how to effectively remove or reduce the problem of image blurring caused by stains on the lens is a difficult problem to be solved in the field of image processing.
Currently, some solutions to the problem of image blurring caused by stains on lenses mainly include the following categories:
(1) Physical cleaning method: i.e. by wiping, blowing, scraping, or other means or methods, the stains on the lens are physically removed. This method is straightforward, but also has some drawbacks, such as: the manual operation of the user is needed, which is time-consuming and labor-consuming; scratches or damage may be caused to the lens surface; the stains on the lens cannot be found and treated in time; failure to treat stains on the inner lens or glass cover; only for the case where stains are found before an image is taken.
(2) Optical compensation method: i.e. by adjusting optical elements or parameters on the lens or sensor, such as aperture, focal length, exposure time, etc. This approach can alleviate the image blur problem to some extent, but has some drawbacks, such as: the user is required to have certain professional knowledge of shooting to manually or automatically adjust the lens, so that the operation complexity is increased; other optical problems may be introduced such as vignetting, distortion, noise, etc.; the problem of image blurring cannot be completely eliminated; and cannot adapt to the image requirements under different scenes and conditions.
(3) Digital processing method: the image definition is improved by performing digital signal processing, such as filtering, enhancing, restoring and other algorithms, on the image signals acquired by the camera. This approach allows the image to be optimized and restored later, but also suffers from drawbacks such as: a significant amount of computing resources and time are consumed; other image problems such as jaggies, distortions, false colors, etc. may be introduced; the original image information cannot be completely restored; and cannot adapt to different types and styles of image requirements, etc.
In summary, the prior art has certain limitations and disadvantages aiming at the problem of image blurring caused by stains on the lens, and cannot achieve the ideal effect.
Disclosure of Invention
The embodiment of the application provides a training method and application of an image blurring region restoration compensation model, which utilize the strong generating and understanding capability of a GLM-6B model to effectively compensate the definition of an image blurring region caused by the dirt of a camera lens so as to obtain an image with high definition.
In a first aspect, an embodiment of the present application provides a training method for an image blur area restoration compensation model, including the following steps:
training sample selection: selecting an image of the marked repair area as a training image, and performing image vectorization processing on the training image to obtain an image vector;
initializing GLM model parameters; constructing a segmentation model as a teacher model, constructing a first model as a student model, wherein the first model is based on a GLM-6B model, accessing a parallel target detection module and a parallel target segmentation module at a feature exit layer of the GLM-6B model, inputting training images into the teacher model to obtain teacher output, inputting image vectors into the student model to obtain student output, calculating knowledge distillation loss of the teacher output and the student output, calculating supervision loss of fuzzy areas of the student output and the marks, weighting the knowledge distillation loss and the supervision loss to obtain first model loss, and training parameters of the GLM-6B model by taking the minimum first model loss as a reference;
mask training: accessing an FFN module at a characteristic outlet layer of the GLM-6B model with initialized parameters to obtain a second model, randomly generating masks with the same scale as the repair area and covering the mask vectors to obtain mask vectors, inputting the mask vectors and the image vectors into the second model to calculate perception loss and reconstruction loss, weighting the perception loss and the reconstruction loss to obtain the second model loss, and training the GLM-6B model in stages according to the second model loss to obtain a trained image fuzzy area repair compensation model.
In a second aspect, an embodiment of the present application provides an application method of an image blur area restoration compensation model, including: and inputting the image to be repaired into the image blur area repair compensation model obtained by training any one of the training methods of the image blur area repair compensation models, so as to obtain the image for repairing the blur area.
In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to run the computer program to perform the training method of the image blur area restoration compensation model.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored therein a computer program comprising program code for controlling a process to execute a process comprising any one of the training methods of the image blur area restoration compensation model.
The main contributions and innovation points of the application are as follows:
the embodiment of the application provides a training method and application of an image blurring region restoration compensation model, which are based on a GLM-6B model, and the GLM-6B model can process image tasks by adding a target detection module and a target segmentation module to the GLM-6B model and training by using a knowledge distillation method; and then, after the FFN module is added to the GLM-6B model, the GLB-6B model is trained by stages by utilizing reconstruction loss and perception loss, so that the GLM-6B model can be used for repairing and supplementing the image fuzzy area and has higher accuracy and generalization.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flow chart of a training method of an image blur area restoration compensation model according to an embodiment of the present application;
FIG. 2 is a schematic structural view of a first model according to the present application;
FIG. 3 is a schematic illustration of image vectors corresponding to a first model;
FIG. 4 is a schematic structural view of a second model according to the present application;
FIG. 5 is a schematic illustration of image vectors corresponding to a second model;
fig. 6 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with aspects of one or more embodiments of the present description as detailed in the accompanying claims.
It should be noted that: in other embodiments, the steps of the corresponding method are not necessarily performed in the order shown and described in this specification. In some other embodiments, the method may include more or fewer steps than described in this specification. Furthermore, individual steps described in this specification, in other embodiments, may be described as being split into multiple steps; while various steps described in this specification may be combined into a single step in other embodiments.
Example 1
The image blurring region restoration compensation model obtained by training the scheme can restore the image blurring region caused by lens stains, and has higher accuracy and generalization. In order to facilitate understanding of the improvement content of the scheme, the GLM-6B model is firstly described in the following:
the GLM-6B model is a Chinese-English language model with a trillion parameter scale, and is optimized for Chinese, and has 62 billion parameters based on a General Language Model (GLM) architecture. By combining with a model quantization technology, a user can perform local deployment on a consumer-level display card, and the minimum only needs 6GB of video memory under the INT4 quantization level, and the model can realize multiple functions of question-answering, literature creation, mail assistant, role playing and the like, and support longer sequence length and human intention alignment training.
However, currently, the parameters of the pre-training of the GLM series model on the market are all parameters extracted according to the characteristics of text information, and if the training of the fuzzy region repair compensation task of the GLM series model is performed by directly utilizing the characteristics of image information, the convergence effect of the training task is extremely poor, so that the training mode of the GLM-6B model is optimized, and the model can repair the fuzzy region of the image well, so that the image meeting the user requirement is obtained.
Specifically, as shown in fig. 1, the present solution provides a training method for an image blur area restoration compensation model, which includes the following steps:
training sample selection: selecting an image of the marked repair area as a training image, and performing image vectorization processing on the training image to obtain an image vector;
initializing GLM model parameters; constructing a segmentation model as a teacher model, constructing a first model as a student model, wherein the first model is based on a GLM-6B model, accessing a parallel target detection module and a parallel target segmentation module at a feature exit layer of the GLM-6B model, inputting training images into the teacher model to obtain teacher output, inputting image vectors into the student model to obtain student output, calculating knowledge distillation loss of the teacher output and the student output, calculating supervision loss of fuzzy areas of the student output and the marks, weighting the knowledge distillation loss and the supervision loss to obtain first model loss, and training parameters of the GLM-6B model by taking the minimum first model loss as a reference;
mask training: accessing an FFN module at a characteristic outlet layer of the GLM-6B model with initialized parameters to obtain a second model, randomly generating masks with the same scale as the repair area and covering the mask vectors to obtain mask vectors, inputting the mask vectors and the image vectors into the second model to calculate perception loss and reconstruction loss, weighting the perception loss and the reconstruction loss to obtain the second model loss, and training the GLM-6B model in stages according to the second model loss to obtain a trained image fuzzy area repair compensation model.
It should be noted that, the model structure of the image blur area restoration compensation model provided by the scheme is the same as that of the GLM-6B model, the existing GLM-6B model is suitable for alignment training of a long text sequence and human intention, the pre-training parameters of the GLM series model are parameters extracted aiming at text information characteristics and cannot be used for an image blur area restoration task, and aiming at the defect, the scheme provides a training method aiming at the GLM-6B model, and the GLM-6B model obtained through training can be used as an image blur area restoration compensation model for restoring a blur area in a compensation image.
Because the GLM system pre-training parameters are parameters aiming at text information feature extraction, if the GLM system pre-training parameters are directly used for an image blur area restoration compensation task, the convergence effect on the image task is poor. To solve this problem, the present solution employs a knowledge distillation approach to detect and segment the dual task outlets, pre-training the GLM-6B model using images of the repair marked areas taken from the public dataset to achieve initialization of the parameters.
Specifically, in the GLM model parameter initialization stage, a pre-trained segmentation model of the detectable and segmented object is selected as the teacher model, and in some embodiments, the segmentation model used as the teacher model may be selected as the Mask R-CNN model. The segmentation model of the scheme is pre-trained by using the images marked with the repair areas, and the pre-trained segmentation model can detect and segment the repair areas in the images, so that the segmentation model can output target detection indexes and target segmentation indexes when the GLM model parameters are initialized.
As shown in fig. 2, the disclosed GLM-6B model is selected as a basic model, and a parallel target detection module and a target segmentation module are connected to a feature exit layer of the GLM-6B model, so that training of an image mode can be realized. Specifically, the feature exit layer of the GLM-6B outputs a feature matrix, the feature matrix is input into the multi-layer convolution layer to be convolved and then is respectively input into the target detection module and the target segmentation module, the feature input into the target detection module is subjected to full connection layer to obtain a classification result and a regression target frame, and the classification result and the regression target frame are input into the target segmentation module to be processed to obtain the segmentation matrix.
It should be noted that, the parameter amounts of the target detection module and the target segmentation module are set to be small parameters, and the stacking of the convolution layers is also small, because the parameter initialization of the GLM model is mainly used for initializing the parameters of the GLM-6B model, and the target detection module and the target segmentation module are not used any more in the subsequent training.
In addition, since the GLM model is suitable for training one-dimensional text vectors, the present solution requires image vectorization processing of training images input into the GLM model to obtain image vectors. As shown in fig. 3, in the step of performing image vectorization processing on a training image to obtain an image vector, the training image is decomposed into a vector matrix of a pixel level in units of pixels, the vector matrix is split according to a required specification to obtain a plurality of split units, and a starting position of each split unit is inserted into a module serial number to obtain the image vector.
Each pixel represents a point in the image and has a specific color or gray value, so that each image can be decomposed into a vector matrix by taking the pixel point as a unit, and the gray value of the corresponding pixel point represented by each matrix element of the vector matrix is obtained. The requirement specification of the present solution is a matrix specification, which may be selected from 1×1,3×3,5×5, and the specific requirement specification is adjustable according to actual requirements. Generally, the smaller the required specification is, the finer the edge accuracy of the training obtained model, which is finally processed on the image task, and the larger the calculation cost is.
Specifically, splitting the vector matrix according to a splitting sequence to obtain splitting units, and inserting a corresponding module serial number at the starting position of each splitting unit to obtain a one-dimensional image vector, wherein the module serial number marks a matrix corresponding to a splitting specification corresponding to the current splitting unit. As shown in fig. 3, if the vector matrix is split by adopting the 1*1 specification, each pixel is sequentially ordered as an element of an image vector to obtain a one-dimensional image vector.
In the step of calculating the knowledge distillation loss of teacher output and student output, calculating the supervised loss of fuzzy areas of student output and marks, weighting the knowledge distillation loss and the supervised loss to obtain a first model loss, respectively weighting the knowledge distillation loss and the supervised loss by super parameters, and adding the two losses to obtain the first model loss.
The specific first model loss is formulated as follows;
L student =αL distill +βL supervise
wherein L is distill Representing knowledge distillation loss, L supervise Representing the supervised losses, α and β are the relative weights of the knowledge distillation losses and the supervised losses, respectively.
Wherein y is s Representing the output of the student model, y t Representing the teacher model output.
Wherein y is s Representing the student model output, y represents the blurred region of the real marker.
And training parameters of the GLM-6B model by taking the minimum total loss as a reference so as to finish the parameter initialization work of the GLM-6B aiming at the image analysis task.
After initializing parameters of the GLM-6B model, relying on characteristics of a mask module in a transducer decoder, taking a fuzzy region for fuzzy region compensation as a two-dimensional mask input model, under the condition of the two-dimensional mask of image data, the model can infer real pixel point values in the mask through pixel characteristics near the mask in the image, so that the final model can repair the compensation repair region.
As shown in fig. 4, in the Mask training stage, the FFN module is used to replace the target detection module and the target segmentation module of the GLM-6B model during parameter initialization of the GLM model, where the FFN module is used to extract reconstruction loss.
In the step of randomly generating the mask with the same scale as the repair area and covering the mask on the image vector to obtain the mask vector, the mask with the same scale as the repair area in the training image is randomly generated, and the mask is covered on the training image, and then the training image with the mask is subjected to image vectorization processing to obtain the mask vector. Specifically, a training image with a mask is decomposed into a pixel-level vector matrix by taking pixels as units, the vector matrix is split according to a required specification to obtain a plurality of split units, and the starting position of each split unit is inserted into a module serial number to obtain an image vector.
It should be noted that the mask size must be the same as the repair area size, so as to avoid the situation that the model is difficult to converge after the mask covers the individual pixels of other blocks in the training image. In some embodiments, the training image may be decomposed into a pixel-level vector matrix by using pixels as units, a mask identical to the repair area is covered on the vector matrix to obtain a vector matrix with the mask, and then the vector matrix is split according to the requirement specification to obtain a plurality of split units, and a start position of each split unit is inserted into the module serial number to obtain a mask vector. As shown in fig. 5, the vector matrix with mask is split 1*1 to obtain split units with single pixel points as units, so as to obtain mask vectors with mask as split units corresponding to the 3 rd, 7 th, 8 th, 9 th and 13 th module serial numbers.
In the step of inputting mask vectors and image vectors into a second model to calculate perception losses and rebuild losses, a plurality of groups of mask vectors and image vectors are input into the second model and pass through a linear convolution layer of a GLM-6B model to obtain corresponding mask feature vectors and image feature vectors of each group, and the perception losses are calculated based on the mask vectors and the image vectors of each group; and (3) inputting the mask vector and the image vector into a second model, obtaining a repair image through an FFN module, and calculating a pixel level difference value between the repair image and an original image of the corresponding image vector to obtain reconstruction loss.
That is, the mask training stage in the scheme is divided into two stages, and because the GLM-6B model includes a large number of parameters, and often the image data of a specific scene does not meet the requirement of full training, only part of the data is selected to fine tune the GLB-6B model, and in the embodiment of the scheme, the perception loss of the GLB-6B model is selected.
The calculation formula for obtaining the perception loss is as follows:
where M represents a batch of input mask vectors and image vectors,image characteristics obtained by linear convolution layer of representative image vector through GLM moduleVector, F y Representing mask feature vectors obtained by the mask vector passing through the linear convolution layer of the GLM module.
The calculation formula for obtaining the reconstruction loss is as follows:
where N represents the number of pixels represented by the input image vector,and y i The i-th pixel value of the output repair image and the original image are represented, respectively.
In the step of obtaining the second model loss by weighting the perception loss and the reconstruction loss, taking the super parameter as the weighted proportion of the perception loss and the reconstruction loss, and obtaining the second model loss by summing the weighted perception loss and the reconstruction loss according to the corresponding weighted proportion.
In the "training the GLM-6B model in stages according to the second model loss to obtain a trained image blur area restoration compensation model", the first stage training is performed after the weight of the perceived loss is increased in the first stage, then the parameters of the GLM-6B model are frozen, the weight of the perceived loss is set to 0, the weight of the reconstructed loss is set to 1, and the second model is trained to obtain the trained image blur area restoration compensation model.
In order to verify that the GLB-6B model obtained by training according to the training method can perform well in an image blur area restoration compensation task, the scheme carries out the following experiment:
1. selecting a definition compensation data set (hereinafter referred to as a dirty data set) of an image blurring area caused by the dirty of a camera lens, wherein the data set comprises 1000 images under different scenes and conditions, and each image is provided with one or more blurring areas caused by the dirty of the camera lens and a corresponding clear image as an original image;
2. a sharpness-compensated dataset (hereinafter referred to as other dataset) of blurred regions of an image due to other reasons, the dataset comprising 1000 images of different scenes and conditions, each image having one or more blurred regions due to raindrop blur, defocus blur, atmospheric blur, etc., and a corresponding sharp image as an original image.
The inventors compared the method of the present application with the following methods, respectively: (1) No processing method, namely directly using the original image as output; (2) The bilateral filtering method is that the image is denoised and smoothed by using a bilateral filtering algorithm; (3) The super resolution method is that the super resolution algorithm is used for amplifying and enhancing the image; (4) The deblurring method is to restore and sharpen an image by using a deblurring algorithm. And the following several indicators were used to evaluate the performance of the various methods, respectively: (1) Peak signal-to-noise ratio (PSNR) for measuring a similarity between an output image and an original image; (2) A Structural Similarity Index (SSIM) for measuring a degree of structural preservation between the output image and the original image; (3) A Perceived Quality Index (PQI) for measuring a perceived quality difference between the output image and the original image; (4) The visual information fidelity index (VIF) is used for measuring the visual information fidelity between the output image and the original image, the experimental results on other obtained data sets are shown in the following table 1, and the experimental results on the obtained stain data sets are shown in the following table 2:
table-average Performance index for each method after processing images on other datasets
Method PSNR SSIM PQI VIF
No treatment 18.23 0.56 0.32 0.21
Bilateral filtering 19.45 0.59 0.35 0.24
Super resolution 20.67 0.62 0.38 0.27
Deblurring 21.89 0.65 0.41 0.30
The application is that 23.12 0.68 0.44 0.33
Table two average performance indicators of each method after processing images on a soil dataset
Method PSNR SSIM PQI VIF
No treatment 17.34 0.54 0.31 0.20
Bilateral filtering 18.56 0.57 0.34 0.23
Super resolution 19.78 0.60 0.37 0.26
Deblurring 21.00 0.63 0.40 0.29
The applicationMing dynasty 22.23 0.66 0.43 0.32
As can be seen from the table, the method is superior to other methods in all indexes, and the method can effectively perform sharpness compensation on the blurred region in the image, and has good adaptability and flexibility.
Example two
Based on the same conception, the application also provides an application method of the image blurring region restoration compensation model, which comprises the following steps:
inputting an image to be repaired into the image blur area repair compensation model obtained by training the training method of the image blur area repair compensation model according to the first embodiment, and obtaining an image for repairing the blur area.
The same technical matters as those in the first embodiment are not fully described here.
Example III
The present embodiment also provides an electronic device, referring to fig. 6, comprising a memory 304 and a processor 302, the memory 304 having stored therein a computer program, the processor 302 being arranged to run the computer program to perform the steps of an embodiment of the training method of any of the above described image blur area restoration compensation models.
In particular, the processor 302 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.
Memory 304 may include, among other things, mass storage 304 for data or instructions. By way of example, and not limitation, memory 304 may comprise a Hard Disk Drive (HDD), floppy disk drive, solid State Drive (SSD), flash memory, optical disk, magneto-optical disk, tape, or Universal Serial Bus (USB) drive, or a combination of two or more of these. Memory 304 may include removable or non-removable (or fixed) media, where appropriate. Memory 304 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 304 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, memory 304 includes Read-only memory (ROM) and Random Access Memory (RAM). Where appropriate, the ROM may be a mask-programmed ROM, a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), an electrically rewritable ROM (EAROM) or FLASH memory (FLASH) or a combination of two or more of these. The RAM may be Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM) where appropriate, and the DRAM may be fast page mode dynamic random access memory 304 (FPMDRAM), extended Data Output Dynamic Random Access Memory (EDODRAM), synchronous Dynamic Random Access Memory (SDRAM), or the like.
Memory 304 may be used to store or cache various data files that need to be processed and/or communicated, as well as possible computer program instructions for execution by processor 302.
The processor 302 implements the training method of the image blur area restoration compensation model of any one of the above embodiments by reading and executing the computer program instructions stored in the memory 304.
Optionally, the electronic apparatus may further include a transmission device 306 and an input/output device 308, where the transmission device 306 is connected to the processor 302, and the input/output device 308 is connected to the processor 302.
The transmission device 306 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wired or wireless network provided by a communication provider of the electronic device. In one example, the transmission device includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through the base station to communicate with the internet. In one example, the transmission device 306 may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.
The input-output device 308 is used to input or output information. In this embodiment, the input information may be an image with a blurred region or the like, and the output information may be a repaired image or the like.
Alternatively, in the present embodiment, the above-mentioned processor 302 may be configured to execute the following steps by a computer program:
training sample selection: selecting an image of the marked repair area as a training image, and performing image vectorization processing on the training image to obtain an image vector;
initializing GLM model parameters; constructing a segmentation model as a teacher model, constructing a first model as a student model, wherein the first model is based on a GLM-6B model, accessing a parallel target detection module and a parallel target segmentation module at a feature exit layer of the GLM-6B model, inputting training images into the teacher model to obtain teacher output, inputting image vectors into the student model to obtain student output, calculating knowledge distillation loss of the teacher output and the student output, calculating supervision loss of fuzzy areas of the student output and the marks, weighting the knowledge distillation loss and the supervision loss to obtain first model loss, and training parameters of the GLM-6B model by taking the minimum first model loss as a reference;
mask training: accessing an FFN module at a characteristic outlet layer of the GLM-6B model with initialized parameters to obtain a second model, randomly generating masks with the same scale as the repair area and covering the mask vectors to obtain mask vectors, inputting the mask vectors and the image vectors into the second model to calculate perception loss and reconstruction loss, weighting the perception loss and the reconstruction loss to obtain the second model loss, and training the GLM-6B model in stages according to the second model loss to obtain a trained image fuzzy area repair compensation model.
It should be noted that, specific examples in this embodiment may refer to examples described in the foregoing embodiments and alternative implementations, and this embodiment is not repeated herein.
In general, the various embodiments may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects of the application may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the application is not limited thereto. While various aspects of the application may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
Embodiments of the application may be implemented by computer software executable by a data processor of a mobile device, such as in a processor entity, or by hardware, or by a combination of software and hardware. Computer software or programs (also referred to as program products) including software routines, applets, and/or macros can be stored in any apparatus-readable data storage medium and they include program instructions for performing particular tasks. The computer program product may include one or more computer-executable components configured to perform embodiments when the program is run. The one or more computer-executable components may be at least one software code or a portion thereof. In addition, in this regard, it should be noted that any blocks of the logic flows as illustrated may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions. The software may be stored on a physical medium such as a memory chip or memory block implemented within a processor, a magnetic medium such as a hard disk or floppy disk, and an optical medium such as, for example, a DVD and its data variants, a CD, etc. The physical medium is a non-transitory medium.
It should be understood by those skilled in the art that the technical features of the above embodiments may be combined in any manner, and for brevity, all of the possible combinations of the technical features of the above embodiments are not described, however, they should be considered as being within the scope of the description provided herein, as long as there is no contradiction between the combinations of the technical features.
The foregoing examples illustrate only a few embodiments of the application, which are described in greater detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. The training method of the image blurring region restoration compensation model is characterized by comprising the following steps of:
training sample selection: selecting an image of the marked repair area as a training image, and performing image vectorization processing on the training image to obtain an image vector;
initializing GLM model parameters; constructing a segmentation model as a teacher model, constructing a first model as a student model, wherein the first model is based on a GLM-6B model, accessing a parallel target detection module and a parallel target segmentation module at a feature exit layer of the GLM-6B model, inputting training images into the teacher model to obtain teacher output, inputting image vectors into the student model to obtain student output, calculating knowledge distillation loss of the teacher output and the student output, calculating supervision loss of fuzzy areas of the student output and the marks, weighting the knowledge distillation loss and the supervision loss to obtain first model loss, and training parameters of the GLM-6B model by taking the minimum first model loss as a reference;
mask training: accessing an FFN module at a characteristic outlet layer of the GLM-6B model with initialized parameters to obtain a second model, randomly generating masks with the same scale as the repair area and covering the mask vectors to obtain mask vectors, inputting the mask vectors and the image vectors into the second model to calculate perception loss and reconstruction loss, weighting the perception loss and the reconstruction loss to obtain the second model loss, and training the GLM-6B model in stages according to the second model loss to obtain a trained image fuzzy area repair compensation model.
2. The method of claim 1, wherein the segmentation model is pre-trained using an image labeled with a repair region, and wherein the pre-trained segmentation model detects and segments the repair region in the image.
3. The training method of the image blur area restoration compensation model according to claim 1, wherein the feature exit layer of the GLM-6B outputs a feature matrix, the feature matrix is input into a target detection module and a target segmentation module after being convolved by a plurality of convolution layers, the feature input into the target detection module obtains a classification result and a regression target frame after passing through a full connection layer, and the feature input into the target segmentation module is processed to obtain the segmentation matrix.
4. The training method of the image blur area restoration compensation model according to claim 1, wherein the training image is decomposed into a pixel-level vector matrix in units of pixels, the vector matrix is split according to a required specification to obtain a plurality of split units, and a starting position of each split unit is inserted into a module serial number to obtain an image vector.
5. The training method of the image blur area restoration compensation model according to claim 1, wherein a mask with the same size as a restoration area in a training image is randomly generated, and after the mask is covered on the training image, image vectorization processing is performed on the training image with the mask to obtain a mask vector.
6. The training method of an image blur area restoration compensation model according to claim 1, wherein a plurality of groups of mask vectors and image vectors are input into a second model, and after a linear convolution layer of a GLM-6B model is passed, the mask feature vectors and the image feature vectors of each group are obtained, and the perception loss is calculated based on the mask vectors and the image vectors of each group; and (3) inputting the mask vector and the image vector into a second model, obtaining a repair image through an FFN module, and calculating a pixel level difference value between the repair image and an original image of the corresponding image vector to obtain reconstruction loss.
7. The training method of an image blur area restoration compensation model according to claim 1, wherein the training of the first stage is performed after the weight of the perceived loss is increased in the first stage, then the weight of the perceived loss is set to 0 after the parameters of the GLM-6B model are frozen, and the training of the second model is performed after the weight of the reconstructed loss is set to 1, so as to obtain the trained image blur area restoration compensation model.
8. An application method of an image blurring region restoration compensation model is characterized by comprising the following steps:
inputting an image to be repaired into the image blur area repair compensation model obtained by training by the training method of the image blur area repair compensation model according to any one of claims 1 to 7, and obtaining the image for repairing the blur area.
9. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the training method of the image blur area restoration compensation model according to any one of claims 1 to 7.
10. A readable storage medium, wherein the readable storage medium has stored thereinComputer program comprising program code for controlling a process to perform a process comprising a training method of an image blur area restoration compensation model according to any one of claims 1 to 7
CN202310706302.XA 2023-06-14 2023-06-14 Training method and application of image blurring region restoration compensation model Pending CN116740501A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310706302.XA CN116740501A (en) 2023-06-14 2023-06-14 Training method and application of image blurring region restoration compensation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310706302.XA CN116740501A (en) 2023-06-14 2023-06-14 Training method and application of image blurring region restoration compensation model

Publications (1)

Publication Number Publication Date
CN116740501A true CN116740501A (en) 2023-09-12

Family

ID=87912941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310706302.XA Pending CN116740501A (en) 2023-06-14 2023-06-14 Training method and application of image blurring region restoration compensation model

Country Status (1)

Country Link
CN (1) CN116740501A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117475091A (en) * 2023-12-27 2024-01-30 浙江时光坐标科技股份有限公司 High-precision 3D model generation method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117475091A (en) * 2023-12-27 2024-01-30 浙江时光坐标科技股份有限公司 High-precision 3D model generation method and system
CN117475091B (en) * 2023-12-27 2024-03-22 浙江时光坐标科技股份有限公司 High-precision 3D model generation method and system

Similar Documents

Publication Publication Date Title
Lefkimmiatis Universal denoising networks: a novel CNN architecture for image denoising
Alireza Golestaneh et al. Spatially-varying blur detection based on multiscale fused and sorted transform coefficients of gradient magnitudes
US9965832B2 (en) Method for performing super-resolution on single images and apparatus for performing super-resolution on single images
CN108961180B (en) Infrared image enhancement method and system
Estrada et al. Stochastic Image Denoising.
CN112785572B (en) Image quality evaluation method, apparatus and computer readable storage medium
Lau et al. Variational models for joint subsampling and reconstruction of turbulence-degraded images
CN116740501A (en) Training method and application of image blurring region restoration compensation model
Iwasokun et al. Image enhancement methods: a review
Byun et al. BitNet: Learning-based bit-depth expansion
JP2015228155A (en) Information processing device, information processing system, image processing method, and program
Seo Image denoising and refinement based on an iteratively reweighted least squares filter
Ji et al. Image recovery via geometrically structured approximation
Saleem et al. A non-reference evaluation of underwater image enhancement methods using a new underwater image dataset
Alam et al. Space-variant blur kernel estimation and image deblurring through kernel clustering
Guan et al. Mutual-guided dynamic network for image fusion
Banerjee et al. Bacterial foraging-fuzzy synergism based image Dehazing
Li et al. Sparse representation-based image restoration via nonlocal supervised coding
CN111028159B (en) Image stripe noise suppression method and system
Chang et al. A hybrid motion deblurring strategy using patch based edge restoration and bilateral filter
Chen et al. Blur kernel estimation of noisy-blurred image via dynamic structure prior
Li et al. Deblurring traffic sign images based on exemplars
Jin et al. Optimal Weights Mixed Filter for removing mixture of Gaussian and impulse noises
van Zwanenberg et al. A tool for deriving camera spatial frequency response from natural scenes (NS-SFR)
Hu et al. Image Deconvolution Using Mixed-Order Salient Edge Selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination