CN117495734A

CN117495734A - Image restoration enhancement and definition method and system based on deep learning

Info

Publication number: CN117495734A
Application number: CN202311591630.6A
Authority: CN
Inventors: 孙秋腾; 贺忠堂; 赵军锋; 李凯; 胡长明
Original assignee: Xuzhou Hanfeng Digital City Development Co ltd
Current assignee: Xuzhou Hanfeng Digital City Development Co ltd
Priority date: 2023-11-27
Filing date: 2023-11-27
Publication date: 2024-02-02

Abstract

The invention discloses an image restoration enhancement and definition method based on deep learning, which comprises the following steps: s1, sampling an image to be repaired; s2, predicting edge mapping of a covering area on the superimposed image by adopting an edge generator; s3, verifying the image with the unreal edge mapping removed by using a Faster-RCNN algorithm; s5, sharpening enhancement is carried out by using the neighborhood fuzzy data of the Y component and the image texture data, the whole contrast ratio of the Y component after sharpening enhancement is adjusted, the RGB data of the current pixel is calculated by using the adjusted Y component and the UV component of the current pixel, the RGB data of the current pixel is output, and the image processing is completed. The invention adopts the Faster-RCNN to support various detection frameworks, including single-type and multi-type target detection, and boundary box and mask detection, and also supports different loss functions, including classification loss and boundary box regression loss, so that the invention can effectively realize image restoration enhancement and definition when the image missing part is very large.

Description

Image restoration enhancement and definition method and system based on deep learning

Technical Field

The invention belongs to the field of deep learning, and particularly relates to an image restoration enhancement and definition method based on deep learning. Meanwhile, the invention also relates to an image restoration enhancement and definition system based on deep learning.

Background

Conventional image restoration enhancement and sharpness is mainly based on mathematical and physical methods, however, with the remarkable effect of deep learning in the visual field in recent years, the front of research in the visual field has been basically occupied by deep learning, and under such circumstances, more and more graphic researchers have also begun to direct their eyes to deep learning. Conventional image restoration may be handled using diffusion-based methods that propagate local structures to the location portion, or by example-based methods that construct one pixel point of the missing portion at a time while maintaining consistency with surrounding pixels points.

The existing mode is invalid when the missing part of the image is large, so that an additional component is needed to provide reasonable imagination, and on the basis of the existing mode, an image restoration enhancement and sharpness method and system based on deep learning are provided.

Disclosure of Invention

The invention aims to solve the defects in the prior art, and provides an image restoration enhancement and definition method and system based on deep learning, wherein a plurality of detection frameworks including single-class and multi-class target detection, and bounding box and mask detection are held by using a fast-RCNN. The method also supports different loss functions, including classification loss and bounding box regression loss, and can effectively realize image restoration enhancement and definition when the image missing part is large.

In order to achieve the above purpose, the present invention provides the following technical solutions:

an image restoration enhancement and definition method based on deep learning comprises the following steps:

s1, sampling an image to be repaired, adopting multi-angle sampling, sequentially performing histogram equalization, denoising and target cutting on all the samples to obtain background cutting, and sequentially superposing the background cutting;

s2, predicting edge mapping of a covering area on the superimposed image by adopting an edge generator, and simultaneously carrying out image restoration and synthesis on the real edge mapping by a complementing network to remove the unreal edge mapping;

s3, verifying whether the images with the unreal edge mapping removed are true or not by using a Faster-RCNN algorithm;

s4, generating a real image, then acquiring YUV data of the real image, carrying out normalization processing on the YUV data of the real image, and calculating neighborhood fuzzy data and image texture data of the normalized Y component;

s5, carrying out sharpening enhancement by using the neighborhood blurring data of the Y component and the image texture data, adjusting the overall contrast of the Y component after the sharpening enhancement,

and calculating RGB data of the current pixel by using the adjusted Y component and the UV component of the current pixel, outputting the RGB data of the current pixel, and finishing image processing.

Preferentially, in step S5, the sharpening enhancement calculation formula is as follows:

Y _e ＝Y _blur +Y _t ×F _r calculating a Y component of the sharpening enhancement; wherein Y is _e To sharpen the enhanced Y component, Y _blur Neighborhood blur data for Y component, Y _t Image texture data for Y component, F _r The value range is 0.0-1.0 for enhancing the factor]。

Preferentially, in step 2, the fanciful edges of the original image predicted by the edge generator are: s is S _pred ＝G(I _gray ,S _gt ) Wherein S is _pred Phantom edges representing the image; i _gray A gray value matrix representing an original image; s is S _gt Representing an edge map of the original image.

Preferentially, the fast-RCNN algorithm adopts a selectvesearch method to pre-extract candidate areas of objects in a series of images, and then adopts CNN extraction features on the candidate areas only to judge;

the RCNN algorithm includes the steps of:

generating candidate areas, generating 1K-2K candidate areas by using an image, and adopting a selectevesearch method;

extracting features, namely extracting the features from each candidate region by using a deep convolution network;

judging the category, wherein the characteristics are sent to SVM classifiers of each category, and judging whether the characteristics belong to the category;

and (3) position refinement, namely finely correcting the positions of the candidate frames by using a regressive.

Preferentially, the fast-RCNN algorithm adopts a supervised pre-training mode, namely, a trained parameter of one task is taken to another task to serve as an initial parameter value of the neural network, so that compared with the method that you directly adopt random initialization, the accuracy can be greatly improved.

Preferentially, the fast-RCNN algorithm adopts non-maximum suppression to find n rectangular frames which are possibly objects from the image, and then makes class classification probability for each rectangular frame;

the non-maximum suppression method comprises the following steps:

firstly, 6 rectangular frames are assumed, sorting is carried out according to the class classification probability of the classifier, and the probabilities of the vehicles from small to large are assumed to be A, B, C, D, E, F respectively;

1) Starting from a maximum probability rectangular frame F, judging whether the overlapping degree IOU of A-E and F is larger than a certain set threshold value or not respectively;

2) Assuming B, D overlap with F exceeds a threshold, B, D is thrown away; and marks the first rectangular frame F, which we keep;

3) E with the highest probability is selected from the rest rectangular frames A, C, E, then the overlapping degree of E and A, C is judged, and the overlapping degree is larger than a certain threshold value and is thrown away; and the label E is the second rectangle we hold.

An image restoration enhancement and definition system based on deep learning is characterized in that: the system adopts the method to enhance and sharpen the image restoration.

Preferably, the system comprises: the edge detection module is used for carrying out edge detection on the image to be repaired to obtain the edge detection image;

the image restoration module comprises the deep learning model, and the deep learning model is used for complementing the edge detection image to acquire the restored image;

and the image restoration module carries a fast-RCNN algorithm to verify the image with the unreal edge mapping removed.

Preferably, the method further comprises: and the image sampling module adopts multi-angle sampling, and then sequentially carries out histogram equalization, denoising and target clipping on all the samples to obtain background clipping and then sequentially overlaps.

The invention has the technical effects and advantages that: compared with the traditional technology, the image restoration enhancement and definition method based on deep learning provided by the invention adopts the fast-RCNN to support various detection frames, including single-class and multi-class target detection, and boundary box and mask detection, and also supports different loss functions, including classification loss and boundary box regression loss, so that the image restoration enhancement and definition can be effectively realized still when the image missing part is large;

secondly, in order to better realize that the image missing part filling edge generator generates a 'fantasy' for the edge of the image missing region, the missing region is filled with the 'fantasy' edge, the edge information of the missing part is obtained by using a heuristic generation model, and then the edge information is taken as a priori part of the image missing and is sent into a restoration network together with the image to reconstruct the image, so that filling of finer details is reproduced.

Drawings

FIG. 1 is a flow chart of an image restoration enhancement and definition method based on deep learning in the invention;

FIG. 2 is a flow chart of the Faster-RCNN algorithm in an embodiment of the invention.

Detailed Description

The present invention will be described in further detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides an image restoration enhancement and definition method based on deep learning, which is shown in fig. 1 and comprises the following steps:

s2, predicting edge mapping of a covering area on the superimposed image by using an edge generator, and simultaneously carrying out image restoration and synthesis on the real edge mapping by using a complementing network to remove the unreal edge mapping, wherein in the step 2, the phantom edges of the original image predicted by the edge generator are as follows: s is S _pred ＝G(I _gray ,S _gt ) Wherein S is _pred Phantom edges representing the image; i _gray A gray value matrix representing an original image; s is S _gt An edge map representing the original image;

as further shown in fig. 2, the fast-RCNN algorithm adopts a SelectiveSearch method to pre-extract candidate areas of objects in a series of images, and then adopts CNN extraction features only on the candidate areas to perform judgment;

the RCNN algorithm includes the steps of:

position fine correction, namely fine correction of the positions of the candidate frames by using a regressive;

it should be noted that, the fast-RCNN algorithm adopts a supervised pre-training mode, that is, a trained parameter of one task is taken to another task as an initial parameter value of the neural network, so that compared with the method that you directly adopt random initialization, the accuracy can be greatly improved.

s5, sharpening enhancement is carried out by using the neighborhood fuzzy data of the Y component and the image texture data, the whole contrast ratio of the Y component after sharpening enhancement is adjusted, the RGB data of the current pixel is calculated by using the adjusted Y component and the UV component of the current pixel, the RGB data of the current pixel is output, and the image processing is finished, and in the step S5, the sharpening enhancement calculation formula is as follows:

It should be noted that, the fast-RCNN algorithm adopts non-maximum suppression to find n rectangular frames which are possibly objects from the image, and then makes class classification probability for each rectangular frame;

the non-maximum suppression method comprises the following steps:

As for one detection task in this embodiment, in addition to outputting the classification result, positioning information, that is, a target bounding box, is also output. The output information [ x, y, w, h, c ] usually detected represents the central coordinate offset of the target, the wide, high offset and the probability of belonging to the category. Two types of Detection algorithms are currently mainstream, one is a Two-stage Detection algorithm (Two-stage Detection), and the other is a One-stage Detection algorithm (One-stage Detection), and the Two-stage Detection algorithms are represented by YOLO series.

The one-stage and two-stage detection algorithms differ in the presence or absence of a preliminary positioning step. The two-stage detection needs to output a positioning result and a classification result, positioning is usually carried out firstly, then the target after positioning is classified, candidate frames with different sizes are obtained through preliminary positioning, the candidate frames are accurately positioned by outputting target offset information through a built detection network model, and then classification calculation is carried out on regression results with highest scores according to a designed classification network. The fast-RCNN is a classical two-stage detection algorithm and is one of the target detection algorithms commonly used at present, and the fast-RCNN training phase and the testing phase are less time-consuming than the contemporaneous algorithm.

The whole network structure consists of 4 parts, including a CNN feature extraction network, an RPN network, an ROI-Pooling and a learning model, and Faster-RCNN proposes a network structure extraction candidate region of the RPN.

The RPN takes the characteristic image output by the CNN characteristic extraction network introduced in the section above as input, carries out 3x3 convolution on the characteristic image with the size of 1/16 of the original image, and sends the convolved image into a regression branch and a classification branch.

In the regression branch, the number of channels is changed by 1x1 convolution, and the output of 36 channels is obtained; in the classification branch, the number of channels is changed into 18 by 1x1 convolution, then the channels are disassembled into 2 feature images of 9 channels, the feature images are subjected to foreground and background classification by a logistic regression model Softmax, and then the feature images are combined into 18 channels.

And combining the two branches, and outputting a 4x2x9xMxN feature map, wherein the information on the feature map comprises four regressive coordinate offsets, classification probability of foreground and background, and feature information for 9 anchor points generated by mapping each pixel point back to the original map.

Anchor defaults of three sizes 128x128, 256x256, 512x512 and three aspect ratios 1:1, 1:2, 2:1 are followed on the above basis. The total number of candidate region samples exceeding twenty thousand is finally generated through the RPN. In order to obtain fewer samples participating in cost function calculation in the training stage, 512 samples are screened out according to positive-negative ratio 1:3 from the samples generated by the RPN, then the samples are sorted according to the intersection ratio IoU of the samples and the real labels in the 512 samples, 32 samples with the size of IoU being more than 0.5 are selected as positive samples, 96 samples with the size of IoU being less than 0.5 are selected as negative samples in the reverse order, and the samples are sent into a model for accurate regression training. When the samples of IoU and IoU are insufficient, candidate frames corresponding to IoU smaller than 0.5 are selected from the sorting result to be supplemented.

In addition, because the Faster-RCNN belongs to a two-stage algorithm regional convolutional neural network, after an original image obtains a feature map through the convolutional network shown in section 1, a plurality of target candidate frames are obtained through selective search or an RPN algorithm, and the Faster-RCNN is used for generating the ROI candidate frames through the RPN. Because the sizes of the candidate area samples are different due to the arrangement of 9 Anchor, the Faster-RCNN proposes that the sizes of the candidate area samples need to be uniformly processed through ROIPooling when the candidate area samples are sent into a training model. Each ROI has its original coordinates and size;

for example, a size of 10x14, for example, requires a unified size of 7 x 7 to be fed into the model to participate in training. Firstly, segmenting a feature map extracted by RPN into 7 multiplied by 7 block areas, filling the remainder which cannot be divided in the last row of blocks and the last column of blocks, and then carrying out maximum pooling (Maxpooling) on each area to realize downsampling, so as to finally obtain normalized images with 7 multiplied by 7 and unified sizes. Herein, fetch BatchSize=2, ROI=64, choose positive and negative samples according to the ratio of the intersection with the true value IOU, the ratio of IOU to 0.5 is set to 1:3, 2 samples are trained per batch, and 64 ROIs are generated per sample, namely 128 ROIs are trained per batch.

The above-described use of Faster-RCNN maintains a variety of detection frameworks, including single-class and multi-class target detection, as well as bounding box and mask detection. The method also supports different loss functions, including classification loss and bounding box regression loss, can effectively realize image restoration enhancement and definition when the image missing part is quite large, and simultaneously fills the edge generator to generate 'fantasy' for the edge of the image missing region in order to better realize the image missing part, fills the missing region with the 'fantasy' edge, obtains the edge information of the missing part by utilizing a heuristic generation model, and then sends the edge information as an priori part of the image missing and the image together into a restoration network for image reconstruction, thereby reproducing finer detail filling.

On the basis of the above, the present embodiment further provides an image restoration enhancement and sharpness system based on deep learning, where the system adopts the above method to enhance and sharpen an image restoration, and the system includes:

the edge detection module is used for carrying out edge detection on the image to be repaired to obtain the edge detection image;

the image restoration module carries a fast-RCNN algorithm to verify the image with the unreal edge mapping removed;

further comprises: and the image sampling module adopts multi-angle sampling, and then sequentially carries out histogram equalization, denoising and target clipping on all the samples to obtain background clipping and then sequentially overlaps.

Comparative example 1

The prior art comparative example is provided for the image restoration enhancement and definition method and system based on deep learning, and the specific steps are as follows:

a method for sharpness enhancement of a dynamic video image, the method comprising the steps of: acquiring YUV data of a current pixel, and carrying out normalization processing on the YUV data of the current pixel;

calculating neighborhood fuzzy data and image texture data of the normalized Y component;

carrying out sharpening enhancement on the neighborhood fuzzy data of the Y component and the image texture data, and adjusting the overall contrast of the sharpened and enhanced Y component;

and calculating the RGB data of the current pixel by using the adjusted Y component and the UV component of the current pixel, and outputting the RGB data of the current pixel.

According to the scheme, the high-efficiency definition enhancement is realized by rapidly improving the chromatic aberration of different things in the image, the continuous processing requirement of the dynamic video image can be met, but the dynamic video image can fail when the missing part of the image is very large, so that compared with the application, the dynamic video image has no effect of still realizing image restoration enhancement and definition when the missing part of the image is very large.

Comparative example 2

A fingerprint image restoration method, comprising the steps of:

s100: performing edge detection on the fingerprint image to be repaired to obtain an edge detection image;

s200: inputting the edge detection image into a deep learning model, wherein the deep learning model is used for complementing the edge detection image;

s300: and acquiring an image output by the deep learning model as a repaired fingerprint image.

In some embodiments, the fingerprint image to be repaired includes at least one fingerprint line, and performing edge detection on the fingerprint image to be repaired in step S100 to obtain an edge detection image includes: detecting the edge of the fingerprint line, and representing the detected edge of the fingerprint line in the edge detection image by using a line. In particular, the edges of each fingerprint line of the fingerprint image to be repaired may be detected, where each fingerprint line is considered as a pattern having an area, the pattern of each fingerprint line comprising its edges (i.e. outline) and a portion located within the area defined by its edges. Only the edges of the fingerprint lines are included in the edge detection image, and the portions within the area defined by the edges of the fingerprint lines are not included, so the edge detection image can be understood as a "skeleton" of the fingerprint image to be repaired.

In some embodiments, the step S200 of the deep learning model for complementing the edge detection image includes: the deep learning model is used for carrying out edge complementation and content complementation on the edge detection image.

In some embodiments, the deep learning model includes a generative challenge network GAN (Generative adversarial networks) model, and GAN has become a popular research direction in the artificial intelligence community, and is mainly composed of a generator and a arbiter, and is trained by means of challenge learning, so as to estimate the potential distribution of data samples and generate new data samples.

According to the scheme, the fingerprint image restoration method adopts the generated type anti-network model to restore the fingerprint image, so that not only can the restoration of the fingerprint image with smaller information deletion area be realized, but also the restoration of the fingerprint image with larger information deletion area can be realized, and the restoration precision of the fingerprint image and the definition of the restored image are improved.

In addition, the embodiment of the invention also provides a terminal device, and the image restoration enhancement and definition method based on deep learning in the embodiment is mainly applied to the terminal device, and the terminal device can be a PC, a portable computer, a mobile terminal and other devices with display and processing functions.

In particular, the terminal device may include a processor (e.g., CPU), a communication bus, a user interface, a network interface, and a memory. Wherein the communication bus is used for realizing connection communication among the components; the user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); the network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface); the memory may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory, or alternatively may be a storage device independent of the aforementioned processor.

The processor can call the image restoration enhancement and definition program stored in the memory and execute the image restoration enhancement and definition method based on deep learning.

It will be appreciated that the readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

Finally, it should be noted that: the foregoing description is only illustrative of the preferred embodiments of the present invention, and although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements or changes may be made without departing from the spirit and principles of the present invention.

Claims

1. The image restoration enhancement and definition method based on deep learning is characterized by comprising the following steps of:

s5, sharpening enhancement is carried out by using the neighborhood fuzzy data of the Y component and the image texture data, the whole contrast ratio of the Y component after sharpening enhancement is adjusted, the RGB data of the current pixel is calculated by using the adjusted Y component and the UV component of the current pixel, the RGB data of the current pixel is output, and the image processing is completed.

2. The method for enhancing and sharpening image restoration based on deep learning according to claim 1, wherein in step S5, the sharpening enhancement is calculated as follows:

3. The method for enhancing and sharpening image restoration based on deep learning according to claim 1, wherein in step 2, the phantom edges of the original image predicted by the edge generator are: s is S _pred ＝G(I _gray ,S _gt ) Wherein S is _pred Phantom edges representing the image; i _gray A gray value matrix representing an original image; s is S _gt Representing an edge map of the original image.

4. The method for enhancing and sharpening image restoration based on deep learning as recited in claim 1, wherein: the Faster-RCNN algorithm adopts a SelectiveSearch method to pre-extract candidate areas of objects in a series of images, and then CNN extraction features are only adopted on the candidate areas to judge;

the RCNN algorithm includes the steps of:

5. The method for enhancing and sharpening image restoration based on deep learning according to claim 4, wherein: the fast-RCNN algorithm adopts a supervised pre-training mode, namely, a trained parameter of one task is taken to another task and used as an initial parameter value of the neural network, so that compared with the method of directly adopting random initialization, the accuracy can be greatly improved.

6. The method for enhancing and sharpening image restoration based on deep learning according to claim 4, wherein: the Faster-RCNN algorithm adopts non-maximum value inhibition to find n rectangular frames which are possibly objects from the image, and then makes class classification probability for each rectangular frame;

the non-maximum suppression method comprises the following steps:

7. An image restoration enhancement and definition system based on deep learning is characterized in that: the system enhances and sharpens image restoration using the method of any one of claims 1-6.

8. The deep learning based image restoration enhancement and sharpness system of claim 7, wherein the system includes:

9. The deep learning based image restoration enhancement and sharpness system of claim 8, further comprising: and the image sampling module adopts multi-angle sampling, and then sequentially carries out histogram equalization, denoising and target clipping on all the samples to obtain background clipping and then sequentially overlaps.