CN113628130B - Deep learning-based vision barrier-assisted image enhancement method, equipment and medium - Google Patents
Deep learning-based vision barrier-assisted image enhancement method, equipment and medium Download PDFInfo
- Publication number
- CN113628130B CN113628130B CN202110829947.3A CN202110829947A CN113628130B CN 113628130 B CN113628130 B CN 113628130B CN 202110829947 A CN202110829947 A CN 202110829947A CN 113628130 B CN113628130 B CN 113628130B
- Authority
- CN
- China
- Prior art keywords
- image
- vision
- simulated
- image enhancement
- enhancement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000004438 eyesight Effects 0.000 title claims abstract description 130
- 238000000034 method Methods 0.000 title claims abstract description 76
- 238000013135 deep learning Methods 0.000 title claims abstract description 27
- 230000004888 barrier function Effects 0.000 title claims abstract description 23
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 43
- 238000012549 training Methods 0.000 claims abstract description 30
- 230000000007 visual effect Effects 0.000 claims abstract description 24
- 230000001771 impaired effect Effects 0.000 claims abstract description 18
- 208000024891 symptom Diseases 0.000 claims abstract description 16
- 230000006735 deficit Effects 0.000 claims abstract description 15
- 238000004088 simulation Methods 0.000 claims abstract description 14
- 230000009467 reduction Effects 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 27
- 230000035945 sensitivity Effects 0.000 claims description 25
- 230000015556 catabolic process Effects 0.000 claims description 24
- 238000006731 degradation reaction Methods 0.000 claims description 24
- 230000000694 effects Effects 0.000 claims description 21
- 238000012545 processing Methods 0.000 claims description 16
- 230000004393 visual impairment Effects 0.000 claims description 14
- 208000029257 vision disease Diseases 0.000 claims description 12
- 206010047571 Visual impairment Diseases 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 10
- 238000002591 computed tomography Methods 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 8
- 230000002708 enhancing effect Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 abstract description 18
- 230000008447 perception Effects 0.000 abstract description 9
- 230000004382 visual function Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 8
- 230000007423 decrease Effects 0.000 description 6
- 239000011521 glass Substances 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 201000004569 Blindness Diseases 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 101000746373 Homo sapiens Granulocyte-macrophage colony-stimulating factor Proteins 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 208000001491 myopia Diseases 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000004304 visual acuity Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 208000002177 Cataract Diseases 0.000 description 1
- 206010012689 Diabetic retinopathy Diseases 0.000 description 1
- 208000010412 Glaucoma Diseases 0.000 description 1
- 208000029091 Refraction disease Diseases 0.000 description 1
- 206010064930 age-related macular degeneration Diseases 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 230000004430 ametropia Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 208000002780 macular degeneration Diseases 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 208000014733 refractive error Diseases 0.000 description 1
- 230000035882 stress Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 208000013021 vision distortion Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a visual barrier auxiliary image enhancement method, equipment and medium based on deep learning, which comprise the following steps: connecting the output end of the convolutional neural network with the input end of a vision system for simulating a visually impaired patient to obtain a cascade system; training the convolutional neural network to obtain an image enhancement network, wherein: the original image is input into a convolutional neural network for enhancement, the enhancement result is input into a visual system of a patient with the simulated vision impairment for simulation, and a cascade system outputs a simulated perceived image of the vision impairment symptom; the loss of the cascade system output and the original image is calculated, and the original image is enhanced to compensate the distortion brought by the visual system of the simulated vision-impaired patient with the aim of minimizing the difference between the input and output images of the cascade system. The image enhancement network obtained by the invention can effectively realize the image enhancement aiming at the video assistance, and experiments show that the image enhancement aiming at the central vision reduction can effectively improve the visual function and subjective perception quality of a patient.
Description
Technical Field
The invention relates to the field of multimedia image enhancement and vision barrier assistance, in particular to a method, equipment and medium for image enhancement based on deep learning and vision barrier assistance.
Background
Vision impairment is a serious social and public health problem facing the world. Data published by the world health organization 2019 shows that at least 22 million people worldwide face problems of vision impairment or blindness. Most visually impaired patients are from developing countries and are predominantly above 50 years old. China is the largest developing country in the world, and the most rapid aging process and the most standard country are affected seriously. The vision-impaired patient faces great problems in various aspects such as work, life and the like, and not only brings great mental stress to the patient, but also greatly reduces the happiness index of the patient; the medical and daily services required by the utility model also bring a great burden to families and society.
The main causes of vision disorder are uncorrected ametropia, cataract, age-related macular degeneration, glaucoma, diabetic retinopathy, etc. Different diseases can lead to different vision-impaired symptoms, and obviously, if the best auxiliary effect is to be achieved, different image enhancement algorithms are required to be designed for the different vision-impaired symptoms. However, visually impaired patients often have more than one symptom of visual impairment. Among these, the most common symptoms are the decrease in central vision due to decreased visual acuity and decreased contrast sensitivity, and almost all visually impaired patients have different degrees of central vision. Thus, an image enhancement algorithm aimed at compensating for central vision loss can help most patients.
Image enhancement algorithms for central vision degradation can be broadly divided into two categories: an image enhancement algorithm dedicated to compensating for patient central vision loss and a method of achieving compensation using a generic image processing algorithm. In recent years, many academia and industry teams begin to work on developing vision-impaired auxiliary equipment based on image enhancement algorithms, but the effect of the auxiliary equipment is limited by theory, technology and the like, and the auxiliary equipment is difficult to meet the actual requirements of vision-impaired patients, so that the auxiliary equipment cannot be put into commercial use on a large scale.
The image enhancement algorithm specially used for compensating the central vision degradation of the patient can be traced back to Peli and the self-adaptive image enhancement method (Adaptive Enhancement) proposed by Peli at the earliest, and by amplifying the high-frequency content in the image and compressing the low-frequency content to the intermediate gray value, the gray value saturation is avoided while compensating the high-frequency contrast sensitivity degradation of the patient. The broadband enhancement method (Wideband Enhancement) is proposed by Peli et al in Eli Peli, jeonghoon Kim, yitzhak Yitzhaky, robert B Goldstein, and Russell L Woods, "Wideband enhancement of television images for people with visual impairments," Journal of the Optical Society of America A, vol.21, no.6, pp.937-50, 2004, by first extracting bipolar features of edges, corners, etc. in an image using a feature detection algorithm based on visual characteristics of the human eye, and then overlaying the features on the original image after scaling to achieve image enhancement. Tang et al and Kim et al achieve image or video enhancement for visual barrier assistance by correcting elements in the quantization matrix in the region corresponding to the critical frequency range (3-7 cycles/degre) during decompression phase for JPEG compressed images and MPEG compressed video, respectively.
The general image processing algorithm, such as image binarization, unsharp masking (unsharp masking), extracting the image edge by using a general high-pass filter and overlapping the image edge back to the original image, can enhance the contrast or high-frequency content of the image, so that the image processing algorithm has a certain compensation effect on the contrast sensitivity reduction of high spatial frequency of the visually impaired patient, and is also an idea of the image enhancement algorithm for central vision reduction.
For central vision loss, the most common auxiliary device is enhanced display glasses. The enhancement of real scenes is achieved by Hwang et al overlaying the edges of the image extracted with Laplacian filters onto the corresponding real scenes through the see-through display screen of Google glasses (Alex D.Hwang and Eli Peli, "Augmented edge enhancement on Google glass for vision-impaled users," Information Display, vol.30, no.3, pp.16-19,2014.). The American eSimht company designs a pair of vision barrier auxiliary glasses, products are produced at present, the glasses capture real scenes through a high-speed and high-resolution camera, the real scenes are displayed on an OLED screen after being processed by an image enhancement algorithm, and a user can also adjust enhancement parameters according to own preferences.
However, these existing image enhancement algorithms for central vision degradation have two general problems, making them difficult to apply to practical vision-impaired aids. Firstly, according to the general expression of the reduction of the contrast sensitivity of the patient, the methods are intuitively proposed for compensating a certain frequency band, and the patient often has the reduction of the contrast sensitivity to different degrees in the whole spatial frequency, so that the intuitive methods cannot completely compensate the vision distortion caused by the reduction of the contrast sensitivity of the patient, and the experimental effect is not ideal. Secondly, different types of images have different frequency spectrum distribution, and the existing method is mainly characterized in that a user manually adjusts the different images when the user needs to enhance the different images by setting a plurality of variable parameters; or the method is proposed for a specific type of image (such as face and text) and cannot automatically and effectively enhance various types of images.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a visual impairment assisting image enhancement method, device and medium based on deep learning, which can automatically and effectively enhance various types of images, and particularly can effectively compensate central vision decline of a patient.
In a first aspect of the present invention, there is provided a visual-barrier-assisted image enhancement method based on deep learning, including:
designing a convolutional neural network for image enhancement, and connecting the output end of the convolutional neural network with the input end of a vision system for simulating a visually impaired patient to obtain a cascade system;
training the convolutional neural network to obtain an image enhancement network, wherein the image enhancement network can realize image enhancement aiming at the vision barrier symptoms;
wherein: training the convolutional neural network to obtain an image enhancement network, comprising:
inputting an original image into the convolutional neural network for enhancement, inputting an enhancement result into the vision system of the simulated vision-impaired patient for simulation, wherein the vision system of the simulated vision-impaired patient outputs a simulated perceived image of the vision-impaired symptom and is also output by the cascade system;
and calculating the output of the visual system of the simulated vision-impaired patient and the loss of the original image, aiming at minimizing the difference between the input and output images of the cascade system, and enhancing the original image to compensate the distortion brought by the visual system of the simulated vision-impaired patient.
Optionally, the step of inputting the enhancement result into the vision system of the simulated vision impairment patient for simulation means that: and processing the input original image by utilizing image processing operation to obtain a simulated perceived image which is an image seen in eyes of the patient with the simulated visual impairment.
Optionally, the design of a convolutional neural network for image enhancement, wherein a convolutional neural network based on UNet structure is adopted;
on the basis of a standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the convolution neural network to obtain the image enhancement network.
Optionally, the objective of minimizing the difference between the input and output images of the cascade system means:
and (3) setting the input image of the cascade system as I, wherein the training of the cascade system aims at training an image enhancement network, so that the image enhanced by the convolutional neural network is as close as possible to the image obtained after passing through the simulated vision system and the input original image, namely, the image seen by the eyes of the simulated visually impaired patient is as close as possible to the original image.
Optionally, when the vision impaired patient is a patient with reduced central vision, the image enhancement method includes:
s1: simulating the vision system of the patient with reduced central vision, namely processing the input image to obtain a simulated perceived image of the patient with reduced central vision;
s2: designing a convolutional neural network capable of realizing image enhancement, wherein the convolutional neural network uses a network structure based on UNet;
s3: connecting the convolutional neural network in the step S2 and the simulated vision system in the step S1 to obtain a cascade system, and then training the convolutional neural network by using the same high-definition image as the input and output of the cascade system, wherein the trained convolutional neural network is used as an image enhancement network;
s4: and inputting the image to be enhanced into the trained image enhancement network in S3, so as to realize the image enhancement aiming at the central vision reduction.
Optionally, in S1, simulating the central vision degradation using the approximate contrast sensitivity function based on the clinical measurement index Pelli-Robson score and logMAR vision and the multiband decomposition characteristic of human eyes includes:
firstly decomposing an image to be simulated onto each spatial frequency band, then respectively solving the local band limit contrast of the image on each spatial frequency band, comparing the local band limit contrast with the contrast detection threshold of a patient with reduced central vision to obtain visible contents on each spatial frequency band, and finally merging the visible contents to obtain the simulated perceived image.
Optionally, the visible content v on the respective spatial frequency band i (x, y) calculated as follows:
wherein ,to reduce the artificial effect during simulation, the visible image is masked m i (x, y) gaussian filtering, the visible image mask is defined as:
wherein an image mask of 1 at a certain point indicates that the pixel point is visible, and 0 indicates that the pixel point is invisible; CT (computed tomography) i A contrast detection threshold corresponding to the center frequency representing the ith frequency band, equal to the inverse of the contrast sensitivity at that frequency, i.e. CT i =1/CS i =1/CS(f 0 α), wherein CS (f) represents a contrast sensitivity function, the unit of f being period/degree; f (f) 0 The center frequency of the ith frequency band in cycles/graph; α represents the viewing angle, which is simply estimated by α=arctan (w/2 d), where w represents the width of the image, and d represents the distance between the observer and the image, in relation to the size of the image and the distance of the observer from the image; b i (x, y) represents the bandpass filtering result of the image at the ith frequency band.
Optionally, in the step S2, the image enhancement network is obtained by adding batch normalization between convolution and ReLU of each layer based on a standard UNet structure, and adding Sigmoid activation function to the last layer of the enhancement network;
the UNet-based image enhancement network and the vision system of the central vision-degrading person simulated in S1 are cascaded to obtain a system, the system output and input are expected to be the same as possible, the neural network is trained with the target construction loss function, wherein,
let the image enhancement network be F (Θ), the loss function used in the system training is:
wherein n is the number of training samples; i i Representing an i-th input image; psi denotes the simulated vision system in S1, which inputs as one image and outputs as the effect of looking at the image in the simulated central vision-degrading eye.
In a second aspect of the present invention, a computer device is provided, comprising at least one processor, and at least one memory, wherein the memory stores a computer program, which when executed by the processor, enables the processor to perform the deep learning based vision barrier assisted image enhancement method.
In a second aspect of the invention, a computer readable storage medium is provided, which when executed by a processor within a device, causes the device to perform the deep learning based vision impairment aided image enhancement method.
Compared with the prior art, the embodiment of the invention has at least one of the following beneficial effects:
according to the image enhancement method, the device and the medium based on the deep learning vision barrier assistance, the convolutional neural network is utilized to realize image enhancement, and the neural network has strong learning ability and is nonlinear, so that simultaneous enhancement of people in different spatial frequencies of the image can be realized; the invention has a plurality of various types of images in the data set used in training the image enhancement network, so the network has universality for various images.
The image enhancement method, the device and the medium for the vision barrier assistance based on the deep learning can effectively realize the image enhancement aiming at the vision barrier assistance, and experiments show that the image enhancement aiming at the central vision degradation can effectively improve the visual function and the subjective perception quality of a patient.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
fig. 1 is a block diagram of an image enhancement method according to an embodiment of the present invention.
FIG. 2 is a diagram showing a simulation verification effect according to an embodiment of the present invention, wherein (a) is an original image; (b) (c) and (d) are respectively the result of enhancement of the original image by an adaptive enhancement method, a DCT domain enhancement method and a method in the invention; (e) (f) (g) (h) are perceived images of the simulated severe central vision impaired person looking at (a) (b) (c) (d), respectively.
Detailed Description
The following describes embodiments of the present invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and detailed implementation modes and specific operation processes are given. It should be noted that variations and modifications can be made by those skilled in the art without departing from the spirit of the invention, which falls within the scope of the invention.
The prior vision-barrier-assisted image enhancement technology based on deep learning mainly has two problems: firstly, according to the general expression of the contrast sensitivity reduction of the patient, a compensation method for a certain frequency band is intuitively proposed, and the patient often has the contrast sensitivity reduction of different degrees in the whole spatial frequency, so that the visual distortion caused by the contrast sensitivity reduction of the patient cannot be completely compensated by the intuitive methods, and the experimental effect is not ideal. Secondly, different types of images have different frequency spectrum distribution, and the prior art mainly sets a plurality of variable parameters to enable a user to manually adjust when different images need to be enhanced; or the method is proposed for a specific type of image (such as face and text) and cannot automatically and effectively enhance various types of images.
In view of the above problems, the present invention provides a method for enhancing an image for visual impairment assistance based on deep learning, which can be used to achieve image enhancement for any simulatable visual impairment symptom, so as to compensate for image distortion caused by a visual system of a patient; and further applies the method to realize image enhancement aiming at central vision degradation. An image enhancement framework for vision barrier assistance based on deep learning is shown in fig. 1. The main idea is to enhance the image by using convolutional neural network (Convolutional Neural Network, CNN) to compensate the distortion brought by the vision system of the visually impaired patient. Specifically, the high-definition image input into the system is enhanced firstly through the CNN enhancement network, then through the simulated vision system and then through the output system, the system output is the effect of the enhanced image seen in the eyes of the simulated vision-impaired patient, and the effect is expected to be as close as possible to the original image (namely the system input), which is the basis of the system training.
Specifically, in an embodiment, the image enhancement method for vision barrier assistance based on deep learning includes:
s100, designing a convolutional neural network for image enhancement, and connecting the output end of the convolutional neural network with the input end of a vision system of a simulated vision impairment patient to obtain a cascade system;
s200, training the convolutional neural network to obtain an image enhancement network, wherein the image enhancement network can realize image enhancement aiming at the vision barrier symptoms; wherein: training the convolutional neural network to obtain an image enhancement network, comprising:
the original image is input into a convolutional neural network for enhancement, the enhancement result is input into a visual system of a simulated vision-impaired patient for simulation, and the visual system of the simulated vision-impaired patient outputs a simulated perceived image of the vision-impaired symptom, which is also the output of a cascade system;
the loss of the visual system output and the original image of the simulated visually impaired patient is calculated, and the original image is enhanced to compensate the distortion brought by the visual system of the simulated visually impaired patient with the aim of minimizing the difference between the input and output images of the cascade system.
The embodiment of the invention can effectively realize image enhancement aiming at video assistance, and experiments show that the image enhancement aiming at central vision degradation can effectively improve the visual function and subjective perception quality of a patient, so that people in different spatial frequencies of the image can be enhanced simultaneously, and the image enhancement method has universality for various images.
Before the above-described embodiments of the present invention begin, the visual system of a visually impaired patient may be modeled, i.e., the symptoms of the visual impairment to be compensated may be simulated using a series of micromanipulations. Then, designing a convolutional neural network which can be used for image enhancement; and finally, combining the image enhancement network with a visual system of a simulated vision-impaired patient, and performing end-to-end training by using the same high-definition image as the input and output of the system, wherein the trained enhancement network can be used for realizing image enhancement aiming at the vision-impaired symptom.
In the above embodiment of the present invention, inputting the enhancement result into the vision system of the patient with simulated vision impairment for simulation means: the input original image is processed by using the image processing operation, so as to obtain a simulated perceived image which is an image seen by eyes of a simulated visually impaired patient.
In the above embodiment of the present invention, the convolutional neural network for image enhancement may employ a UNet-based structure of the convolutional neural network; on the basis of a standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the convolution neural network to obtain the image enhancement network.
In the above embodiment of the present invention, the objective of minimizing the difference between the input and output images of the cascade system may be specifically to set the input image of the cascade system as I, and the objective of the cascade system training is to train the image enhancement network, so that the image after enhancement by the convolutional neural network, the image obtained after passing through the simulated vision system and the input original image are as close as possible, that is, the image seen in the eyes of the simulated visually impaired patient and the original image are as close as possible.
In another embodiment of the present invention, the image enhancement for central vision degradation is achieved according to the method described above, specifically including: firstly, the vision impairment symptom of central vision is simulated by using an approximate contrast sensitivity function (Contrast Sensitivity Function, CSF) based on a clinical measurement index Pelli-Robson score and logMAR vision and multiband decomposition characteristics of human eyes, and then an image enhancement network based on UNet is designed and trained. The invention can effectively realize the image enhancement aiming at the auxiliary vision, and experiments show that the image enhancement aiming at the central vision decline can effectively improve the vision function and subjective perception quality of patients.
In order to better illustrate the above technical solution of the present invention, as shown in fig. 1 and 2, an embodiment of the present invention for implementing image enhancement for central vision degradation specifically includes the following steps:
the first step is to simulate the vision system of a patient with reduced central vision, namely to process the input image by a series of image processing operations to obtain an image seen in the eyes of a simulated patient with reduced central vision:
specifically, central vision degradation was simulated using an approximate Contrast Sensitivity Function (CSF) based on clinical measurement metrics Pelli-Robson score and logMAR vision and multiband decomposition characteristics of the human eye: firstly decomposing an image to be simulated onto each spatial frequency band, then respectively solving the local band limit contrast of the image on each spatial frequency band, comparing the local band limit contrast with the contrast detection threshold of a patient with reduced central vision to obtain visible contents on each spatial frequency band, and finally merging the visible contents to obtain the simulated perceived image.
The approximate CSF used in the present invention can be expressed as:
wherein f is as followsSpatial frequency in cycles/degree; CS represents contrast sensitivity (contrast sensitivity); PCS represents the peak of contrast sensitivity; PSF represents the spatial frequency corresponding to the contrast sensitivity peak; the subscript l denotes the base 10 logarithm of the amount of the variable; w (w) L and wH Is a fitting parameter. Fitting the CSF measurements of a large number of normal vision subjects resulted in the best fitting parameters: pcs=166, psf=2.5, w L =0.68,w H =1.28. The CSF of a central vision-impaired person can be seen as a leftward-downward translation of the normal human CSF, which is localized by PCS, PSF, both parameters in turn being determinable by the patient's Pelli-Robson score and logMAR vision:
wherein PR represents the Pelli-Robson score; logMAR means logMAR vision; COF (chip on film) N Is the cut-off frequency of normal human CSF, and can be expressed by letting pcs=166, psf=2.5, cs in the above CSF expression l Derived =0.
In this embodiment, preferably, the simulation process of the perceived image is as follows:
(1) First decompose the image onto individual frequency bands:
wherein F (x, y) represents the DFT transform of the image F (x, y), F (r, θ) being the polar form of F (x, y); h n (r, θ) represents the content of the ultrahigh frequency of the image, which will be omitted in the following discussion; l (L) 0 (r, θ) and B i (r, θ) is the low-pass and band-pass image filtered by the cosine log filter bank as follows:
G i (r)=0.5[1+cos(πlog 2 r-πi)]i=0,1,…,n-1
the filters are all given in polar coordinates, and r is the polar diameter, and corresponds to the distance from a certain pixel point in the image to the center of the image. Performing IDFT on the decomposition result of the image spectrum to obtain the decomposition result on the image space domain:
wherein ,l0 (x, y) represents the low-pass filtering result of the image; b i (x, y) represents the bandpass filtering result of the image at the ith frequency band. The local band-limited contrast of the image over each band is calculated as follows:
wherein i=1, 2, …,7; l (L) i (x, y) represents the total energy of the image below band i; c i (x, y) represents a local band-limited contrast image of the image at the ith band.
(2) Comparing the local band limit contrast of each frequency band with the contrast detection threshold value on the corresponding spatial frequency, the part of the frequency band image which is visible and the part of the frequency band image which is invisible can be known, and a visible image mask is defined:
wherein an image mask of 1 at a certain point indicates that the pixel point is visible, and 0 indicates that the pixel point is invisible; CT (computed tomography) i A contrast detection threshold corresponding to the center frequency representing the ith frequency band, equal to the inverse of the contrast sensitivity at that frequency, i.e. CT i =1/CS i =1/CS(f 0 Alpha), wherein CS (f) representsContrast Sensitivity Function (CSF), with f being in cycles/degree; f (f) 0 The center frequency of the i-th band in cycles/image; α represents the viewing angle, which is related to the size of the image and the distance of the observer from the image, can be simply estimated by α=arctan (w/2 d), where w represents the width of the image and d represents the distance between the observer and the image. In order to reduce the artificial effect (artifact) during simulation, the mask image is subjected to Gaussian filtering, and the filtering result is recorded asWith the aid of an image mask, the visible portion of each band image can represent:
(3) The visible images on each frequency band are simply added to obtain a simulated perceived image s (x, y):
secondly, designing a convolutional neural network based on a UNet structure for realizing image enhancement:
on the basis of the standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the enhancement network to obtain the image enhancement network.
Thirdly, connecting the image enhancement network in the second step and the simulated vision system in the first step according to the framework shown in fig. 1 to obtain a cascade system, and then training a neural network by using the same high-definition image as input and output of the cascade system:
specifically, the visual system of the central vision degradation person based on UNet image enhancement network (neural network) and simulation is cascaded according to the framework shown in fig. 1, and the cascade system output and input are expected to be the same as possible, so that the neural network is trained by taking the cascade system output and input as target construction loss functions. The input image of the cascade system is set as I, and the aim of the cascade system training is to train an image enhancement network F (theta), so that an image psi (F (I; theta)) obtained after the enhanced image F (I; theta) passes through the simulated vision system and the input of the cascade system are as close as possible, namely, the image seen by eyes of a simulated central vision person is as close as possible to the original high-definition image. The similarity between two images is measured using the Mean Square Error (MSE), whereby the loss function used in training the network is:
wherein n is the number of training samples; psi denotes the simulated vision system in S1, which inputs as one image and outputs as the effect of looking at the image in the simulated central vision-degrading eye.
And fourthly, taking out a trained image enhancement network (neural network), and using the trained image enhancement network to realize image enhancement aiming at central vision degradation:
the trained image enhancement network F (Θ) is taken out of the cascade system shown in FIG. 1, an image I is output to the network, and the image enhancement network output F (I; Θ) is the image enhancement result aiming at the simulated visual impairment performance (here, the central vision degradation of different degrees).
According to the steps of the embodiment, the size of 590 Zhang Gaoqing images collected on the internet is adjusted to 1280×720, and then the images are divided into a training set and a testing set in a ratio of 19:1; training a model on two NVIDIA GTX 2080Ti GPUs, setting the batch size to 2, and training 150 epochs; and selecting an Adam optimizer during training, wherein the learning rate is set to be 0.001.
The implementation effect is as follows:
in order to verify the effectiveness of the image enhancement method for central vision degradation provided in the above embodiment of the present invention, and further verify the effectiveness of the image enhancement frame for visual impairment assistance based on deep learning, simulation verification and patient experiments may be performed, and the experimental results and the most classical two methods: the adaptive enhancement method (E. Peli, R B Goldstein, G M Young, C L Trempe, and S M Buzney, "Image enhancement for the visually immobilized. Formulations and experimental results," Investigative Ophthalmology & Visual Science, vol.32, no.8, pp.2337, 1991.) and the enhancement method of the DCT domain (G Luo, premNandhini Satgunam, and Eli Peli, "Visual search performance of patients with vision impairment: effect of jpeg image enhancement," Ophthalmic and Physiological Optics, vol.32, no.5, pp.421-428,2012.) were compared. In the experiment, the fixed viewing angle α=14°, the central vision decline was classified into three classes: mild (PR < 1.7, logMAR > 0.3), moderate (PR < 1.5, logMAR > 0.477), severe (PR < 1.0, logMAR > 1.0), corresponding to three CSF.
As previously described, the output of the simulated vision system may be considered as an image seen by a visually impaired patient, and thus the performance of various enhancement methods may be verified by comparing the similarity of the enhanced simulated perceived image to the original image. Using the images in the test set for verification, the original image and the vision system of the person with mild, moderate and severe central vision deterioration, which is simulated by using the three methods (the self-adaptive enhancement method, the enhancement method of the DCT domain, and the method of the invention) are respectively input, and the peak signal-to-noise ratio (PSNR), the Structural Similarity Index (SSIM) and the Mean Square Error (MSE) between the image and the original image are output by the computing system, and the results are shown in table 1. It can be seen that in all test cases, the output image of the image enhanced by the method of the invention after passing through the simulated vision system of the person with reduced central vision is closest to the original image, namely, the enhancement method of the invention has the best effect. In order to clearly show the enhancement results, the perception effect of the images obtained by enhancing the images by different enhancement methods and the simulated severe central vision degradation person looking at the images is shown in fig. 2, and it can be seen that the simulated perception result of the images enhanced by the method in the invention is closest to the original image, which illustrates the superiority of the method in the invention.
TABLE 1
To further verify the effect of the image enhancement method of the present invention on the person with reduced central vision, patient experiments may be performed. The shortsighted person has the central vision decline symptom caused by the decline of visual acuity and contrast sensitivity, so that the shortsighted person can be found for experiment conveniently. In the invention, 15 myopes are recruited in total during verification, firstly, the myopes are subjected to optometry, so that the naked eyes of the myopes are between 0.1 and 0.8, and then, the myopes are separated into experimental groups with different central vision degradation grades according to vision conditions for experiments. Patient experiments can be classified into objective experiments and subjective experiments: objective experiments evaluate the improvement condition of image enhancement on visual functions of a patient with visual impairment through searching tasks; subjective experiments evaluate the improvement of the subjective perception quality of the patient by image enhancement through comparison and selection tasks.
Four classes of images are involved in subjective experiments: the original unenhanced image, and the image enhanced by the adaptive enhancement method, the enhancement method of the DCT domain, and the method of the present invention, respectively, have 15 images per type. The subject needs to find the corresponding item in the image according to the voice broadcast and click on the item. The average and standard deviation of the correctness of the search for the article by 15 subjects are shown in table 2. The result shows that the image enhancement method can effectively improve the visual search function of a patient, and the improvement effect is better than that of two methods compared in an experiment.
TABLE 2
In subjective experiments, an original image, an image obtained by enhancing by a self-adaptive enhancement method and an image obtained by enhancing by a DCT domain enhancement method are respectively displayed on a screen in pairs with the image obtained by enhancing by the method, 20 pairs of images are respectively selected from the images, and the subject needs to select the image with clearer self-looking and better quality. Defining the preference ratio of a certain type of image pair for selecting the proportion of the number of the images enhanced by the method in the invention to the total number of the images for the subject, wherein the preference ratio of the three types of image pairs involved in the experiment is shown in table 3, and all the preference ratios are larger than 0.5, which shows that the method in the invention can improve the subjective perception quality of patients and the improvement effect is superior to that of the two other methods compared.
TABLE 3 Table 3
According to the image enhancement method for the central vision degradation based on the proposed image enhancement frame for the vision impairment assistance, which is provided by the embodiment of the invention, a series of image processing operations are utilized to simulate the vision system of a person with the central vision degradation, a UNet-based image enhancement network is designed, and the image enhancement network and the simulated vision system of the person with the central vision degradation are put into the proposed frame for end-to-end training and other steps, so that the image enhancement for the central vision degradation can be realized, and the vision function and subjective perception quality of the patient with the central vision degradation can be effectively improved.
In another embodiment, the present invention also provides a computer device comprising at least one processor, and at least one memory, wherein the memory stores a computer program that, when executed by the processor, enables the processor to perform the deep learning based vision-impairment aided image enhancement method.
In another embodiment, the invention also provides a computer readable storage medium, which when executed by a processor within a device, causes the device to perform the deep learning based vision barrier assisted image enhancement method.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, or as a computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
Claims (10)
1. A deep learning-based vision barrier-assisted image enhancement method, comprising:
designing a convolutional neural network for image enhancement, and connecting the output end of the convolutional neural network with the input end of a vision system for simulating a visually impaired patient to obtain a cascade system;
training the convolutional neural network to obtain an image enhancement network, wherein the image enhancement network can realize image enhancement aiming at the vision barrier symptoms;
wherein: training the convolutional neural network to obtain an image enhancement network, comprising:
inputting an original image into the convolutional neural network for enhancement, inputting an enhancement result into the vision system of the simulated vision-impaired patient for simulation, wherein the vision system of the simulated vision-impaired patient outputs a simulated perceived image of the vision-impaired symptom and is also output by the cascade system;
and calculating the output of the visual system of the simulated vision-impaired patient and the loss of the original image, aiming at minimizing the difference between the input and output images of the cascade system, and enhancing the original image to compensate the distortion brought by the visual system of the simulated vision-impaired patient.
2. The deep learning based vision impairment aiding image enhancement method of claim 1, wherein inputting the enhancement result into the vision system of the simulated vision impairment patient for simulation comprises: and processing the input original image by utilizing image processing operation to obtain a simulated perceived image which is an image seen in eyes of the patient with the simulated visual impairment.
3. The deep learning based vision barrier assisted image enhancement method of claim 1, wherein a convolutional neural network for image enhancement is designed, wherein a UNet based architecture is used;
on the basis of a standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the convolution neural network to obtain the image enhancement network.
4. The deep learning based vision-impaired vision-aided image enhancement method of claim 1, wherein the objective of minimizing the difference between the input and output images of the cascade system is:
and the input image of the cascade system is set as I, and the training of the cascade system aims at training an image enhancement network, so that the image enhanced by the convolutional neural network is close to the original image input by the simulated vision system, namely the image seen by the eyes of the simulated vision-impaired patient is close to the original image.
5. The deep learning based vision impairment aiding image enhancement method of claim 1, wherein when the vision impaired patient is a patient with reduced central vision, the image enhancement method comprises:
s1: simulating the vision system of the patient with reduced central vision, namely processing the input image to obtain a simulated perceived image of the patient with reduced central vision;
s2: designing a convolutional neural network capable of realizing image enhancement, wherein the convolutional neural network uses a network structure based on UNet;
s3: connecting the convolutional neural network in the step S2 and the simulated vision system in the step S1 to obtain a cascade system, and then training the convolutional neural network by using the same high-definition image as the input and output of the cascade system, wherein the trained convolutional neural network is used as an image enhancement network;
s4: and inputting the image to be enhanced into the trained image enhancement network in S3, so as to realize the image enhancement aiming at the central vision reduction.
6. The deep learning based vision impairment aiding image enhancement method of claim 5, wherein in S1, central vision degradation is simulated using an approximate contrast sensitivity function based on clinical measurement metrics Pelli-Robson score and logMAR vision and multiband decomposition characteristics of the human eye, comprising:
firstly decomposing an image to be simulated onto each spatial frequency band, then respectively solving the local band limit contrast of the image on each spatial frequency band, comparing the local band limit contrast with the contrast detection threshold of a patient with reduced central vision to obtain visible contents on each spatial frequency band, and finally merging the visible contents to obtain the simulated perceived image.
7. The deep learning based visual barrier assisted image enhancement method of claim 6, wherein the visual content v on each spatial frequency band i (x, y) calculated as follows:
wherein ,to reduce the artificial effect during simulation, the visible image is masked m i (x, y) gaussian filtering, the visible image mask is defined as:
wherein, an image mask at a certain point is 1 to indicate that the pixel point is visible, and 0 to indicate that the pixel point is invisible; CT (computed tomography) i A contrast detection threshold corresponding to the center frequency representing the ith frequency band, equal to the inverse of the contrast sensitivity at that frequencyNumber, i.e. CT i =1/CS i =1/CS(f 0 α), wherein CS (f) represents a contrast sensitivity function, the unit of f being period/degree; f (f) 0 The center frequency of the ith frequency band in cycles/graph; α represents the viewing angle, which is simply estimated by α=arctan (w/2 d), where w represents the width of the image, and d represents the distance between the observer and the image, in relation to the size of the image and the distance of the observer from the image; b i (x, y) represents the bandpass filtering result of the image at the ith frequency band.
8. The deep learning based visual barrier assisted image enhancement method according to claim 5, wherein in S2, the image enhancement network is obtained by adding batch normalization between convolution and ReLU of each layer and adding Sigmoid activation function at the last layer of the enhancement network on the basis of a standard UNet structure;
the UNet-based image enhancement network and the vision system of the central vision degradation person simulated in S1 are cascaded to obtain a system, the system output and the system input are expected to be the same, the neural network is trained by taking the system output and the system input as target construction loss functions, wherein,
let the image enhancement network be F (Θ), the loss function used in the system training is:
wherein n is the number of training samples; i i Representing an i-th input image; psi denotes the simulated vision system in S1, which inputs as one image and outputs as the effect of looking at the image in the simulated central vision-degrading eye.
9. A computer device comprising at least one processor, and at least one memory, wherein the memory stores a computer program that, when executed by the processor, enables the processor to perform the deep learning based vision barrier assisted image enhancement method of any one of claims 1-8.
10. A computer readable storage medium, which when executed by a processor within a device, causes the device to perform the deep learning based vision barrier assisted image enhancement method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110829947.3A CN113628130B (en) | 2021-07-22 | 2021-07-22 | Deep learning-based vision barrier-assisted image enhancement method, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110829947.3A CN113628130B (en) | 2021-07-22 | 2021-07-22 | Deep learning-based vision barrier-assisted image enhancement method, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113628130A CN113628130A (en) | 2021-11-09 |
CN113628130B true CN113628130B (en) | 2023-10-27 |
Family
ID=78380499
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110829947.3A Active CN113628130B (en) | 2021-07-22 | 2021-07-22 | Deep learning-based vision barrier-assisted image enhancement method, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113628130B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115251822B (en) * | 2022-07-14 | 2023-08-18 | 中山大学中山眼科中心 | Neural network-based contrast sensitivity rapid measurement method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986050A (en) * | 2018-07-20 | 2018-12-11 | 北京航空航天大学 | A kind of image and video enhancement method based on multiple-limb convolutional neural networks |
CN110852964A (en) * | 2019-10-30 | 2020-02-28 | 天津大学 | Image bit enhancement method based on deep learning |
CN111242868A (en) * | 2020-01-16 | 2020-06-05 | 重庆邮电大学 | Image enhancement method based on convolutional neural network under dark vision environment |
-
2021
- 2021-07-22 CN CN202110829947.3A patent/CN113628130B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108986050A (en) * | 2018-07-20 | 2018-12-11 | 北京航空航天大学 | A kind of image and video enhancement method based on multiple-limb convolutional neural networks |
CN110852964A (en) * | 2019-10-30 | 2020-02-28 | 天津大学 | Image bit enhancement method based on deep learning |
CN111242868A (en) * | 2020-01-16 | 2020-06-05 | 重庆邮电大学 | Image enhancement method based on convolutional neural network under dark vision environment |
Also Published As
Publication number | Publication date |
---|---|
CN113628130A (en) | 2021-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | Color retinal image enhancement based on luminosity and contrast adjustment | |
Ninassi et al. | Does where you gaze on an image affect your perception of quality? Applying visual attention to image quality metric | |
CN110619301A (en) | Emotion automatic identification method based on bimodal signals | |
Cao et al. | Enhancement of blurry retinal image based on non-uniform contrast stretching and intensity transfer | |
CN113628130B (en) | Deep learning-based vision barrier-assisted image enhancement method, equipment and medium | |
CN111062936B (en) | Quantitative index evaluation method for facial deformation diagnosis and treatment effect | |
Vera-Diaz et al. | Shape and individual variability of the blur adaptation curve | |
Bataineh et al. | Enhancement method for color retinal fundus images based on structural details and illumination improvements | |
CN109377472A (en) | A kind of eye fundus image quality evaluating method | |
Zhang et al. | A fundus image enhancer based on illumination-guided attention and optic disc perception GAN | |
CN111815617B (en) | Fundus image detection method based on hyperspectrum | |
CN109242795A (en) | A kind of brightness enhancement of low-light level human tissue cell two-photon micro-image | |
Fu et al. | A GAN-based deep enhancer for quality enhancement of retinal images photographed by a handheld fundus camera | |
Wu et al. | Accurate compensation makes the world more clear for the visually impaired | |
CN111588345A (en) | Eye disease detection method, AR glasses and readable storage medium | |
Alonso, Jr et al. | Image pre-compensation to facilitate computer access for users with refractive errors | |
Bala et al. | Contrast and luminance enhancement technique for fundus images using bi-orthogonal wavelet transform and bilateral filter | |
Kumar et al. | Performance evaluation of joint filtering and histogram equalization techniques for retinal fundus image enhancement | |
Redi et al. | How to apply spatial saliency into objective metrics for JPEG compressed images? | |
Huang et al. | Joint Retinex-based variational model and CLAHE-in-CIELUV for enhancement of low-quality color retinal images | |
Ling et al. | A toolkit of approaches for digital mapping and correction of visual distortion | |
Intajag et al. | Retinal image enhancement in multi-mode histogram | |
Singh et al. | Principal component fusion based unexposed biological feature enhancement of fundus images | |
Wei et al. | OCT image denoising algorithm based on discrete wavelet transform and spatial domain feature fusion | |
Huang et al. | Evaluation of dynamic image pre-compensation forcomputer users with severe refractive error |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |