CN113628130B

CN113628130B - Deep learning-based vision barrier-assisted image enhancement method, equipment and medium

Info

Publication number: CN113628130B
Application number: CN202110829947.3A
Authority: CN
Inventors: 翟广涛; 吴思婧; 段慧煜; 闵雄阔; 高艺璇; 曹于勤
Original assignee: Shanghai Jiao Tong University
Current assignee: Shanghai Jiao Tong University
Priority date: 2021-07-22
Filing date: 2021-07-22
Publication date: 2023-10-27
Anticipated expiration: 2041-07-22
Also published as: CN113628130A

Abstract

The present invention provides a visually impaired assisted image enhancement method, equipment and medium based on deep learning, including: connecting the output end of a convolutional neural network with the input end of a simulated visual system of a visually impaired patient to obtain a cascade system; The convolutional neural network is trained to obtain an image enhancement network, in which: the original image is input into the convolutional neural network for enhancement, the enhancement result is input into the visual system of a simulated visually impaired patient for simulation, and the output of the cascade system is a simulation of the symptoms of visual impairment. Perceptual image; calculate the loss between the output of the cascade system and the original image, with the goal of minimizing the difference between the input and output images of the cascade system, and enhance the original image to compensate for the distortion caused by the visual system of the simulated visually impaired patient . The image enhancement network obtained by the present invention can effectively realize image enhancement for visually impaired assistance. Experiments show that image enhancement for central vision loss can effectively improve the patient's visual function and subjective perception quality.

Description

Deep learning-based vision barrier-assisted image enhancement method, equipment and medium

Technical Field

The invention relates to the field of multimedia image enhancement and vision barrier assistance, in particular to a method, equipment and medium for image enhancement based on deep learning and vision barrier assistance.

Background

Vision impairment is a serious social and public health problem facing the world. Data published by the world health organization 2019 shows that at least 22 million people worldwide face problems of vision impairment or blindness. Most visually impaired patients are from developing countries and are predominantly above 50 years old. China is the largest developing country in the world, and the most rapid aging process and the most standard country are affected seriously. The vision-impaired patient faces great problems in various aspects such as work, life and the like, and not only brings great mental stress to the patient, but also greatly reduces the happiness index of the patient; the medical and daily services required by the utility model also bring a great burden to families and society.

The main causes of vision disorder are uncorrected ametropia, cataract, age-related macular degeneration, glaucoma, diabetic retinopathy, etc. Different diseases can lead to different vision-impaired symptoms, and obviously, if the best auxiliary effect is to be achieved, different image enhancement algorithms are required to be designed for the different vision-impaired symptoms. However, visually impaired patients often have more than one symptom of visual impairment. Among these, the most common symptoms are the decrease in central vision due to decreased visual acuity and decreased contrast sensitivity, and almost all visually impaired patients have different degrees of central vision. Thus, an image enhancement algorithm aimed at compensating for central vision loss can help most patients.

Image enhancement algorithms for central vision degradation can be broadly divided into two categories: an image enhancement algorithm dedicated to compensating for patient central vision loss and a method of achieving compensation using a generic image processing algorithm. In recent years, many academia and industry teams begin to work on developing vision-impaired auxiliary equipment based on image enhancement algorithms, but the effect of the auxiliary equipment is limited by theory, technology and the like, and the auxiliary equipment is difficult to meet the actual requirements of vision-impaired patients, so that the auxiliary equipment cannot be put into commercial use on a large scale.

The image enhancement algorithm specially used for compensating the central vision degradation of the patient can be traced back to Peli and the self-adaptive image enhancement method (Adaptive Enhancement) proposed by Peli at the earliest, and by amplifying the high-frequency content in the image and compressing the low-frequency content to the intermediate gray value, the gray value saturation is avoided while compensating the high-frequency contrast sensitivity degradation of the patient. The broadband enhancement method (Wideband Enhancement) is proposed by Peli et al in Eli Peli, jeonghoon Kim, yitzhak Yitzhaky, robert B Goldstein, and Russell L Woods, "Wideband enhancement of television images for people with visual impairments," Journal of the Optical Society of America A, vol.21, no.6, pp.937-50, 2004, by first extracting bipolar features of edges, corners, etc. in an image using a feature detection algorithm based on visual characteristics of the human eye, and then overlaying the features on the original image after scaling to achieve image enhancement. Tang et al and Kim et al achieve image or video enhancement for visual barrier assistance by correcting elements in the quantization matrix in the region corresponding to the critical frequency range (3-7 cycles/degre) during decompression phase for JPEG compressed images and MPEG compressed video, respectively.

The general image processing algorithm, such as image binarization, unsharp masking (unsharp masking), extracting the image edge by using a general high-pass filter and overlapping the image edge back to the original image, can enhance the contrast or high-frequency content of the image, so that the image processing algorithm has a certain compensation effect on the contrast sensitivity reduction of high spatial frequency of the visually impaired patient, and is also an idea of the image enhancement algorithm for central vision reduction.

For central vision loss, the most common auxiliary device is enhanced display glasses. The enhancement of real scenes is achieved by Hwang et al overlaying the edges of the image extracted with Laplacian filters onto the corresponding real scenes through the see-through display screen of Google glasses (Alex D.Hwang and Eli Peli, "Augmented edge enhancement on Google glass for vision-impaled users," Information Display, vol.30, no.3, pp.16-19,2014.). The American eSimht company designs a pair of vision barrier auxiliary glasses, products are produced at present, the glasses capture real scenes through a high-speed and high-resolution camera, the real scenes are displayed on an OLED screen after being processed by an image enhancement algorithm, and a user can also adjust enhancement parameters according to own preferences.

However, these existing image enhancement algorithms for central vision degradation have two general problems, making them difficult to apply to practical vision-impaired aids. Firstly, according to the general expression of the reduction of the contrast sensitivity of the patient, the methods are intuitively proposed for compensating a certain frequency band, and the patient often has the reduction of the contrast sensitivity to different degrees in the whole spatial frequency, so that the intuitive methods cannot completely compensate the vision distortion caused by the reduction of the contrast sensitivity of the patient, and the experimental effect is not ideal. Secondly, different types of images have different frequency spectrum distribution, and the existing method is mainly characterized in that a user manually adjusts the different images when the user needs to enhance the different images by setting a plurality of variable parameters; or the method is proposed for a specific type of image (such as face and text) and cannot automatically and effectively enhance various types of images.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a visual impairment assisting image enhancement method, device and medium based on deep learning, which can automatically and effectively enhance various types of images, and particularly can effectively compensate central vision decline of a patient.

In a first aspect of the present invention, there is provided a visual-barrier-assisted image enhancement method based on deep learning, including:

designing a convolutional neural network for image enhancement, and connecting the output end of the convolutional neural network with the input end of a vision system for simulating a visually impaired patient to obtain a cascade system;

training the convolutional neural network to obtain an image enhancement network, wherein the image enhancement network can realize image enhancement aiming at the vision barrier symptoms;

wherein: training the convolutional neural network to obtain an image enhancement network, comprising:

inputting an original image into the convolutional neural network for enhancement, inputting an enhancement result into the vision system of the simulated vision-impaired patient for simulation, wherein the vision system of the simulated vision-impaired patient outputs a simulated perceived image of the vision-impaired symptom and is also output by the cascade system;

and calculating the output of the visual system of the simulated vision-impaired patient and the loss of the original image, aiming at minimizing the difference between the input and output images of the cascade system, and enhancing the original image to compensate the distortion brought by the visual system of the simulated vision-impaired patient.

Optionally, the step of inputting the enhancement result into the vision system of the simulated vision impairment patient for simulation means that: and processing the input original image by utilizing image processing operation to obtain a simulated perceived image which is an image seen in eyes of the patient with the simulated visual impairment.

Optionally, the design of a convolutional neural network for image enhancement, wherein a convolutional neural network based on UNet structure is adopted;

on the basis of a standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the convolution neural network to obtain the image enhancement network.

Optionally, the objective of minimizing the difference between the input and output images of the cascade system means:

and (3) setting the input image of the cascade system as I, wherein the training of the cascade system aims at training an image enhancement network, so that the image enhanced by the convolutional neural network is as close as possible to the image obtained after passing through the simulated vision system and the input original image, namely, the image seen by the eyes of the simulated visually impaired patient is as close as possible to the original image.

Optionally, when the vision impaired patient is a patient with reduced central vision, the image enhancement method includes:

s1: simulating the vision system of the patient with reduced central vision, namely processing the input image to obtain a simulated perceived image of the patient with reduced central vision;

s2: designing a convolutional neural network capable of realizing image enhancement, wherein the convolutional neural network uses a network structure based on UNet;

s3: connecting the convolutional neural network in the step S2 and the simulated vision system in the step S1 to obtain a cascade system, and then training the convolutional neural network by using the same high-definition image as the input and output of the cascade system, wherein the trained convolutional neural network is used as an image enhancement network;

s4: and inputting the image to be enhanced into the trained image enhancement network in S3, so as to realize the image enhancement aiming at the central vision reduction.

Optionally, in S1, simulating the central vision degradation using the approximate contrast sensitivity function based on the clinical measurement index Pelli-Robson score and logMAR vision and the multiband decomposition characteristic of human eyes includes:

firstly decomposing an image to be simulated onto each spatial frequency band, then respectively solving the local band limit contrast of the image on each spatial frequency band, comparing the local band limit contrast with the contrast detection threshold of a patient with reduced central vision to obtain visible contents on each spatial frequency band, and finally merging the visible contents to obtain the simulated perceived image.

Optionally, the visible content v on the respective spatial frequency band _i (x, y) calculated as follows:

wherein ,to reduce the artificial effect during simulation, the visible image is masked m _i (x, y) gaussian filtering, the visible image mask is defined as:

wherein an image mask of 1 at a certain point indicates that the pixel point is visible, and 0 indicates that the pixel point is invisible; CT (computed tomography) _i A contrast detection threshold corresponding to the center frequency representing the ith frequency band, equal to the inverse of the contrast sensitivity at that frequency, i.e. CT _i ＝1/CS _i ＝1/CS(f ₀ α), wherein CS (f) represents a contrast sensitivity function, the unit of f being period/degree; f (f) ₀ The center frequency of the ith frequency band in cycles/graph; α represents the viewing angle, which is simply estimated by α=arctan (w/2 d), where w represents the width of the image, and d represents the distance between the observer and the image, in relation to the size of the image and the distance of the observer from the image; b _i (x, y) represents the bandpass filtering result of the image at the ith frequency band.

Optionally, in the step S2, the image enhancement network is obtained by adding batch normalization between convolution and ReLU of each layer based on a standard UNet structure, and adding Sigmoid activation function to the last layer of the enhancement network;

the UNet-based image enhancement network and the vision system of the central vision-degrading person simulated in S1 are cascaded to obtain a system, the system output and input are expected to be the same as possible, the neural network is trained with the target construction loss function, wherein,

let the image enhancement network be F (Θ), the loss function used in the system training is:

wherein n is the number of training samples; i _i Representing an i-th input image; psi denotes the simulated vision system in S1, which inputs as one image and outputs as the effect of looking at the image in the simulated central vision-degrading eye.

In a second aspect of the present invention, a computer device is provided, comprising at least one processor, and at least one memory, wherein the memory stores a computer program, which when executed by the processor, enables the processor to perform the deep learning based vision barrier assisted image enhancement method.

In a second aspect of the invention, a computer readable storage medium is provided, which when executed by a processor within a device, causes the device to perform the deep learning based vision impairment aided image enhancement method.

Compared with the prior art, the embodiment of the invention has at least one of the following beneficial effects:

according to the image enhancement method, the device and the medium based on the deep learning vision barrier assistance, the convolutional neural network is utilized to realize image enhancement, and the neural network has strong learning ability and is nonlinear, so that simultaneous enhancement of people in different spatial frequencies of the image can be realized; the invention has a plurality of various types of images in the data set used in training the image enhancement network, so the network has universality for various images.

The image enhancement method, the device and the medium for the vision barrier assistance based on the deep learning can effectively realize the image enhancement aiming at the vision barrier assistance, and experiments show that the image enhancement aiming at the central vision degradation can effectively improve the visual function and the subjective perception quality of a patient.

Drawings

Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:

fig. 1 is a block diagram of an image enhancement method according to an embodiment of the present invention.

FIG. 2 is a diagram showing a simulation verification effect according to an embodiment of the present invention, wherein (a) is an original image; (b) (c) and (d) are respectively the result of enhancement of the original image by an adaptive enhancement method, a DCT domain enhancement method and a method in the invention; (e) (f) (g) (h) are perceived images of the simulated severe central vision impaired person looking at (a) (b) (c) (d), respectively.

Detailed Description

The following describes embodiments of the present invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and detailed implementation modes and specific operation processes are given. It should be noted that variations and modifications can be made by those skilled in the art without departing from the spirit of the invention, which falls within the scope of the invention.

The prior vision-barrier-assisted image enhancement technology based on deep learning mainly has two problems: firstly, according to the general expression of the contrast sensitivity reduction of the patient, a compensation method for a certain frequency band is intuitively proposed, and the patient often has the contrast sensitivity reduction of different degrees in the whole spatial frequency, so that the visual distortion caused by the contrast sensitivity reduction of the patient cannot be completely compensated by the intuitive methods, and the experimental effect is not ideal. Secondly, different types of images have different frequency spectrum distribution, and the prior art mainly sets a plurality of variable parameters to enable a user to manually adjust when different images need to be enhanced; or the method is proposed for a specific type of image (such as face and text) and cannot automatically and effectively enhance various types of images.

In view of the above problems, the present invention provides a method for enhancing an image for visual impairment assistance based on deep learning, which can be used to achieve image enhancement for any simulatable visual impairment symptom, so as to compensate for image distortion caused by a visual system of a patient; and further applies the method to realize image enhancement aiming at central vision degradation. An image enhancement framework for vision barrier assistance based on deep learning is shown in fig. 1. The main idea is to enhance the image by using convolutional neural network (Convolutional Neural Network, CNN) to compensate the distortion brought by the vision system of the visually impaired patient. Specifically, the high-definition image input into the system is enhanced firstly through the CNN enhancement network, then through the simulated vision system and then through the output system, the system output is the effect of the enhanced image seen in the eyes of the simulated vision-impaired patient, and the effect is expected to be as close as possible to the original image (namely the system input), which is the basis of the system training.

Specifically, in an embodiment, the image enhancement method for vision barrier assistance based on deep learning includes:

s100, designing a convolutional neural network for image enhancement, and connecting the output end of the convolutional neural network with the input end of a vision system of a simulated vision impairment patient to obtain a cascade system;

s200, training the convolutional neural network to obtain an image enhancement network, wherein the image enhancement network can realize image enhancement aiming at the vision barrier symptoms; wherein: training the convolutional neural network to obtain an image enhancement network, comprising:

the original image is input into a convolutional neural network for enhancement, the enhancement result is input into a visual system of a simulated vision-impaired patient for simulation, and the visual system of the simulated vision-impaired patient outputs a simulated perceived image of the vision-impaired symptom, which is also the output of a cascade system;

the loss of the visual system output and the original image of the simulated visually impaired patient is calculated, and the original image is enhanced to compensate the distortion brought by the visual system of the simulated visually impaired patient with the aim of minimizing the difference between the input and output images of the cascade system.

The embodiment of the invention can effectively realize image enhancement aiming at video assistance, and experiments show that the image enhancement aiming at central vision degradation can effectively improve the visual function and subjective perception quality of a patient, so that people in different spatial frequencies of the image can be enhanced simultaneously, and the image enhancement method has universality for various images.

Before the above-described embodiments of the present invention begin, the visual system of a visually impaired patient may be modeled, i.e., the symptoms of the visual impairment to be compensated may be simulated using a series of micromanipulations. Then, designing a convolutional neural network which can be used for image enhancement; and finally, combining the image enhancement network with a visual system of a simulated vision-impaired patient, and performing end-to-end training by using the same high-definition image as the input and output of the system, wherein the trained enhancement network can be used for realizing image enhancement aiming at the vision-impaired symptom.

In the above embodiment of the present invention, inputting the enhancement result into the vision system of the patient with simulated vision impairment for simulation means: the input original image is processed by using the image processing operation, so as to obtain a simulated perceived image which is an image seen by eyes of a simulated visually impaired patient.

In the above embodiment of the present invention, the convolutional neural network for image enhancement may employ a UNet-based structure of the convolutional neural network; on the basis of a standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the convolution neural network to obtain the image enhancement network.

In the above embodiment of the present invention, the objective of minimizing the difference between the input and output images of the cascade system may be specifically to set the input image of the cascade system as I, and the objective of the cascade system training is to train the image enhancement network, so that the image after enhancement by the convolutional neural network, the image obtained after passing through the simulated vision system and the input original image are as close as possible, that is, the image seen in the eyes of the simulated visually impaired patient and the original image are as close as possible.

In another embodiment of the present invention, the image enhancement for central vision degradation is achieved according to the method described above, specifically including: firstly, the vision impairment symptom of central vision is simulated by using an approximate contrast sensitivity function (Contrast Sensitivity Function, CSF) based on a clinical measurement index Pelli-Robson score and logMAR vision and multiband decomposition characteristics of human eyes, and then an image enhancement network based on UNet is designed and trained. The invention can effectively realize the image enhancement aiming at the auxiliary vision, and experiments show that the image enhancement aiming at the central vision decline can effectively improve the vision function and subjective perception quality of patients.

In order to better illustrate the above technical solution of the present invention, as shown in fig. 1 and 2, an embodiment of the present invention for implementing image enhancement for central vision degradation specifically includes the following steps:

the first step is to simulate the vision system of a patient with reduced central vision, namely to process the input image by a series of image processing operations to obtain an image seen in the eyes of a simulated patient with reduced central vision:

specifically, central vision degradation was simulated using an approximate Contrast Sensitivity Function (CSF) based on clinical measurement metrics Pelli-Robson score and logMAR vision and multiband decomposition characteristics of the human eye: firstly decomposing an image to be simulated onto each spatial frequency band, then respectively solving the local band limit contrast of the image on each spatial frequency band, comparing the local band limit contrast with the contrast detection threshold of a patient with reduced central vision to obtain visible contents on each spatial frequency band, and finally merging the visible contents to obtain the simulated perceived image.

The approximate CSF used in the present invention can be expressed as:

wherein f is as followsSpatial frequency in cycles/degree; CS represents contrast sensitivity (contrast sensitivity); PCS represents the peak of contrast sensitivity; PSF represents the spatial frequency corresponding to the contrast sensitivity peak; the subscript l denotes the base 10 logarithm of the amount of the variable; w (w) _L and w_H Is a fitting parameter. Fitting the CSF measurements of a large number of normal vision subjects resulted in the best fitting parameters: pcs=166, psf=2.5, w _L ＝0.68,w _H =1.28. The CSF of a central vision-impaired person can be seen as a leftward-downward translation of the normal human CSF, which is localized by PCS, PSF, both parameters in turn being determinable by the patient's Pelli-Robson score and logMAR vision:

wherein PR represents the Pelli-Robson score; logMAR means logMAR vision; COF (chip on film) _N Is the cut-off frequency of normal human CSF, and can be expressed by letting pcs=166, psf=2.5, cs in the above CSF expression _l Derived =0.

In this embodiment, preferably, the simulation process of the perceived image is as follows:

(1) First decompose the image onto individual frequency bands:

wherein F (x, y) represents the DFT transform of the image F (x, y), F (r, θ) being the polar form of F (x, y); h _n (r, θ) represents the content of the ultrahigh frequency of the image, which will be omitted in the following discussion; l (L) ₀ (r, θ) and B _i (r, θ) is the low-pass and band-pass image filtered by the cosine log filter bank as follows:

G _i (r)＝0.5[1+cos(πlog ₂ r-πi)]i＝0,1,…,n-1

the filters are all given in polar coordinates, and r is the polar diameter, and corresponds to the distance from a certain pixel point in the image to the center of the image. Performing IDFT on the decomposition result of the image spectrum to obtain the decomposition result on the image space domain:

wherein ,l₀ (x, y) represents the low-pass filtering result of the image; b _i (x, y) represents the bandpass filtering result of the image at the ith frequency band. The local band-limited contrast of the image over each band is calculated as follows:

wherein i=1, 2, …,7; l (L) _i (x, y) represents the total energy of the image below band i; c _i (x, y) represents a local band-limited contrast image of the image at the ith band.

(2) Comparing the local band limit contrast of each frequency band with the contrast detection threshold value on the corresponding spatial frequency, the part of the frequency band image which is visible and the part of the frequency band image which is invisible can be known, and a visible image mask is defined:

wherein an image mask of 1 at a certain point indicates that the pixel point is visible, and 0 indicates that the pixel point is invisible; CT (computed tomography) _i A contrast detection threshold corresponding to the center frequency representing the ith frequency band, equal to the inverse of the contrast sensitivity at that frequency, i.e. CT _i ＝1/CS _i ＝1/CS(f ₀ Alpha), wherein CS (f) representsContrast Sensitivity Function (CSF), with f being in cycles/degree; f (f) ₀ The center frequency of the i-th band in cycles/image; α represents the viewing angle, which is related to the size of the image and the distance of the observer from the image, can be simply estimated by α=arctan (w/2 d), where w represents the width of the image and d represents the distance between the observer and the image. In order to reduce the artificial effect (artifact) during simulation, the mask image is subjected to Gaussian filtering, and the filtering result is recorded asWith the aid of an image mask, the visible portion of each band image can represent:

(3) The visible images on each frequency band are simply added to obtain a simulated perceived image s (x, y):

secondly, designing a convolutional neural network based on a UNet structure for realizing image enhancement:

on the basis of the standard UNet structure, batch Normalization (BN) is added between convolution and ReLU of each layer, and Sigmoid activation function is added to the last layer of the enhancement network to obtain the image enhancement network.

Thirdly, connecting the image enhancement network in the second step and the simulated vision system in the first step according to the framework shown in fig. 1 to obtain a cascade system, and then training a neural network by using the same high-definition image as input and output of the cascade system:

specifically, the visual system of the central vision degradation person based on UNet image enhancement network (neural network) and simulation is cascaded according to the framework shown in fig. 1, and the cascade system output and input are expected to be the same as possible, so that the neural network is trained by taking the cascade system output and input as target construction loss functions. The input image of the cascade system is set as I, and the aim of the cascade system training is to train an image enhancement network F (theta), so that an image psi (F (I; theta)) obtained after the enhanced image F (I; theta) passes through the simulated vision system and the input of the cascade system are as close as possible, namely, the image seen by eyes of a simulated central vision person is as close as possible to the original high-definition image. The similarity between two images is measured using the Mean Square Error (MSE), whereby the loss function used in training the network is:

wherein n is the number of training samples; psi denotes the simulated vision system in S1, which inputs as one image and outputs as the effect of looking at the image in the simulated central vision-degrading eye.

And fourthly, taking out a trained image enhancement network (neural network), and using the trained image enhancement network to realize image enhancement aiming at central vision degradation:

the trained image enhancement network F (Θ) is taken out of the cascade system shown in FIG. 1, an image I is output to the network, and the image enhancement network output F (I; Θ) is the image enhancement result aiming at the simulated visual impairment performance (here, the central vision degradation of different degrees).

According to the steps of the embodiment, the size of 590 Zhang Gaoqing images collected on the internet is adjusted to 1280×720, and then the images are divided into a training set and a testing set in a ratio of 19:1; training a model on two NVIDIA GTX 2080Ti GPUs, setting the batch size to 2, and training 150 epochs; and selecting an Adam optimizer during training, wherein the learning rate is set to be 0.001.

The implementation effect is as follows:

in order to verify the effectiveness of the image enhancement method for central vision degradation provided in the above embodiment of the present invention, and further verify the effectiveness of the image enhancement frame for visual impairment assistance based on deep learning, simulation verification and patient experiments may be performed, and the experimental results and the most classical two methods: the adaptive enhancement method (E. Peli, R B Goldstein, G M Young, C L Trempe, and S M Buzney, "Image enhancement for the visually immobilized. Formulations and experimental results," Investigative Ophthalmology & Visual Science, vol.32, no.8, pp.2337, 1991.) and the enhancement method of the DCT domain (G Luo, premNandhini Satgunam, and Eli Peli, "Visual search performance of patients with vision impairment: effect of jpeg image enhancement," Ophthalmic and Physiological Optics, vol.32, no.5, pp.421-428,2012.) were compared. In the experiment, the fixed viewing angle α=14°, the central vision decline was classified into three classes: mild (PR < 1.7, logMAR > 0.3), moderate (PR < 1.5, logMAR > 0.477), severe (PR < 1.0, logMAR > 1.0), corresponding to three CSF.

As previously described, the output of the simulated vision system may be considered as an image seen by a visually impaired patient, and thus the performance of various enhancement methods may be verified by comparing the similarity of the enhanced simulated perceived image to the original image. Using the images in the test set for verification, the original image and the vision system of the person with mild, moderate and severe central vision deterioration, which is simulated by using the three methods (the self-adaptive enhancement method, the enhancement method of the DCT domain, and the method of the invention) are respectively input, and the peak signal-to-noise ratio (PSNR), the Structural Similarity Index (SSIM) and the Mean Square Error (MSE) between the image and the original image are output by the computing system, and the results are shown in table 1. It can be seen that in all test cases, the output image of the image enhanced by the method of the invention after passing through the simulated vision system of the person with reduced central vision is closest to the original image, namely, the enhancement method of the invention has the best effect. In order to clearly show the enhancement results, the perception effect of the images obtained by enhancing the images by different enhancement methods and the simulated severe central vision degradation person looking at the images is shown in fig. 2, and it can be seen that the simulated perception result of the images enhanced by the method in the invention is closest to the original image, which illustrates the superiority of the method in the invention.

TABLE 1

To further verify the effect of the image enhancement method of the present invention on the person with reduced central vision, patient experiments may be performed. The shortsighted person has the central vision decline symptom caused by the decline of visual acuity and contrast sensitivity, so that the shortsighted person can be found for experiment conveniently. In the invention, 15 myopes are recruited in total during verification, firstly, the myopes are subjected to optometry, so that the naked eyes of the myopes are between 0.1 and 0.8, and then, the myopes are separated into experimental groups with different central vision degradation grades according to vision conditions for experiments. Patient experiments can be classified into objective experiments and subjective experiments: objective experiments evaluate the improvement condition of image enhancement on visual functions of a patient with visual impairment through searching tasks; subjective experiments evaluate the improvement of the subjective perception quality of the patient by image enhancement through comparison and selection tasks.

Four classes of images are involved in subjective experiments: the original unenhanced image, and the image enhanced by the adaptive enhancement method, the enhancement method of the DCT domain, and the method of the present invention, respectively, have 15 images per type. The subject needs to find the corresponding item in the image according to the voice broadcast and click on the item. The average and standard deviation of the correctness of the search for the article by 15 subjects are shown in table 2. The result shows that the image enhancement method can effectively improve the visual search function of a patient, and the improvement effect is better than that of two methods compared in an experiment.

TABLE 2

In subjective experiments, an original image, an image obtained by enhancing by a self-adaptive enhancement method and an image obtained by enhancing by a DCT domain enhancement method are respectively displayed on a screen in pairs with the image obtained by enhancing by the method, 20 pairs of images are respectively selected from the images, and the subject needs to select the image with clearer self-looking and better quality. Defining the preference ratio of a certain type of image pair for selecting the proportion of the number of the images enhanced by the method in the invention to the total number of the images for the subject, wherein the preference ratio of the three types of image pairs involved in the experiment is shown in table 3, and all the preference ratios are larger than 0.5, which shows that the method in the invention can improve the subjective perception quality of patients and the improvement effect is superior to that of the two other methods compared.

TABLE 3 Table 3

According to the image enhancement method for the central vision degradation based on the proposed image enhancement frame for the vision impairment assistance, which is provided by the embodiment of the invention, a series of image processing operations are utilized to simulate the vision system of a person with the central vision degradation, a UNet-based image enhancement network is designed, and the image enhancement network and the simulated vision system of the person with the central vision degradation are put into the proposed frame for end-to-end training and other steps, so that the image enhancement for the central vision degradation can be realized, and the vision function and subjective perception quality of the patient with the central vision degradation can be effectively improved.

In another embodiment, the present invention also provides a computer device comprising at least one processor, and at least one memory, wherein the memory stores a computer program that, when executed by the processor, enables the processor to perform the deep learning based vision-impairment aided image enhancement method.

In another embodiment, the invention also provides a computer readable storage medium, which when executed by a processor within a device, causes the device to perform the deep learning based vision barrier assisted image enhancement method.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, or as a computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A visually impaired assisted image enhancement method based on deep learning, which is characterized by including:

Design a convolutional neural network for image enhancement, and connect the output end of the convolutional neural network with the input end of the visual system of a simulated visually impaired patient to obtain a cascade system;

The convolutional neural network is trained to obtain an image enhancement network, and the image enhancement network can realize image enhancement for the visual impairment symptoms;

Wherein: training the convolutional neural network to obtain an image enhancement network, including:

The original image is input into the convolutional neural network for enhancement, the enhancement result is input into the visual system of the simulated visually impaired patient for simulation, and the visual system of the simulated visually impaired patient outputs a simulated perceptual image of the visual impairment symptom, Also the output of the cascade system;

Calculate the loss of the visual system output of the simulated visually impaired patient and the original image, with the goal of minimizing the difference between the input and output images of the cascade system, and enhance the original image to compensate for the loss of the simulated visually impaired patient. Distortions introduced by the visual system.

2. The visually impaired assisted image enhancement method based on deep learning according to claim 1, characterized in that said inputting the enhancement results into the visual system of the simulated visually impaired patient for simulation means: using image processing operations The input original image is processed to obtain the image seen by the simulated visually impaired patient, that is, the simulated perceptual image.

3. The visually impaired assisted image enhancement method based on deep learning according to claim 1, characterized in that the design of a convolutional neural network for image enhancement, wherein a convolutional neural network based on the structure of UNet is used network;

Based on the standard UNet structure, batch normalization (BN) is added between the convolution and ReLU of each layer, and the Sigmoid activation function is added to the last layer of the convolutional neural network to obtain the image enhancement network.

4. The visually impaired assisted image enhancement method based on deep learning according to claim 1, characterized in that the goal of minimizing the difference between the input and output images of the cascade system refers to:

Suppose the input image of the cascade system is I. The goal of training the cascade system is to train the image enhancement network so that the image enhanced by the convolutional neural network, the image obtained after passing through the simulated visual system and the input The original image is close, that is, the image seen by the simulated visually impaired patient is close to the original image.

5. The visually impaired assisted image enhancement method based on deep learning according to claim 1, characterized in that when the visually impaired patient is a patient with reduced central vision, the image enhancement method includes:

S1: Simulate the visual system of patients with reduced central vision, that is, process the input image to obtain the simulated perceived image of the patient with reduced central vision;

S2: Design a convolutional neural network that can achieve image enhancement. The convolutional neural network uses a network structure based on UNet;

S3: Connect the convolutional neural network in S2 and the visual system simulated in S1 to obtain a cascade system, and then use the same high-definition image as the input and output of the cascade system to train the convolutional neural network, and the trained The convolutional neural network serves as an image enhancement network;

S4: Input the image to be enhanced into the image enhancement network trained in S3 to achieve image enhancement for reduced central vision.

6. The visually impaired assisted image enhancement method based on deep learning according to claim 5, characterized in that, in the S1, an approximate contrast sensitivity function based on the clinical measurement index Pelli-Robson score and logMAR visual acuity and human visual acuity are used. The multi-band decomposition properties of the eye model central vision loss, including:

First, the image to be simulated is decomposed into each spatial frequency band, and then the local band-limit contrast of the image on each spatial frequency band is obtained respectively, and compared with the contrast detection threshold of patients with reduced central vision, that is, the visible content on each spatial frequency band is obtained. Finally, these visible contents are combined to form a simulated perceptual image.

7. The visually impaired assisted image enhancement method based on deep learning according to claim 6, characterized in that the visible content _vi (x, y) on each spatial frequency band is calculated as follows:

in, It is the result of Gaussian filtering on the visible image mask m _i (x, y) in order to reduce artificial effects during simulation. The visible image mask is defined as:

Among them, the image mask at a certain point is 1, which means the pixel is visible, and 0, which means the pixel is invisible; CT _i represents the contrast detection threshold corresponding to the center frequency of the i-th frequency band, which is equal to the reciprocal of the contrast sensitivity at that frequency, That is, CT _i =1/CS _i =1/CS(f ₀ /α), where CS(f) represents the contrast sensitivity function, and the unit of f is cycle/degree; f ₀ represents the center frequency of the i-th frequency band, The unit is period/image; α represents the viewing angle, which is related to the size of the image and the distance between the observer and the image. It is simply estimated by α = arctan (w/2d), where w represents the width of the image and d represents the distance between the observer and the image. Distance; b _i (x, y) represents the band-pass filtering result of the image on the i-th frequency band.

8. The visually impaired assisted image enhancement method based on deep learning according to claim 5, characterized in that, in the S2, the image enhancement network is based on the standard UNet structure, with convolution and ReLU in each layer. Obtained by adding batch normalization in the middle and adding the Sigmoid activation function to the last layer of the enhanced network;

The image enhancement network based on UNet is cascaded with the visual system of the person with reduced central vision simulated in S1 to obtain the system. It is hoped that the system output and input are the same, and a loss function is constructed as the goal to train the neural network, where,

Assuming that the image enhancement network is F(Θ), the loss function used in system training is:

Among them, n is the number of training samples; I _i represents the i-th input image; ψ represents the simulated visual system in S1, whose input is an image, and the output is the simulated effect of seeing the image in the eyes of a person with reduced central vision.

9. A computer device, comprising at least one processor and at least one memory, wherein the memory stores a computer program that, when executed by the processor, enables the processor to execute claim 1 The visually impaired assisted image enhancement method based on deep learning according to any one of ~8.

10. A computer-readable storage medium that, when the instructions in the storage medium are executed by a processor in the device, enables the device to perform the deep learning-based vision impairment treatment according to any one of claims 1 to 8. Auxiliary image enhancement methods.