CN109978804B - Human eye sight line correction method and system based on deep learning - Google Patents
Human eye sight line correction method and system based on deep learning Download PDFInfo
- Publication number
- CN109978804B CN109978804B CN201910175164.0A CN201910175164A CN109978804B CN 109978804 B CN109978804 B CN 109978804B CN 201910175164 A CN201910175164 A CN 201910175164A CN 109978804 B CN109978804 B CN 109978804B
- Authority
- CN
- China
- Prior art keywords
- image
- network
- human eye
- correction
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012937 correction Methods 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000013135 deep learning Methods 0.000 title claims abstract description 35
- 230000004438 eyesight Effects 0.000 title claims abstract description 17
- 230000007547 defect Effects 0.000 claims abstract description 52
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 11
- 238000001514 detection method Methods 0.000 claims description 9
- 238000006467 substitution reaction Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 230000000007 visual effect Effects 0.000 abstract description 16
- 230000004048 modification Effects 0.000 description 8
- 238000012986 modification Methods 0.000 description 8
- 230000002787 reinforcement Effects 0.000 description 8
- 241001522296 Erithacus rubecula Species 0.000 description 7
- 230000002950 deficient Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000005286 illumination Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000008439 repair process Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 210000003128 head Anatomy 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a human eye sight line correction method and a human eye sight line correction system based on deep learning, wherein the method comprises the following steps: acquiring a human eye picture; processing the human eye picture through a coarse adjustment distortion network to obtain a human eye generation image in a coarse stage; and detecting a defect area in the human eye generated image through the fine correction network, and correcting the defect area. According to the method, for the input human eye picture, a generated image in a coarse stage is obtained by using a distortion-based method, and then a defect region in the image output in the coarse stage is detected by using a cyclic strategy network based on depth enhancement learning, so that the error between the generated image and a real image is effectively reduced, the visual defect and the non-reality sense in the image are eliminated, and meanwhile, the image details such as reflective bright spots and the like can be recovered.
Description
Technical Field
The invention relates to the technical field of digital image processing, in particular to a human eye sight line correction method and system based on deep learning.
Background
Gaze Correction (Gaze Correction) is the processing of a picture of a person's eyes to change the direction of the person's eyes in the picture. The gaze correction has practical value and broad prospects in communication scenes such as video calls and the like. However, since the image or video of human eyes may vary greatly in size, resolution, viewing angle, illumination, texture, and occlusion during the acquisition, the problem of visual correction in the real world is still a challenging problem.
Currently, the existing gaze correction methods are mainly classified into two types: a graphics-based gaze correction and a pixel distortion-based gaze correction. For the first category, graphics-based gaze correction is mainly based on the use of 3D eye models with artificial textures to simulate continuous motion of the eyes and head, rendering eye images by geometric mass rendering using dynamic and controllable eye region models. However, the human eye image synthesized by the method has a large difference from the real human eye image. Meanwhile, a 3D model of human eyes is needed in application, but the cost for constructing the 3D model is high, so that the method has great limitation in practical application. For the second category, the gaze correction method based on warping predicts the warped flow field by learning the warping function, thereby directly generating the gaze-corrected image from the original human eye image. For example, Gain et al propose a depth feedforward system that combines the principles of coarse and fine processing, image warping, intensity correction, and the like. Kononenko et al propose a human eye distortion field method implemented by random forest prediction period and capable of running on a CPU (Central Processing Unit) in real time, since the distortion function is pose-specific, it is possible to synthesize a more realistic image using human eye images having different gaze directions and head poses, and have solved head pose and gaze angle variations in practical applications. However, human eye images usually have complicated textures, lighting, occlusion and the like, and the influence of these specific factors is difficult to be accomplished by the overall correction operation. As shown in fig. 1, images generated using only warping methods can have significant defects and non-photorealism problems.
In recent years, Deep Learning (Deep Learning) has been significantly successful in various visual applications, such as object detection, object tracking, object search, and motion recognition. Current deep reinforcement learning methods can be divided into two categories: deep Q learning and policy gradients. For the first class, the Q value is fitted to capture the expected return for taking a particular action in a particular state. For example, one collaborative deep reinforcement learning method proposed by Kong et al jointly locates objects in several iterations. For the second category, the distribution of the strategy is explicitly represented and the strategy is optimized by updating the parameters in the gradient direction. Liu et al applies a policy gradient method to optimize the headline metric and generative countermeasure networks, respectively. Recently, deep reinforcement learning plays an important role in face recognition and synthesis.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, an object of the present invention is to provide a method for correcting a line of sight of a human eye based on deep learning, which effectively reduces an error between a generated image and a real image, eliminates a visual defect and a non-real sense in the image, and can recover image details such as reflective bright spots.
Another object of the present invention is to provide a system for correcting the visual line of human eyes based on deep learning.
In order to achieve the above object, an embodiment of an aspect of the present invention provides a method for correcting a line of sight of a human eye based on deep learning, including: acquiring a human eye picture; processing the human eye picture through a coarse distortion network to obtain a human eye generated image in a coarse stage; and detecting a defect area in the human eye generated image through a fine correction network, and correcting the defect area.
According to the human eye sight line correction method based on the deep learning, the generated image in the coarse stage is obtained by using a distortion-based method for the input human eye image, and then the defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
In addition, the method for correcting the sight line of the human eye based on the deep learning according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, before the processing the human eye picture through the coarse distortion network to obtain the human eye generated image in the coarse stage, the method further includes: training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map; the compensation map carries out pixel-level substitution operation on the original image to generate a training image; training the coarse distortion network using a mean square error between the generated training image and the original image as a loss function.
Further, in an embodiment of the present invention, the fine modification network includes: a loop policy network and a local correction network;
the detecting the defect area in the human eye generated image through the fine correction network and correcting the defect area comprises the following steps: the loop strategy network detects a defect area in the human eye generated image; the local correction network corrects the defect area through a convolution layer.
Further, in one embodiment of the present invention, the loop policy network detects a defect region in the human eye-generated image, including:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the local repair network repairs the defect area through a convolutional layer, including: the block to be corrected selected in each stepWith said locally modified network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT。
In order to achieve the above object, another embodiment of the present invention provides a system for correcting a line of sight of a human eye based on deep learning, including: the acquisition module is used for acquiring a human eye picture; the processing module is used for processing the human eye picture through a coarse adjustment distortion network to obtain a human eye generated image in a coarse stage; and the correction module is used for detecting a defect area in the human eye generated image through a fine correction network and correcting the defect area.
According to the human eye sight line correction system based on the deep learning, the generated image in the coarse stage is obtained by using a distortion-based method for the input human eye image, and then the defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
In addition, the human eye sight line correction system based on deep learning according to the above embodiment of the invention may also have the following additional technical features:
further, in an embodiment of the present invention, the method further includes: the generating module is used for training a convolution neural network with a coarse-fine structure to generate a two-dimensional compensation map, the compensation map performs pixel-level substitution operation on an original image to generate a training image, and the coarse distortion network is trained by using the mean square error between the generated training image and the original image as a loss function.
Further, in an embodiment of the present invention, the fine modification network includes: a loop policy network and a local correction network;
the correction module comprises: a detection unit and a correction unit;
the detection unit is used for detecting a defect area in the human eye generated image by the loop strategy network; the correcting unit is used for correcting the defect area through a convolution layer by the local correcting network.
Further, in an embodiment of the present invention, the detecting unit is specifically configured to:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the modifying unit is specifically configured to:
the block to be corrected selected in each stepWith said locally modified network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT。
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic diagram of an image generated using a warping method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for correcting a human eye's vision based on deep learning according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a coarse twist network according to an embodiment of the present invention;
FIG. 5 is a block diagram of a round robin policy network according to one embodiment of the invention;
FIG. 6 is a flow diagram of a local correction network in accordance with one embodiment of the present invention;
fig. 7 is a schematic structural diagram of a human eye vision correcting system based on deep learning according to an embodiment of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a human eye sight line correction method and system based on deep learning according to an embodiment of the present invention with reference to the accompanying drawings.
First, a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 2 is a flowchart of a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the invention.
As shown in fig. 2, the method for correcting the sight line of the human eye based on the deep learning comprises the following steps:
in step S101, a human eye picture is acquired.
In step S102, the eye image is processed through the coarse distortion network to obtain a coarse-stage eye-generated image.
Further, in an embodiment of the present invention, before step S102, the method further includes: training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map; the compensation map carries out pixel-level substitution operation on the original image to generate a training image; the coarse warping network is trained using the mean square error between the generated training image and the original image as a loss function.
Specifically, a coarse distortion network is generated through the above steps to perform a preliminary processing on the human eye image obtained in step S101, so as to obtain a human eye generated image in a coarse stage.
In step S103, a defective area in the human eye production image is detected by the fine correction network, and the defective area is corrected.
Further, in one embodiment of the present invention, a fine modification network comprises: a round robin policy network and a local correction network.
Detecting a defect area in the human eye generated image through a fine correction network, and correcting the defect area, wherein the method comprises the following steps: detecting a defect area in an image generated by human eyes by a loop strategy network; the local correction network corrects the defective area by the convolution layer.
The method for detecting the defect area in the image generated by the human eyes by the loop strategy network comprises the following steps:
given the image I of step tt-1The round robin strategy network selects the coordinate position l of a local blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of a cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the local repair network repairs the defect area by using the convolution layer, including: the block to be corrected selected in each stepBy locally modifying the network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT。
The method of the embodiment of the invention is different from a distortion-based method and an integral correction method for the generated image, and has obvious improvement on the integral effect. With dynamic stepwise assignment of new regions of interest based on a deep reinforcement learning loop policy network, visually defective blocks (patch) of the coarse-stage generated image can be detected. The detected blocks are corrected by using a local correction network considering the global visual characteristics, so that the error between the generated image and the real image is effectively reduced, the visual defect and the unreality in the image are eliminated, and the image details such as reflective bright spots and the like are recovered.
The embodiment of the invention performs gaze correction by a two-stage method from coarse to fine, as shown in fig. 3, for a given human eye picture I and a change angle α of a sight line, the method of the embodiment of the invention is divided into two parts: coarse Warping Networks (CWN), Coarse Warping networks (FCN), and Fine Corrected Networks (FCN). Wherein the CWN is used in a first step to modify the image as a whole by pixel replacement operations. And FCN is used in a second step to refine the image output by the CWN to increase the realism of the generated image.
The following describes a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the present invention.
1. Coarse tuning distortion network (CWN)
The task of the coarse warping network is to generate a warped flow field for warping the original image. To achieve this, a coarse-to-fine structured convolutional neural network is trained to generate a two-dimensional compensation map. The map has a compensation vector (u (x, y), v (x, y)) for each pixel (x, y). This compensation map is used to perform pixel-level substitution operations on the original image. The calculation method of the distorted image is as follows:
O(x,y)=I(x+u(x,y),y+v(x,y))
therefore, the pixel of each point of the generated image is replaced by a pixel point in the original image, and the position of the replaced point is determined by the compensation vector.
The original image, the sight line change angle and the detected positions of the human eye feature points are used as the input of a coarse distortion network, and the network generates a two-channel atlas DC. Generating a coarse warped image O by warping the original image I from belowC:
OC(x,y)=I{(x,y)+DC(x,y)}
Wherein the parenthesis represents bilinear difference operation.
The CWN is trained using Mean Squared Error (MSE) between the generated image and the actual image as a loss function. The concrete network structure of the CWN is shown in fig. 4.
2. Fine Correction Network (FCN)
The result generated by the coarse distortion network usually contains local defects, which seriously affect the reality of the picture. To address this problem, the fine correction network is used to fine-correct the image generated by the coarse network lock.
The fine correction network is mainly divided into two parts: (1) a loop strategy network selects a block (2) to be corrected at each step and a partial correction network corrects the defective block by convolution layers. The cyclic body flow of the FCN is as follows:
given the image I of step tt-1The round robin strategy network selects the coordinate position l of a local blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of a cyclic strategy network, from an input image It-1Encoded history hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Then selecting the block to be corrected for each stepBy locally modifying the network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image It. After T steps, we obtain the final image IT。
The specific implementation manner of the loop policy network is as follows: this process is considered a markov decision problem at discrete time intervals. At each step, the decision network encodes the current state characteristics and decides which part of the image of the human eye needs to be modified. Until the maximum number of steps is reached, the blocks of the human eye image are gradually modified and the state features are updated.
At the end of the correction sequence, a delayed global reward is taken to guide the training of the policy network. The policy network iteratively explores an optimal search path so that each individual eye image can achieve the maximum global reward, and the structural details of the network are shown in fig. 5.
The specific settings of the state, behavior and reward of the policy network are as follows:
the state is as follows: state stExtracted from the input image of the current step and the past behavior history, and comprises three parts: (1) image of the human eye from the current step ItThe feature map extracted in (1) is extracted by the same convolution network structure as that in the local correction network, and the specific structure will be described later. (2) Location I of the block selected in the previous stept-1. (3) Hidden unit h of LSTM layertWherein, the LSTM adopts a GRU network structure.
Behavior: in a policy network, the action is to select the location of the block to be modified at this step from all possible locations. The network firstly encodes the feature map of the current image and the block position selected in the last step through a full connection layer, and simultaneously combines the vector obtained after encoding with a historical hidden vector ht-1Generating a new hidden unit htFinal policy network piθFrom htPosition l in this step is obtainedt。
Rewarding: rewards are used to guide the web learning how to select a series of actions to achieve an optimal final output. The loss of mean square error between the final output image and the real image is used as a reward for the network. In addition, the final delay reward is generated only in the last step, and the error of each step in the middle of the network is not counted into the training. The reward r at step t is therefore as follows:
wherein, IgtRepresenting a real image. In the method of the present embodiment, the method,the discount factor y is set to 1, i.e. the correction of each step is equally important for the evaluation of the final result.
The local correction network is specifically as follows:
location l obtained from a round robin policy networktTo image It-1Clipping to obtain the block to be correctedWill position ltEncoding, and summing the sameAnd merging as the input of the network. And obtaining a residual error map delta through a deep convolution network containing a series of convolution layers, directly adding the value of the residual error map delta to the block before modification, and taking the result as the modified block to replace the original block. The specific flow is shown in fig. 6.
The optimization method comprises the following steps:
and jointly training the cyclic strategy network and rejecting the network for correction by using an enhanced learning architecture. The overall formula for the optimization problem is as follows:
first, the round robin policy network is optimized using the following formula, { μ, ∑ pi }, whereθ(st):
Secondly, optimizing a local correction network:
the local convolutional network will perform parameter update and optimization at each step. The mean square error is still used as a loss function of the backtransmission error. The optimization process for the local convolutional network does not affect the parameters of the circular strategy network.
According to the human eye sight line correction method flow chart based on the deep learning provided by the embodiment of the invention, for the input human eye picture, a generated image in a coarse stage is obtained by using a distortion-based method, and then a defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
Next, a system for correcting a line of sight of a human eye based on deep learning according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 7 is a schematic structural diagram of a human eye vision correcting system based on deep learning according to an embodiment of the invention.
As shown in fig. 7, the system for correcting the line of sight of the human eye includes: an acquisition module 100, a processing module 200 and a modification module 300.
The obtaining module 100 is configured to obtain a human eye picture. The processing module 200 is configured to process the eye image through the coarse distortion network to obtain a coarse-stage eye-generated image. The correction module 300 is configured to detect a defective area in the human eye generated image through the fine correction network and correct the defective area. The system effectively reduces the error between the generated image and the real image, eliminates the visual defect and the non-reality sense in the image, and can recover the image details such as reflective bright spots and the like.
Further, in an embodiment of the present invention, the method further includes: the generating module is used for training a convolution neural network with a coarse-fine structure to generate a two-dimensional compensation map, the compensation map carries out pixel-level substitution operation on the original image to generate a training image, and the generated training image and the original image are used for training the coarse tuning distortion network by taking the mean square error as a loss function.
Further, in one embodiment of the present invention, a fine modification network comprises: a loop policy network and a local correction network;
a correction module, comprising: a detection unit and a correction unit;
the detection unit is used for detecting a defect area in the image generated by the human eyes through a loop strategy network;
and the correcting unit is used for correcting the defect area through the convolution layer by the local correcting network.
Further, in an embodiment of the present invention, the detection unit is specifically configured to:
given the image I of step tt-1The round robin strategy network selects the coordinate position l of a local blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of a cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the modification unit is specifically configured to:
the block to be corrected selected in each stepBy locally modifying the network feTo make a correction to obtain a corrected block, and then directly replacing the block before correction with the corrected blockModifying the image I for this steptAfter T, the final image I is obtainedT。
It should be noted that the foregoing explanation of the embodiment of the method for correcting the line of sight of the human eye based on deep learning is also applicable to the system of the embodiment, and is not repeated here.
According to the human eye sight line correction system based on the deep learning, provided by the embodiment of the invention, for an input human eye picture, a generated image in a coarse stage is obtained by using a warping-based method, and a defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (8)
1. A human eye sight line correction method based on deep learning is characterized by comprising the following steps:
acquiring a human eye picture;
processing the human eye picture through a coarse distortion network to obtain a human eye generated image in a coarse stage;
detecting a defect area in the human eye generated image through a fine correction network, and correcting the defect area;
wherein the fine correction network comprises: a loop policy network and a local correction network;
the detecting the defect area in the human eye generated image through the fine correction network and correcting the defect area comprises the following steps:
the loop strategy network detects a defect area in the human eye generated image;
the local correction network corrects the defect area through a convolution layer.
2. The method for correcting the sight line of the human eye based on the deep learning of claim 1, wherein before the processing the picture of the human eye through the coarse distortion network to obtain the image generated by the human eye in the coarse stage, the method further comprises:
training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map;
the compensation map carries out pixel-level substitution operation on the original image to generate a training image;
training the coarse distortion network using a mean square error between the generated training image and the original image as a loss function.
3. The method for correcting the sight line of the human eye based on the deep learning as claimed in claim 1, wherein the loop strategy network detects a defect region in the image generated by the human eye, and comprises the following steps:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image lt-1In a given position ltThe block of (b) is clipped as a result.
4. The deep learning based human eye vision correction method according to claim 3,
the local correction network corrects the defect area through a convolution layer, and comprises:
5. A system for correcting a line of sight of a human eye based on deep learning, comprising:
the acquisition module is used for acquiring a human eye picture;
the processing module is used for processing the human eye picture through a coarse adjustment distortion network to obtain a human eye generated image in a coarse stage;
the correction module is used for detecting a defect area in the human eye generated image through a fine correction network and correcting the defect area; wherein the fine correction network comprises: a loop policy network and a local correction network;
the correction module comprises: a detection unit and a correction unit;
the detection unit is used for detecting a defect area in the human eye generated image by the loop strategy network;
the correcting unit is used for correcting the defect area through a convolution layer by the local correcting network.
6. The deep learning based human eye gaze correction system of claim 5, further comprising: a module for generating a plurality of modules,
the generation module is used for training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map, the compensation map performs pixel-level substitution operation on an original image to generate a training image, and the coarse tuning distortion network is trained by using the mean square error between the generated training image and the original image as a loss function.
7. The deep learning based human eye vision correction system of claim 6,
the detection unit is specifically configured to:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
lt=fr(st-1)
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
8. The deep learning based human eye gaze correction system of claim 7,
the correction unit is specifically configured to:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910175164.0A CN109978804B (en) | 2019-03-08 | 2019-03-08 | Human eye sight line correction method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910175164.0A CN109978804B (en) | 2019-03-08 | 2019-03-08 | Human eye sight line correction method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109978804A CN109978804A (en) | 2019-07-05 |
CN109978804B true CN109978804B (en) | 2021-02-26 |
Family
ID=67078291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910175164.0A Active CN109978804B (en) | 2019-03-08 | 2019-03-08 | Human eye sight line correction method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109978804B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111008929B (en) * | 2019-12-19 | 2023-09-26 | 维沃移动通信(杭州)有限公司 | Image correction method and electronic equipment |
CN111339928B (en) * | 2020-02-25 | 2022-06-28 | 苏州科达科技股份有限公司 | Eye spirit adjusting method and device and storage medium |
CN113343931B (en) * | 2021-07-05 | 2024-07-26 | Oppo广东移动通信有限公司 | Training method for generating countermeasure network, image vision correction method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107944415A (en) * | 2017-12-06 | 2018-04-20 | 董伟 | A kind of human eye notice detection method based on deep learning algorithm |
CN108022213A (en) * | 2017-11-29 | 2018-05-11 | 天津大学 | Video super-resolution algorithm for reconstructing based on generation confrontation network |
CN108492273A (en) * | 2018-03-28 | 2018-09-04 | 深圳市唯特视科技有限公司 | A kind of image generating method based on from attention model |
CN108765340A (en) * | 2018-05-29 | 2018-11-06 | Oppo(重庆)智能科技有限公司 | Fuzzy image processing method, apparatus and terminal device |
CN108885784A (en) * | 2016-04-22 | 2018-11-23 | 英特尔公司 | It is corrected using the real-time eye contact of machine learning neural network based |
CN109102532A (en) * | 2017-06-20 | 2018-12-28 | 西门子保健有限责任公司 | The metaplasia of deep learning for medical imaging |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8599238B2 (en) * | 2009-10-16 | 2013-12-03 | Apple Inc. | Facial pose improvement with perspective distortion correction |
-
2019
- 2019-03-08 CN CN201910175164.0A patent/CN109978804B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108885784A (en) * | 2016-04-22 | 2018-11-23 | 英特尔公司 | It is corrected using the real-time eye contact of machine learning neural network based |
CN109102532A (en) * | 2017-06-20 | 2018-12-28 | 西门子保健有限责任公司 | The metaplasia of deep learning for medical imaging |
CN108022213A (en) * | 2017-11-29 | 2018-05-11 | 天津大学 | Video super-resolution algorithm for reconstructing based on generation confrontation network |
CN107944415A (en) * | 2017-12-06 | 2018-04-20 | 董伟 | A kind of human eye notice detection method based on deep learning algorithm |
CN108492273A (en) * | 2018-03-28 | 2018-09-04 | 深圳市唯特视科技有限公司 | A kind of image generating method based on from attention model |
CN108765340A (en) * | 2018-05-29 | 2018-11-06 | Oppo(重庆)智能科技有限公司 | Fuzzy image processing method, apparatus and terminal device |
Non-Patent Citations (3)
Title |
---|
《Eye Gaze Correction with Stereovision for Video-Teleconferencing》;Ruigang Yang等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20040731;第26卷(第7期);第956-960页 * |
《基于Hough变换和梯度信息的人眼视线方向估计》;孙兴华等;《小型微型计算机系统》;20070630;第28卷(第6期);第1123-1128页 * |
《虚拟视角自适应的视线校正方法》;尹苓琳等;《计算机辅助设计与图形学学报》;20131231;第25卷(第12期);第1834-1841页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109978804A (en) | 2019-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102281017B1 (en) | Neural network model training method, apparatus and storage medium for image processing | |
CN109978804B (en) | Human eye sight line correction method and system based on deep learning | |
CN112614077B (en) | Unsupervised low-illumination image enhancement method based on generation countermeasure network | |
CN110097609B (en) | Sample domain-based refined embroidery texture migration method | |
CN103530847B (en) | A kind of infrared image enhancing method | |
JP6463101B2 (en) | Region dividing apparatus and method | |
JP2007000205A (en) | Image processing apparatus, image processing method, and image processing program | |
US20100209000A1 (en) | Image processing apparatus for detecting coordinate position of characteristic portion of face | |
CN110853110A (en) | Automatic picture toning method based on generation countermeasure network | |
CN112270691B (en) | Monocular video structure and motion prediction method based on dynamic filter network | |
CN113706393B (en) | Video enhancement method, device, equipment and storage medium | |
CN114882158B (en) | Method, apparatus, device and readable medium for NERF optimization based on attention mechanism | |
CN111614911B (en) | Image generation method and device, electronic device and storage medium | |
CN111586321A (en) | Video generation method and device, electronic equipment and computer-readable storage medium | |
CN117788344A (en) | Building texture image restoration method based on diffusion model | |
CN111612721B (en) | Image restoration model training method and device and satellite image restoration method and device | |
CN111932594B (en) | Billion pixel video alignment method and device based on optical flow and medium | |
CN116524290A (en) | Image synthesis method based on countermeasure generation network | |
CN116129417A (en) | Digital instrument reading detection method based on low-quality image | |
KR20230166870A (en) | Image signal processing method using neural network, and computing appratus for performing the same | |
CN113781368B (en) | Infrared imaging device based on local information entropy | |
JP7451443B2 (en) | Image processing method and device, machine learning model training method and device, and program | |
CN115049559A (en) | Model training method, human face image processing method, human face model processing device, electronic equipment and readable storage medium | |
CN113763524A (en) | Physical optical model and neural network-based dual-flow shot rendering method and system | |
JP5544497B2 (en) | Image processing apparatus, image processing method, and image processing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |