CN109978804B - Human eye sight line correction method and system based on deep learning - Google Patents

Human eye sight line correction method and system based on deep learning Download PDF

Info

Publication number
CN109978804B
CN109978804B CN201910175164.0A CN201910175164A CN109978804B CN 109978804 B CN109978804 B CN 109978804B CN 201910175164 A CN201910175164 A CN 201910175164A CN 109978804 B CN109978804 B CN 109978804B
Authority
CN
China
Prior art keywords
image
network
human eye
correction
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910175164.0A
Other languages
Chinese (zh)
Other versions
CN109978804A (en
Inventor
鲁继文
周杰
任亮亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201910175164.0A priority Critical patent/CN109978804B/en
Publication of CN109978804A publication Critical patent/CN109978804A/en
Application granted granted Critical
Publication of CN109978804B publication Critical patent/CN109978804B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a human eye sight line correction method and a human eye sight line correction system based on deep learning, wherein the method comprises the following steps: acquiring a human eye picture; processing the human eye picture through a coarse adjustment distortion network to obtain a human eye generation image in a coarse stage; and detecting a defect area in the human eye generated image through the fine correction network, and correcting the defect area. According to the method, for the input human eye picture, a generated image in a coarse stage is obtained by using a distortion-based method, and then a defect region in the image output in the coarse stage is detected by using a cyclic strategy network based on depth enhancement learning, so that the error between the generated image and a real image is effectively reduced, the visual defect and the non-reality sense in the image are eliminated, and meanwhile, the image details such as reflective bright spots and the like can be recovered.

Description

Human eye sight line correction method and system based on deep learning
Technical Field
The invention relates to the technical field of digital image processing, in particular to a human eye sight line correction method and system based on deep learning.
Background
Gaze Correction (Gaze Correction) is the processing of a picture of a person's eyes to change the direction of the person's eyes in the picture. The gaze correction has practical value and broad prospects in communication scenes such as video calls and the like. However, since the image or video of human eyes may vary greatly in size, resolution, viewing angle, illumination, texture, and occlusion during the acquisition, the problem of visual correction in the real world is still a challenging problem.
Currently, the existing gaze correction methods are mainly classified into two types: a graphics-based gaze correction and a pixel distortion-based gaze correction. For the first category, graphics-based gaze correction is mainly based on the use of 3D eye models with artificial textures to simulate continuous motion of the eyes and head, rendering eye images by geometric mass rendering using dynamic and controllable eye region models. However, the human eye image synthesized by the method has a large difference from the real human eye image. Meanwhile, a 3D model of human eyes is needed in application, but the cost for constructing the 3D model is high, so that the method has great limitation in practical application. For the second category, the gaze correction method based on warping predicts the warped flow field by learning the warping function, thereby directly generating the gaze-corrected image from the original human eye image. For example, Gain et al propose a depth feedforward system that combines the principles of coarse and fine processing, image warping, intensity correction, and the like. Kononenko et al propose a human eye distortion field method implemented by random forest prediction period and capable of running on a CPU (Central Processing Unit) in real time, since the distortion function is pose-specific, it is possible to synthesize a more realistic image using human eye images having different gaze directions and head poses, and have solved head pose and gaze angle variations in practical applications. However, human eye images usually have complicated textures, lighting, occlusion and the like, and the influence of these specific factors is difficult to be accomplished by the overall correction operation. As shown in fig. 1, images generated using only warping methods can have significant defects and non-photorealism problems.
In recent years, Deep Learning (Deep Learning) has been significantly successful in various visual applications, such as object detection, object tracking, object search, and motion recognition. Current deep reinforcement learning methods can be divided into two categories: deep Q learning and policy gradients. For the first class, the Q value is fitted to capture the expected return for taking a particular action in a particular state. For example, one collaborative deep reinforcement learning method proposed by Kong et al jointly locates objects in several iterations. For the second category, the distribution of the strategy is explicitly represented and the strategy is optimized by updating the parameters in the gradient direction. Liu et al applies a policy gradient method to optimize the headline metric and generative countermeasure networks, respectively. Recently, deep reinforcement learning plays an important role in face recognition and synthesis.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, an object of the present invention is to provide a method for correcting a line of sight of a human eye based on deep learning, which effectively reduces an error between a generated image and a real image, eliminates a visual defect and a non-real sense in the image, and can recover image details such as reflective bright spots.
Another object of the present invention is to provide a system for correcting the visual line of human eyes based on deep learning.
In order to achieve the above object, an embodiment of an aspect of the present invention provides a method for correcting a line of sight of a human eye based on deep learning, including: acquiring a human eye picture; processing the human eye picture through a coarse distortion network to obtain a human eye generated image in a coarse stage; and detecting a defect area in the human eye generated image through a fine correction network, and correcting the defect area.
According to the human eye sight line correction method based on the deep learning, the generated image in the coarse stage is obtained by using a distortion-based method for the input human eye image, and then the defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
In addition, the method for correcting the sight line of the human eye based on the deep learning according to the above embodiment of the present invention may further have the following additional technical features:
further, in an embodiment of the present invention, before the processing the human eye picture through the coarse distortion network to obtain the human eye generated image in the coarse stage, the method further includes: training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map; the compensation map carries out pixel-level substitution operation on the original image to generate a training image; training the coarse distortion network using a mean square error between the generated training image and the original image as a loss function.
Further, in an embodiment of the present invention, the fine modification network includes: a loop policy network and a local correction network;
the detecting the defect area in the human eye generated image through the fine correction network and correcting the defect area comprises the following steps: the loop strategy network detects a defect area in the human eye generated image; the local correction network corrects the defect area through a convolution layer.
Further, in one embodiment of the present invention, the loop policy network detects a defect region in the human eye-generated image, including:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
Figure BDA0001989331620000021
lt=fr(st-1)
Figure BDA0001989331620000031
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the local repair network repairs the defect area through a convolutional layer, including: the block to be corrected selected in each step
Figure BDA0001989331620000032
With said locally modified network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT
In order to achieve the above object, another embodiment of the present invention provides a system for correcting a line of sight of a human eye based on deep learning, including: the acquisition module is used for acquiring a human eye picture; the processing module is used for processing the human eye picture through a coarse adjustment distortion network to obtain a human eye generated image in a coarse stage; and the correction module is used for detecting a defect area in the human eye generated image through a fine correction network and correcting the defect area.
According to the human eye sight line correction system based on the deep learning, the generated image in the coarse stage is obtained by using a distortion-based method for the input human eye image, and then the defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
In addition, the human eye sight line correction system based on deep learning according to the above embodiment of the invention may also have the following additional technical features:
further, in an embodiment of the present invention, the method further includes: the generating module is used for training a convolution neural network with a coarse-fine structure to generate a two-dimensional compensation map, the compensation map performs pixel-level substitution operation on an original image to generate a training image, and the coarse distortion network is trained by using the mean square error between the generated training image and the original image as a loss function.
Further, in an embodiment of the present invention, the fine modification network includes: a loop policy network and a local correction network;
the correction module comprises: a detection unit and a correction unit;
the detection unit is used for detecting a defect area in the human eye generated image by the loop strategy network; the correcting unit is used for correcting the defect area through a convolution layer by the local correcting network.
Further, in an embodiment of the present invention, the detecting unit is specifically configured to:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
Figure BDA0001989331620000033
lt=fr(st-1)
Figure BDA0001989331620000041
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the modifying unit is specifically configured to:
the block to be corrected selected in each step
Figure BDA0001989331620000042
With said locally modified network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic diagram of an image generated using a warping method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for correcting a human eye's vision based on deep learning according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a coarse twist network according to an embodiment of the present invention;
FIG. 5 is a block diagram of a round robin policy network according to one embodiment of the invention;
FIG. 6 is a flow diagram of a local correction network in accordance with one embodiment of the present invention;
fig. 7 is a schematic structural diagram of a human eye vision correcting system based on deep learning according to an embodiment of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a human eye sight line correction method and system based on deep learning according to an embodiment of the present invention with reference to the accompanying drawings.
First, a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 2 is a flowchart of a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the invention.
As shown in fig. 2, the method for correcting the sight line of the human eye based on the deep learning comprises the following steps:
in step S101, a human eye picture is acquired.
In step S102, the eye image is processed through the coarse distortion network to obtain a coarse-stage eye-generated image.
Further, in an embodiment of the present invention, before step S102, the method further includes: training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map; the compensation map carries out pixel-level substitution operation on the original image to generate a training image; the coarse warping network is trained using the mean square error between the generated training image and the original image as a loss function.
Specifically, a coarse distortion network is generated through the above steps to perform a preliminary processing on the human eye image obtained in step S101, so as to obtain a human eye generated image in a coarse stage.
In step S103, a defective area in the human eye production image is detected by the fine correction network, and the defective area is corrected.
Further, in one embodiment of the present invention, a fine modification network comprises: a round robin policy network and a local correction network.
Detecting a defect area in the human eye generated image through a fine correction network, and correcting the defect area, wherein the method comprises the following steps: detecting a defect area in an image generated by human eyes by a loop strategy network; the local correction network corrects the defective area by the convolution layer.
The method for detecting the defect area in the image generated by the human eyes by the loop strategy network comprises the following steps:
given the image I of step tt-1The round robin strategy network selects the coordinate position l of a local blocktTo select a block
Figure BDA0001989331620000051
lt=fr(st-1)
Figure BDA0001989331620000052
Wherein s ist-1Is an encoded state feature of a cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the local repair network repairs the defect area by using the convolution layer, including: the block to be corrected selected in each step
Figure BDA0001989331620000053
By locally modifying the network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT
The method of the embodiment of the invention is different from a distortion-based method and an integral correction method for the generated image, and has obvious improvement on the integral effect. With dynamic stepwise assignment of new regions of interest based on a deep reinforcement learning loop policy network, visually defective blocks (patch) of the coarse-stage generated image can be detected. The detected blocks are corrected by using a local correction network considering the global visual characteristics, so that the error between the generated image and the real image is effectively reduced, the visual defect and the unreality in the image are eliminated, and the image details such as reflective bright spots and the like are recovered.
The embodiment of the invention performs gaze correction by a two-stage method from coarse to fine, as shown in fig. 3, for a given human eye picture I and a change angle α of a sight line, the method of the embodiment of the invention is divided into two parts: coarse Warping Networks (CWN), Coarse Warping networks (FCN), and Fine Corrected Networks (FCN). Wherein the CWN is used in a first step to modify the image as a whole by pixel replacement operations. And FCN is used in a second step to refine the image output by the CWN to increase the realism of the generated image.
The following describes a method for correcting a line of sight of a human eye based on deep learning according to an embodiment of the present invention.
1. Coarse tuning distortion network (CWN)
The task of the coarse warping network is to generate a warped flow field for warping the original image. To achieve this, a coarse-to-fine structured convolutional neural network is trained to generate a two-dimensional compensation map. The map has a compensation vector (u (x, y), v (x, y)) for each pixel (x, y). This compensation map is used to perform pixel-level substitution operations on the original image. The calculation method of the distorted image is as follows:
O(x,y)=I(x+u(x,y),y+v(x,y))
therefore, the pixel of each point of the generated image is replaced by a pixel point in the original image, and the position of the replaced point is determined by the compensation vector.
The original image, the sight line change angle and the detected positions of the human eye feature points are used as the input of a coarse distortion network, and the network generates a two-channel atlas DC. Generating a coarse warped image O by warping the original image I from belowC
OC(x,y)=I{(x,y)+DC(x,y)}
Wherein the parenthesis represents bilinear difference operation.
The CWN is trained using Mean Squared Error (MSE) between the generated image and the actual image as a loss function. The concrete network structure of the CWN is shown in fig. 4.
2. Fine Correction Network (FCN)
The result generated by the coarse distortion network usually contains local defects, which seriously affect the reality of the picture. To address this problem, the fine correction network is used to fine-correct the image generated by the coarse network lock.
The fine correction network is mainly divided into two parts: (1) a loop strategy network selects a block (2) to be corrected at each step and a partial correction network corrects the defective block by convolution layers. The cyclic body flow of the FCN is as follows:
given the image I of step tt-1The round robin strategy network selects the coordinate position l of a local blocktTo select a block
Figure BDA0001989331620000061
lt=fr(st-1)
Figure BDA0001989331620000062
Wherein s ist-1Is an encoded state feature of a cyclic strategy network, from an input image It-1Encoded history hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Then selecting the block to be corrected for each step
Figure BDA0001989331620000063
By locally modifying the network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image It. After T steps, we obtain the final image IT
The specific implementation manner of the loop policy network is as follows: this process is considered a markov decision problem at discrete time intervals. At each step, the decision network encodes the current state characteristics and decides which part of the image of the human eye needs to be modified. Until the maximum number of steps is reached, the blocks of the human eye image are gradually modified and the state features are updated.
At the end of the correction sequence, a delayed global reward is taken to guide the training of the policy network. The policy network iteratively explores an optimal search path so that each individual eye image can achieve the maximum global reward, and the structural details of the network are shown in fig. 5.
The specific settings of the state, behavior and reward of the policy network are as follows:
the state is as follows: state stExtracted from the input image of the current step and the past behavior history, and comprises three parts: (1) image of the human eye from the current step ItThe feature map extracted in (1) is extracted by the same convolution network structure as that in the local correction network, and the specific structure will be described later. (2) Location I of the block selected in the previous stept-1. (3) Hidden unit h of LSTM layertWherein, the LSTM adopts a GRU network structure.
Behavior: in a policy network, the action is to select the location of the block to be modified at this step from all possible locations. The network firstly encodes the feature map of the current image and the block position selected in the last step through a full connection layer, and simultaneously combines the vector obtained after encoding with a historical hidden vector ht-1Generating a new hidden unit htFinal policy network piθFrom htPosition l in this step is obtainedt
Figure BDA0001989331620000071
Rewarding: rewards are used to guide the web learning how to select a series of actions to achieve an optimal final output. The loss of mean square error between the final output image and the real image is used as a reward for the network. In addition, the final delay reward is generated only in the last step, and the error of each step in the middle of the network is not counted into the training. The reward r at step t is therefore as follows:
Figure BDA0001989331620000072
wherein, IgtRepresenting a real image. In the method of the present embodiment, the method,the discount factor y is set to 1, i.e. the correction of each step is equally important for the evaluation of the final result.
The local correction network is specifically as follows:
location l obtained from a round robin policy networktTo image It-1Clipping to obtain the block to be corrected
Figure BDA0001989331620000073
Will position ltEncoding, and summing the same
Figure BDA0001989331620000074
And merging as the input of the network. And obtaining a residual error map delta through a deep convolution network containing a series of convolution layers, directly adding the value of the residual error map delta to the block before modification, and taking the result as the modified block to replace the original block. The specific flow is shown in fig. 6.
The optimization method comprises the following steps:
and jointly training the cyclic strategy network and rejecting the network for correction by using an enhanced learning architecture. The overall formula for the optimization problem is as follows:
Figure BDA0001989331620000081
first, the round robin policy network is optimized using the following formula, { μ, ∑ pi }, whereθ(st):
Figure BDA0001989331620000082
Probability distribution using a positive-Taiwan distribution as a behavior selection
Figure BDA0001989331620000083
Secondly, optimizing a local correction network:
Figure BDA0001989331620000084
the local convolutional network will perform parameter update and optimization at each step. The mean square error is still used as a loss function of the backtransmission error. The optimization process for the local convolutional network does not affect the parameters of the circular strategy network.
According to the human eye sight line correction method flow chart based on the deep learning provided by the embodiment of the invention, for the input human eye picture, a generated image in a coarse stage is obtained by using a distortion-based method, and then a defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
Next, a system for correcting a line of sight of a human eye based on deep learning according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 7 is a schematic structural diagram of a human eye vision correcting system based on deep learning according to an embodiment of the invention.
As shown in fig. 7, the system for correcting the line of sight of the human eye includes: an acquisition module 100, a processing module 200 and a modification module 300.
The obtaining module 100 is configured to obtain a human eye picture. The processing module 200 is configured to process the eye image through the coarse distortion network to obtain a coarse-stage eye-generated image. The correction module 300 is configured to detect a defective area in the human eye generated image through the fine correction network and correct the defective area. The system effectively reduces the error between the generated image and the real image, eliminates the visual defect and the non-reality sense in the image, and can recover the image details such as reflective bright spots and the like.
Further, in an embodiment of the present invention, the method further includes: the generating module is used for training a convolution neural network with a coarse-fine structure to generate a two-dimensional compensation map, the compensation map carries out pixel-level substitution operation on the original image to generate a training image, and the generated training image and the original image are used for training the coarse tuning distortion network by taking the mean square error as a loss function.
Further, in one embodiment of the present invention, a fine modification network comprises: a loop policy network and a local correction network;
a correction module, comprising: a detection unit and a correction unit;
the detection unit is used for detecting a defect area in the image generated by the human eyes through a loop strategy network;
and the correcting unit is used for correcting the defect area through the convolution layer by the local correcting network.
Further, in an embodiment of the present invention, the detection unit is specifically configured to:
given the image I of step tt-1The round robin strategy network selects the coordinate position l of a local blocktTo select a block
Figure BDA0001989331620000085
lt=fr(st-1)
Figure BDA0001989331620000091
Wherein s ist-1Is an encoded state feature of a cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
Further, in an embodiment of the present invention, the modification unit is specifically configured to:
the block to be corrected selected in each step
Figure BDA0001989331620000092
By locally modifying the network feTo make a correction to obtain a corrected block, and then directly replacing the block before correction with the corrected blockModifying the image I for this steptAfter T, the final image I is obtainedT
It should be noted that the foregoing explanation of the embodiment of the method for correcting the line of sight of the human eye based on deep learning is also applicable to the system of the embodiment, and is not repeated here.
According to the human eye sight line correction system based on the deep learning, provided by the embodiment of the invention, for an input human eye picture, a generated image in a coarse stage is obtained by using a warping-based method, and a defect area in the image output in the coarse stage is detected by using a cyclic strategy network based on the deep reinforcement learning. The detected defect area is refined by considering a local correction network of global visual characteristics, so that the visual defects and non-authenticity caused by specific factors such as illumination, texture, shielding and the like are greatly eliminated.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (8)

1. A human eye sight line correction method based on deep learning is characterized by comprising the following steps:
acquiring a human eye picture;
processing the human eye picture through a coarse distortion network to obtain a human eye generated image in a coarse stage;
detecting a defect area in the human eye generated image through a fine correction network, and correcting the defect area;
wherein the fine correction network comprises: a loop policy network and a local correction network;
the detecting the defect area in the human eye generated image through the fine correction network and correcting the defect area comprises the following steps:
the loop strategy network detects a defect area in the human eye generated image;
the local correction network corrects the defect area through a convolution layer.
2. The method for correcting the sight line of the human eye based on the deep learning of claim 1, wherein before the processing the picture of the human eye through the coarse distortion network to obtain the image generated by the human eye in the coarse stage, the method further comprises:
training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map;
the compensation map carries out pixel-level substitution operation on the original image to generate a training image;
training the coarse distortion network using a mean square error between the generated training image and the original image as a loss function.
3. The method for correcting the sight line of the human eye based on the deep learning as claimed in claim 1, wherein the loop strategy network detects a defect region in the image generated by the human eye, and comprises the following steps:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
Figure FDA0002754047570000011
lt=fr(st-1)
Figure FDA0002754047570000012
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image lt-1In a given position ltThe block of (b) is clipped as a result.
4. The deep learning based human eye vision correction method according to claim 3,
the local correction network corrects the defect area through a convolution layer, and comprises:
the block to be corrected selected in each step
Figure FDA0002754047570000021
With said locally modified network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT
5. A system for correcting a line of sight of a human eye based on deep learning, comprising:
the acquisition module is used for acquiring a human eye picture;
the processing module is used for processing the human eye picture through a coarse adjustment distortion network to obtain a human eye generated image in a coarse stage;
the correction module is used for detecting a defect area in the human eye generated image through a fine correction network and correcting the defect area; wherein the fine correction network comprises: a loop policy network and a local correction network;
the correction module comprises: a detection unit and a correction unit;
the detection unit is used for detecting a defect area in the human eye generated image by the loop strategy network;
the correcting unit is used for correcting the defect area through a convolution layer by the local correcting network.
6. The deep learning based human eye gaze correction system of claim 5, further comprising: a module for generating a plurality of modules,
the generation module is used for training a convolutional neural network with a coarse-fine structure to generate a two-dimensional compensation map, the compensation map performs pixel-level substitution operation on an original image to generate a training image, and the coarse tuning distortion network is trained by using the mean square error between the generated training image and the original image as a loss function.
7. The deep learning based human eye vision correction system of claim 6,
the detection unit is specifically configured to:
given the image I of step tt-1The cyclic strategy network selects a local coordinate position l of a blocktTo select a block
Figure FDA0002754047570000022
lt=fr(st-1)
Figure FDA0002754047570000023
Wherein s ist-1Is an encoded state feature of the cyclic strategy network, from an input image It-1Encoded historical hidden state ht-1And the position l of the block selected in the previous stept-1Co-construction, g denotes a crop operation, image It-1In a given position ltThe block of (b) is clipped as a result.
8. The deep learning based human eye gaze correction system of claim 7,
the correction unit is specifically configured to:
the block to be corrected selected in each step
Figure FDA0002754047570000031
With said locally modified network feTo obtain a corrected block, and using the corrected block to directly replace the block before correction as the corrected image ItAfter T, the final image I is obtainedT
CN201910175164.0A 2019-03-08 2019-03-08 Human eye sight line correction method and system based on deep learning Active CN109978804B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910175164.0A CN109978804B (en) 2019-03-08 2019-03-08 Human eye sight line correction method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910175164.0A CN109978804B (en) 2019-03-08 2019-03-08 Human eye sight line correction method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN109978804A CN109978804A (en) 2019-07-05
CN109978804B true CN109978804B (en) 2021-02-26

Family

ID=67078291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910175164.0A Active CN109978804B (en) 2019-03-08 2019-03-08 Human eye sight line correction method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN109978804B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008929B (en) * 2019-12-19 2023-09-26 维沃移动通信(杭州)有限公司 Image correction method and electronic equipment
CN111339928B (en) * 2020-02-25 2022-06-28 苏州科达科技股份有限公司 Eye spirit adjusting method and device and storage medium
CN113343931B (en) * 2021-07-05 2024-07-26 Oppo广东移动通信有限公司 Training method for generating countermeasure network, image vision correction method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944415A (en) * 2017-12-06 2018-04-20 董伟 A kind of human eye notice detection method based on deep learning algorithm
CN108022213A (en) * 2017-11-29 2018-05-11 天津大学 Video super-resolution algorithm for reconstructing based on generation confrontation network
CN108492273A (en) * 2018-03-28 2018-09-04 深圳市唯特视科技有限公司 A kind of image generating method based on from attention model
CN108765340A (en) * 2018-05-29 2018-11-06 Oppo(重庆)智能科技有限公司 Fuzzy image processing method, apparatus and terminal device
CN108885784A (en) * 2016-04-22 2018-11-23 英特尔公司 It is corrected using the real-time eye contact of machine learning neural network based
CN109102532A (en) * 2017-06-20 2018-12-28 西门子保健有限责任公司 The metaplasia of deep learning for medical imaging

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8599238B2 (en) * 2009-10-16 2013-12-03 Apple Inc. Facial pose improvement with perspective distortion correction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108885784A (en) * 2016-04-22 2018-11-23 英特尔公司 It is corrected using the real-time eye contact of machine learning neural network based
CN109102532A (en) * 2017-06-20 2018-12-28 西门子保健有限责任公司 The metaplasia of deep learning for medical imaging
CN108022213A (en) * 2017-11-29 2018-05-11 天津大学 Video super-resolution algorithm for reconstructing based on generation confrontation network
CN107944415A (en) * 2017-12-06 2018-04-20 董伟 A kind of human eye notice detection method based on deep learning algorithm
CN108492273A (en) * 2018-03-28 2018-09-04 深圳市唯特视科技有限公司 A kind of image generating method based on from attention model
CN108765340A (en) * 2018-05-29 2018-11-06 Oppo(重庆)智能科技有限公司 Fuzzy image processing method, apparatus and terminal device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《Eye Gaze Correction with Stereovision for Video-Teleconferencing》;Ruigang Yang等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20040731;第26卷(第7期);第956-960页 *
《基于Hough变换和梯度信息的人眼视线方向估计》;孙兴华等;《小型微型计算机系统》;20070630;第28卷(第6期);第1123-1128页 *
《虚拟视角自适应的视线校正方法》;尹苓琳等;《计算机辅助设计与图形学学报》;20131231;第25卷(第12期);第1834-1841页 *

Also Published As

Publication number Publication date
CN109978804A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
KR102281017B1 (en) Neural network model training method, apparatus and storage medium for image processing
CN109978804B (en) Human eye sight line correction method and system based on deep learning
CN112614077B (en) Unsupervised low-illumination image enhancement method based on generation countermeasure network
CN110097609B (en) Sample domain-based refined embroidery texture migration method
CN103530847B (en) A kind of infrared image enhancing method
JP6463101B2 (en) Region dividing apparatus and method
JP2007000205A (en) Image processing apparatus, image processing method, and image processing program
US20100209000A1 (en) Image processing apparatus for detecting coordinate position of characteristic portion of face
CN110853110A (en) Automatic picture toning method based on generation countermeasure network
CN112270691B (en) Monocular video structure and motion prediction method based on dynamic filter network
CN113706393B (en) Video enhancement method, device, equipment and storage medium
CN114882158B (en) Method, apparatus, device and readable medium for NERF optimization based on attention mechanism
CN111614911B (en) Image generation method and device, electronic device and storage medium
CN111586321A (en) Video generation method and device, electronic equipment and computer-readable storage medium
CN117788344A (en) Building texture image restoration method based on diffusion model
CN111612721B (en) Image restoration model training method and device and satellite image restoration method and device
CN111932594B (en) Billion pixel video alignment method and device based on optical flow and medium
CN116524290A (en) Image synthesis method based on countermeasure generation network
CN116129417A (en) Digital instrument reading detection method based on low-quality image
KR20230166870A (en) Image signal processing method using neural network, and computing appratus for performing the same
CN113781368B (en) Infrared imaging device based on local information entropy
JP7451443B2 (en) Image processing method and device, machine learning model training method and device, and program
CN115049559A (en) Model training method, human face image processing method, human face model processing device, electronic equipment and readable storage medium
CN113763524A (en) Physical optical model and neural network-based dual-flow shot rendering method and system
JP5544497B2 (en) Image processing apparatus, image processing method, and image processing program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant