CN114926352A

CN114926352A - Image reflection removing method, system, device and storage medium

Info

Publication number: CN114926352A
Application number: CN202210387641.1A
Authority: CN
Inventors: 韦志宇; 黄艳; 全宇晖; 何旭怡; 陈腾跃
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2022-04-14
Filing date: 2022-04-14
Publication date: 2022-08-19
Anticipated expiration: 2042-04-14

Abstract

The invention discloses an image reflection removing method, a system, a device and a storage medium, wherein the method comprises the following steps: constructing a composite data set comprising a background layer image and a reflection layer image from the reflection image; determining a basic filter, and constructing various morphological filters according to the basic filter; inputting the reflection image in the synthetic data set into a depth reinforcement learning network, and outputting a value V and a strategy pi of the reflection image; selecting a morphological filter from the action set A according to the strategy pi, and performing dereflection on the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image; training the depth reinforcement learning network according to the background layer image and the reflection layer image; and inputting the reflection image into a depth reinforcement learning network, and outputting a processing result. The method learns the selection of the morphological filter by a deep reinforcement learning method, increases the interpretability of the image in the process of removing reflection, improves the performance of removing reflection of the image, and can be widely applied to the field of digital image restoration.

Description

Image reflection removing method, system, device and storage medium

Technical Field

The present invention relates to the field of digital image restoration, and in particular, to a method, system, apparatus, and storage medium for image dereflection.

Background

In the production life of today's society, digital images play an extremely important role. In the signal processing and computer vision fields, the reflection caused by illumination during the shooting process of an image sensor can cause serious degradation of the color and contrast of an image captured by an outdoor video system, and the analysis and understanding of a subsequent image are greatly hindered. Reflection is a very common physical phenomenon when one photographs an object in front of a specular object, such as glass. In order to improve the image quality and improve the effectiveness and reliability of image post-processing, it is necessary to use reflection reduction or reflection removal techniques to filter out the reflections in the image.

The studies on reflection image separation are roughly classified into two categories, the first category is specular reflection separation of input multiple images, and the second category is reflection separation using single image input.

Scheechner et al and Kong et al, in turn, estimate the reflective layer by using multiple images taken at different polarization angles based on a physical reflection model. Sirinukulwattana et al use the change in reflection in multiple images taken from slightly different perspectives to impose constraints on the parallax of the multiple images, smoothing out specific regions of the reflective layer while maintaining clarity of transmission. In addition, some methods rely on video sequences to decorrelate motion between the transmissive and reflective layers, exploiting motion differences to decompose the input image into the initial transmissive and reflective layers. The process of extracting the motion field from the initial layer, updating the transmissive and reflective layers and estimating the motion field is repeated until convergence. However, the method based on multiple images and videos increases the difficulty of data acquisition and economic cost due to excessive data adopted.

Levin provides a corresponding solution for how to accurately separate the images formed by superposing and synthesizing two images, and tries to apply the solution to image reflection. Levin and Weiss et al propose a method for achieving reflection image separation with minimum prior of edges and corners to attempt to solve single image reflection separation. Li and Brown automatically extract these two layers by optimizing the objective function, which imposes a smooth gradient on the reflective layer and a sparse gradient on the transport layer. This gradient prior is by observation that reflections are generally less focused than the projected image, so the reflective layer gradient is weaker.

The application of the deep Convolutional Neural Network (CNN) first proposed by LeCun et al to the image reflection problem has also been rapidly developed and achieved excellent results. Zhang et al analyzes the polarization characteristics of the reflected light and the transmitted light by using the fresnel law, combines the polarization information with a deep learning method, and can effectively separate the reflected component and the transmitted component. However, many current methods have many defects, and because the single image is influenced by various factors such as light, target scenery, noise and the like in the process of acquiring the single image, and the problem of reflection removal based on the single image is pathological, the reflection in the single image is still difficult to be completely separated by using the single image.

Disclosure of Invention

To solve at least some of the problems in the prior art, it is an object of the present invention to provide an image reflection removing method, system, device and storage medium.

The technical scheme adopted by the invention is as follows:

an image de-reflection method comprising the steps of:

synthesizing the reflection image, and constructing a synthetic data set containing a background layer image and a reflection layer image according to the reflection image;

determining a basic filter, and constructing various morphological filters according to the basic filter to be used as an action set A in the enhanced filtering;

inputting the reflection image in the synthetic data set into a depth reinforcement learning network, and outputting a value V and a strategy pi of the reflection image;

selecting a morphological filter from the action set A according to the strategy pi, and performing dereflection on the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image;

training a depth reinforcement learning network according to the obtained background layer image and the reflection layer image;

and inputting the reflection images in the test data set into the trained depth reinforcement learning network, and outputting corresponding reflection layer images and background layer images.

Further, the basic filter comprises a gaussian filter, a bilateral filter and a median filter;

the determining the base filter, constructing a plurality of morphological filters according to the base filter, and using the morphological filters as an action set A in the enhanced filtering, comprises the following steps:

selecting a base filter;

evenly dividing the over-center of a square basic filter into a plurality of parts, and designing 8 different areas which are respectively represented as A1 to A8;

rotating the regions around the central point of the filter by different angles to respectively obtain a series of regions, wherein the weights belonging to the regions are equal to the weights of the corresponding basic filters, and the weights of the rest parts are set to be zero;

and constraining the weight sum of each newly constructed filter to be 1 so as to obtain an action set A formed by combining a series of different morphological filters.

Further, the deep reinforcement learning network includes a policy network and a value network, and the inputting the reflection image in the synthesized data set into the deep reinforcement learning network and outputting two parts of the value V and the policy pi of the reflection image includes:

inputting the reflection image into a depth reinforcement learning network for feature extraction to obtain a feature image, and inputting the feature image into a strategy network and a value network;

the strategy network has four layers of filters, wherein the first three layers are convolution layers of 64 filters with the size of 3 multiplied by 64 and an activation function ReLU; the size of the filter of the fourth layer is 3 multiplied by 64, the output dimension is equal to the size of the action set, and the output dimension represents the strategy pi of each action; (ii) a

The value network has three layers of filters, wherein the first two layers are a convolution layer of 64 filters with the size of 3 multiplied by 64 and an activation function ReLU; the filter size of the third layer is 3 × 3 × 64, and the output dimension is 1, representing the current value V.

Further, the selecting a morphological filter from the action set a according to the strategy pi to dereflect the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image includes:

and selecting a filter corresponding to the optimal strategy from the action set A according to the obtained strategy pi about each action, and performing convolution operation on the corresponding region by adopting the selected filter to obtain a current background layer image T 'and a current reflection layer image R'.

Further, the training of the depth-enhanced learning network according to the obtained background layer image and the obtained reflection layer image includes:

according to the obtained background layer image

And a reflective layer image

Performing gradient calculation, and training parameters of the deep reinforcement learning network by adopting a back propagation algorithm;

the gradient of the policy network is calculated in the following manner:

in the formula (I), the compound is shown in the specification,

state of the background layer image representing the t-th iteration, a ^(t) The action of the t-th iteration is represented,

I ^target the original background layer image is represented and,

representing the value of the background layer image for the t + n iterations;

the method for calculating the gradient of the value network comprises the following steps:

in the formula (I), the compound is shown in the specification,

representing the state of the background layer image for the t-th iteration,

representing the value, R, of the background layer image of the t-th iteration ^(t) Representing the reward for the t-th iteration.

Further, during the training process, the calculation formula of the loss function is as follows:

wherein, I ^target The original background layer image is represented and,

representing the state of the background layer image for the t-th iteration,

representing the state of the background layer image for the t-1 th iteration.

Further, the synthetic reflection image, from which a synthetic dataset comprising a background layer image and a reflection layer image is constructed, comprises:

randomly selecting two images in a preset real image dataset as a reflection layer and a background layer respectively so as to synthesize a plurality of synthesized datasets with the reflection images and the images of the reflection layer of the background layer;

and randomly overturning along a horizontal axis or a vertical axis in the synthetic data set, then randomly rotating by an angle, and randomly cutting a reflection image area block with the size of r multiplied by r.

The other technical scheme adopted by the invention is as follows:

an image antireflection system, comprising:

the data acquisition module is used for synthesizing the reflection image and constructing a synthetic data set containing the background layer image and the reflection layer image according to the reflection image;

the action acquisition module is used for determining a basic filter, and constructing various morphological filters according to the basic filter to be used as an action set A in the enhanced filtering;

the image input module is used for inputting the reflection image in the synthetic data set into a depth reinforcement learning network and outputting a value V and a strategy pi of the reflection image;

the reflection removing module is used for selecting a morphological filter from the action set A according to the strategy pi and removing reflection of the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image;

the model training module is used for training the depth reinforcement learning network according to the obtained background layer image and the reflection layer image;

and the model testing module is used for inputting the reflection images in the test data set into the trained depth reinforcement learning network and outputting corresponding reflection layer images and background layer images.

The other technical scheme adopted by the invention is as follows:

an image antireflection device, comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method described above.

The invention adopts another technical scheme that:

a computer readable storage medium in which a processor executable program is stored, which when executed by a processor is for performing the method as described above.

The beneficial effects of the invention are: according to the method, the selection of the morphological filter is learned through a deep reinforcement learning method, so that the interpretability of the image in the reflection removing process is increased, the iterative filtering strategy is improved, and the image reflection removing performance is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flowchart illustrating steps of an image dereflection method based on morphological filtering reinforcement learning according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating a deep reinforcement learning network according to an embodiment of the present invention;

FIG. 3 is a representation of 8 action set regions in an embodiment of the present invention; in any of the above methods, the weight of the red part is not 0, the weight of the white part is 0, the value of the part which is not 0 is taken as a basic filter, and finally the sum of the weights of the parts is constrained to be 1.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings only for the convenience of description of the present invention and simplification of the description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.

In the description of the present invention, a plurality of means is one or more, a plurality of means is two or more, and greater than, less than, more than, etc. are understood as excluding the essential numbers, and greater than, less than, etc. are understood as including the essential numbers. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.

As shown in fig. 1, the present embodiment provides an image dereflection method based on morphological filter reinforcement learning, which provides a more flexible filter construction method for the problem of simple structure, and is referred to as a morphological filter; aiming at the problem of simple selection, a method based on deep reinforcement learning is provided, a proper morphological filter is selected in an iterative mode, and the purpose of reflection removal is finally achieved. The method specifically comprises the following steps:

and S1, synthesizing the reflection image, and constructing a synthesized data set containing the background layer image and the reflection layer image according to the reflection image.

Firstly synthesizing a reflection image, establishing a synthetic data set corresponding to the background layer and the reflection layer image, and training a deep reinforcement learning network through the synthetic reflection image data set. In order to synthesize a reflection image dataset, two images are randomly selected from the existing real image dataset as a reflection layer and a background layer respectively, so that a plurality of synthesized datasets with reflection images and background layer reflection layer images are synthesized; and randomly overturning along a horizontal axis and a vertical axis in the synthetic data set, then randomly rotating by an angle, and randomly cutting a reflection image area block with the size of r multiplied by r.

Specifically, the image datasets are ImageNet Dataset and BSD500 Dataset, and two images are randomly selected from the image datasets as a reflection layer and a background layer respectively, so that 1000 synthetic datasets with the reflection image and the background layer reflection layer image are synthesized; turning over along a horizontal axis and a vertical axis randomly in the synthetic data set, rotating by a random angle, and randomly cutting a reflection image area block with the size of r × r, wherein in the embodiment, r is 70;

s2, determining a basic filter, and constructing various morphological filters according to the basic filter as an action set A in the reinforced filtering.

A base filter, such as a Gaussian filter, a bilateral filter, a median filter, etc., is determined, and a variety of morphological filters are constructed on the basis of the base filter to be used as the action set A in the enhanced filtering. Wherein, the step S2 specifically includes steps S21-S23:

s21, selecting a proper basic filter, wherein the proper basic filter comprises a Gaussian filter:

etc.;

s22, divide the over-center of a square basic filter into multiple parts, and design 8 different areas, which are denoted as a1 to a8, respectively, as shown in fig. 3. And rotating the regions around the central point of the filter by different angles to respectively obtain a series of regions, wherein the weight values in the regions are equal to the weight values of the corresponding basic filter, and the weight values of the rest parts are set to be zero.

And S23, restraining the sum of the weights of each newly constructed filter to be 1, thereby obtaining an action set formed by combining a series of different actions (form filters). For the sake of simplicity, denote by A _{x，y} Is represented by area A _x And A _y The set of different actions obtained by rotation can be more or less according to the needsFewer action sets. In addition to the above-described operations, there is a special operation (morphological filter): the pixel point is not updated (the central value is 1, and the rest is 0), which occurs when the network makes a judgment that the pixel does not need to be further updated, and therefore, the judgment of no-update operation is adopted.

And S3, inputting the reflection image in the synthetic data set into a depth reinforcement learning network, and outputting the value V and the strategy pi of the reflection image.

And constructing a deep reinforcement learning network structure, as shown in FIG. 2. The specific modules comprise an overlay Conv, an activation function ReLU and the like, the input is a reflection image (block) with a band, and the value V and the strategy pi output as the reflection image (block) are obtained through a series of linear and nonlinear operations such as convolution, activation and the like in a network, and the specific structure is as follows:

firstly, inputting a W multiplied by H multiplied by c image (a gray scale image c is 1, a color image c is 3), utilizing convolution layers of 64 filters with the size of 3 multiplied by c and an activation function ReLU to carry out feature extraction, wherein the feature extraction has four layers in total, and the features are respectively input into a strategy network and a value network, namely the two sub-networks have the same input;

the strategy network has three layers of 64 convolution layers with the size of 3 multiplied by 64 and an activation function ReLU; the filter size of the fourth layer is 3 × 3 × 64, but the output dimension is equal to the size of the action set, representing the strategy pi of each action;

the value network has two layers of convolution layers of 64 filters with the size of 3 multiplied by 64 and an activation function ReLU; the filter size of the third layer is 3 × 3 × 64, and the output dimension is 1, representing the current value V.

And S4, selecting a morphological filter from the action set A according to the strategy pi, and performing reflection removal on the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image.

And according to the strategy pi obtained in the step S3, selecting the optimal morphological filter in the action set a obtained in the step S2, and performing convolution operation on the corresponding region through the selected morphological filter pair to obtain the corresponding current background layer image T 'and the corresponding reflection layer image R'.

And S5, training the depth reinforcement learning network according to the obtained background layer image and the reflection layer image.

Through the obtained background layer image of the t-th iteration

And a reflective layer image

Training a reinforced filter network by using an error back propagation algorithm; the background layer image is returned to step S2 as an input to the network again until the number of loops t is reached _max In this embodiment, t _max The method comprises the following specific steps:

s51, calculating the gradient of the strategy network as follows:

in the formula (I), the compound is shown in the specification,

s52, calculating the gradient of the value network as follows:

wherein

Representing the t-th background layer imageIn the state of (a) to (b),

and representing the value of the background layer image in the t step.

S53, using the background layer image obtained in step S4

And a reflective layer image

Taking the difference between the mean square error of the current background layer image and the clean image and the mean square error of the last background layer image and the clean image as a loss function

Training is performed using the SGD back propagation algorithm according to the gradient calculation in steps S51 and S52, and parameters of each layer of network are trained.

If the cycle number does not reach the maximum cycle number t _max Go back to step S2 to form the current background image layer

As an input to step S2; if the maximum number of cycles t has been reached _max The process proceeds to step S5. The most appropriate morphological filter in each iteration will be selected during the process of de-reflection to obtain a series of background layer images with gradually removed reflection.

And S6, inputting the reflection images in the test data set into a trained deep reinforcement learning network, and outputting corresponding reflection layer images and background layer images.

In this embodiment, the test data sets are Real Images From Dataset and SIR2-Postcard Dataset of Real world known background layer Images and reflected Images, and the reflected Images are newly synthesized by using ImageNet Dataset and BSD500 Dataset, and the reflected Images are input into a trained enhanced filter network model, and the corresponding reflected layer Images and background layer Images are output.

In summary, compared with the prior art, the method of the present embodiment has the following advantages and effects:

(1) the present invention is directed to a composite reflectance data set that is distinct from real world reflectance images. Obtaining real and valid training data in the real world requires time consuming resources.

(2) The invention constructs a flexible form filter, and improves the performance of reflection removal by increasing the action set of reinforcement learning.

(3) The invention provides a better morphological filter selection strategy: the selection of the morphological filter is obtained through a deep reinforcement learning model, and the shape filter is more interpretable in the process of reflection removal.

(4) The method can generate a good restoration effect on most of the reflected images of the natural scene.

The present embodiment also provides an image antireflection system, including:

and the model testing module is used for inputting the reflection images in the test data set into the trained deep reinforcement learning network and outputting the corresponding reflection layer images and background layer images.

The image reflection removing system of the embodiment can execute the image reflection removing method provided by the embodiment of the method of the invention, can execute any combination of the implementation steps of the embodiment of the method, and has corresponding functions and beneficial effects of the method.

The present embodiment also provides an image antireflection device, including:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.

The image antireflection device of the embodiment can execute the image antireflection method provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.

The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor, causing the computer device to perform the method illustrated in fig. 1.

The present embodiment further provides a storage medium, which stores an instruction or a program for executing the image reflection removing method provided in the embodiment of the method of the present invention, and when the instruction or the program is executed, the method may perform any combination of the implementation steps, and has corresponding functions and advantages of the method.

In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer given the nature, function, and interrelationships of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is to be determined from the appended claims along with their full scope of equivalents.

The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. An image antireflection method, comprising the steps of:

selecting a morphological filter from the action set A according to the strategy pi, and performing reflection removal on the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image;

and inputting the reflection images in the test data set into a trained deep reinforcement learning network, and outputting corresponding reflection layer images and background layer images.

2. The method of claim 1, wherein the base filter comprises a gaussian filter, a bilateral filter, a median filter;

the determining a base filter, constructing a plurality of morphological filters according to the base filter, and using the morphological filters as an action set A in the enhanced filtering, includes:

selecting a base filter;

equally dividing the center of a square basic filter into a plurality of parts, and designing 8 different regions which are respectively represented as A1 to A8;

3. The image dereflection method as recited in claim 1, wherein the deep reinforcement learning network comprises a policy network and a value network, and the inputting the reflection image in the synthesized data set into the deep reinforcement learning network and outputting two parts of the value V and the policy pi of the reflection image comprises:

the strategy network has four layers of filters, wherein the first three layers are convolution layers of 64 filters with the size of 3 multiplied by 64 and an activation function ReLU; the size of the filter of the fourth layer is 3 multiplied by 64, the output dimension is equal to the size of the action set, and the output dimension represents the strategy pi of each action;

4. The method according to claim 1, wherein the selecting a shape filter from the action set a according to the policy pi to dereflect the input reflection image to obtain a corresponding background layer image and reflection layer image comprises:

5. The image dereflection method according to claim 3, wherein the training of the depth-enhanced learning network according to the obtained background layer image and reflection layer image comprises:

performing gradient calculation according to the obtained background layer image and the reflection layer image, and training parameters of the depth reinforcement learning network by adopting a back propagation algorithm;

the method for calculating the gradient of the policy network comprises the following steps:

in the formula (I), the compound is shown in the specification,

I ^target the original background layer image is represented and,

representing the value of the background layer image of the t + n iterations;

the way of calculating the gradient of the value network is as follows:

in the formula (I), the compound is shown in the specification,

representing the state of the background layer image for the t-th iteration,

value of background layer image representing the t-th iteration, R ^(t) Representing the reward for the t-th iteration.

6. An image dereflection method as claimed in claim 5, characterized in that in the training process, the loss function is calculated as follows:

wherein, I ^target The original background layer image is represented and,

representing the state of the background layer image for the t-th iteration,

representing the state of the background layer image for the t-1 th iteration.

7. The method of claim 1, wherein the step of constructing a synthetic dataset comprising the background layer image and the reflection layer image from the synthetic reflection image comprises:

randomly selecting two images from a preset real image dataset as a reflection layer and a background layer respectively, thereby synthesizing a plurality of synthetic datasets with the reflection images and the background layer reflection layer images;

and randomly overturning along a horizontal axis or a vertical axis in the synthetic data set, then randomly rotating by an angle, and randomly cutting a reflection image region block with the size of r multiplied by r.

8. An image antireflection system, comprising:

the reflection removing module is used for selecting a form filter from the action set A according to the strategy pi, and removing reflection of the input reflection image to obtain a corresponding background layer image and a corresponding reflection layer image;

9. An image antireflection device, comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-7.

10. A computer-readable storage medium, in which a program executable by a processor is stored, wherein the program executable by the processor is adapted to perform the method according to any one of claims 1 to 7 when executed by the processor.