WO2024066654A1 - 基于无监督权重深度模型的偏振图像去雾方法及装置 - Google Patents

基于无监督权重深度模型的偏振图像去雾方法及装置 Download PDF

Info

Publication number
WO2024066654A1
WO2024066654A1 PCT/CN2023/106584 CN2023106584W WO2024066654A1 WO 2024066654 A1 WO2024066654 A1 WO 2024066654A1 CN 2023106584 W CN2023106584 W CN 2023106584W WO 2024066654 A1 WO2024066654 A1 WO 2024066654A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
polarization
layer
feature
unsupervised
Prior art date
Application number
PCT/CN2023/106584
Other languages
English (en)
French (fr)
Inventor
孙波
马铜伟
叶壮
李道胜
Original Assignee
泉州装备制造研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 泉州装备制造研究所 filed Critical 泉州装备制造研究所
Publication of WO2024066654A1 publication Critical patent/WO2024066654A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present invention relates to the field of image defogging, and in particular to a polarization image defogging method and device based on an unsupervised weighted depth model.
  • Galdran introduced an artificial multi-exposure image fusion strategy to restore the haze image.
  • Gao et al. proposed a self-constructed image fusion method based on scale-invariant feature transform flow to restore a single haze image. Li et al.
  • the global atmospheric light is estimated by the hierarchical search method in the image, and different denoising methods are used to restore the scene radiation at different levels of the pyramid by estimating the transmission map.
  • the generated pyramid is folded to restore the dehazed image.
  • the fused images cannot be well It reflects the depth information of the scene, resulting in poor dehazing effect of the image in the presence of severe smoke.
  • Cai et al. first proposed estimating the transmission map through a convolutional neural network, and then estimated the atmospheric light through a priori-based method to restore a clear image using an atmospheric degradation model. Li et al. directly restored the image using cGAN or a network based on an encoder-decoder structure.
  • learning-based methods are data-driven. The training data of these methods almost always rely on synthetic foggy images, and the dehazing effect often becomes distorted when the scene changes, especially for real foggy images.
  • the purpose of the present invention is to propose a polarization image defogging method and device based on an unsupervised weighted depth model to solve the technical problems mentioned in the above background technology section.
  • the present invention provides a polarization image defogging method based on an unsupervised weighted depth model, comprising the following steps:
  • An unsupervised weighted deep model is constructed and trained to obtain a trained unsupervised weighted deep model, wherein the unsupervised weighted deep model includes a coding layer, a fusion layer, a decoding layer, and a weight measurement layer, wherein the coding layer, the fusion layer, and the decoding layer are connected in sequence, wherein the coding layer includes a convolutional layer and a DenseNet network, wherein the first image and the second image are respectively input into the coding layer to obtain a first feature and a second feature, wherein the first feature and the second feature are input into the fusion layer for feature splicing to obtain a spliced feature, wherein the spliced feature is input into the decoding layer to output a defogged polarization image, wherein during the training process, the weight measurement layer performs unsupervised training according to the defogged polarization image and the first image and the second image;
  • the first image and the second image are input into the trained unsupervised weighted deep model, and after the encoding layer, the fusion layer and the decoding layer, a defogged polarization image is obtained.
  • the weight measurement layer includes a feature extraction part, an information measurement part, an information preservation part and a loss function part.
  • the first image and the second image are respectively input into the feature extraction part to obtain a first feature map and a second feature map.
  • the first feature map and the second feature map are respectively input into the information measurement part.
  • the gradients of the first feature map and the second feature map are evaluated to obtain a first evaluation value and a second evaluation value.
  • the first evaluation value and the second evaluation value are input into the information preservation part to obtain a first weight and a second weight.
  • the loss function part the first weight and the second weight are combined with the structural similarity between the dehazed polarization image and the first image and the second image to calculate the similarity loss.
  • the first weight and the second weight are combined with the mean square error between the dehazed polarization image and the first image and the second image to calculate the mean square error loss.
  • the feature extraction part uses a VGG19 network for feature extraction, the first image and the second image are respectively input into the VGG19 network, and the output of the convolutional layer before the maximum pooling layer of the VGG19 network is used as the first feature map and the second feature map.
  • the formula for evaluating the gradient of the first feature map and the second feature map is as follows:
  • the softmax function is used to map the weights and Get the first weight and the second weight
  • the formula is as follows:
  • c represents a predefined positive constant.
  • represents the parameters of the network body
  • D represents the training data set
  • E represents the average of the output inside the bracket. and They respectively represent the structural similarity between the defogging polarization image If and the first image I1 , and the structural similarity between the defogging polarization image If and the second image I2 ;
  • the mean square error is used as the intensity distribution constraint between the defogging polarization image and the first image and the second image, respectively, and the formula is as follows:
  • the fusion layer uses a residual structure to concatenate the first feature and the second feature.
  • the decoding layer includes 4 convolutional layers. Except for the last layer, the other layers select the ReLU activation function, and the kernel of each convolutional layer is 3 ⁇ 3.
  • obtaining the first image and the second image according to the micro-polarization array image specifically includes:
  • the polarization image information of four different polarization directions of 0°, 45°, 90° and 135° is obtained respectively.
  • I 45 (x, y) I orig (2x, 2y + 1);
  • I 90 (x, y) I orig (2x+1, 2y+1);
  • I 135 (x, y) I orig (2x + 1, 2y);
  • the degree of polarization image DoP is taken as the first image I 1
  • the atmosphere image S 0 is taken as the second image I 2 .
  • the present invention provides a polarization image defogging device based on an unsupervised weighted depth model, comprising:
  • An image acquisition module is configured to acquire a micro-polarization array image, and obtain a first image and a second image according to the micro-polarization array image;
  • a model construction training module is configured to construct an unsupervised weighted deep model and perform training to obtain a trained unsupervised weighted deep model, wherein the unsupervised weighted deep model includes a coding layer, a fusion layer, a decoding layer, and a weight measurement layer, wherein the coding layer, the fusion layer, and the decoding layer are sequentially connected, wherein the coding layer includes a convolutional layer and a DenseNet network, wherein the first image and the second image are respectively input into the coding layer to obtain a first feature and a second feature, wherein the first feature and the second feature are input into the fusion layer for feature splicing to obtain a spliced feature, wherein the spliced feature is input into the decoding layer to output a defogged polarization image, wherein during the training process, the weight measurement layer performs unsupervised training according to the defogged polarization image and the first image and the second image;
  • the application module is configured to input the first image and the second image into a trained unsupervised weighted depth model in an application, and obtain a defogged polarization image through a coding layer, a fusion layer and a decoding layer.
  • the present invention provides an electronic device comprising one or more processors; a storage device for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors implement the method described in any implementation manner in the first aspect.
  • the present invention provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method described in any one of the implementation modes of the first aspect. method.
  • the present invention has the following beneficial effects:
  • the present invention adopts a focal plane polarization imaging system to collect polarized haze images and haze-free images under different haze conditions, which can provide ground truth for the unsupervised weighted depth model.
  • a focal plane polarization imaging system to collect polarized haze images and haze-free images under different haze conditions, which can provide ground truth for the unsupervised weighted depth model.
  • the present invention adds a weight measurement layer to the unsupervised weighted deep model, takes into account the retention methods of different modal information, fully utilizes the complementary information of different modal information, does not rely on the constraint that the image must contain the sky area, greatly expands the application scenarios, and can be used for defogging tasks under complex weather conditions such as rain, fog, dust and snow.
  • FIG1 is a schematic diagram of a process of a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a micro-polarization array image of a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention
  • FIG4 is a framework diagram of an unsupervised weighted depth model of a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a Densenet network of an unsupervised weighted depth model of a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a fusion layer of an unsupervised weighted depth model of a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention
  • FIG7 is an atmospheric image of a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention
  • FIG8 is a polarization image defogging method based on an unsupervised weighted depth model according to an embodiment of the present invention.
  • FIG9 is a schematic diagram of a polarization image defogging device based on an unsupervised weighted depth model according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of the structure of a computer device suitable for implementing an electronic device of an embodiment of the present application.
  • FIG. 1 shows an exemplary device architecture 100 to which the polarization image defogging method based on an unsupervised weighted deep model or the polarization image defogging device based on an unsupervised weighted deep model according to an embodiment of the present application can be applied.
  • the device architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105.
  • the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links or optical fiber cables, etc.
  • terminal devices 101, 102, 103 Users can use terminal devices 101, 102, 103 to interact with server 105 through network 104 to receive or send messages, etc.
  • Various applications such as data processing applications, file processing applications, etc., can be installed on terminal devices 101, 102, 103.
  • Terminal devices 101, 102, 103 can be hardware or software.
  • terminal devices 101, 102, 103 When terminal devices 101, 102, 103 are hardware, they can be various electronic devices, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, etc.
  • terminal devices 101, 102, 103 When terminal devices 101, 102, 103 are software, they can be installed in the electronic devices listed above. They can be implemented as multiple software or software modules (for example, software or software modules used to provide distributed services), or they can be implemented as a single software or software module. No specific limitation is made here.
  • the server 105 may be a server that provides various services, such as 103 A background data processing server for processing the uploaded files or data.
  • the background data processing server can process the acquired files or data and generate processing results.
  • the polarization image defogging method based on the unsupervised weighted deep model provided in the embodiment of the present application can be executed by the server 105, or by the terminal devices 101, 102, and 103. Accordingly, the polarization image defogging device based on the unsupervised weighted deep model can be set in the server 105, or in the terminal devices 101, 102, and 103.
  • the number of terminal devices, networks and servers in FIG1 is merely illustrative. Any number of terminal devices, networks and servers may be provided as required.
  • the above device architecture may not include a network, but only a server or a terminal device.
  • FIG2 shows a polarization image defogging method based on an unsupervised weighted depth model provided by an embodiment of the present application, comprising the following steps:
  • T1 acquiring a micro-polarization array image, and obtaining a first image and a second image according to the micro-polarization array image.
  • the micro-polarization array image is obtained by a focal plane polarization imaging system.
  • the micro-polarization array image is composed of polarization image information having four different polarization directions of 0°, 45°, 90°, and 135°.
  • the first image and the second image are obtained according to the micro-polarization array image, specifically including:
  • the polarization image information of four different polarization directions of 0°, 45°, 90° and 135° is obtained respectively.
  • the degree of polarization image DoP is taken as the first image I 1
  • the atmosphere image S 0 is taken as the second image I 2 .
  • the unsupervised weighted deep model includes a coding layer, a fusion layer, a decoding layer and a weight measurement layer.
  • the coding layer, the fusion layer and the decoding layer are connected in sequence.
  • the coding layer includes a convolutional layer and a DenseNet network.
  • the first image and the second image are respectively input into the coding layer to obtain the first feature and the second feature.
  • the first feature and the second feature are input into the fusion layer for feature splicing to obtain the spliced feature.
  • the spliced feature is input into the decoding layer, and the defogged polarization image is output.
  • the weight measurement layer performs unsupervised training based on the defogged polarization image and the first image and the second image.
  • the convolution layer in the coding layer is used to extract rough features, and then the depth and high-dimensional features corresponding to the first image and the second image, namely the first feature and the second feature, are extracted through the DenseNet network, which can save more feature information.
  • the size of all convolution kernels in Figure 4 is 3 ⁇ 3, and the first image I1 and the second image I2 are respectively input into a convolution layer C1 with 16 output channels to extract the overall contour features.
  • the activation function selects the ReLU function, and the outputs are respectively represented as C I1 and C I2 ; then C I1 and C I2 are input into DenseNet, as shown in Figure 5, wherein the DenseNet network includes three convolution layers, the size of the convolution kernel is 3 ⁇ 3, the output channel is 16, the activation function of each layer selects the ReLU function, the output of each layer is cascaded as the input of the next layer, and the output obtained after DenseNet is respectively represented as DC I1 and DC I2 , and the feature map is 56 ⁇ 56 ⁇ 64.
  • the fusion layer adopts a residual structure, and the calculation of the residual structure can be expressed as:
  • the residual structure consists of three convolution layers, the convolution kernel is 3 ⁇ 3, and the output channels are 2, 2, and 1 respectively.
  • the obtained feature map is represented by Fusion, and the size is 56 ⁇ 56 ⁇ 64; the fusion layer uses the residual structure to splice the first feature and the second feature, which can accelerate the convergence speed of the network training stage.
  • Fusion is input into the decoding layer Decode.
  • the decoding layer consists of 4 convolution layers, the convolution kernel size is 3 ⁇ 3, and the output channels are 64, 32, 16, and 1 respectively. Except for the last layer, other layers use the ReLU activation function.
  • the obtained feature map is represented by If , and the size is 56 ⁇ 56 ⁇ 1, which is the final output.
  • the decoding layer can ensure that the resolution of the final output defogging polarization image is consistent with the input image.
  • the weight measurement layer includes a feature extraction part, an information measurement part, an information preservation part and a loss function part.
  • the first image and the second image are respectively input into the feature extraction part to obtain a first feature map and a second feature map.
  • the first feature map and the second feature map are respectively input into the information measurement part.
  • the gradients of the first feature map and the second feature map are evaluated to obtain a first evaluation value and a second evaluation value.
  • the first evaluation value and the second evaluation value are input into the information preservation part to obtain a first weight and a second weight.
  • the loss function part the first weight and the second weight are combined with the structural similarity between the dehazed polarization image and the first image and the second image to calculate the similarity loss.
  • the first weight and the second weight are combined with the mean square error between the dehazed polarization image and the first image and the second image to calculate the mean square error loss.
  • the feature extraction part uses the VGG19 network to extract features, the first image and the second image are respectively input into the VGG19 network, and the output of the convolution layer before the maximum pooling layer of the VGG19 network is used as the first feature map and the second feature map.
  • the formula for evaluating the gradient of the first feature map and the second feature map is as follows:
  • c represents a predefined positive constant.
  • the embodiment of the present application maps the weights of the softmax function and Convert to a real number between 0 and 1 and ensure that both weights and The sum is 1, The weight value is then used to control the degree of information preservation of the input image.
  • the loss function is mainly used to preserve important information and train the deep model, and the structural similarity index is used as the similarity loss between the dehazed polarization image I f and the first image I 1 and the second image I 2 respectively.
  • represents the parameters of the network body
  • D represents the training data set
  • E represents the average of the output inside the bracket. and They respectively represent the structural similarity between the defogging polarization image If and the first image I1 , and the structural similarity between the defogging polarization image If and the second image I2 ;
  • the mean square error is used as the intensity distribution constraint between the defogging polarization image and the first image and the second image, respectively, and the formula is as follows:
  • Polarized haze images and fog-free images under different haze conditions are collected by the focal plane polarization imaging system to form a training set for the unsupervised weighted depth model, so as to realize the training of the unsupervised weighted depth model using the ground truth, and further improve the performance of the unsupervised weighted depth model in real scene applications.
  • the specific training process is the same as the conventional training process and will not be repeated here.
  • the first image and the second image are input into the trained unsupervised weighted deep model, and after the encoding layer, the fusion layer and the decoding layer, a defogged polarization image is obtained.
  • Figure 7 is the second image
  • Figure 8 is the polarization image after defogging.
  • the polarization image can be effectively defogged.
  • the embodiment of the present application takes the de-entangled representation learning idea as the starting point, directly solves the polarization information of the light wave according to the Stokes theory, directly inputs the second image and the first image into the coding layer, and designs two weight factors in the model training method, thereby considering the retention method of different modal information and taking the weight factors into account in the loss
  • the unsupervised weighted deep model is trained in the function, and finally the long-distance scene defogging is realized.
  • the polarization images under different haze conditions are collected by the focal plane polarization imaging system, and the polarization defogging is processed by the unsupervised weighted deep model.
  • the present application provides an embodiment of a polarization image defogging device based on an unsupervised weighted deep model.
  • the device embodiment corresponds to the method embodiment shown in Figure 2, and the device can be specifically applied to various electronic devices.
  • the embodiment of the present application provides a polarization image defogging device based on an unsupervised weighted depth model, comprising:
  • An image acquisition module 1 is configured to acquire a micro-polarization array image, and obtain a first image and a second image according to the micro-polarization array image;
  • the model construction training module 2 is configured to construct an unsupervised weighted deep model and perform training to obtain a trained unsupervised weighted deep model, wherein the unsupervised weighted deep model includes a coding layer, a fusion layer, a decoding layer and a weight measurement layer, wherein the coding layer, the fusion layer and the decoding layer are connected in sequence, wherein the coding layer includes a convolutional layer and a DenseNet network, wherein the first image and the second image are respectively input into the coding layer to obtain a first feature and a second feature, wherein the first feature and the second feature are input into the fusion layer for feature splicing to obtain a spliced feature, wherein the spliced feature is input into the decoding layer to output a defogged polarized image, wherein during the training process, the weight measurement layer performs unsupervised training according to the defogged polarized image and the first image and the second image;
  • the application module 3 is configured to input the first image and the second image into a trained unsupervised weighted depth model in an application, and obtain a defogged polarization image through a coding layer, a fusion layer and a decoding layer.
  • each module in this embodiment a polarization image defogging method based on an unsupervised weighted depth model.
  • FIG10 shows a schematic diagram of the structure of a computer device 1000 suitable for implementing an electronic device (such as a server or terminal device shown in FIG1 ) according to an embodiment of the present application.
  • the electronic device shown in FIG10 is only an example and should not limit the functions and scope of use of the embodiments of the present application.
  • a computer device 1000 includes a central processing unit (CPU) 1001 and a graphics processing unit (GPU) 1002, which can be programmed according to a program stored in a read-only memory (ROM) 1003.
  • the CPU 1001, the GPU 1002, the ROM 1003, and the RAM 1004 are connected to each other via a bus 1005.
  • An input/output (I/O) interface 1006 is also connected to the bus 1005.
  • the following components are connected to the I/O interface 1006: an input section 1007 including a keyboard, a mouse, etc.; an output section 1008 including a display such as a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 1009 including a hard disk, etc.; and a communication section 1010 including a network interface card such as a LAN card, a modem, etc.
  • the communication section 1010 performs communication processing via a network such as the Internet.
  • a drive 1011 may also be connected to the I/O interface 1006 as needed.
  • a removable medium 1012 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 1011 as needed, so that a computer program read therefrom is installed into the storage section 1009 as needed.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes a program code for executing the method shown in the flowchart.
  • the computer program can be downloaded and installed from a network through the communication part 1010, and/or installed from a removable medium 1012.
  • CPU central processing unit
  • GPU graphics processing unit
  • the computer-readable medium described in the present application may be a computer-readable signal medium or a computer-readable medium or any combination of the above two.
  • the computer-readable medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared or semiconductor device, apparatus or device, or any combination of the above.
  • Computer-readable media may include, but are not limited to: an electrical connection with one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the above.
  • a computer-readable medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution device, apparatus or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as part of a carrier wave, which carries a computer-readable Program code. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable medium that can send, propagate, or transmit a program for use by or in conjunction with an instruction execution device, apparatus, or device.
  • the program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, RF, etc., or any suitable combination of the above.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages or a combination thereof, including object-oriented programming languages, such as Java, Smalltalk, C++, and conventional procedural programming languages, such as "C" or similar programming languages.
  • the program code may be executed entirely on the user's computer, partially on the user's computer, as a separate software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server.
  • the remote computer may be connected to the user's computer via any type of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (e.g., via the Internet using an Internet service provider).
  • LAN local area network
  • WAN wide area network
  • Internet service provider e.g., via the Internet using an Internet service provider
  • each box in the flowchart or block diagram can represent a module, a program segment or a part of a code, and the module, a program segment or a part of the code contains one or more executable instructions for realizing the specified logical function.
  • the functions marked in the box can also occur in a different order from the order marked in the accompanying drawings. For example, two boxes represented in succession can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, depending on the functions involved.
  • each box in the block diagram and/or flowchart and the combination of boxes in the block diagram and/or flowchart can be implemented with a dedicated hardware-based device that performs a specified function or operation, or can be implemented with a combination of dedicated hardware and computer instructions.
  • modules involved in the embodiments described in this application may be implemented by software or hardware, and the modules described may also be set in a processor.
  • the present application further provides a computer-readable medium, which may be included in the electronic device described in the above embodiment; or may exist independently. Not assembled into the electronic device.
  • the computer-readable medium carries one or more programs.
  • the electronic device obtains a micro-polarization array image, obtains a first image and a second image according to the micro-polarization array image; constructs an unsupervised weighted depth model and performs training to obtain a trained unsupervised weighted depth model, the unsupervised weighted depth model includes a coding layer, a fusion layer, a decoding layer and a weight measurement layer, the coding layer, the fusion layer and the decoding layer are connected in sequence, the coding layer includes a convolutional layer and a DenseNet network, the first image and the second image are respectively input into the coding layer to obtain a first feature and a second feature, the first feature and the second feature are input into the fusion layer for feature splicing to obtain a s
  • the present invention discloses a polarization image defogging method and device based on an unsupervised weighted deep model.
  • a split-focus plane polarization imaging system is used to collect polarization images under different haze conditions.
  • the unsupervised weighted deep model is used to perform polarization defogging processing to solve the problem that the current image defogging has a poor defogging effect on long-distance scenes.
  • the method and device have a wide range of applications and good industrial practicability.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于无监督权重深度模型的偏振图像去雾方法及装置,涉及图像去雾处理领域,以解纠缠的表示学习思想为起点,根据stokes理论,直接解算光波的偏振信息,将第一图像与第二图像直接输入无监督权重深度模型,依次经过编码层、融合层和解码层,输出得到去雾后的偏振图像,在模型训练方法上,通过在权重测量层内设计两个权重,从而考虑了不同模态信息的保留方式,将两个权重计入损失函数中对无监督权重深度模型进行训练,最终实现远距离场景去雾。采用分焦平面型偏振成像系统采集不同雾霾条件下的偏振图像,通过无监督权重深度模型进行偏振去雾处理,以解决目前图像去雾时对远距离场景去雾效果不佳的问题。

Description

基于无监督权重深度模型的偏振图像去雾方法及装置 技术领域
本发明涉及图像去雾处理领域,具体涉及一种基于无监督权重深度模型的偏振图像去雾方法及装置。
背景技术
由于雾霾以及大气污染等因素,图像去雾是计算机视觉领域的一个具有挑战性的课题,如目标检测与识别,视觉导航,自动驾驶等。学者们提出了许多方法来克服雾霾造成的图像退化。Schechner等人首次系统的证明,可以通过偏振成像来提高在低能见度天气下拍摄的图像的质量。其中大气退化模型更是广泛用于描述雾霾图像的形成。针对去雾问题开发的方法主要分为基于先验的方法、基于融合的方法和基于学习的方法。
1)基于先验的方法:
这些方法也称为手工制作的去雾方法,例如暗通道先验、颜色衰减先验、固有边界约束、局部平滑先验、雾线先验等,它们都基于大气退化模型。尽管这些方法具有显著的去雾性能,由于假设是预先确定的,因此在某些情况下难以实现良好的性能。例如,假设在暗通道中,除天空区域外,清晰图像的至少一个颜色通道具有一些强度非常低甚至接近零的像素。当场景对象类似于大气光(例如天空或白色建筑物)时,该方法就会出现透射函数过度估计的问题。
2)基于融合的方法:
这些方法的特点是在不使用大气退化模型的情况下恢复雾图像。Galdran引入了一种人工多曝光图像融合策略来恢复雾霾图像。首先从原始雾霾图像中获得多个过度曝光的图像,然后使用拉普拉斯金字塔分解方法将这些图像合并为无雾霾结果。高等提出了一种基于尺度不变特征变换流的自构造图像融合方法来恢复单个雾图像。Li等提出通过拉普拉斯金字塔将雾霾和无雾图像分解为多个尺度,利用高斯金字塔将传输图分解为多个尺度,建立了一种新的多尺度模糊图像模型。利用图像中的分层搜索方法估计全局大气光,通过估计传输映射,采用不同的降噪方法来恢复金字塔不同级别的场景辐射。最终折叠生成的金字塔以恢复去雾图像。但是融合生成的图像都不能很好地 反映场景的深度信息,导致在存在严重烟雾的情况下,图像的去雾效果较差。
3)基于学习的方法:
最近,基于学习的去雾方法受到了广泛关注。Cai等首先提出通过卷积神经网络估计透射图,然后通过基于先验的方法估计大气光,以利用大气退化模型恢复清晰图像。Li等使用cGAN或基于编码器-解码器结构的网络直接恢复图像。然而,基于学习的方法是数据驱动的。这些方法的训练数据几乎总是依赖于合成的雾图像,场景发生变化时,去雾效果往往会出现失真,尤其是真实的雾图像。
现有的图像去雾深度模型采用无雾图像与合成雾霾图像对进行网络训练,同时深度网络很少考虑了不同模态信息的保留方式,往往采用相同的网络结构提取图像特征,并且强度图像对环境的依赖性较强,若出现烟尘环境、低能见度(如雾霾天气、水体环境)状况,则会带来低信噪比情形下的光学观测困难,可应用场景有限。
发明内容
针对上述提到的目前图像去雾时对远距离场景去雾效果不佳等问题。本发明的目的在于提出了一种基于无监督权重深度模型的偏振图像去雾方法及装置,来解决以上背景技术部分提到的技术问题。
第一方面,本发明提供了一种基于无监督权重深度模型的偏振图像去雾方法,包括以下步骤:
获取微偏振阵列图像,根据微偏振阵列图像得到第一图像和第二图像;
构建无监督权重深度模型并进行训练,得到经训练的无监督权重深度模型,无监督权重深度模型包括编码层、融合层、解码层和权重测量层,编码层、融合层和解码层依次连接,编码层包括卷积层和DenseNet网络,第一图像和第二图像分别输入编码层,得到第一特征和第二特征,第一特征和第二特征输入融合层进行特征拼接,得到拼接特征,拼接特征输入解码层,输出得到去雾后的偏振图像,在训练过程中,权重测量层根据去雾后的偏振图像以及第一图像和第二图像进行无监督训练;
在应用中,将第一图像和第二图像输入经训练的无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
作为优选,权重测量层包括特征提取部分、信息测量部分、信息保存度部分和损失函数部分,将第一图像和第二图像分别输入特征提取部分,得到第一特征图和第二特征图,第一特征图和第二特征图分别输入信息测量部分,对第一特征图和第二特征图的梯度进行评估,得到第一评估值和第二评估值,将第一评估值和第二评估值输入信息保存度部分,得到第一权重和第二权重,在损失函数部分,将第一权重和第二权重结合去雾后的偏振图像分别与第一图像和第二图像之间的结构相似性计算相似性损失,将第一权重和第二权重结合去雾后的偏振图像分别与第一图像和第二图像之间的均方误差计算均方误差损失。
作为优选,特征提取部分采用VGG19网络进行特征提取,将第一图像和第二图像分别输入VGG19网络,并将VGG19网络的最大池化层之前的卷积层的输出作为第一特征图和第二特征图。
作为优选,在信息测量部分中,对第一特征图和第二特征图的梯度进行评估的公式如下:
其中,为第一特征图SD或第二特征图S0在VGG19网络中第j个最大池化层之前的卷积层的特征映射,k表示Dj通道的第k个通道的特征映射,||·||F表示Frobenius范数,▽表示拉普拉斯算子,S为第一特征图SD或第二特征图S0,Hj表示第j个最大池化层之前的卷积层的特征映射的高度,Wj表示第j个最大池化层之前的卷积层的特征映射的宽度,得到第一评估值和第二评估值
在信息保存度部分中,使用softmax函数映射权重得到第一权重和第二权重公式如下:
其中,c表示预定义的正常数。
作为优选,在损失函数部分中,
采用以下公式计算相似性损失:
其中,θ表示网络主体的参数,D表示训练数据集,E表示对括号内部的输出进行平均,分别表示去雾后的偏振图像If和第一图像I1之间的结构相似性以及去雾后的偏振图像If和第二图像I2之间的结构相似性;
同时将均方误差作为去雾后的偏振图像分别与第一图像和第二图像之间的强度分布约束,公式如下:
其中,分别表示去雾后的偏振图像If和第一图像I1之间的结构相似性值以及去雾后的偏振图像If和第二图像之间的结构相似性值I2
作为优选,融合层采用残差结构将第一特征和第二特征进行拼接,解码层包括4个卷积层,除最后一层外,其他层选择ReLU激活函数,每个卷积层的内核为3×3。
作为优选,根据微偏振阵列图像得到第一图像和第二图像,具体包括:
根据微偏振阵列图像分别求取0°、45°、90°和135°四个不同偏振方向的偏振图像信息,公式如下:
I0(x,y)=Iorig(2x,2y);
I45(x,y)=Iorig(2x,2y+1);
I90(x,y)=Iorig(2x+1,2y+1);
I135(x,y)=Iorig(2x+1,2y);
其中,x,y为像素位置索引,Iorig为微偏振阵列图像;
依据stokes理论,得到stokes参量S0、S1、S2,公式如下:




其中,θ为偏振角度0°、45°、90°和135°,代入进而得到S0、S1、S2、偏振度DoP以及偏振角AoP,公式如下:

Sl=I(0)-I(90)
S2=I(45)-I(135)

将偏振度图像DoP作为第一图像I1,将大气图像S0作为第二图像I2
第二方面,本发明提供了一种基于无监督权重深度模型的偏振图像去雾装置,包括:
图像获取模块,被配置为获取微偏振阵列图像,根据微偏振阵列图像得到第一图像和第二图像;
模型构建训练模块,被配置为构建无监督权重深度模型并进行训练,得到经训练的无监督权重深度模型,无监督权重深度模型包括编码层、融合层、解码层和权重测量层,编码层、融合层和解码层依次连接,编码层包括卷积层和DenseNet网络,第一图像和第二图像分别输入编码层,得到第一特征和第二特征,第一特征和第二特征输入融合层进行特征拼接,得到拼接特征,拼接特征输入解码层,输出得到去雾后的偏振图像,在训练过程中,权重测量层根据去雾后的偏振图像以及第一图像和第二图像进行无监督训练;
应用模块,被配置为在应用中,将第一图像和第二图像输入经训练的无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
第三方面,本发明提供了一种电子设备,包括一个或多个处理器;存储装置,用于存储一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法。
第四方面,本发明提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面中任一实现方式描述的 方法。
相比于现有技术,本发明具有以下有益效果:
(1)本发明采用分焦平面型偏振成像系统采集不同雾霾条件下的偏振雾霾图像以及无雾图像,可以为无监督权重深度模型提供地面真值,同时利用光波的第三维偏振信息与大气光无光的特性,可以有效的解决目前主流的去雾方法无法对远距离场景去雾的问题。
(2)本发明在无监督权重深度模型中加入权重测量层,考虑了不同模态信息的保留方式,充分的利用不同模态信息的互补信息,不依赖图像必须包含天空区域的约束,极大的扩展了应用场景,可用于雨雾尘雪等复杂天气条件下的去雾任务。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法的流程示意图;
图3为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法的微偏振阵列图像的示意图;
图4为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法的无监督权重深度模型的框架图;
图5为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法的无监督权重深度模型的Densenet网络的示意图;
图6为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法的无监督权重深度模型的融合层的示意图;
图7为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法的大气图像;
图8为本发明的实施例的基于无监督权重深度模型的偏振图像去雾方法 的去雾后的偏振图像;
图9为本发明的实施例的基于无监督权重深度模型的偏振图像去雾装置的示意图;
图10是适于用来实现本申请实施例的电子设备的计算机装置的结构示意图。
具体实施方式
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
图1示出了可以应用本申请实施例的基于无监督权重深度模型的偏振图像去雾方法或基于无监督权重深度模型的偏振图像去雾装置的示例性装置架构100。
如图1所示,装置架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种应用,例如数据处理类应用、文件处理类应用等。
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、 103上传的文件或数据进行处理的后台数据处理服务器。后台数据处理服务器可以对获取的文件或数据进行处理,生成处理结果。
需要说明的是,本申请实施例所提供的基于无监督权重深度模型的偏振图像去雾方法可以由服务器105执行,也可以由终端设备101、102、103执行,相应地,基于无监督权重深度模型的偏振图像去雾装置可以设置于服务器105中,也可以设置于终端设备101、102、103中。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。在所处理的数据不需要从远程获取的情况下,上述装置架构可以不包括网络,而只需服务器或终端设备。
图2示出了本申请的实施例提供的一种基于无监督权重深度模型的偏振图像去雾方法,包括以下步骤:
T1,获取微偏振阵列图像,根据微偏振阵列图像得到第一图像和第二图像。
在具体的实施例中,微偏振阵列图像通过分焦平面型偏振成像系统得到,参考图3,该微偏振阵列图像由具有0°、45°、90°和135°四个不同偏振方向的偏振图像信息组成,根据微偏振阵列图像得到第一图像和第二图像,具体包括:
根据微偏振阵列图像分别求取0°、45°、90°和135°四个不同偏振方向的偏振图像信息,公式如下:
I0(x,y)=Iorig(2x,2y)
I45(x,y)=Iorig(2x,2y+1);
I90(x,y)=Iorig(2x+1,2y+1);
I135(x,y)=Iorig(2x+1,2y);
其中,x,y为像素位置索引,Iorig为微偏振阵列图像;
依据stokes理论,得到stokes参量S0、S1、S2,公式如下:




其中,θ为偏振角度0°、45°、90°和135°,代入进而得到S0、S1、S2、偏振度图像DoP以及偏振角图像AoP,公式如下:

S1=I(0)-I(90)
S2=I(45)-I(135)

将偏振度图像DoP作为第一图像I1,将大气图像S0作为第二图像I2
T2,构建无监督权重深度模型并进行训练,得到经训练的无监督权重深度模型,无监督权重深度模型包括编码层、融合层、解码层和权重测量层,编码层、融合层和解码层依次连接,编码层包括卷积层和DenseNet网络,第一图像和第二图像分别输入编码层,得到第一特征和第二特征,第一特征和第二特征输入融合层进行特征拼接,得到拼接特征,拼接特征输入解码层,输出得到去雾后的偏振图像,在训练过程中,权重测量层根据去雾后的偏振图像以及第一图像和第二图像进行无监督训练。
在具体的实施例中,参考图4-6,编码层中的卷积层用于提取粗糙特征,再通过DenseNet网络提取对应于第一图像和第二图像的深度和高维特征,即第一特征和第二特征,可以保存更多的特征信息。具体的,图4中所有卷积核的大小均为3×3,将第一图像I1与第二图像I2分别输入到一层具有16个输出通道的卷积层C1,以提取整体轮廓特征,激活函数选择ReLU函数,输出分别表示为CI1与CI2;然后将CI1与CI2输入DenseNet,如图5所示,其中,DenseNet网络包含三个卷积层,卷积核的大小为3×3,输出通道为16,每层的激活函数选择ReLU函数,每层的输出级联为下一层的输入,经过DenseNet后得到的输出分别表示为DCI1与DCI2,特征映射为56×56×64。参考图6,融合层采用残差结构,残差结构的计算可以表示为:
其中,被相应的输入融合层中,残差结构由三层卷积层构成,卷积核为3×3,输出通道分别为2、2、1,得到的特征映射表示为Fusion,大小为56×56×64;融合层采用残差结构将第一特征和第二特征进行拼接,可以加速网络训练阶段的收敛速度。然后将Fusion输入到解码层Decode中,解码层由4层卷积层组成,卷积核的大小为3×3,输出通道分别为64、32、16、1,除最后一层外,其它层使用ReLU激活函数,得到的特征映射表示为If,大小为56×56×1,即为最终的输出。解码层可以保证最终输出的去雾后的偏振图像的分辨率与输入图像一致。
在具体的实施例中,权重测量层包括特征提取部分、信息测量部分、信息保存度部分和损失函数部分,将第一图像和第二图像分别输入特征提取部分,得到第一特征图和第二特征图,第一特征图和第二特征图分别输入信息测量部分,对第一特征图和第二特征图的梯度进行评估,得到第一评估值和第二评估值,将第一评估值和第二评估值输入信息保存度部分,得到第一权重和第二权重,在损失函数部分,将第一权重和第二权重结合去雾后的偏振图像分别与第一图像和第二图像之间的结构相似性计算相似性损失,将第一权重和第二权重结合去雾后的偏振图像分别与第一图像和第二图像之间的均方误差计算均方误差损失。
具体的,特征提取部分采用VGG19网络进行特征提取,将第一图像和第二图像分别输入VGG19网络,并将VGG19网络的最大池化层之前的卷积层的输出作为第一特征图和第二特征图。
具体的,在信息测量部分中,为了测量提取的特征图中包含的信息,对第一特征图和第二特征图的梯度进行评估的公式如下:
其中,为第一特征图SD或第二特征图S0在VGG19网络中第j个最大池化层之前的卷积层的特征映射,k表示Dj通道的第k个通道的特征映射,||·||F表示Frobenius范数,▽表示拉普拉斯算子,S为第一特征图SD或第二特征图S0,Hj表示第j个最大池化层之前的卷积层的特征映射的高度,Wj表示第j个最大池化层之前的卷积层的特征映射的宽度,得到第一评估值和第二评估值
在信息保存度部分中,同时可以实现在计算和存储方面的高效性,为了尽可能保留输入图像中的有效信息,用信息保留度定义两个自适应权值,使用softmax函数映射权重得到第一权重和第二权重公式如下:
其中,c表示预定义的正常数。本申请的实施例将softmax函数映射权重转换为0到1之间的实数,并保证两个权重之和为1, 然后利用权重值控制输入图像的信息保存程度。
具体的,在损失函数部分中,损失函数主要用于保存重要信息和训练深度模型,使用结构相似性指标作为去雾后的偏振图像If分别与第一图像I1和第二图像I2之间的相似性损失。
采用以下公式计算相似性损失:
其中,θ表示网络主体的参数,D表示训练数据集,E表示对括号内部的输出进行平均,分别表示去雾后的偏振图像If和第一图像I1之间的结构相似性以及去雾后的偏振图像If和第二图像I2之间的结构相似性;
同时将均方误差作为去雾后的偏振图像分别与第一图像和第二图像之间的强度分布约束,公式如下:
其中,分别表示去雾后的偏振图像If和第一图像I1之间的结构相似性值以及去雾后的偏振图像If和第二图像I2之间的结构相似性值。
通过分焦平面型偏振成像系统采集不同雾霾条件下的偏振雾霾图像以及无雾图像,构成无监督权重深度模型的训练集,来实现利用地面真值对无监督权重深度模型的训练,进一步提高无监督权重深度模型在真实场景应用时的性能。具体的训练过程与常规训练过程一致,在此不再赘述。
T3,在应用中,将第一图像和第二图像输入经训练的无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
具体的,参考图7和图8,图7为第二图像,图8为去雾后的偏振图像,经过本申请的实施例的无监督权重深度模型,可有效实现对偏振图像的去雾。
本申请的实施例以解纠缠的表示学习思想为起点,根据stokes理论,直接解算光波的偏振信息,将第二图像与第一图像直接输入编码层,在模型训练方法上,设计两个权重因子,从而考虑了不同模态信息的保留方式,将权重因子计入损失 函数中对无监督权重深度模型进行训练,最终实现远距离场景去雾。采用分焦平面型偏振成像系统采集不同雾霾条件下的偏振图像,通过无监督权重深度模型进行偏振去雾处理,与目前主流的去雾算法进行对比,在峰值信噪比(PSNR)、结构相似性指标(SSIM)、均方根误差(RMSE)以及信息熵等评估指标下表现出良好的效果,不依赖对图像必须包含天空区域的约束,极大的扩展了应用场景。
进一步参考图9,作为对上述各图所示方法的实现,本申请提供了一种基于无监督权重深度模型的偏振图像去雾装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
本申请实施例提供了一种基于无监督权重深度模型的偏振图像去雾装置,包括:
图像获取模块1,被配置为获取微偏振阵列图像,根据微偏振阵列图像得到第一图像和第二图像;
模型构建训练模块2,被配置为构建无监督权重深度模型并进行训练,得到经训练的无监督权重深度模型,无监督权重深度模型包括编码层、融合层、解码层和权重测量层,编码层、融合层和解码层依次连接,编码层包括卷积层和DenseNet网络,第一图像和第二图像分别输入编码层,得到第一特征和第二特征,第一特征和第二特征输入融合层进行特征拼接,得到拼接特征,拼接特征输入解码层,输出得到去雾后的偏振图像,在训练过程中,权重测量层根据去雾后的偏振图像以及第一图像和第二图像进行无监督训练;
应用模块3,被配置为在应用中,将第一图像和第二图像输入经训练的无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
本实施例中各模块功能的进一步描述,详见前述实施例一种基于无监督权重深度模型的偏振图像去雾方法。
下面参考图10,其示出了适于用来实现本申请实施例的电子设备(例如图1所示的服务器或终端设备)的计算机装置1000的结构示意图。图10示出的电子设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图10所示,计算机装置1000包括中央处理单元(CPU)1001和图形处理器(GPU)1002,其可以根据存储在只读存储器(ROM)1003中的程 序或者从存储部分1009加载到随机访问存储器(RAM)1004中的程序而执行各种适当的动作和处理。在RAM 1004中,还存储有装置1000操作所需的各种程序和数据。CPU 1001、GPU1002、ROM 1003以及RAM 1004通过总线1005彼此相连。输入/输出(I/O)接口1006也连接至总线1005。
以下部件连接至I/O接口1006:包括键盘、鼠标等的输入部分1007;包括诸如、液晶显示器(LCD)等以及扬声器等的输出部分1008;包括硬盘等的存储部分1009;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分1010。通信部分1010经由诸如因特网的网络执行通信处理。驱动器1011也可以根据需要连接至I/O接口1006。可拆卸介质1012,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1011上,以便于从其上读出的计算机程序根据需要被安装入存储部分1009。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分1010从网络上被下载和安装,和/或从可拆卸介质1012被安装。在该计算机程序被中央处理单元(CPU)1001和图形处理器(GPU)1002执行时,执行本申请的方法中限定的上述功能。
需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读介质或者是上述两者的任意组合。计算机可读介质例如可以是——但不限于——电、磁、光、电磁、红外线或半导体的装置、装置或器件,或者任意以上的组合。计算机可读介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件或者上述的任意合适的组合。在本申请中,计算机可读介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行装置、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的 程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行装置、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请的操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,也可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本申请各种实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,该模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的装置来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本申请实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而 未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取微偏振阵列图像,根据微偏振阵列图像得到第一图像和第二图像;构建无监督权重深度模型并进行训练,得到经训练的无监督权重深度模型,无监督权重深度模型包括编码层、融合层、解码层和权重测量层,编码层、融合层和解码层依次连接,编码层包括卷积层和DenseNet网络,第一图像和第二图像分别输入编码层,得到第一特征和第二特征,第一特征和第二特征输入融合层进行特征拼接,得到拼接特征,拼接特征输入解码层,输出得到去雾后的偏振图像,在训练过程中,权重测量层根据去雾后的偏振图像以及第一图像和第二图像进行无监督训练;在应用中,将第一图像和第二图像输入经训练的无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
工业实用性
本发明一种基于无监督权重深度模型的偏振图像去雾方法及装置,采用分焦平面型偏振成像系统采集不同雾霾条件下的偏振图像,通过无监督权重深度模型进行偏振去雾处理,以解决目前图像去雾时对远距离场景去雾效果不佳的问题,其使用范围广,具有良好的工业实用性。

Claims (11)

  1. 一种基于无监督权重深度模型的偏振图像去雾方法,其特征在于,包括以下步骤:
    获取微偏振阵列图像,根据所述微偏振阵列图像得到第一图像和第二图像;
    构建无监督权重深度模型并进行训练,得到经训练的所述无监督权重深度模型,所述无监督权重深度模型包括编码层、融合层、解码层和权重测量层,所述编码层、融合层和解码层依次连接,所述编码层包括卷积层和DenseNet网络,所述第一图像和第二图像分别输入所述编码层,得到第一特征和第二特征,所述第一特征和第二特征输入所述融合层进行特征拼接,得到拼接特征,所述拼接特征输入所述解码层,输出得到去雾后的偏振图像,在训练过程中,所述权重测量层根据所述去雾后的偏振图像以及第一图像和第二图像进行无监督训练;
    在应用中,将所述第一图像和第二图像输入经训练的所述无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
  2. 根据权利要求1所述的基于无监督权重深度模型的偏振图像去雾方法,其特征在于,所述权重测量层包括特征提取部分、信息测量部分、信息保存度部分和损失函数部分:将所述第一图像和第二图像分别输入所述特征提取部分,得到第一特征图和第二特征图;所述第一特征图和第二特征图分别输入所述信息测量部分,对所述第一特征图和第二特征图的梯度进行评估,得到第一评估值和第二评估值;将所述第一评估值和第二评估值输入所述信息保存度部分,得到第一权重和第二权重;在所述损失函数部分,将所述第一权重和第二权重结合所述去雾后的偏振图像分别与第一图像和第二图像之间的结构相似性计算相似性损失,将所述第一权重和第二权重结合所述去雾后的偏振图像分别与第一图像和第二图像之间的均方误差计算均方误差损失。
  3. 根据权利要求2所述的基于无监督权重深度模型的偏振图像去雾方法,其特征在于,所述特征提取部分采用VGG19网络进行特征提取,将所述第一图像和第二图像分别输入所述VGG19网络,并将所述VGG19网络的最大池化层之前的卷积层的输出作为所述第一特征图和第二特征图。
  4. 根据权利要求3所述的基于无监督权重深度模型的偏振图像去雾方法,其特征在于,在所述信息测量部分中,对所述第一特征图和第二特征图的梯度进行 评估的公式如下:
    其中,为第一特征图SD或第二特征图S0在所述VGG19网络中第j个最大池化层之前的卷积层的特征映射,k表示Dj通道的第k个通道的特征映射,||·||F表示Frobenius范数,表示拉普拉斯算子,S为第一特征图SD或第二特征图S0,Hj表示第j个最大池化层之前的卷积层的特征映射的高度,Wj表示第j个最大池化层之前的卷积层的特征映射的宽度,得到第一评估值和第二评估值
    在所述信息保存度部分中,使用softmax函数映射权重得到第一权重和第二权重公式如下:
    其中,c表示预定义的正常数。
  5. 根据权利要求2所述的基于无监督权重深度模型的偏振图像去雾方法,其特征在于,在所述损失函数部分中,
    采用以下公式计算相似性损失:
    其中,θ表示网络主体的参数,D表示训练数据集,E表示对括号内部的输出进行平均,分别表示所述去雾后的偏振图像If和第一图像I1之 间的结构相似性以及所述去雾后的偏振图像If和第二图像I2之间的结构相似性;
    同时将均方误差作为所述去雾后的偏振图像分别与第一图像和第二图像之间的强度分布约束,公式如下:
    其中,分别表示所述去雾后的偏振图像If和第一图像I1之间的结构相似性值以及所述去雾后的偏振图像If和第二图像I2之间的结构相似性值。
  6. 根据权利要求1所述的基于无监督权重深度模型的偏振图像去雾方法,其特征在于,所述融合层采用残差结构将所述第一特征和第二特征进行拼接,所述解码层包括4个卷积层,除最后一层外,其他层选择ReLU激活函数,每个卷积层的内核为3×3。
  7. 根据权利要求1所述的基于无监督权重深度模型的偏振图像去雾方法,其特征在于,所述根据所述微偏振阵列图像得到第一图像和第二图像,具体包括:
    根据所述微偏振阵列图像分别求取0°、45°、90°和135°四个不同偏振方向的偏振图像信息,公式如下:
    I0(x,y)=Iorig(2x,2y);
    I45(x,y)=Iorig(2x,2y+1);
    I90(x,y)=Iorig(2x+1,2y+1);
    I135(x,y)=Iorig(2x+1,2y);
    其中,x,y为像素位置索引,Iorig为所述微偏振阵列图像;
    依据stokes理论,得到stokes参量S0、S1、S2,公式如下:




    其中,θ为偏振角度0°、45°、90°和135°,代入进而得到S0、S1、S2、偏振度DoP以及偏振角AoP,公式如下:

    S1=I(0)-I(90)
    S2=I(45)-I(135)

    将偏振度图像DoP作为第一图像I1,将大气图像S0作为第二图像I2
  8. 一种基于无监督权重深度模型的偏振图像去雾装置,其特征在于,包括:
    图像获取模块,被配置为获取微偏振阵列图像,根据所述微偏振阵列图像得到第一图像和第二图像;
    模型构建训练模块,被配置为构建无监督权重深度模型并进行训练,得到经训练的所述无监督权重深度模型,所述无监督权重深度模型包括编码层、融合层、解码层和权重测量层,所述编码层、融合层和解码层依次连接,所述编码层包括卷积层和DenseNet网络,所述第一图像和第二图像分别输入所述编码层,得到 第一特征和第二特征,所述第一特征和第二特征输入所述融合层进行特征拼接,得到拼接特征,所述拼接特征输入所述解码层,输出得到去雾后的偏振图像,在训练过程中,所述权重测量层根据所述去雾后的偏振图像以及第一图像和第二图像进行无监督训练;
    应用模块,被配置为在应用中,将所述第一图像和第二图像输入经训练的所述无监督权重深度模型,经过编码层、融合层和解码层,得到去雾后的偏振图像。
  9. 根据权利要求8所述的一种基于无监督权重深度模型的偏振图像去雾装置,其特征在于,所述权重测量层包括特征提取部分、信息测量部分、信息保存度部分和损失函数部分:将第一图像和第二图像分别输入特征提取部分,得到第一特征图和第二特征图;再将其分别输入信息测量部分,对其梯度进行评估,得到第一评估值和第二评估值;将评估值分别输入信息保存度部分,得到第一权重和第二权重;在损失函数部分,将第一权重和第二权重结合去雾后的偏振图像分别与第一图像和第二图像之间的结构相似性计算相似性损失,将第一权重和第二权重结合去雾后的偏振图像分别与第一图像和第二图像之间的均方误差计算均方误差损失。
  10. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一所述的方法。
  11. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-7中任一所述的方法。
PCT/CN2023/106584 2022-09-28 2023-07-10 基于无监督权重深度模型的偏振图像去雾方法及装置 WO2024066654A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211186406.4A CN115293992B (zh) 2022-09-28 2022-09-28 基于无监督权重深度模型的偏振图像去雾方法及装置
CN202211186406.4 2022-09-28

Publications (1)

Publication Number Publication Date
WO2024066654A1 true WO2024066654A1 (zh) 2024-04-04

Family

ID=83834174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/106584 WO2024066654A1 (zh) 2022-09-28 2023-07-10 基于无监督权重深度模型的偏振图像去雾方法及装置

Country Status (2)

Country Link
CN (1) CN115293992B (zh)
WO (1) WO2024066654A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293992B (zh) * 2022-09-28 2022-12-30 泉州装备制造研究所 基于无监督权重深度模型的偏振图像去雾方法及装置
CN117911282B (zh) * 2024-03-19 2024-05-28 华中科技大学 一种图像去雾模型的构建方法及应用

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110043603A1 (en) * 2006-01-18 2011-02-24 Technion Research & Development Foundation Ltd. System And Method For Dehazing
CN110544213A (zh) * 2019-08-06 2019-12-06 天津大学 一种基于全局和局部特征融合的图像去雾方法
CN114004760A (zh) * 2021-10-22 2022-02-01 北京工业大学 图像去雾方法、电子设备、存储介质和计算机程序产品
CN114841885A (zh) * 2022-05-10 2022-08-02 中国矿业大学(北京) 一种基于偏振图像数据的去雾融合处理方法
CN115293992A (zh) * 2022-09-28 2022-11-04 泉州装备制造研究所 基于无监督权重深度模型的偏振图像去雾方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419163B (zh) * 2019-08-21 2023-06-30 中国人民解放军火箭军工程大学 一种基于先验知识和深度学习的单张图像弱监督去雾方法
CN110570371B (zh) * 2019-08-28 2023-08-29 天津大学 一种基于多尺度残差学习的图像去雾方法
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal
CN111738942A (zh) * 2020-06-10 2020-10-02 南京邮电大学 一种融合特征金字塔的生成对抗网络图像去雾方法
CN113674190B (zh) * 2021-08-20 2022-09-16 中国人民解放军国防科技大学 基于密集连接生成对抗网络的图像融合方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110043603A1 (en) * 2006-01-18 2011-02-24 Technion Research & Development Foundation Ltd. System And Method For Dehazing
CN110544213A (zh) * 2019-08-06 2019-12-06 天津大学 一种基于全局和局部特征融合的图像去雾方法
CN114004760A (zh) * 2021-10-22 2022-02-01 北京工业大学 图像去雾方法、电子设备、存储介质和计算机程序产品
CN114841885A (zh) * 2022-05-10 2022-08-02 中国矿业大学(北京) 一种基于偏振图像数据的去雾融合处理方法
CN115293992A (zh) * 2022-09-28 2022-11-04 泉州装备制造研究所 基于无监督权重深度模型的偏振图像去雾方法及装置

Also Published As

Publication number Publication date
CN115293992B (zh) 2022-12-30
CN115293992A (zh) 2022-11-04

Similar Documents

Publication Publication Date Title
WO2024066654A1 (zh) 基于无监督权重深度模型的偏振图像去雾方法及装置
Zhang et al. Multi-scale single image dehazing using perceptual pyramid deep network
Sun et al. Deep pixel‐to‐pixel network for underwater image enhancement and restoration
CN110570371A (zh) 一种基于多尺度残差学习的图像去雾方法
Zhao et al. Pyramid global context network for image dehazing
CN111079764B (zh) 一种基于深度学习的低照度车牌图像识别方法及装置
WO2023082453A1 (zh) 一种图像处理方法及装置
CN113313832B (zh) 三维模型的语义生成方法、装置、存储介质与电子设备
Shi et al. Polarization-based haze removal using self-supervised network
Yeh et al. Single image dehazing via deep learning-based image restoration
He et al. Unsupervised haze removal for aerial imagery based on asymmetric contrastive CycleGAN
Wang et al. Multi‐scale network for remote sensing segmentation
Cimtay Smart and real-time image dehazing on mobile devices
CN114444653A (zh) 一种数据增广对深度学习模型性能影响评估方法及系统
Chen et al. Attentive generative adversarial network for removing thin cloud from a single remote sensing image
Liu et al. Image dehazing method of transmission line for unmanned aerial vehicle inspection based on densely connection pyramid network
Han Texture image compression algorithm based on self-organizing neural network
Yi et al. MSNet: A novel end‐to‐end single image dehazing network with multiple inter‐scale dense skip‐connections
CN112052863B (zh) 一种图像检测方法及装置、计算机存储介质、电子设备
CN115953312A (zh) 一种基于单幅图像的联合去雾检测方法、装置及存储介质
CN113744152A (zh) 一种潮水图像去噪处理方法、终端及计算机可读存储介质
Gao et al. CP-Net: Channel attention and pixel attention network for single image dehazing
CN113763259A (zh) 图像去雾方法和装置
Siddiqua et al. MACGAN: an all-in-one image restoration under adverse conditions using multidomain attention-based conditional GAN
Du et al. Dehazing Network: Asymmetric Unet Based on Physical Model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23869901

Country of ref document: EP

Kind code of ref document: A1