CN114782297B - Image fusion method based on motion-friendly multi-focus fusion network - Google Patents

Image fusion method based on motion-friendly multi-focus fusion network Download PDF

Info

Publication number
CN114782297B
CN114782297B CN202210396277.5A CN202210396277A CN114782297B CN 114782297 B CN114782297 B CN 114782297B CN 202210396277 A CN202210396277 A CN 202210396277A CN 114782297 B CN114782297 B CN 114782297B
Authority
CN
China
Prior art keywords
image
focus
input
motion
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210396277.5A
Other languages
Chinese (zh)
Other versions
CN114782297A (en
Inventor
刘帅成
郑梓楠
陈才
章程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210396277.5A priority Critical patent/CN114782297B/en
Publication of CN114782297A publication Critical patent/CN114782297A/en
Application granted granted Critical
Publication of CN114782297B publication Critical patent/CN114782297B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of image enhancement and computer vision, and discloses an image fusion method based on a motion-friendly multi-focus fusion network, which comprises the following steps: step S1: shooting two partially focused images by using a camera, wherein one image is in foreground focusing and background blurring, and the other image is in foreground blurring and background focusing; step S2: judging whether the cameras in the two images move or not, if so, entering a step S3, and if not, entering the step S3 or the step S4; judging whether the object in the two images moves or not, if so, entering a step S3, and if not, entering the step S3 or the step S4; step S3: fusing the two input photos by using a motion-friendly multi-focus fusion network, directly outputting the two fused photos according to the motion-friendly multi-focus fusion network, and corresponding to the input images one by one; step S4: and fusing the two input photos by using a fusion network, and outputting the two fused photos, wherein the two fused photos correspond to the input images one by one.

Description

Image fusion method based on motion-friendly multi-focus fusion network
Technical Field
The invention relates to the technical field of image enhancement and computer vision, in particular to an image fusion method based on a motion-friendly multi-focus fusion network, which is used for realizing effective multi-focus image fusion by carrying out special processing on a multi-focus image with motion.
Background
Due to the limitation of hardware conditions, the depth of field of the optical lens is limited, and objects located in the depth of field only have clear appearance when shooting, so that only one of the near view and the far view can be clearly presented. However, a sharp image is more easily observed and perceived by human vision than a blurred image, and a sharp fully focused image may provide more image content and detail. Multi-focus image fusion is a technique that generates a picture in focus everywhere from a set of partially focused pictures taken in the same scene. The method is an effective technology for expanding the depth of field of the optical lens, has important significance in the fields of digital photography, optical microscope, integrated imaging and the like, and is an important field in image processing.
The two source images fused by the multi-focus image have different focusing areas except for objects in focus, and other aspects have to be consistent, so that the requirements on shooting scenes, shooting conditions and shooting equipment are very high. However, in real life, most of the photos are taken by the handheld device of the user, and there may be motion of the object to be taken, and there may be shake of the camera due to the handheld device. At present, when two photographed images are not completely "matched", that is, when there is the motion, a corresponding technology is lacking to enable the two source images to be effectively multi-focus fused, and the existing focused image fusion is based on a static object, so that an image fusion method is needed to be capable of performing multi-focus fusion on the images with the motion by using deep learning.
Disclosure of Invention
The invention aims to provide an image fusion method based on a sports-friendly multi-focus fusion network, which realizes effective multi-focus image fusion by performing special processing on a multi-focus image with sports.
The invention is realized by the following technical scheme: an image fusion method based on a motion-friendly multi-focus fusion network comprises the following steps:
step S1: shooting two partially focused images by using a camera, wherein one image is in foreground focusing and background blurring, and the other image is in foreground blurring and background focusing;
step S2: utilizing a feature alignment module to align the middle features of the images, judging whether cameras in the two images move or not, if so, entering a step S3, and if not, entering the step S3 or the step S4; judging whether the object in the two images moves or not, if so, entering a step S3, and if not, entering the step S3 or the step S4;
step S3: fusing the two input photos by using a motion-friendly multi-focus fusion network MTMFNet, directly outputting the two fused photos according to the motion-friendly multi-focus fusion network MTMFNet, and corresponding to the input images one by one;
step S4: and fusing the two input photos by using a fusion network, and outputting the two fused photos, wherein the two fused photos correspond to the input images one by one.
To better implement the present invention, further, the feature alignment module includes a deconvolution layer.
To better implement the present invention, further, the motion-friendly multi-focus fusion network MTMFNet in step S3 includes:
the motion-friendly multi-focus fusion network MTMFNet comprises a dataset;
the dataset includes a DAVIS video dataset and a Cityscapes street view image dataset.
To better implement the invention, further, the dataset comprises:
the data set selects a moving object in the DAVIS video data set as an image foreground, and selects a street view image in the Cityscapes as an image background;
the image foreground selects non-rigid moving objects with random directions and random amplitudes, and meanwhile, the image background is transformed into the non-rigid moving objects with random directions and random amplitudes.
In order to better implement the present invention, further, the calculation method for fusing the two input photos by the motion-friendly multi-focus fusion network MTMFNet in step S3 includes:
in training the sports-friendly multi-focus fusion network MTMFNet, the loss function used is:
wherein,;/>
wherein,representing spatial domain loss, < >>Representing frequency domain loss->Representing the corresponding output image of the input image 1, +.>Representing a ground truth image corresponding to the input image 1. />Fourier transform representing the corresponding output image of the input image 1,/->A fourier transform representing a ground truth image corresponding to the input image 1; />Representing the corresponding output image of the input image 2, +.>Representing the ground truth image corresponding to the input image 2. />Fourier transform representing the corresponding output image of the input image 2,/->Representing the fourier transform of the ground truth image corresponding to the input image 2. H and W represent the height and width of the input image, respectively, < >>A channel index representing an RGB image.
In order to better implement the present invention, further, the fusion network in step S4 includes a multi-focus image fusion network.
Compared with the prior art, the invention has the following advantages:
(1) The existing focusing image fusion is based on a static object, and when two photographed images are not completely matched, namely under the condition of the motion, the two source images can be effectively fused in a multi-focusing mode due to the lack of corresponding technology. The invention judges the image after obtaining the image to see whether the object in the image moves or not and whether the camera shakes or not, if yes, the invention needs to process the multi-focus fusion network which is friendly to the movement, if not, the invention can process the image of the static object or the image which does not shake by selecting the multi-focus fusion network which is friendly to the movement or other focusing fusion networks.
Drawings
The invention is further described with reference to the following drawings and examples, and all inventive concepts of the invention are to be considered as being disclosed and claimed.
Fig. 1 is a schematic structural diagram of a structure of a sports-friendly multi-focus fusion network MTMFNet in the image fusion method based on the sports-friendly multi-focus fusion network.
Fig. 2 is a schematic diagram showing comparison of multi-focus image fusion results in an image fusion method based on a motion-friendly multi-focus fusion network.
Fig. 3 is a schematic flow chart of an image fusion method based on a sports-friendly multi-focus fusion network.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only some embodiments of the present invention, but not all embodiments, and therefore should not be considered as limiting the scope of protection. All other embodiments, which are obtained by a worker of ordinary skill in the art without creative efforts, are within the protection scope of the present invention based on the embodiments of the present invention.
In the description of the present invention, it should be noted that, unless explicitly stated and limited otherwise, the terms "disposed," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; or may be directly connected, or may be indirectly connected through an intermediate medium, or may be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Example 1:
the image fusion method based on the multi-focus fusion network with the motion friendly performance of the embodiment, as shown in fig. 1-3, comprises the following steps:
step S1: shooting two partially focused images by using a camera, wherein one image is in foreground focusing and background blurring, and the other image is in foreground blurring and background focusing;
step S2: judging whether the cameras in the two images move or not, if so, entering a step S3, and if not, entering the step S3 or the step S4; judging whether the object in the two images moves or not, if so, entering a step S3, and if not, entering the step S3 or the step S4;
step S3: fusing the two input photos by using a motion-friendly multi-focus fusion network MTMFNet, directly outputting the two fused photos according to the motion-friendly multi-focus fusion network MTMFNet, and corresponding to the input images one by one;
step S4: and fusing the two input photos by using a fusion network, and outputting the two fused photos, wherein the two fused photos correspond to the input images one by one.
In this embodiment, as shown in fig. 1, the input image is a pair of multi-focus images in which there is motion. The output images are two all-focusing images, and are respectively in one-to-one correspondence with the input images. The effective multi-focus image fusion is realized by carrying out special treatment on the multi-focus image with motion; the aligned image features are used for fusing the multi-focus image features, so that the purpose of image fusion is achieved. When a user holds a photographing apparatus to photograph an image, a multi-focus image in which there is a motion may be obtained. Through the processing of the deep learning network, a clear full-focus image can be obtained.
Two partially focused photographs were taken using a camera. One of the front Jing Duijiao and background is blurred, and the other is blurred and the background is focused. The camera is allowed to move slightly in both pictures.
And fusing the two input photos by using a motion-friendly multi-focus Fusion network MTMFNet (Motion Tolerant Multi-focus Fusion Net), wherein the network directly outputs the two fused photos and corresponds to the input images one by one.
The invention relates to a motion-friendly multi-focus fusion network MTMFNet, which is used for carrying out multi-focus fusion on two input multi-focus images with motion. The MTMFNet consists of four modules, namely feature extraction, feature alignment, feature fusion and image restoration. Because the shot object in the two input photos has relative motion, the network is easy to learn difficultly, and therefore, the invention uses the characteristic alignment module to align the middle characteristic of the image, thereby being convenient for the network to grasp the key points.
The real object motion is random, flexible motion with an indefinite range, and the information provided by the input image is too much and complex for the neural network, so that the neural network has difficulty in learning the characteristics thereof. The invention considers simplifying corresponding movement to reduce over-complex information, and adjusts the flexible movement of the foreground and the background into rigid movement so as to reduce the difficulty of network learning of the multi-focus image fusion method. Compared with the existing method for focusing pictures, the existing method for focusing pictures does not allow the camera to shake and shake due to the fact that images of objects which do not move are selected, but the method for focusing pictures based on the multi-focus fusion network is provided.
Example 2:
the embodiment is further optimized based on embodiment 1, as shown in fig. 2, and is a comparison schematic diagram of multi-focus image fusion results, wherein the data set for training the MTMFNet network is a multi-focus image data set with motion using a DAVIS video data set and a Cityscapes street view image data set. The data set takes a moving object in the DAVIS video data set as an image foreground, takes a street view image in the Cityscapes as an image background, selects a non-rigid moving object with random direction and random amplitude for simulating a moving image under a real condition, and simultaneously carries out transformation of the non-rigid movement with random direction and random amplitude on the background. The training set had 17,915 image pairs and the test set had 2,083 image pairs. The DAVIS video dataset and the Cityscapes street view image dataset are proper nouns and have no chinese paraphrasing.
Other portions of this embodiment are the same as those of embodiment 1, and thus will not be described in detail.
Example 3:
this embodiment is further optimized based on the above embodiment 1 or 2, and the loss function used in training the neural network MTMFNet is as follows:
wherein:
wherein,indicating emptyInterdomain loss (I/O)>Representing frequency domain loss->Representing the corresponding output image of the input image 1, +.>Representing a ground truth image corresponding to the input image 1. />Fourier transform representing the corresponding output image of the input image 1,/->A fourier transform representing a ground truth image corresponding to the input image 1; />Representing the corresponding output image of the input image 2, +.>Representing the ground truth image corresponding to the input image 2. />Fourier transform representing the corresponding output image of the input image 2,/->Representing the fourier transform of the ground truth image corresponding to the input image 2. H and W represent the height and width of the input image, respectively, < >>A channel index representing an RGB image.
In the feature alignment module, two images are input, pyramid features are obtained through a convolution layer, features of each level are obtained step by step through deconvolution and up-sampling, and finally alignment is achieved through cascading. Other than the motion-friendly multi-focus convergence network, the remaining convergence networks cannot achieve alignment.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and any simple modification and equivalent variation of the above embodiment according to the technical matter of the present invention falls within the scope of the present invention.

Claims (4)

1. An image fusion method based on a motion-friendly multi-focus fusion network is characterized by comprising the following steps: step S1: shooting two partially focused images by using a camera, wherein one image is in foreground focusing and background blurring, and the other image is in foreground blurring and background focusing; step S2: utilizing a feature alignment module to align the middle features of the images, judging whether cameras in the two images move or not, if so, entering a step S3, and if not, entering the step S3 or the step S4; judging whether the object in the two images moves or not, if so, entering a step S3, and if not, entering the step S3 or the step S4; step S3: fusing the two input photos by using a motion-friendly multi-focus fusion network MTMFNet, directly outputting the two fused photos according to the motion-friendly multi-focus fusion network MTMFNet, and corresponding to the input images one by one;
the motion-friendly multi-focus fusion network MTMFNet comprises a feature extraction module, a feature alignment module, a feature fusion module and an image recovery module which are connected in sequence;
the feature extraction module includes a feature pyramid block pyramid feature block;
the characteristic alignment Module comprises a deformable alignment Module PCD Module and a deconvolution lamination conv which are connected in sequence;
the feature fusion module comprises an expanded convolution residual error intensive block DRDB and a self-adaptive convolution layer adaptive conv which are connected in sequence;
the image restoration module includes an image restoration module image reconstruction;
step S4: fusing the two input photos by using a fusion network, and outputting the two fused photos, wherein the two fused photos correspond to the input images one by one;
the step S3 includes: the motion-friendly multi-focus fusion network MTMFNet comprises a dataset;
the data set comprises a DAVIS video data set and a Cityscapes street view image data set;
the calculation method for fusing the two input photos by the motion-friendly multi-focus fusion network MTMFNet in the step S3 comprises the following steps:
in training the sports-friendly multi-focus fusion network MTMFNet, the loss function used is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Representing spatial domain loss, < >>Representing a frequency domain loss; />Representing the corresponding output image of the input image 1, +.>Representing a ground real image corresponding to the input image 1;fourier transform representing the corresponding output image of the input image 1,/->A fourier transform representing a ground truth image corresponding to the input image 1;/>representing the corresponding output image of the input image 2, +.>Representing a ground real image corresponding to the input image 2; />Fourier transform representing the corresponding output image of the input image 2,/->A fourier transform representing a ground truth image corresponding to the input image 2; h and W represent the height and width of the input image, respectively, < >>A channel index representing an RGB image.
2. The method of claim 1, wherein the feature alignment module comprises a deconvolution layer.
3. The image fusion method based on a motion-friendly multi-focus fusion network of claim 2, wherein the dataset comprises: the data set selects a moving object in the DAVIS video data set as an image foreground, and selects a street view image in the Cityscapes as an image background; the image foreground selects non-rigid moving objects with random directions and random amplitudes, and meanwhile, the image background is transformed into the non-rigid moving objects with random directions and random amplitudes.
4. The method of claim 1, wherein the fusion network in step S4 comprises a multi-focus image fusion network.
CN202210396277.5A 2022-04-15 2022-04-15 Image fusion method based on motion-friendly multi-focus fusion network Active CN114782297B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210396277.5A CN114782297B (en) 2022-04-15 2022-04-15 Image fusion method based on motion-friendly multi-focus fusion network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210396277.5A CN114782297B (en) 2022-04-15 2022-04-15 Image fusion method based on motion-friendly multi-focus fusion network

Publications (2)

Publication Number Publication Date
CN114782297A CN114782297A (en) 2022-07-22
CN114782297B true CN114782297B (en) 2023-12-26

Family

ID=82428279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210396277.5A Active CN114782297B (en) 2022-04-15 2022-04-15 Image fusion method based on motion-friendly multi-focus fusion network

Country Status (1)

Country Link
CN (1) CN114782297B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533623A (en) * 2019-09-06 2019-12-03 兰州交通大学 A kind of full convolutional neural networks multi-focus image fusing method based on supervised learning
CN110569832A (en) * 2018-11-14 2019-12-13 安徽艾睿思智能科技有限公司 text real-time positioning and identifying method based on deep learning attention mechanism
CN112215788A (en) * 2020-09-15 2021-01-12 湖北工业大学 Multi-focus image fusion algorithm based on improved generation countermeasure network
CN112379231A (en) * 2020-11-12 2021-02-19 国网浙江省电力有限公司信息通信分公司 Equipment detection method and device based on multispectral image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569832A (en) * 2018-11-14 2019-12-13 安徽艾睿思智能科技有限公司 text real-time positioning and identifying method based on deep learning attention mechanism
CN110533623A (en) * 2019-09-06 2019-12-03 兰州交通大学 A kind of full convolutional neural networks multi-focus image fusing method based on supervised learning
CN112215788A (en) * 2020-09-15 2021-01-12 湖北工业大学 Multi-focus image fusion algorithm based on improved generation countermeasure network
CN112379231A (en) * 2020-11-12 2021-02-19 国网浙江省电力有限公司信息通信分公司 Equipment detection method and device based on multispectral image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
占哲琦 ; 陈鹏 ; 桑永胜 ; 彭德中 ; .融合双注意力的深度神经网络在无人机目标检测中的应用.现代计算机.2020,(第11期),30-35. *
郑梓楠.基于深度学习的多对焦图像融合.中国优秀硕士学位论文全文数据库信息科技辑.2023,(第1期),I138-2300. *

Also Published As

Publication number Publication date
CN114782297A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
Agrawal et al. Coded exposure deblurring: Optimized codes for PSF estimation and invertibility
Tai et al. Image/video deblurring using a hybrid camera
JP5468404B2 (en) Imaging apparatus and imaging method, and image processing method for the imaging apparatus
US20220148297A1 (en) Image fusion method based on fourier spectrum extraction
CN102369722A (en) Imaging device and method, and image processing method for imaging device
Cao et al. Ntire 2023 challenge on 360deg omnidirectional image and video super-resolution: Datasets, methods and results
CN108024058A (en) Image virtualization processing method, device, mobile terminal and storage medium
CN112651911A (en) High dynamic range imaging generation method based on polarization image
CN115115516A (en) Real-world video super-resolution algorithm based on Raw domain
CN110278366B (en) Panoramic image blurring method, terminal and computer readable storage medium
CN111489300B (en) Screen image Moire removing method based on unsupervised learning
CN114202472A (en) High-precision underwater imaging method and device
CN112150363B (en) Convolutional neural network-based image night scene processing method, computing module for operating method and readable storage medium
CN107295261B (en) Image defogging method and device, storage medium and mobile terminal
CN114782297B (en) Image fusion method based on motion-friendly multi-focus fusion network
Xue Blind image deblurring: a review
CN109257540A (en) Take the photograph photography bearing calibration and the camera of lens group more
CN107392986A (en) A kind of image depth rendering intent based on gaussian pyramid and anisotropic filtering
Kim et al. Light field angular super-resolution using convolutional neural network with residual network
Wong A new method for creating a depth map for camera auto focus using an all in focus picture and 2D scale space matching
CN112532856B (en) Shooting method, device and system
CN113935910A (en) Image fuzzy length measuring method based on deep learning
Nagahara et al. Programmable aperture camera using LCoS
CN113379624A (en) Image generation method, training method, device and equipment of image generation model
JP6006506B2 (en) Image processing apparatus, image processing method, program, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant