WO2021208600A1 - 一种图像处理方法、智能设备及计算机可读存储介质 - Google Patents

一种图像处理方法、智能设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2021208600A1
WO2021208600A1 PCT/CN2021/077544 CN2021077544W WO2021208600A1 WO 2021208600 A1 WO2021208600 A1 WO 2021208600A1 CN 2021077544 W CN2021077544 W CN 2021077544W WO 2021208600 A1 WO2021208600 A1 WO 2021208600A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature map
moiré
processing
model
Prior art date
Application number
PCT/CN2021/077544
Other languages
English (en)
French (fr)
Inventor
柯戈扬
黄飞
熊唯
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP21789264.5A priority Critical patent/EP4030379A4/en
Priority to JP2022533195A priority patent/JP7357998B2/ja
Publication of WO2021208600A1 publication Critical patent/WO2021208600A1/zh
Priority to US17/711,852 priority patent/US20220222786A1/en

Links

Images

Classifications

    • G06T5/77
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • G06T5/60
    • G06T5/73
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7747Organisation of the process, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the field of image processing technology, in particular to an image processing technology.
  • the edge extraction algorithm is used to determine the part of the moiré image in the image to achieve the removal of moiré. Purpose.
  • the algorithm in this way is complicated to implement, and the moiré removal effect is poor.
  • the present application provides an image processing method, smart device, and computer-readable storage medium, which can remove moiré in an image more conveniently and comprehensively.
  • the present application provides an image processing method, which is executed by a smart device, and the method includes:
  • the image processing model is a network model trained in advance based on a moiré training data set
  • the image processing model includes: a multi-band module for processing the original image to obtain an N-layer Laplacian pyramid on the original image, and based on the N-layer Laplacian
  • the feature maps corresponding to the N spatial frequency bands of the Las Pyramid obtain a first processing result feature map, the target image is obtained according to the first processing result feature map, and the N is a positive integer greater than or equal to 2.
  • this application also provides an image processing device, which includes:
  • the acquisition module is used to acquire the original image
  • a processing module configured to run an image processing model to perform moiré removal processing on the original image to obtain a target image
  • the image processing model is a network model trained in advance based on a moiré training data set
  • the image processing model includes: a multi-band module for processing the original image to obtain an N-layer Laplacian pyramid on the original image, and based on the N-layer Laplacian
  • the feature maps corresponding to the N spatial frequency bands of the Las Pyramid obtain a first processing result feature map, the target image is obtained according to the first processing result feature map, and the N is a positive integer greater than or equal to 2.
  • the present application also provides a smart device, the smart device includes: a storage device and a processor; the storage device stores program instructions for image processing; the processor calls the program instructions , Used to implement the above-mentioned image processing method.
  • the present application also provides a computer-readable storage medium in which program instructions for image processing are stored.
  • program instructions for image processing When the program instructions are executed by a processor, the above-mentioned image processing method will be implemented. .
  • the present application also provides a computer program product, the computer program product including instructions, when run on a computer, causes the computer to execute the above-mentioned image processing method.
  • This application aims at the difference of moiré at different scales and different spatial frequency bands.
  • a multi-scale model based on the Laplacian pyramid is designed, using multiple frequency bands of the Laplacian pyramid.
  • the feature map below can be trained to obtain an image processing model that can remove moiré more comprehensively at different scales and different frequency bands, and can easily achieve better moiré removal effects.
  • FIG. 1 is a schematic diagram of a flow of using the moiré removal function according to an embodiment of the present application
  • Figure 2 is a related application interface of one of the image processing applications in the embodiments of the present application.
  • FIG. 3 is a schematic diagram of another process of using the moiré removal function according to an embodiment of the present application.
  • Fig. 4 is a schematic diagram of moiré patterns at different scales according to an embodiment of the present application.
  • Fig. 5 is a schematic diagram of moiré patterns in different spatial frequency bands according to an embodiment of the present application.
  • Fig. 6 is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an image processing model according to an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of another image processing method according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of another image processing model according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a process of training an image processing model according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of training an image processing model according to an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a smart device according to an embodiment of the present application.
  • Moiré is a fence-like texture based on interference. If digital cameras, mobile phones and other smart devices with shooting functions have photosensitive elements (such as charge coupled device (CCD) photosensitive elements or complementary metal oxide semiconductor (complementary metal oxide semiconductor, CMOS) photosensitive elements) pixel spatial frequency It is close to the spatial frequency of the fringe in the image displayed on the electronic display screen, then moiré will be produced in the captured image.
  • CCD charge coupled device
  • CMOS complementary metal oxide semiconductor
  • This application conducts a comprehensive analysis of the characteristics of moiré. Based on the characteristics of different moiré expressions across spatial frequency bands and multiple scales, the application designs corresponding image processing models and adopts a series of deep learning technologies to eliminate smart phones, Digital cameras and other smart devices with shooting functions capture the moiré on the image obtained when the electronic display screen is taken, and try to restore the image quality.
  • an image with moiré is input, and the image processing model performs moiré elimination processing to output a desired target image.
  • the model design uses techniques such as sub-space frequency band supervision and multi-task learning; and uses the attention mechanism to restore the color and brightness distribution to achieve a better moiré removal effect.
  • this application also proposes a practical method of manufacturing training data to facilitate model training based on the manufactured training data, and uses self-supervised learning, multi-scale learning, and generating adversarial network training optimization processing methods in the training process.
  • a target application can be designed and installed in a smart device.
  • the target application can call the camera of the smart device to take images; on the other hand, the target application can read locally stored images or download from the network.
  • the image at the same time, for the image with moiré obtained by shooting and the image with moiré read or downloaded, it can be directly displayed to the user.
  • the moiré removal process can be determined according to the user's operation.
  • display the image after removing the moiré can be displayed to the user.
  • the amount of video encoding data generated due to moiré can be reduced, the quality of image data transmission can be guaranteed, and user B can see the image information obtained by shooting more clearly, and user B can also directly store the clarity of video communication transmission , Or convert it into editable documents and other data.
  • FIG. 1 it is a schematic diagram of a process of using the moiré removal function according to an embodiment of the present application, which specifically includes the following steps:
  • S101 The smart device obtains the original image.
  • the user can control the smart device to make the smart device obtain the image with moiré.
  • the user can directly shoot the electronic display screen to get the image with the moiré.
  • the image with the moiré can be called the original image.
  • the smart device can also read images with moiré locally, or download images with moiré from the Internet, so as to obtain the original image.
  • S102 The smart device detects a user operation on the original image.
  • the smart device After the smart device obtains the original image with moiré, it can display whether the button for removing the moiré is turned on on the display interface for displaying the original image, so as to detect user operations on the original image. When these images with moiré are displayed on the display interface, it may also be displayed whether the button for removing moiré is turned on. In this way, it can be realized that all original images obtained in different ways provide a one-key elimination of moiré.
  • S103 The smart device previews the image after removing the moiré.
  • the smart device can preview the image after removing the moiré, and the image after removing the moiré is called the target image.
  • Other function buttons can also be displayed on the interface where the target image is displayed in the preview, such as buttons for sharing, saving, and editing again.
  • S104 The smart device executes corresponding processing such as sharing, storing, re-editing, etc., based on the received operation of the buttons by the user.
  • this figure is a related application interface of one of the image processing applications in the embodiments of this application.
  • the acquired original image can be displayed on the application interface 201. If there is moiré, the removal button 204 can be displayed. When the user clicks the removal button 204, the target application can perform image processing to obtain the target image after removing the moiré. , And display the target image after removing the moiré on the application interface 201.
  • the smart device can analyze the acquired image displayed on the application interface. If the analysis determines that there is moiré on the image, the removal button 204 is displayed. Of course, the removal button 204 is displayed. It may also be displayed on the application interface 201 after acquiring the original image. In FIG. 2, other buttons are also shown, such as a button 205 for triggering reacquisition of an image and a button 206 for saving an image.
  • the image processing method of the present application can also be used to remove moiré in the scene of converting an image into an editable document.
  • FIG. 3 is a schematic diagram of another process of using the moiré removal function according to an embodiment of the application, and the process may include the following steps:
  • S301 The smart device obtains the original image.
  • Smart devices are deployed with image-text conversion applications with the function of removing moiré.
  • the graphic conversion application refers to an application that converts an image (for example, a photographed slideshow image) into an editable document.
  • the smart device may start the graphic-text conversion application in response to the user's operation of clicking the icon of the graphic-text conversion application. Then the smart device can obtain the original image according to the user's operation, such as shooting the PPT displayed on the computer screen to obtain the original image, or extracting the original image with moiré from the local.
  • S302 The smart device receives a user operation generated by a button used to trigger document recognition and reconstruction.
  • S303 The smart device executes the image processing method of this application to eliminate moiré and restore image quality.
  • S304 The smart device determines the text content and typesetting form of the original image based on image recognition.
  • the smart device may specifically determine the text content and typesetting form of the original image based on optical character recognition (OCR).
  • OCR optical character recognition
  • S305 The smart device displays the image content data after text recognition and typesetting, and detects whether a user confirmation operation is received.
  • the smart device can import the image content data after text recognition and typesetting based on the user's confirmation operation into the shared document, or store it in the local storage in the form of a document.
  • the smart device may also receive user editing operations to edit the displayed text recognition and typeset image content data, such as supplementing or modifying the text that has not been correctly recognized, and adding the text that has not been correctly recognized. Attached drawings and so on.
  • the moiré has the characteristics of multi-scale and multi-space frequency bands.
  • the same image has different forms of moiré at different scales.
  • the same image with moiré has different forms of expression at 256x256, 384x384, 512x512 and other scales.
  • Figure 4 it is a partial display of the same image zoomed to different scales.
  • the shape of the moiré is different.
  • the data distribution of the moiré of each scale has its own uniqueness.
  • An image processing model that eliminates the moiré, It should be able to handle multi-scale and various forms of moiré.
  • the resolution of the image on the right is greater than the resolution of the image on the left.
  • the same image has different patterns of moiré in different spatial frequency bands.
  • the shape of the moiré on each spatial frequency band is different.
  • the shape of the moiré is different. Specifically, not every spatial frequency band has traces of moiré. From Figure 5, we can see that each The data distribution of the moiré in each spatial frequency band is also different. The low frequency tends to show thick color cast stripes, and the high frequency tends to show gray fine lines.
  • Figures 4 and 5 are only used to illustrate the two characteristics of moiré.
  • Figures 4 and 5 are only used to illustrate the shape of the moiré.
  • Figures 4 and 5 except for the moiré The content of other images outside the pattern is meaningless.
  • FIG. 6 is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • the embodiment of the present application can be executed by smart devices such as smart phones, tablets, personal computers, and even servers. , Used the moiré features determined based on the above research to construct and train the optimized image processing model, and the special image processing model is used to achieve the removal of the moiré.
  • the method of the embodiment of the present application includes the following steps:
  • S601 The smart device obtains the original image.
  • the original image can be an image with moiré obtained when the content displayed on the electronic display is taken, or it can be an image with moiré obtained locally or on the Internet.
  • the user can obtain it by running the relevant application.
  • S602 The smart device runs the image processing model to perform moiré removal processing on the original image to obtain a target image.
  • the image processing model is a network model trained in advance based on the moiré training data set; and, the image processing model includes: a multi-band module, the multi-band module is used to process the original image to obtain information about The N-layer Laplacian pyramid of the original image, and the first processing result feature map is obtained based on the feature map corresponding to the N spatial frequency bands of the N-layer Laplacian pyramid, and the target image is based on the first A feature map of the processing result is obtained, and the N is a positive integer greater than or equal to 2.
  • the structure of the image processing model of the embodiment of the present application is shown in Figure 7.
  • the Laplacian pyramid is introduced to obtain the feature map of the multi-space frequency band corresponding to the original image.
  • the feature map of the frequency band also reflects the feature map of the original image at different scales.
  • the moiré patterns at different spatial frequency bands and different scales are fully considered. That is to say, compared with the prior art, in the embodiment of the application, the image processing model introduces the Laplacian pyramid, and then based on the supervised or self-supervised model training method, the image processing suitable for removing moiré is obtained. Compared with the traditional model based on edge extraction technology, the model can remove the moiré in the image more conveniently and comprehensively.
  • the 2-scale initial feature map 703 is an optional step.
  • the aforementioned initial feature maps are respectively subjected to the convolution processing of the intermediate features to obtain the corresponding intermediate feature maps under the N scales, and then from the initial features of the 1/32 scale
  • the intermediate feature map corresponding to Fig. 707 starts up-sampling and convolution layer by layer, and concatenates the up-sampled convolution result with the intermediate feature map of the same scale, continues the up-sampling convolution of the stitching processing result, and repeats the stitching process The process until the Laplacian pyramid containing 5 feature maps of different spatial frequency bands and different scales is generated.
  • the feature maps corresponding to the N spatial frequency bands included in the Laplace pyramid are: feature maps 710 corresponding to the fifth spatial frequency band (band4 in Figure 7), and the fourth spatial frequency band (Figure 7).
  • Band3 in 7) corresponds to the feature map 711
  • the third spatial frequency band (band2 in Figure 7) corresponds to the feature map 712
  • the second spatial frequency band (band1 in Figure 7) corresponds to the feature map 713
  • the fifth spatial frequency band is the top layer of the Laplacian pyramid.
  • the stitching process mentioned in this application is an operation in the network structure design of the image processing model, which is used to combine features. Specifically, it can refer to the feature fusion extracted by multiple convolution feature extraction frameworks, or the output Layer information is fused.
  • the splicing process involved in the present application can also be replaced with a superimposition process (add), and the superimposition process can simply be a direct superposition between information.
  • the feature maps corresponding to the N spatial frequency bands may be resized and spliced to obtain the first processing result feature map 716.
  • the feature map and the predicted feature map 715 corresponding to the N spatial frequency bands may also be subjected to size adjustment and splicing processing to obtain the first processing result feature map 716.
  • the training image is input into the image processing model shown in Figure 7 to get Model prediction feature maps of N (for example, 5) spatial frequency bands corresponding to N (for example, 5) layers of the Laplacian pyramid.
  • N for example, 5 spatial frequency bands
  • Laplacian pyramid processing is directly performed on the supervised image to obtain the supervised feature maps of N (for example, 5) spatial frequency bands, and the model prediction feature maps of N (for example, 5) spatial frequency bands and N (for example 5) are calculated.
  • N for example, 5
  • the calculated N for example, 5
  • the obtained image processing model can remove the moiré in the image more conveniently and comprehensively.
  • the feature maps of each layer based on the Laplacian pyramid can be restored to obtain corresponding images.
  • the image processing model as shown in Figure 7, for the feature map corresponding to each spatial frequency band of the Laplacian pyramid corresponding to the training image (based on the feature map corresponding to the 5-layer spatial frequency band of the Laplacian pyramid), or for the training image The feature map corresponding to each spatial frequency band of the corresponding Laplacian pyramid and the predicted feature map corresponding to the training image are subjected to size adjustment and splicing processing to obtain the target image after removing the moiré.
  • the loss function value between the target image and the supervised image corresponding to the training image is directly calculated, and the relevant convolution parameters in each convolution process shown in FIG.
  • the original image to be processed is directly used as the input of the image processing model.
  • the final output is in multiple spatial frequency bands and multiple scales.
  • the target image with moiré removal processing is
  • a multi-band module based on the Laplacian pyramid is designed.
  • the feature maps in each frequency band can be trained to obtain an image processing model that can remove moiré in different scales and frequency bands, and can easily achieve better moiré removal effects.
  • the moiré in the image can be removed well. Further, after research, it was found that if the original image is blurred, some moiré in the flat area can be removed, so these flat areas can be directly used to construct the final result without the need for the network to learn this ability; images with moiré , Often the lighting environment is not ideal, and the brightness of the image is not balanced. If the lighting can be expressed explicitly, it will help the model pay more attention to how to eliminate moiré. Based on this, the image processing method of the embodiment of the present application will be further described below in conjunction with FIG. 8 and FIG. 9.
  • FIG. 8 shows a schematic flowchart of another image processing method according to an embodiment of the present application.
  • the method of the embodiment of the present application can be implemented by smart devices such as a smart phone, a tablet computer, or a personal computer.
  • the method includes the following steps :
  • the smart device After acquiring the original image, the smart device can run the image processing model to perform moiré removal processing on the original image to obtain the target image.
  • the specific process may include the description of the following steps.
  • the image processing model includes: a multi-band module 901, a prediction module 902, and a super-division module 903.
  • the relevant model parameters in the multi-band module 901, the prediction module 902, and the super-division module 903 minimize the loss function value obtained by the comprehensive calculation.
  • the smart device runs the image processing model, and obtains a first processing result feature map through the multi-band module.
  • the multi-band module 901 is constructed based on the Laplacian pyramid.
  • the S802 may include the following steps:
  • the smart device runs the image processing model, and performs M initial analysis processing on the original image through the multi-band module 901 to obtain initial feature maps at N scales, where M is a positive integer greater than or equal to 2.
  • the initial analysis processing includes: first down-sampling the original image and then performing convolution processing, or performing down-scaling convolution processing on the original image.
  • M is greater than or equal to N, and both M and N are positive integers.
  • the initial feature map under N scales is the content of the area 9001 as shown in FIG. 9.
  • S8022 The smart device performs convolution processing on the initial feature maps in the N scales to obtain intermediate feature maps in the N scales.
  • the intermediate feature maps under N scales are the five feature maps pointed by the conv arrow pointing downward from the initial feature map in FIG. 9.
  • the smart device obtains the feature maps corresponding to the N spatial frequency bands of the N-layer Laplacian pyramid according to the intermediate feature maps at N scales.
  • the feature map corresponding to the N spatial frequency bands the feature map of the N-th spatial frequency band is obtained according to the N-th intermediate feature map with the smallest scale in the intermediate feature map;
  • the feature map of the Nith spatial frequency band is obtained from the Nth intermediate feature map, the Nith intermediate feature map, and all intermediate feature maps between the Nth intermediate feature map and the Nith intermediate feature map, so Said i is a positive integer greater than or equal to 1 and less than N.
  • the intermediate feature map with the smallest scale is taken as the feature map corresponding to the fifth spatial frequency band.
  • the feature map obtained later is obtained after splicing processing.
  • the 5th intermediate feature map is up-sampled and convolved with the 4th intermediate feature map after the splicing process, and the spliced feature map is then up-sampled and convolved to be spliced with the third intermediate feature map.
  • the feature map corresponding to the third spatial frequency band can be obtained, and so on.
  • the smart device obtains the first processing result feature map according to the feature maps corresponding to the N spatial frequency bands. After the smart device obtains the feature maps corresponding to the N spatial frequency bands, it can directly adjust the scale based on the feature maps corresponding to these spatial frequency bands, and then stitch together to obtain a large feature map, which is recorded as the first processing result Feature map.
  • the S8024 may further include: the smart device acquires a predicted feature map, the predicted feature map is based on the Nth intermediate feature map, the first intermediate feature map, and the Nth intermediate feature map to the first All intermediate feature maps between the intermediate feature maps are obtained.
  • the predicted feature map is obtained after the two feature maps in the region 9004 in Figure 9 are stitched together.
  • One of the two feature maps is the intermediate feature map obtained by convolution of the initial feature map 9002, and the other is Starting from the fifth intermediate feature map, the feature maps obtained after convolution and splicing are up-sampled.
  • the first processing result feature map is obtained.
  • the predicted feature map and the feature map corresponding to the adjusted spatial frequency band are spliced with the predicted feature maps to obtain a large feature map, which is recorded as the first processing result feature map, and the large feature map is shown in the figure 9 is shown in the feature map 9003.
  • the output of 5 frequency bands is reused because these outputs have a linear relationship with the final output in theory, which is a high-level feature with abundant information.
  • the smart device After the smart device obtains the feature maps corresponding to the N spatial frequency bands, it can obtain the target image according to the feature maps corresponding to the N spatial frequency bands, or obtain the target image according to the feature map of the first processing result.
  • the smart device can also directly obtain the target image based on the first processing result feature map. Specifically, the following steps can be performed to obtain the target image according to the first processing result feature map.
  • S803 The smart device obtains a second processing result feature map according to the first processing result feature map through the prediction module.
  • the prediction module is constructed based on the attention mechanism, and the S803 may specifically include the following steps:
  • S8031 The smart device obtains a fuzzy feature map obtained after the original image is blurred.
  • S8032 The smart device obtains a second processing result feature map through the prediction module according to the first processing result feature map and the fuzzy feature map.
  • the blur processing may use Gaussian blur processing.
  • multiple RGB three-channel feature maps are predicted through the network.
  • multiple RGB three-channel feature maps include a pixel-by-pixel weighted parameter feature map attention0 (such as the feature map 9005 in Figure 9) and a pixel-wise weighted parameter feature map attention1 (such as the feature map in Figure 9).
  • RGB output feature map (such as the feature map 9007 in FIG. 9, the illumination multiplication coefficient feature map alpha (such as the feature map 9008 in FIG. 9, the illumination addition coefficient feature map beta (such as the feature map 9009 in FIG. 9)).
  • the preliminary result feature map RGBweighted attention0*blured+attention1*RGB.
  • the blured is the fuzzy feature map 9010 obtained after blurring the original image mentioned above.
  • the fuzzy feature map 9010 is a feature map after the original image is blurred and the size is adjusted, and the size of the fuzzy feature map is the same as the size of the first processed result feature map.
  • the pixel-by-pixel weighted parameter feature map attention0, the pixel-by-pixel weighted parameter feature map attention1, and the RGB output feature map are respectively obtained after convolution calculations according to the first processing result feature map.
  • the illumination coefficient is used to eliminate unreasonable light and shade changes:
  • the feature map result RGBweighted*alpha+beta.
  • alpha and beta are also obtained after convolution calculation based on the feature map of the first processing result.
  • the feature map result 9011 is the second processing result feature map, from which the de-moiré result at the 1/2 scale is obtained, and the illumination change is also restored.
  • the target image can be obtained according to the second processing result feature map. Specifically, the following super-division processing can be further performed to obtain the final target image. In one embodiment, according to the second processing As a result, the target image can be obtained directly from the feature map, without the need to go through the following super-division module for super-division processing.
  • S804 The smart device obtains the target image according to the second processing result feature map through the super-division module.
  • the S804 specifically includes: the smart device obtains a reference feature map according to the first processing result feature map, and processing the second processing result feature map to obtain an intermediate result feature map.
  • the scale of the reference feature map is the same as the scale of the original image
  • the scale of the intermediate result feature map is the same as the scale of the original image.
  • the smart device obtains a target image according to the reference feature map and the intermediate result feature map, and according to the super-division module 903.
  • the smart device After the smart device obtains the second processing result feature map, it then performs a simple super-resolution upsampling on the 1/2-scale second processing result feature map to obtain the final result.
  • the result of 1/2 scale is first enlarged twice to become the RGB image at the original resolution, denoted as result 1.0, which is 9012, and result 1.0 represents the feature after the second processing result feature map is enlarged twice picture.
  • result 1.0 represents the feature after the second processing result feature map is enlarged twice picture.
  • the upper convolution Starting from the feature map of the first processing result, use the upper convolution to predict the residual value final_residual, which is 9013, and final_residual also represents a feature map.
  • the final result is the feature map 9014 of the target image that can be output, the feature of the target image
  • S805 The smart device outputs the target image.
  • the smart device can specifically display the target image on the user display interface, so that the user can realize functions such as storage, sharing, editing, and even reprocessing.
  • the smart device can also display the target image and the original image on the user display interface at the same time, thereby allowing the user to know the difference between the original image and the target image.
  • a multi-band module based on the Laplacian pyramid is designed.
  • the feature maps in each frequency band can be trained to obtain an image processing model that can remove moiré in different scales and frequency bands.
  • a prediction module based on the attention mechanism is also introduced. It takes into account the characteristics of directly removing a part of the moiré under the blurred image and the brightness characteristics of the picture, and also uses the super-division module, which can easily achieve a better moiré removal effect.
  • FIG. 10 is a schematic diagram of a process for training an image processing model according to an embodiment of the present application.
  • the training of the image processing model in the embodiment of the present application is mainly executed by powerful smart devices such as servers and personal computers.
  • the image processing model mentioned in the foregoing embodiment can be obtained through the training process of the embodiment of the present application.
  • the process of training the image processing model includes the following steps:
  • S1001 The smart device obtains the moiré training data set.
  • the moiré training data set includes: a matching image pair, the matching image pair includes: a training image and a supervision image, the training image has moiré, and the supervision image does not have the moiré.
  • S1002 The smart device trains the initial model according to each matching image pair in the moiré training data set, so as to obtain the image processing model.
  • the training of the image processing model is supervised training.
  • the output result of the training image is compared with the image processing model according to the known result, if the condition is met (for example, the L2 loss function value is the smallest) ,
  • the image processing model is considered to be effective for the training image, otherwise, the model parameters in the image processing model are adjusted until the conditions are met.
  • a large number of training images and corresponding supervised images are used to train the image processing model, and an image processing model that can remove moiré processing on most images can be obtained.
  • the matching image pair is obtained based on processing the original image data.
  • This application designs ways to determine matching image pairs through simulation processing and preprocessing of the original image data. Based on the matching image pairs obtained in this way, the image processing model can be better trained, and the entire process is fully automated to achieve automation, Intelligent training improves the efficiency of image processing model training.
  • the obtaining of the moiré training data set includes: obtaining a supervised image according to the original image data, obtaining a training image with moiré added according to the original image data, wherein the supervising image is obtained according to the original image data, and the training image is obtained according to the original image data.
  • the two steps of obtaining a training image with moiré added to the original image data can be performed on a smart device that performs image processing model training or using an image processing model. Simulating the original image data can also be performed on other dedicated devices. Execute on the simulated device.
  • obtaining the training image with moiré added according to the original image data includes simulation processing of the original image data, which specifically includes the following steps:
  • the smart device disassembles each pixel in the original image data into three side-by-side sub-pixels to obtain a sub-pixel image.
  • Each sub-pixel corresponds to a color.
  • the colors of the original image data include red, green, and blue. RGB.
  • the smart device disassembles each pixel of the original image data into 3 side-by-side sub-pixels, and the color of each sub-pixel is the RGB value of the original pixel.
  • S12 The smart device adjusts the size of the sub-pixel image to obtain a first intermediate image with the same size as the image size of the original image data, and resize the new sub-pixel image to the original resolution to obtain the first intermediate image .
  • the smart device sets the gray value of the pixel with the gray value lower than the first threshold in the first intermediate image as the second threshold to obtain the second intermediate image.
  • the smart device sets the pixels with gray values close to 0 to a certain threshold greater than 0.
  • the pixels with gray values less than 5 or 10 are all set to 10, which is to simulate a pure black image on the display There are also weak luminous characteristics.
  • the gamma gamma is further adjusted to make the output image closer to the visual effect of the display.
  • the adjusted gamma value may be some random value, for example, the random value may be 0.8 ⁇ Any value between 1.5.
  • S14 The smart device adds radial distortion to the second intermediate image to obtain the first distorted image, thereby simulating the image distortion caused by the bending of the screen of the display.
  • S15 The smart device performs camera imaging simulation optimization on the first distorted image, and obtains a training image with moiré added.
  • the S15 may specifically include the following steps, so that the obtained training image with added moiré is closer to the real camera shooting effect.
  • the smart device performs projection transformation processing on the first distorted image according to the first projection transformation parameter to obtain an oblique simulated image, and performs a perspective transform on the first distorted image.
  • the simulated camera shoots the screen, the sensor cannot completely face the screen. The resulting image is tilted.
  • the smart device processes the tilt simulation image according to the brightness distribution characteristics of the image imaged by the camera to obtain a brightness simulation image.
  • the smart device uses the color filter array (CFA) sampling algorithm of the Bayer array to resample and interpolate the tilted analog image output in the previous step.
  • CFA color filter array
  • the gamma value of the re-interpolated image can be adjusted again to simulate the image brightness distribution characteristics of the camera during imaging.
  • the smart device adds image noise to the brightness simulation image to obtain a noise simulation image with noise.
  • Gaussian noise may be added to the imaging result image to simulate the noise generated by the imaging sensor of a real camera.
  • the smart device processes the noise simulation image according to the preset light coefficient to obtain the light and dark simulation image.
  • the smart device multiplies different light coefficients on different areas of the image to simulate the phenomenon of uneven brightness of the image when shooting the screen.
  • the final simulated light and dark image can be used as the final training image with moiré added.
  • the obtaining the supervised image according to the original image data includes:
  • S21 The smart device adds radial distortion to the original image data to obtain a second distorted image.
  • This step is the same as the processing method of the radial distortion processing mentioned in the above S14.
  • S22 The smart device performs projection transformation processing on the second distorted image according to the first projection transformation parameter to obtain a supervised image.
  • This step is the same as the projection transformation parameters on which the projection transformation process mentioned in S151 is based, and the smart device can directly compose a matching image pair between the supervised image and the training image obtained above.
  • the image alignment scheme adopted in the embodiment of this application is a two-stage alignment. First, the two images (the original image and the image obtained by shooting the original image displayed on the display screen) are matched with feature points, and the projection transformation is calculated, and then the transformation is performed. The optical flow field is calculated for the past image to perform finer alignment to compensate for the image distortion that cannot be represented by the projection transformation.
  • the obtaining the moiré training data set may include:
  • S31 The smart device displays the original image data on the electronic display screen, and photographs the original image data displayed on the electronic display screen to obtain training images;
  • S32 The smart device performs feature point matching and optical flow alignment processing on the original image data according to the training image to obtain a supervised image;
  • S33 The smart device constructs and obtains a matching image pair based on the training image and the corresponding supervised image.
  • the S32 may include:
  • S321 The smart device performs feature point matching processing between the training image and the original image data, and calculates a second projection transformation parameter according to the feature point matching processing result; performs projection transformation processing on the original image data according to the second projection transformation parameter, Get the original image data after projection.
  • the aforementioned S321 is a feature matching (feature matching) process.
  • the image Before detecting and calculating the feature points of the training image and the original image data, the image can also be denoised first.
  • the non-local average non-local means algorithm can be selected, and the feature point detection (features from accelerated segment test, FAST) algorithm can be used to detect feature points, and then the detected feature points can be converted with scale-invariant features ( scale-invariant feature transform, SIFT), accelerated robust features (speeded up robust features, SURF), AKAZE (an image search method that can find matching key points between two images), ORB (a kind of feature points Algorithms for extraction and feature point description), binary robust invariant scalable keypoints (BRISK) and other algorithms respectively calculate feature values and perform brute force matching (brute force matching).
  • FAST feature point detection
  • SIFT scale-invariant feature transform
  • SURF accelerated robust features
  • AKAZE an image search method that can find matching key points between two
  • the smart device calculates a pixel coordinate correspondence table between the training image and the projected original image data based on the optical flow method.
  • This step corresponds to the alignment processing of the optical flow method, and specifically may be image distortion processing based on the optical flow method (optical flow).
  • the registration result based on the second projection transformation parameter is not entirely pixel-level alignment.
  • the experimental statistics still have a difference of 10 pixels on average. This difference is mainly caused by screen curvature and lens distortion.
  • the optical flow method is further used to eliminate the final error. Specifically, the dense optical flow field can be calculated, and the correspondence table of the pixel coordinates between the training image and the original image data after the projection processing can be obtained, and then the projection can be performed. Through experimental analysis, the variational optical flow algorithm is finally selected. The smoothness and accuracy of the optical flow obtained by this algorithm can be better balanced.
  • using the image gradient as the data item of the variational optical flow method can effectively avoid the brightness difference between the captured image and the original image, and the training image and the original image data after the projection should be Gaussian blur , On the one hand, it eliminates the influence of moiré noise, on the other hand, it makes the image smoother and more in line with the nature of divergence.
  • S323 The smart device optimizes the projection of the projected original image data according to the pixel coordinate correspondence table to obtain a supervised image corresponding to the training image.
  • the training image and the supervised image can basically be aligned at the pixel level. In this way, it can be ensured that these aligned training images and supervised images at the pixel level are used to train the initial model to obtain the image Processing model, after the image processing model performs moiré removal processing on the moiré image, the target image of normal scale and normal content is obtained, and the content of the target image and the original moiré image will not be inconsistent.
  • the initial model can be trained.
  • the structure of the initial model can also refer to the structure shown in FIG. 9. Training the initial model can consider any one or a combination of the following multiple model training methods.
  • the first model training method :
  • the initial model or image processing model may include: a multi-band module, a prediction module based on the attention mechanism, and a super-division module.
  • the output result of the super-division module is the model output result, which can be used in the training process.
  • the respective loss function values are determined through the L2 loss function, and the model parameters are adjusted and optimized according to the loss function value.
  • FIG. 11 shows a schematic diagram of supervised training of the initial model based on the three output results of the model.
  • the training of the initial model according to each matching image pair in the moiré training data set includes the following steps:
  • the training image in the matching image pair of the moiré training data set is used as the input of the initial model.
  • the first result 1101 output by the multi-band module, the second result 1102 output by the prediction module, and the super division module are obtained.
  • N-layer Laplacian pyramid processing on the supervised image in the matched image pair of the moiré training data set to obtain the feature maps of the N spatial frequency bands corresponding to the supervised image; obtain N corresponding to the supervised image
  • the feature maps of the frequency band are five feature maps in the feature map set 1104 as shown in FIG. 11.
  • the loss value can be calculated based on the L2 loss function between the feature maps in the feature map set 1104 and the feature maps of the same scale in the first result 1101. In FIG. 11, five basic loss function values can be obtained.
  • the first loss function value between the image after resolution processing on the supervision image and the second result 1102 is obtained; the image 1105 is obtained after the resolution processing on the supervision image.
  • the loss value can be calculated based on the L2 loss function between the second result 1102 and the image 1105.
  • the second loss function value between the supervised image 1106 and the third result 1103 is acquired; the loss value can be calculated based on the L2 loss function between the supervised image 1106 and the feature map of the third result 1103.
  • the model parameters of the initial model are optimized, so as to obtain an image processing model.
  • the initial model or image processing model may include: a multi-band module, and the output result of the multi-band module is the model output result.
  • the loss function value can be determined by the L2 loss function, and the model parameters can be adjusted and optimized according to the loss function value.
  • the initial model or image processing model may include: a multi-band module and a prediction module based on the attention mechanism, and the result output by the prediction module is the result output by the model.
  • the respective loss function value can be determined through the L2 loss function, and the model parameters can be adjusted and optimized according to the loss function value.
  • the second model training method :
  • the moiré image will have very different moiré patterns at various scales. Therefore, when training the initial model of the image processing model, Multi-scale learning was performed. Specifically, the training the initial model according to each matching image pair in the moiré training data set includes:
  • the initial model is trained to obtain P loss function values, and the model parameters in the initial model are optimized according to the P loss function values, so as to obtain an image processing model.
  • each deformed matching image pair is used as a separate training data.
  • Respectively input into the initial model train according to the first model training method mentioned above, then calculate the loss function value separately, and finally add up the loss function value for training, that is, add all the loss functions when the total loss function value is,
  • the optimized model can make the value obtained by adding the corresponding loss function values to the minimum value.
  • the initial model can learn moiré patterns at different resolutions at the same time, and the initial model will be more robust when it learns the ability to remove moiré.
  • a common phenomenon has been found through practice. The greater the resolution, the better the removal of moiré.
  • the third model training method is a third model training method
  • this application can also directly train the model based on a large number of images with moiré.
  • the training of the image processing model in the image processing method may further include the following steps.
  • the size of the training images in the moiré image set is enlarged and adjusted to obtain an enlarged moiré image of the target size.
  • There is no matching image pair in the moiré image set and only the moiré images with moiré are included in the moiré image set, and there is no supervision image corresponding to the moiré image.
  • the first image processing model is run to perform moiré removal processing on the moiré magnified image to obtain a moiré supervised image.
  • a model obtained by training according to each matching image pair in the above-mentioned moiré training data set may be used as the first image processing model.
  • the obtained image processing model is used as the first image processing model.
  • the first image processing model may also be a model for removing moiré trained by other training methods.
  • self-supervised training is further carried out based on a large number of moiré images to obtain an image processing model with better performance.
  • the model parameters in the first image processing model are optimized, so as to obtain an image processing model.
  • the moiré matching image pair is scaled to 256x256, 384x384, 512x512, 768x768 and other scales, and the first image processing model is again multi-scale training. After training to convergence, an image processing model with better performance can be obtained.
  • the newly trained model can also be used as the new first image processing model, and the new first image processing model can be used to eliminate the moiré of the 1664x1664 (or other larger size) moiré image. , Perform self-supervised training again. In this way, one or more image processing models that can remove the moiré on the image can finally be obtained.
  • a model obtained by optimizing the model parameters in the first image processing model according to the multiple moiré image pairs obtained after the above-mentioned enlargement processing and the first image processing model removes the moiré processing, as The first version of the image processing model.
  • the first version of the image processing model is used as a new first image processing model, so that the new first image processing model is trained according to the moiré image set to obtain the second version of the image processing model. Further, the image processing model of the second version can be used as the new first image processing model, and iteratively, multiple versions of the image processing model can be obtained.
  • the process of running the image processing model to perform moiré removal processing on the original image to obtain the target image may include the following steps:
  • a target image is obtained.
  • the image obtained after removing the moiré processing the image output by all versions of the image processing model can be based on the partial image
  • the regions are even compared pixel by pixel, and the target image is constructed based on the partial image region or pixel with the best moiré removal effect.
  • many versions of image processing models will be obtained. These image processing models have their own emphasis on capabilities. Therefore, these image processing models can be used to eliminate the moiré on the original image, and then the results Combined, you can get a better moiré elimination effect.
  • the merging method is to determine the local image area and/or pixel point with the smallest gradient in the local image area and/or pixel point in the image output by each version of the image processing model, and select the local image with the smallest gradient Regions and/or pixels, based on the selected partial image areas and/or pixels, the target image is finally constructed, for example, the image is divided into upper and lower partial image areas, and the upper half of the first image is the partial image area
  • the gradient is smaller than the partial image area of the upper half of the second image, and the gradient of the partial image area of the lower half of the second image is smaller than the partial image area of the lower half of the first image. Therefore, the upper half of the first image can be localized.
  • the image area is merged with the lower half of the partial image area of the second image to obtain the final target image.
  • a confrontation network can be generated to further fine-tune the model parameters of the image processing model.
  • the role of the discriminator that generates the confrontation network is to distinguish the network output image from the real moiré-free image, and its loss function is the cross-entropy of the classification.
  • the loss function of the generator that generates the adversarial network is added by two parts, one is the negative discriminator loss function, and the other is the L2 loss function of the difference between the generated graph and the generated graph of the old model.
  • the design goal of generating the loss function of the confrontation network is: not only the image processing model is required to generate a more real moiré-free image, but also the ability to remove the moiré is not too different from the original image processing model.
  • FIG. 12 is a schematic structural diagram of an image processing apparatus according to an embodiment of the application.
  • the apparatus of the embodiment of the present application may be set in a smart device. Personal computers and other equipment.
  • the device includes the following modules.
  • the obtaining module 1201 is used to obtain the original image
  • the processing module 1202 is configured to run an image processing model to perform moiré removal processing on the original image to obtain a target image; wherein the image processing model is a network model trained in advance according to the moiré training data set; and, the The image processing model includes: a multi-band module for processing the original image to obtain an N-layer Laplacian pyramid on the original image, and based on the N-layer Laplacian pyramid
  • the feature maps corresponding to the N spatial frequency bands obtain a first processing result feature map
  • the target image is obtained according to the first processing result feature map
  • the N is a positive integer greater than or equal to 2.
  • the processing module 1202 is specifically configured to run an image processing model, and perform M initial analysis and processing on the original image through the multi-band module to obtain initial feature maps at N scales.
  • M is a positive integer greater than or equal to 2
  • the M is greater than or equal to the N
  • the initial analysis processing includes: first down-sampling the original image and then convolution processing, or performing the original image Downscaling convolution processing; performing convolution processing on the initial feature maps at the N scales to obtain intermediate feature maps at N scales; according to the intermediate feature maps at N scales, N layers of Laplacian are obtained
  • the feature map of the N-th spatial frequency band is obtained according to the N-th intermediate feature map with the smallest scale in the intermediate feature map;
  • the feature map of the Nith spatial frequency band is based on the Nth intermediate feature map, the Nith intermediate feature map, and the Nth intermediate feature map to the Nith intermediate feature map.
  • the processing module 1202 is specifically configured to obtain a prediction feature map, which is based on the Nth intermediate feature map, the first intermediate feature map, and the Nth intermediate feature map to the first All the intermediate feature maps between the two intermediate feature maps are obtained; the feature maps corresponding to the N spatial frequency bands whose scale is smaller than the scale of the predicted feature map are adjusted to make the adjusted spatial frequency bands correspond to The scale of the feature map is equal to the scale of the predicted feature map; the first processing result feature map is obtained according to the predicted feature map and the feature map corresponding to the adjusted spatial frequency band.
  • the image processing model further includes: a prediction module based on an attention mechanism, and the processing module 1202 is specifically configured to obtain a fuzzy feature map obtained after blurring the original image; The first processing result feature map and the fuzzy feature map are used to obtain a second processing result feature map through the prediction module; a target image is obtained according to the second processing result feature map.
  • the image processing model further includes: a super-division module, and the processing module 1202 is specifically configured to obtain a reference feature map according to the first processing result feature map, and to compare the second processing result feature map
  • the image is processed to obtain an intermediate result feature map, the scale of the reference feature map is the same as the scale of the original image, and the scale of the intermediate result feature map is the same as the scale of the original image;
  • the intermediate result feature map is described, and the target image is obtained according to the super-division module.
  • the device further includes: a training module 1203;
  • the training module 1203 is configured to obtain a moiré training data set.
  • the moiré training data set includes a matching image pair, the matching image pair includes a training image and a supervision image, the training image has a moiré pattern, so The supervised image does not have the moiré; the initial model is trained according to each matching image pair in the moiré training data set, so as to obtain the image processing model.
  • the training module 1203 is specifically configured to disassemble each pixel in the original image data into three side-by-side sub-pixels to obtain a sub-pixel image, and each sub-pixel corresponds to a color.
  • the color of the original image data includes red, green, blue, RGB; the size of the sub-pixel image is adjusted to obtain a first intermediate image with the same size as the image size of the original image data;
  • the gray value of the pixel whose value is lower than the first threshold is set to the second threshold to obtain a second intermediate image; radial distortion is added to the second intermediate image to obtain the first distorted image; Perform camera imaging simulation optimization, and get training images with moiré added.
  • the training module 1203 is specifically configured to perform projection transformation processing on the first distorted image according to the first projection transformation parameters to obtain a tilt simulation image; according to the image brightness distribution characteristics of the camera imaging, the tilt simulation The image is processed to obtain a brightness simulation image; image noise is added to the brightness simulation image to obtain a noise simulation image with noise; the noise simulation image is processed according to a preset light coefficient to obtain a light and dark simulation image.
  • the training module 1203 is specifically configured to add radial distortion to the original image data to obtain a second distorted image; perform projection transformation processing on the second distorted image according to the first projection transformation parameter to obtain supervision image.
  • the training module 1203 is specifically configured to display the original image data on the electronic display screen, and to photograph the original image data displayed on the electronic display screen to obtain a training image; according to the training image, Perform feature point matching and optical flow alignment processing on the original image data to obtain a supervised image; construct a matching image pair based on the training image and the corresponding supervised image.
  • the initial model includes: a multi-band module to be trained, an attention mechanism-based prediction module to be trained, and a super-score module to be trained;
  • the training module 1203 is specifically configured to use the training image in the matching image pair of the moiré training data set as the input of the initial model, and after the initial model processing, the first result output by the multi-band module is obtained.
  • the second result output by the prediction module, and the third result output by the super-division module wherein the first result includes feature maps of N spatial frequency bands corresponding to the N-layer Laplacian pyramid of the training image, and The second result includes the second result feature map corresponding to the training image, the third result includes the target image corresponding to the training image; N layers are performed on the supervised image in the matching image pair of the moiré training data set Laplacian pyramid processing to obtain feature maps of the N spatial frequency bands corresponding to the supervised image; obtain the feature maps of the N spatial frequency bands corresponding to the supervised image and the characteristics of the N spatial frequency bands in the first result The N basic loss function values of the feature map are obtained, the first loss function value between the image after resolution processing of the supervised image and the second result is obtained, and
  • the training module 1203 is specifically configured to perform size adjustment processing on the images in the target matching image pair of the moiré training data set to obtain P deformed matching image pairs of different sizes; matching according to the P deformations For image pairs, the initial model is trained to obtain P loss function values, and the model parameters in the initial model are optimized according to the P loss function values, so as to obtain an image processing model.
  • a model trained according to each matching image pair in the moiré training data set is used as the first image processing model, and the training module 1203 is specifically configured to perform a calculation on the moiré image set.
  • the size of the training image is enlarged and adjusted to obtain an enlarged moiré image of the target size; the first image processing model is run to remove the moiré processing on the enlarged moiré image to obtain a moiré supervision image; and the moiré supervision image is obtained Performing scaling processing with the enlarged moiré image to obtain a plurality of moiré image pairs; according to the plurality of moiré image pairs, the model parameters in the first image processing model are optimized, so as to obtain an image processing model .
  • the model obtained by optimizing the model parameters in the first image processing model according to a plurality of moiré image pairs is used as the image processing model of the first version;
  • the processing model is used as a new first image processing model, so as to train the new first image processing model according to the moiré image set to obtain a second version of the image processing model;
  • the image processing model includes: the first version of the image processing model, and the second version of the image processing model;
  • the processing module 1202 is further configured to run the image processing model of the first version and the image processing model of the second version, and perform moiré removal processing on the original image to obtain a first image and a second image; According to the first image and the second image, a target image is obtained.
  • a multi-scale model based on the Laplacian pyramid is designed.
  • the feature maps in each frequency band can be trained to obtain an image processing model that can remove moiré in different scales and frequency bands, and can easily achieve better moiré removal effects.
  • FIG. 13 is a schematic structural diagram of a smart device in an embodiment of the present application.
  • the smart device in the embodiment of the present application may be, for example, a smart phone, a tablet computer, a personal computer, a server, etc., and the smart device can implement Data transmission, storage, data analysis, editing and other functions.
  • the smart device also includes various required housing structures, and includes a power supply, a communication interface, and the like.
  • the smart device may also include a processor 1301, a storage device 1302, an input interface 1303, and an output interface 1304.
  • the input interface 1303 may be some user interface, or data interface, or communication interface, which can obtain some data.
  • the output interface 1304 may be some network interfaces that can send out data, and the output interface 1304 may output the processed data to the display, so that the display can display the moiré-removed image and other data output by the output interface 1304.
  • the storage device 1302 may include a volatile memory (volatile memory), such as random-access memory (RAM); the storage device 1302 may also include a non-volatile memory (non-volatile memory), such as fast Flash memory (flash memory), solid-state drive (solid-state drive, SSD), etc.; the storage device 1302 may also include a combination of the foregoing types of memories.
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • flash memory flash memory
  • solid-state drive solid-state drive
  • SSD solid-state drive
  • the processor 1301 may be a central processing unit (CPU).
  • the processor 1301 may further include a hardware chip.
  • the above-mentioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), etc.
  • the above-mentioned PLD may be a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), and the like.
  • the storage device 1302 stores program instructions
  • the processor 1301 calls the program instructions stored in the storage device 1302 to execute the relevant methods and steps mentioned in the foregoing embodiments.
  • the processor 1301 is configured to execute the following steps:
  • the image processing model is a network model trained in advance based on a moiré training data set
  • the image processing model includes: a multi-band module for processing the original image to obtain an N-layer Laplacian pyramid on the original image, and based on the N-layer Laplacian
  • the feature maps corresponding to the N spatial frequency bands of the Las Pyramid obtain a first processing result feature map, the target image is obtained according to the first processing result feature map, and the N is a positive integer greater than or equal to 2.
  • the processor 1301 is specifically configured to execute the following steps:
  • the initial analysis processing includes: first down-sampling the original image and then performing convolution processing, or performing down-scaling convolution processing on the original image;
  • the feature maps corresponding to the N spatial frequency bands of the N-level Laplacian pyramid are obtained;
  • the target image is obtained according to the feature map of the first processing result.
  • the feature map of the N-th spatial frequency band is obtained according to the N-th intermediate feature map with the smallest scale in the intermediate feature map; the N spaces
  • the feature map corresponding to the frequency band, the feature map of the Nith spatial frequency band is based on the Nth intermediate feature map, the Nith intermediate feature map, and all intermediate features between the Nth intermediate feature map and the Nith intermediate feature map.
  • the processor 1301 is specifically configured to execute the following steps:
  • the prediction feature map being obtained based on the Nth intermediate feature map, the first intermediate feature map, and all intermediate feature maps from the Nth intermediate feature map to the first intermediate feature map;
  • a first processing result feature map is obtained.
  • the image processing model further includes: a prediction module based on an attention mechanism, and the processor 1301 is configured to perform the following steps:
  • a second processing result feature map is obtained through the prediction module
  • the target image is obtained according to the feature map of the second processing result.
  • the image processing model further includes: a super-division module, and the processor 1301 is configured to perform the following steps:
  • the scale of the reference feature map is the same as the scale of the original image
  • the The scale of the feature map of the intermediate result is the same as the scale of the original image
  • a target image is obtained.
  • processor 1301 is further configured to perform the following steps:
  • the moiré training data set includes: a matching image pair, the matching image pair includes: a training image and a supervised image, the training image has moiré, the supervised image does not have all Said moiré
  • the processor 1301 is specifically configured to execute the following steps:
  • each pixel in the original image data into three side-by-side sub-pixels to obtain a sub-pixel image, each sub-pixel corresponds to a color, and the color of the original image data includes red, green, blue, and RGB;
  • the processor 1301 is specifically configured to execute the following steps:
  • the noise simulation image is processed according to the preset light coefficient to obtain the light and dark simulation image.
  • the processor 1301 is specifically configured to execute the following steps:
  • the processor 1301 is specifically configured to execute the following steps:
  • a matching image pair is constructed based on the training image and the corresponding supervised image.
  • the initial model includes: a multi-band module to be trained, an attention mechanism-based prediction module to be trained, and a super-score module to be trained;
  • the processor 1301 is configured to execute the following steps:
  • the training image in the matching image pair of the moiré training data set is used as the input of the initial model.
  • the first result output by the multi-band module, the second result output by the prediction module, and the output by the super-division module are obtained.
  • the third result wherein the first result includes the feature map of the N spatial frequency bands corresponding to the N-layer Laplacian pyramid of the training image, and the second result includes the second result feature map corresponding to the training image ,
  • the third result includes a target image corresponding to the training image;
  • the model parameters of the initial model are optimized, so as to obtain an image processing model.
  • the processor 1301 is specifically configured to execute the following steps:
  • the initial model is trained to obtain P loss function values, and the model parameters in the initial model are optimized according to the P loss function values, so as to obtain an image processing model.
  • a model trained on each matched image pair in the moiré training data set is used as the first image processing model, and the processor 1301 is further configured to perform the following steps:
  • the model parameters in the first image processing model are optimized, so as to obtain an image processing model.
  • the processor 1301 is specifically configured to execute the following steps:
  • the image processing model includes: the first version of the image processing model, and the second version of the image processing model;
  • the processor 1301 is configured to execute the following steps:
  • a target image is obtained.
  • a multi-scale model based on the Laplacian pyramid is designed.
  • the feature maps in each frequency band can be trained to obtain an image processing model that can remove moiré in different scales and frequency bands, and can easily achieve better moiré removal effects.
  • the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (ROM) or a random access memory (RAM), etc.
  • the present application also provides a computer program product, which includes instructions, which when run on a computer, cause the computer to execute the image processing method in any of the above-mentioned embodiments.

Abstract

本申请公开了一种图像处理方法、智能设备及计算机可读存储介质,所述方法包括:获取原始图像;运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像;其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。采用该图像处理方法,可以得到更好的摩尔纹去除效果。

Description

一种图像处理方法、智能设备及计算机可读存储介质
本申请要求于2020年04月15日提交中国专利局、申请号为202010295743.1、申请名称为“一种图像处理方法、智能设备及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,具体涉及一种图像处理技术。
背景技术
随着计算机技术和电子成像技术的发展,图像应用作为非常好的辅助工具,已被人们应用到生活、学习、工作的各个场景中。使用手机、相机等设备不仅能够拍摄环境图像,还能够拍摄各种显示屏显示出的图像,用手机、数码相机等拍摄电子显示屏时,所得的图像经常会带摩尔纹。这些摩尔纹不仅影响观感,还对后续的图像识别会有不良影响。且由于摩尔纹,原图内容会增加许多纹理和噪声,使得压缩图像的数据量增大。
目前去除摩尔纹的方案中,一般基于摩尔纹常见的形态、以及摩尔纹与图像中的非摩尔纹内容的形状差异,通过边缘提取算法来确定图像中的摩尔纹图像部分进而达到去除摩尔纹的目的。但是,这种方式的算法实现复杂,且摩尔纹去除效果较差。
发明内容
本申请提供一种图像处理方法、智能设备及计算机可读存储介质,可较为便捷、全面地去除图像中的摩尔纹。
一方面,本申请提供了一种图像处理方法,由智能设备执行,所述方法包括:
获取原始图像;
运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像;
其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;
并且,所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。
另一方面,本申请还提供了一种图像处理装置,所述装置包括:
获取模块,用于获取原始图像;
处理模块,用于运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像;
其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;
并且,所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。
相应地,本申请还提供了一种智能设备,所述智能设备包括:存储装置和处理器;所述存储装置中存储有用于进行图像处理的程序指令;所述处理器,调用所述程序指令,用于实现上述的图像处理方法。
相应地,本申请还提供了一种计算机可读存储介质,所述计算机存储介质中存储有用于进行图像处理的程序指令,所述程序指令被处理器执行时,上述的图像处理方法将被实现。
相应地,本申请还提供了一种计算机程序产品,该计算机程序产品包括指令,当其在计算机上运行时,使得计算机执行上述图像处理方法。
本申请针对在不同尺度和不同空间频带下摩尔纹的差异性,在去除摩尔纹的图像处理模型中,设计了基于拉普拉斯金字塔构建的多尺度模型,利用拉普拉斯金字塔多个频带下的特征图,可以训练得到能够在不同尺度、不同频带下较为全面地去除摩尔纹的图像处理模型,可以便捷地实现更好的摩尔纹去除效果。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例的一种利用摩尔纹去除功能的流程示意图;
图2是本申请实施例的其中一个图像处理应用的相关应用界面;
图3是本申请实施例的另一种利用摩尔纹去除功能的流程示意图;
图4是本申请实施例的在不同尺度下的摩尔纹示意图;
图5是本申请实施例的在不同空间频带下的摩尔纹的示意图;
图6是本申请实施例的一种图像处理方法的流程示意图;
图7是本申请实施例的一种图像处理模型的结构示意图;
图8是本申请实施例的另一种图像处理方法的流程示意图;
图9是本申请实施例的另一种图像处理模型的结构示意图;
图10是本申请实施例的一种对图像处理模型进行训练的流程示意图;
图11是本申请实施例的对图像处理模型进行训练的结构示意图;
图12是本申请实施例的一种图像处理装置的结构示意图;
图13是本申请实施例的一种智能设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
摩尔纹是一种基于干涉现象产生的呈现栅栏状的纹理。如果数码相机、手机等具有拍摄功能的智能设备的感光元件(例如电荷耦合器件(charge coupled device,CCD)感光元件或者互补金属氧化物半导体(complementary metal oxide semiconductor,CMOS)感光元件)像素的空间频率与电子显示屏显示的影像中条纹的空间频率接近,那么在拍摄得到的图像中就会产生摩尔纹。
本申请通过对摩尔纹的特性进行综合分析,基于摩尔纹跨空间频带、多尺度下,会有不同的摩尔纹表现等特性,设计相应的图像处理模型,采用系列深度学习技术,消除智能手机、数码相机等具有拍摄功能的智能设备在拍摄电子显示屏时所得到的图像上的摩尔纹,并尽量复原其画质。
具体地,对于训练优化后的图像处理模型,输入带有摩尔纹的图像,经过图像处理模型进行摩尔纹消除处理,输出所需的目标图像。模型设计采用了分空间频带监督、多任务学习等技术;并且利用注意力机制复原颜色和亮度分布,实现较好的摩尔纹去除效果。
进一步地,本申请还提出实用的制造训练数据的方式以便于基于制造的训练数据进行模型训练,并在训练过程中运用了自监督学习,多尺度学习和生成对抗网络等训练优化处理方式。
为了实现上述功能,可以设计并在智能设备中安装一个目标应用,一方面,该目标应用可以调用智能设备的摄像头拍摄图像,另一方面,该目标应用可以读取本地存储的图像或者从网络下载图像,同时,对于拍摄得到的带有摩尔纹的图像、读取或下载的带有摩尔纹的图像,可以直接显示给用户,在显示给用户后,可以根据用户操作决定执行去除摩尔纹的处理,并显示去除摩尔纹后的图像。当然也可以在显示给用户之前,后台自动进行摩尔纹去除处理,以便于显示给用户处理后的不带摩尔纹的目标图像。
去除摩尔纹的应用场景有多种,例如:在一个关于用户通信的场景中, A用户通过通信应用向B用户分享其个人电脑上的PPT等信息时,A用户可以通过运行通信应用的智能手机等智能设备拍摄个人电脑的屏幕,此时会得到带有摩尔纹的图像。对于这些带摩尔纹的图像,A用户可以直接开启去除摩尔纹功能,这样一来,A用户向B用户共享的通过智能手机拍摄个人电脑显示屏得到的PPT等图像,均为去除摩尔纹后的图像。由此,可以降低因为摩尔纹产生的视频编码数据量,保证图像数据传输的质量,并且可以方便B用户能够更清晰地看到拍摄得到的图像信息,B用户还可以直接存储视频通信传输的清晰的图像信息,或者将其转换为可编辑的文档等数据。
如图1所示,是本申请实施例的一种利用摩尔纹去除功能的流程示意图,具体包括如下步骤:
S101:智能设备获取原始图像。
用户可以通过控制智能设备,使智能设备获取带有摩尔纹的图像,例如用户可以直接拍摄电子显示屏得到带摩尔纹的图像,带摩尔纹的图像可以称之为原始图像。
在一些实施例中,智能设备也可以从本地读取带有摩尔纹的图像,或者是从网络中下载带有摩尔纹的图像,从而获得原始图像。
S102:智能设备检测对原始图像的用户操作。
智能设备得到带摩尔纹的原始图像后,可以在用于显示该原始图像的显示界面上显示是否开启去除摩尔纹的按钮,以此来检测对原始图像的用户操作。在所述显示界面上显示这些带摩尔纹的图像时,也可以显示所述是否开启去除摩尔纹的按钮。如此可以实现为不同方式获得的原始图像均提供摩尔纹一键消除功能。
S103:智能设备预览去除摩尔纹后的图像。
智能设备的后台完成关于去除摩尔纹的图像处理后,智能设备可以预览显示去除摩尔纹后的图像,去除摩尔纹后的图像称之为目标图像。在预览显示目标图像的界面上还可以显示其他功能按钮,例如分享、存储、再次编辑等按钮。
S104:智能设备基于接收到的用户对这些按钮的操作,执行相应的诸如分享、存储、再次编辑等处理。
在一个实施例中,如图2所示,该图为本申请实施例的其中一个图像处理应用的相关应用界面,该图像处理应用中包括用户从获取带摩尔纹的图像、到去除摩尔纹图像显示给用户的一系列界面示意图。用户点击应用图标后,运行图像处理应用,用户在应用界面201中可以触发获取原始图像的操作。具体地,用户可以选择触发拍摄按钮202以拍摄图像,也可以选择触发加载按钮203从本地或者网络提取图像,从而获得原始图像。
对于获取到的原始图像可以在应用界面201上进行显示,若存在摩尔纹,则可以显示去除按钮204,当用户点击去除按钮204后,目标应用可以 执行图像处理,得到去除摩尔纹之后的目标图像,并在应用界面201上显示去除摩尔纹之后的目标图像。
在一个实施例中,基于所述图像处理应用,智能设备可以对应用界面上显示的已获取到的图像进行分析,如果分析确定图像上存在摩尔纹,才显示去除按钮204,当然,去除按钮204也可以在获取原始图像后一直显示在所述应用界面201上。在图2中还示出了其他按钮,比如触发重新获取图像的按钮205和保存图像的按钮206。
在一个实施例中,还可以在将图像转换为可编辑的文档的场景下使用本申请的图像处理方法以去除摩尔纹。如图3所示,该图为本申请实施例的另一种利用摩尔纹去除功能的流程示意图,该流程可以包括如下步骤:
S301:智能设备获取原始图像。
智能设备部署有具备去除摩尔纹功能的图文转换应用。该图文转换应用是指将图像(例如拍摄的幻灯片图像)转换为可编辑的文档的应用。智能设备可以响应于用户点击图文转换应用的图标的操作,启动图文转换应用。接着智能设备可以根据用户操作获取原始图像,比如拍摄电脑显示屏显示的PPT得到原始图像,或者从本地提取带摩尔纹的原始图像。
S302:智能设备接收用于触发文档识别重建的按钮产生的用户操作。
S303:智能设备执行本申请的图像处理方法,消除摩尔纹,并还原画质。
S304:智能设备基于图像识别确定原始图像的文字内容和排版形式。
在一个实施例中,智能设备具体可以基于光学字符识别(optical character recognition,OCR)确定原始图像的文字内容和排版形式。
S305:智能设备显示文字识别和排版后的图像内容数据,并检测是否接收到用户确认操作。
S306:若是,智能设备基于用户的确认操作对文字识别和排版后的图像内容数据进行相应的操作。
在一个实施例中,智能设备可以基于用户的确认操作对文字识别和排版后的图像内容数据导入到共享文档中,或者以文档的形式存储到本地存储器中。在其他实施例中,智能设备还可以接收用户的编辑操作,以实现对显示的文字识别和排版后的图像内容数据进行编辑,比如补充或者修改没有被正确识别的文字,加入没有被正确识别的附图等等。
下面对本申请去除摩尔纹的图像处理方法进行描述。
经过对带摩尔纹的图像的研究发现,摩尔纹具有多尺度多空间频带特性。一方面,同一张图像在不同尺度下,摩尔纹的形态是不一样的,比如同一张带摩尔纹的图像在256x256,384x384,512x512等尺度下,摩尔纹的表现形态不同。如图4所示,是同一张图像缩放到不同尺度大小下的局部展示,摩尔纹形状是有差别的,各个尺度的摩尔纹的数据分布有其独特性,一个消 除摩尔纹的图像处理模型,应该能应对多尺度各种形态的摩尔纹进行处理。在图4中,右边图像的分辨率大于左边的图像的分辨率,可以简单地认为,将左边的图像放大后,摩尔纹的表现形式不相同,比如“发送短信”附近的图像部分,在左边图像是“发送短信”下方的条纹分散、上方的条纹密集,而右边的图像则是“发送短信”下方的条纹密集,上方的条纹分散,并且条纹的粗细也不相同。
另一方面,同一张图像,不同空间频带中摩尔纹形态也不同。如果运用拉普拉斯金字塔将摩尔纹拆成多个空间频带,发现每个空间频带上的摩尔纹形态是不同的。如图5所示,在拉普拉斯金字塔所对应的5个空间频带中,摩尔纹的形态不相同,具体地,并不是每个空间频带都有摩尔纹的痕迹,从图5可知,每个空间频带的摩尔纹的数据分布也是不同的,低频倾向于呈现粗大的偏色条纹,高频倾向于呈现灰色的细纹。在设计图像处理模型时,本申请能利用上这种多空间频带差异性,在各个空间频带上有针对性地消除摩尔纹。
可以理解的是,上述涉及到的图4、图5仅仅是为了说明摩尔纹的两个特性,在图4和图5中仅仅用来示意关于摩尔纹的形态,图4和图5中除摩尔纹外的其他图像内容并无意义。
在一个实施例中,请参考图6,该图是本申请实施例的一种图像处理方法的流程示意图,本申请实施例可以由智能手机、平板电脑、个人电脑、甚至服务器等智能设备来执行,使用了基于上述研究确定的摩尔纹特征来构建并训练优化得到的图像处理模型,由该特殊的图像处理模型来实现摩尔纹的去除处理。具体地,本申请实施例的所述方法包括如下步骤:
S601:智能设备获取原始图像。
如上述,该原始图像可以是拍摄电子显示屏显示的内容时,得到的带有摩尔纹的图像,也可以是从本地或者网络上获取的带有摩尔纹的图像,用户通过运行相关应用来获取需要处理的原始图像,以便于得到更清晰的不带摩尔纹的图像、可以编辑共享的文档等等。
S602:智能设备运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像。
其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;并且,所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。
本申请实施例的所述图像处理模型的结构如图7所示,一方面,引入拉普拉斯金字塔,可以得到原始图像对应的多空间频带的特征图,另一方面, 在得到的多空间频带的特征图中也体现了原始图像在不同尺度下的特征图,在图像处理模型中较为充分地考虑了不同空间频带和不同尺度下的摩尔纹形态。也就是说,与现有技术相比,在本申请实施例中,图像处理模型引入了拉普拉斯金字塔,再基于有监督的或者自监督的模型训练方式,得到适合去除摩尔纹的图像处理模型,相较于传统的基于边缘提取技术构建的模型而言,能够更为便捷、全面地去除图像中的摩尔纹。
如图7所示,在本申请实施例的图像处理模型中,对于原始图像701,先将原始图像701先后进行M=6次初始分析处理,得到原始图像对应的N=5个尺度下的初始特征图。在本申请实施例,拉普拉斯金字塔为5层,因此M=6次初始分析处理可以包括:
对原始尺度的原始图像701,进行降尺度的卷积处理或者下采样后卷积处理,得到1/2尺度的初始特征图702;
对1/2尺度的初始特征图702直接进行卷积处理得到1/2尺度的初始特征图703;在一些实施例中,对1/2尺度的初始特征图702直接进行卷积处理得到1/2尺度的初始特征图703为可选的步骤。
对1/2尺度的初始特征图703再进行降尺度的卷积处理或者下采样后卷积处理,得到1/4尺度的初始特征图704,其中,当不存在所述初始特征图703时,可以直接对1/2尺度的初始特征图702进行相关处理,得到该1/4尺度的初始特征图704;
对1/4尺度的初始特征图704再进行降尺度的卷积处理或者下采样后卷积处理,得到1/8尺度的初始特征图705;
对1/8尺度的初始特征图705再进行降尺度的卷积处理或者下采样后卷积处理,得到1/16尺度的初始特征图706;
对1/16尺度的初始特征图706再进行降尺度的卷积处理或者下采样后卷积处理,得到1/32尺度的初始特征图707;
可以理解的是,M=6、N=5仅为举例,在其他实施例中,M=N或者M>N,且M、N为其他整数值。
在得到上述各个尺度的初始特征图后,分别将上述提及的初始特征图进行中间特征的卷积处理,得到对应的N个尺度下的中间特征图,接着再从1/32尺度的初始特征图707所对应的中间特征图开始逐层上采样卷积,并将上采样卷积结果和相同尺度的中间特征图进行拼接处理(concatenate),将拼接处理结果继续上采样卷积,重复拼接处理过程,直至生成含有5个不同空间频带、不同尺度的特征图的拉普拉斯金字塔。
在图7中,拉普拉斯金字塔所包括的N个空间频带所对应的特征图分别为:第5个空间频带(图7中的band4)对应的特征图710、第4个空间频带(图7中的band3)对应的特征图711、第3个空间频带(图7中的band2)对应的特征图712、第2个空间频带(图7中的band1)对应的特征图713、 第1个空间频带(图7中的band0)对应的特征图714。在本申请实施例中,第5个空间频带为拉普拉斯金字塔顶层。
本申请所提及的拼接处理是所述图像处理模型的网络结构设计中的一种操作,用于将特征联合,具体可以是指多个卷积特征提取框架提取的特征融合、或者是将输出层的信息进行融合。在其他实施例中,本申请所涉及的拼接处理也可以被替换为叠加处理(add),而叠加处理简言之可以是信息之间的直接叠加。
在一个实施例中,可以将N个空间频带所对应的特征图进行尺寸调整和拼接处理后,得到第一处理结果特征图716。也可以将N个空间频带所对应的特征图和预测特征图715进行尺寸调整和拼接处理后,得到第一处理结果特征图716。
在图像处理模型训练阶段,对于带有摩尔纹的训练图像和该训练对象对应的监督图像(监督图像没有摩尔纹),一方面,将训练图像输入到图7所示的图像处理模型中,得到N(例如5)层拉普拉斯金字塔所对应的N(例如5)个空间频带的模型预测特征图。另一方面,对监督图像直接进行拉普拉斯金字塔处理得到N(例如5)个空间频带的监督特征图,计算N(例如5)个空间频带的模型预测特征图与N(例如5)个空间频带的监督特征图之间的损失函数值,不断调整图7所示的各卷积处理中的相关卷积参数,使得计算得到的N(例如5)个损失函数值最小。基于大量的训练图像和监督图像的训练,能够得到兼顾摩尔纹的尺度和空间频带特性的图像处理模型,得到的图像处理模型能更为便捷、更为全面地去除图像中的摩尔纹。
在其他实施例中,基于拉普拉斯金字塔的各层特征图可以还原得到对应的图像。如图7所述的图像处理模型,对训练图像对应的拉普拉斯金字塔各个空间频带所对应的特征图(基于拉普拉斯金字塔5层空间频带所对应的特征图)、或者对训练图像对应的拉普拉斯金字塔各个空间频带所对应的特征图与训练图像所对应的预测特征图,进行尺寸调整和拼接处理,得到去除摩尔纹后的目标图像。本实施例中直接计算该目标图像和训练图像对应的监督图像之间的损失函数值,不断调整图7所示的各个卷积处理中的相关卷积参数,使得计算得到的损失函数值最小。基于大量的训练图像和监督图像的训练,能够得到兼顾摩尔纹的尺度和空间频带特性的图像处理模型,得到的图像处理模型能便捷、更为全面地实现摩尔纹的去除。
在使用图像处理模型对原始图像进行识别的阶段,直接将待处理的原始图像作为图像处理模型的输入,经过参数优化后的卷积、采样等处理后,最终输出在多空间频带、多尺度上进行了摩尔纹去除处理的目标图像。
本申请实施例针对在不同尺度和不同空间频带下摩尔纹的差异性,在去除摩尔纹的图像处理模型中,设计了基于拉普拉斯金字塔构建的多频带模块,利用拉普拉斯金字塔多个频带下的特征图,可以训练得到能够在不同尺 度、不同频带下较为全面地去除摩尔纹的图像处理模型,可以便捷地实现更好的摩尔纹去除效果。
在另一个实施例中,在上述建立图像处理模型所考虑的摩尔纹特性的基础上,已经能够较好地去除掉图像中的摩尔纹。进一步地,经过研究后还发现,如果对原图进行模糊,可以去掉一些平坦区域的摩尔纹,所以这些平坦区域可以直接用来组建最终结果,不需网络再学习这个能力;带摩尔纹的图像,往往光照环境不够理想,图像的亮度是不均衡的,若能显式地将光照表示出来,会有助于模型更关注如何消除摩尔纹。基于此,下面结合图8、图9对本申请实施例的图像处理方法进行进一步描述。
图8示出了本申请实施例的另一种图像处理方法的流程示意图,本申请实施例的所述方法可以由智能手机、平板电脑、个人电脑等智能设备来实现,所述方法包括如下步骤:
S801:智能设备获取原始图像。
智能设备在获取到原始图像后,即可运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像,具体过程可包括下述步骤的描述。
在本申请实施例中,如图9所示,所述图像处理模型包括:多频带模块901,预测模块902,超分模块903。在所述图像处理模型的训练过程中,需要结合多频带模块901的训练处理结果、预测模块902的训练处理结果、以及超分模块903的训练处理结果,来综合计算损失函数值,以便于优化多频带模块901,预测模块902,超分模块903中的相关模型参数,使得综合计算得到的损失函数值最小。当然,也可以仅计算超分模块903最后输出的训练结果所对应的损失函数值,以便于优化多频带模块901,预测模块902,超分模块903中的相关模型参数,使得根据超分模块903的输出结果对应计算得到的损失函数值最小。
S802:智能设备运行图像处理模型,通过所述多频带模块得到第一处理结果特征图。
其中,所述多频带模块901基于拉普拉斯金字塔构建。在一个实施例中,所述S802可以包括如下步骤:
S8021:智能设备运行图像处理模型,通过所述多频带模块901对所述原始图像进行M次初始分析处理,得到N个尺度下的初始特征图,M为大于等于2的正整数。所述初始分析处理包括:对原始图像先进行下采样、再进行卷积处理,或者,对原始图像进行降尺度的卷积处理。M大于或等于N,M和N均为正整数。N个尺度下的初始特征图为如图9所示区域9001的内容,在本申请实施例中,M大于N,具体为M=N+1,因此,经过M次初始分析处理后,还得到有1个1/2尺度的特征图即图9中所示区域9002部分的内容。
S8022:智能设备对所述N个尺度下的初始特征图进行卷积处理,得到 N个尺度下的中间特征图。N个尺度下的中间特征图即图9中从初始特征图出发向下的卷积conv箭头所指向的5个特征图。
S8023:智能设备根据N个尺度下的中间特征图,得到N层拉普拉斯金字塔的N个空间频带所对应的特征图。其中,所述N个空间频带所对应的特征图中,第N个空间频带的特征图根据所述中间特征图中尺度最小的第N个中间特征图获得;所述N个空间频带所对应的特征图中,第N-i个空间频带的特征图根据第N个中间特征图、第N-i个中间特征图及第N个中间特征图到第N-i个中间特征图之间的所有中间特征图获得,所述i为大于等于1且小于N的正整数。
以图9为例,将尺度最小的中间特征图作为第5个空间频带所对应的特征图。对于第5-1(即i=1)=4个空间频带所对应的特征图而言,则是基于第5-1=4个中间特征图和对第5个中间特征图上采样卷积up后得到的特征图进行拼接处理后得到。第5-2(即i=2)=3个空间频带所对应的特征图则需要第3个中间特征图、第4个中间特征图和第5个中间特征图得到。具体可以为:第5个中间特征图上采样卷积后与第4个中间特征图拼接处理后的特征图,拼接处理后的特征图再上采样卷积后与第3个中间特征图进行拼接处理,即可得到第3个空间频带所对应的特征图,依此类推。
S8024:智能设备根据N个空间频带所对应的特征图得到第一处理结果特征图。智能设备在得到N个空间频带所对应的特征图后,可以基于这些空间频带所对应的特征图直接进行尺度调整后,拼接处理得到一个大特征图,将该大特征图记为第一处理结果特征图。
在一个实施例中,所述S8024还可以包括:智能设备获取预测特征图,所述预测特征图根据第N个中间特征图、第一个中间特征图及第N个中间特征图到第一个中间特征图之间的所有中间特征图获得。如图9所示,预测特征图为图9中的区域9004中的两个特征图拼接处理后得到,该两个特征图中一个为初始特征图9002卷积得到的中间特征图,另一个为从第5个中间特征图开始上采样卷积、拼接后得到的特征图。将N个空间频带所对应的特征图中尺度小于所述预测特征图的尺度的特征图进行尺寸调整resize,以使调整后的空间频带所对应的特征图的尺度等于所述预测特征图的尺度;根据预测特征图和调整后的空间频带所对应的特征图,得到第一处理结果特征图。基于N个空间频带所对应的特征图直接进行尺度调整后,再与预测特征图进行拼接处理,得到一个大特征图,该大特征图记为第一处理结果特征图,该大特征图如图9中所示的特征图9003。复用5个频带的输出,是因为这些输出在理论上和最终输出是呈线性关系的,是信息量非常充沛的高层特征。
智能设备在得到N个空间频带所对应的特征图后,即可根据N个空间频带所对应的特征图得到目标图像,或者说根据第一处理结果特征图得到目 标图像。智能设备也可以基于第一处理结果特征图直接得到目标图像,具体可以执行下述步骤来实现根据第一处理结果特征图得到目标图像。
S803:智能设备通过预测模块,根据第一处理结果特征图得到第二处理结果特征图。
所述预测模块为基于注意力机制构建的,所述S803具体可以包括如下步骤:
S8031:智能设备获取对原始图像进行模糊处理后得到的模糊特征图。
S8032:智能设备根据第一处理结果特征图和所述模糊特征图,通过所述预测模块,得到第二处理结果特征图。
其中,所述模糊处理可以使用高斯模糊处理。
如图9所示,对于基于注意力机制构建的所述预测模块而言,从第一处理结果特征图出发,通过网络预测多个RGB三通道的特征图map。在一些可能的实现方式中,多个RGB三通道的特征图包括逐像素加权参数特征图attention0(例如图9中的特征图9005)和逐像素加权参数特征图attention1(例如图9中的特征图9006、RGB输出特征图(例如图9中的特征图9007、光照乘法系数特征图alpha(例如图9中的特征图9008、光照加法系数特征图beta(例如图9中的特征图9009)。
在一个实施例中,初步结果特征图RGBweighted=attention0*blured+attention1*RGB。其中的blured即为上述提及的对原始图像进行模糊处理后得到的模糊特征图9010。在图9所示的图像处理模型中,模糊特征图9010是对原始图像进行模糊处理并调整了尺寸后的特征图,模糊特征图的尺寸与第一处理结果特征图的尺寸相同。在一个实施例中,逐像素加权参数特征图attention0、逐像素加权参数特征图attention1、RGB输出特征图,分别根据第一处理结果特征图,经过卷积计算后获得。
在特征图RGBweighted的基础上,再用光照系数消除不合理的明暗变化:
特征图result=RGBweighted*alpha+beta。其中alpha和beta也是根据第一处理结果特征图经过卷积计算后得到的。
对于attention0、attention1、RGB输出特征图、alpha以及beta而言,在模型训练期间,会根据大量的损失函数值不断调整优化从第一处理结果特征图到这些特征图的卷积计算的卷积参数,得到合适卷积参数,进而从多频带模块输出的结果得到合适的可以用来协助消除摩尔纹、并恢复原始图像画质的attention0、attention1、RGB输出特征图、alpha以及beta。
特征图result 9011即为第二处理结果特征图,由此得到了1/2尺度下的去摩尔纹结果,同时附带还原了光照变化。
在得到第二处理结果特征图后,可以根据第二处理结果特征图得到目标图像,具体可以进一步进行下述的超分处理以得到最终的目标图像,在一个 实施例中,根据该第二处理结果特征图可以直接得到目标图像,而不必再经过下述的超分模块进行超分处理。
S804:智能设备通过超分模块,根据第二处理结果特征图得到目标图像。
所述S804具体包括:智能设备根据所述第一处理结果特征图得到参考特征图,以及对所述第二处理结果特征图进行处理得到中间结果特征图。所述参考特征图的尺度与所述原始图像的尺度相同,所述中间结果特征图的尺度与所述原始图像的尺度相同。智能设备根据所述参考特征图和所述中间结果特征图,并根据所述超分模块903,得到目标图像。
智能设备在得到第二处理结果特征图后,接下来对1/2尺度的第二处理结果特征图做一个简单的超分辨率上采样,就得到最终的结果。在一个实施例中,首先将1/2尺度的结果放大两倍,成为原分辨率下的RGB图,记为result 1.0即9012,result 1.0表示对第二处理结果特征图放大两倍后的特征图。再从第一处理结果特征图出发,用上卷积预测残差值final_residual即9013,final_residual也是表示的是一个特征图,最终的结果即为可以输出的目标图像的特征图9014,目标图像的特征图9014可以记为final_result=result 1.0+final_residual,这样就完成了超分,基于超分后的目标图像的特征图,可以还原得到目标图像。
S805:智能设备输出所述目标图像。
智能设备具体可以在用户显示界面上显示目标图像,以便于用户实现存储、共享、编辑、甚至重新处理等功能。在一个实施例中,智能设备还可以在用户显示界面上同时显示目标图像和原始图像,由此,可以使用户知晓原始图像与目标图像之间的差异。
本申请实施例针对在不同尺度和不同空间频带下摩尔纹的差异性,在去除摩尔纹的图像处理模型中,设计了基于拉普拉斯金字塔构建的多频带模块,利用拉普拉斯金字塔多个频带下的特征图,可以训练得到能够在不同尺度、不同频带下较为全面地去除摩尔纹的图像处理模型,并且,在多尺度模型的基础上还引入了基于注意力机制的预测模块,可以兼顾在模糊图像下能够直接去除一部分摩尔纹的特性以及图片亮度特性,并且还利用到了超分模块,可以便捷地实现更好的摩尔纹去除效果。
再请参见图10,是本申请实施例的一种对图像处理模型进行训练的流程示意图,本申请实施例对图像处理模型进行训练主要是由服务器、个人电脑等性能强大的智能设备来执行,经过本申请实施例的训练流程可以得到前述实施例中提到的图像处理模型。对图像处理模型进行训练的流程包括如下步骤:
S1001:智能设备获取摩尔纹训练数据集合。
其中,所述摩尔纹训练数据集合中包括:匹配图像对,该匹配图像对包 括:训练图像和监督图像,所述训练图像具有摩尔纹,所述监督图像不具有所述摩尔纹。
S1002:智能设备根据摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,以便于得到所述图像处理模型。
在本申请实施例中,对图像处理模型的训练是有监督的训练,根据已知的结果与图像处理模型对训练图像处理后输出的结果进行比较,若满足条件(如L2损失函数值最小),则认为图像处理模型对该训练图像有效,反之,则对图像处理模型中的模型参数进行调整,直到满足条件为止。同时采用大量的训练图像和对应的监督图像对图像处理模型进行训练,可以得到能够对大多数图像进行去摩尔纹处理的图像处理模型。
在本申请实施例中,在获取摩尔纹训练数据集合时,是基于对原图数据进行处理得到匹配图像对。本申请设计了通过仿真处理和对原图数据进行预处理等方式,确定匹配图像对,基于这样得到的匹配图像对能够较好地训练得到图像处理模型,并且整个过程全自动进行,实现自动化、智能化训练,提高图像处理模型训练效率。
在一个实施例中,所述获取摩尔纹训练数据集合包括:根据原图数据得到监督图像、根据所述原图数据得到添加了摩尔纹的训练图像,其中,根据原图数据得到监督图像、根据所述原图数据得到添加了摩尔纹的训练图像这两个步骤可以在执行图像处理模型训练的、或者使用图像处理模型的智能设备上执行,对原图数据进行仿真处理也可以在其他专用的仿真设备上执行。
在一个实施例中,所述根据所述原图数据得到添加了摩尔纹的训练图像包括对原图数据的仿真处理,其具体包括如下步骤:
S11:智能设备将原图数据中的每个像素点拆解为三个并排的亚像素点,得到亚像素图像,每个亚像素点对应一个颜色,所述原图数据的颜色包括红绿蓝RGB。智能设备将原图数据的每个像素拆解成3个并排的亚像素点,每个亚像素点颜色分别为原像素的RGB值。
S12:智能设备对亚像素图像的尺寸进行调整,得到尺寸与所述原图数据的图像尺寸相同的第一中间图像,将亚像素新图resize到原来的分辨率,得到所述第一中间图像。
S13:智能设备将第一中间图像中灰度值低于第一阈值的像素点的灰度值设置为第二阈值,得到第二中间图像。智能设备将灰度值接近0的像素设置为大于0的某个阈值,例如,将灰度值小于5或10等数值的像素点,均设置为10,这是为了模拟纯黑图在显示器上还有微弱的发光的特性。在得到第二中间图像后,进一步调整伽马gamma使得输出的图像更接近显示器的视觉效果,在本申请实施例中,调整的伽马值可以为一些随机值,该随机值比如可以是0.8~1.5之间任意值。
S14:智能设备对所述第二中间图像添加径向畸变,得到第一畸变图像, 以此来模拟显示器的屏幕弯曲带来的图像畸变。
S15:智能设备对所述第一畸变图像进行相机成像模拟优化,得到添加了摩尔纹的训练图像。
其中,所述S15具体可以包括如下步骤,以使得得到的添加了摩尔纹的训练图像与真实的相机拍摄效果更加接近。
S151:智能设备按照第一投影变换参数对第一畸变图像进行投影变换处理,得到倾斜模拟图像,对第一畸变图像进行投影变换(perspective transform),模拟相机拍摄屏幕时传感器无法完全正对屏幕所造成的图像倾斜。
S152:智能设备根据相机成像的图像亮度分布特征,对所述倾斜模拟图像进行处理,得到亮度模拟图像。智能设备用拜耳阵列的色彩滤波阵列(color filter array,CFA)采样算法对上一步输出的倾斜模拟图像进行再次采样插值。在一个实施例中,还可以对重插值后的图像再次调整伽马gamma值,模拟相机成像时的图像亮度分布特性。
S153:智能设备对所述亮度模拟图像添加图像噪声,得到存在噪点的噪声模拟图像,具体可以对成像结果图加入高斯噪声以模拟真实相机成像传感器成像时产生的噪点。
S154:智能设备按照预设的光照系数对噪声模拟图像进行处理,得到明暗模拟图像。智能设备对图像不同区域乘以不同的光照系数,模拟拍摄屏幕时图像明暗不均的现象。
可以将最后得到的明暗模拟图像作为最终的添加了摩尔纹的训练图像。
在一个实施例中,所述根据原图数据得到监督图像,包括:
S21:智能设备对原图数据添加径向畸变,得到第二畸变图像。
该步骤与上述S14中提到的径向畸变处理的处理方式相同。
S22:智能设备按照所述第一投影变换参数对第二畸变图像进行投影变换处理,得到监督图像。
该步骤与上述S151中提到的投影变换处理所基于的投影变换参数相同,智能设备可以直接将监督图像与前述得到的训练图像组成匹配图像对。
上述的仿真图像处理中有很多参数可以灵活调整,因此可以得到较为丰富的匹配图像对。可以在摩尔纹训练数据集合中加入更多实拍的摩尔纹图像,以便更好地对图像处理模型进行训练。对于实拍的摩尔纹图像,其难度在于如何将拍摄得到的带摩尔纹的图像和原图进行像素级对齐。本申请实施例采用的图像对齐方案是两阶段对齐,首先,对两图(原图以及对显示屏上显示的原图进行拍摄得到的图像)进行特征点匹配,并计算投影变换,接着对变换过的图像再计算光流运动场以进行更精细的对齐,以弥补投影变换无法表示的图像畸变。具体地,所述获取摩尔纹训练数据集合,可以包括:
S31:智能设备将原图数据显示在电子显示屏上,并对电子显示屏上显 示的原图数据进行拍摄,得到训练图像;
S32:智能设备根据所述训练图像,对所述原图数据进行特征点匹配和光流法对齐处理,得到监督图像;
S33:智能设备基于所述训练图像和对应的监督图像构建得到匹配图像对。
具体地,所述S32可以包括:
S321:智能设备在训练图像和原图数据之间进行特征点匹配处理,并根据特征点匹配处理结果计算第二投影变换参数;根据第二投影变换参数对所述原图数据进行投影变换处理,得到投影后的原图数据。
上述的S321是特征点匹配(feature matching)处理过程。在检测并计算训练图像和原图数据的特征点之前,还可以先对图像进行去噪。在一个实施例中可以选用非局部平均non local means算法,进一步地可以使用特征点检测(features from accelerated segment test,FAST)算法检测特征点,然后将检测到的特征点用尺度不变特征转换(scale-invariant feature transform,SIFT),加速稳健特征(speeded up robust features,SURF),AKAZE(一种图像搜索方式,能够在两幅图像之间找到匹配的关键点),ORB(一种关于特征点提取和特征点描述的算法),二进制鲁棒不变可扩展的关键点(binary robust invariant scalable keypoints,BRISK)等多种算法分别计算特征值,并各自进行暴力匹配(brute force matching)。对每一对特征点匹配结果进行投票,如果多种算法中有Y种以上能支持目标特征点匹配结果,则认为该目标特征点匹配结果是正确的。如此可以得到置信度较高的特征点匹配结果。然后再用这组特征点匹配结果计算第二投影变换参数,将原图数据投影到训练图像上,得到初步的配准结果。
S322:智能设备基于光流法计算得到所述训练图像和所述投影后的原图数据之间的像素坐标对应表。
该步骤对应于光流法对齐处理,具体可以是基于光流法(optical flow)的图像扭曲处理。该步骤中基于第二投影变换参数的配准结果并不完全是像素级的对齐,实验统计平均还有10个像素的差别,这个差别主要由于屏幕弯曲和镜头畸变造成的。在本申请实施例中,进一步使用光流法来消除最后的误差。具体可以计算稠密光流运动场,得到训练图像和投影处理后的原图数据之间的像素坐标的对应表,再进行投影。通过实验分析,最终选用变分光流variational optical flow算法,这个算法得到的光流在光滑性和准确性可以得到较好的平衡。使用细节上,使用图像梯度作为variational光流法的数据项,可以有效避免拍摄图和原图之间的亮度差别,并且要对训练图像和所述投影后的原图数据都先做一下高斯模糊,一方面消除摩尔纹噪声的影响,一方面使得图像更光滑而更符合可导的性质。
S323:智能设备根据所述像素坐标对应表对投影后的原图数据进行投 影优化,得到与训练图像对应的监督图像。
经过S321、S322、S323之后,训练图像和监督图像基本能够达到在像素级别上对齐,这样一来,可以保证采用这些在像素点级别对齐后的训练图像和监督图像,对初始模型进行训练得到图像处理模型,该图像处理模型在对带摩尔纹的图像进行去摩尔纹处理后,得到正常尺度、正常内容的目标图像,不会出现目标图像与原始的带摩尔纹的图像内容不一致的情况。
在得到了大量的匹配图像对后,可以对初始模型进行训练。所述初始模型的结构同样可以参考图9所示的结构。对初始模型进行训练可以考虑以下多个模型训练方式中的任意一个或者多个的组合。
第一种模型训练方式:
在一个实施例中,初始模型或者说图像处理模型可以同时包括:多频带模块、基于注意力机制的预测模块、超分模块,超分模块输出的结果即为模型输出结果,在训练过程中可以基于多频带模块、预测模块、超分模块输出的训练结果,分别通过L2损失函数确定各自的损失函数值,根据损失函数值来对模型参数进行调整优化。具体如图11所示,示出了基于模型的三个输出结果对初始模型进行监督训练的示意图。
在初始模型同时包括多频带模块、预测模块、超分模块的情况下,所述根据摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,包括以下步骤:
将摩尔纹训练数据集合的匹配图像对中的训练图像作为初始模型的输入,经过初始模型处理后,得到多频带模块输出的第一结果1101、预测模块输出的第二结果1102、以及超分模块输出的第三结果1103,其中,所述第一结果1101中包括训练图像的N层拉普拉斯金字塔对应的N个空间频带的特征图;如图11所示,5层的拉普拉斯金字塔可以得到5个空间频带的特征图,第一结果为5个空间频带所对应的5个特征图,所述第二结果包括所述训练图像对应的第二结果特征图,所述第三结果包括所述训练图像对应的目标图像。
对摩尔纹训练数据集合的匹配图像对中的所述监督图像进行N层拉普拉斯金字塔处理,得到所述监督图像对应的N个空间频带的特征图;获取所述监督图像对应的N个空间频带的特征图与所述第一结果1101中所述N个空间频带的特征图的N个基础损失函数值;对监督图像进行N层拉普拉斯金字塔处理后得到监督图像的N个空间频带的特征图,如图11所示特征图集合1104中的5个特征图。特征图集合1104中的特征图与第一结果1101中相同尺度的特征图之间可以基于L2损失函数进行损失值的计算,图11中可以得到5个基础损失函数值。
获取对所述监督图像进行分辨率处理后的图像与第二结果1102之间的第一损失函数值;监督图像进行分辨率处理后得到图像1105。第二结果1102 与图像1105之间可以基于L2损失函数进行损失值的计算。
获取所述监督图像1106与第三结果1103之间的第二损失函数值;所述监督图像1106与第三结果1103的特征图之间,可以基于L2损失函数进行损失值的计算。
根据N个基础损失函数值、所述第一损失函数值以及所述第二损失函数值,对所述初始模型的模型参数进行优化,以便于得到图像处理模型。
基于上述描述,在图11的模型下,分别有5个、1个、1个L2损失函数。最终的损失函数由这7个L2损失函数值可以直接相加即可,由于各个L2损失的像素数的差异导致的量级的差异,直接相加就是合理的加权了。
在一个实施例中,初始模型或者说图像处理模型可以包括:多频带模块,多频带模块输出的结果即为模型输出结果。在训练过程中可以基于多频带模块输出的训练结果,通过L2损失函数确定损失函数值,根据损失函数值来对各模型参数进行调整优化。
在一个实施例中,初始模型或者说图像处理模型可以包括:多频带模块和基于注意力机制的预测模块,预测模块输出的结果即为模型输出的结果。在训练过程中可以基于多频带模块、基于注意力机制的预测模块各自输出的结果,分别通过L2损失函数确定各自的损失函数值,根据损失函数值来对各模型参数进行调整优化。
第二种模型训练方式:
在一个实施例中,为了提高图像处理模型的摩尔纹去除能力,进一步考虑到摩尔纹图像在各个尺度下会有很不一样的摩尔纹样式,所以在对图像处理模型的初始模型进行训练时,进行了多尺度学习。具体地,所述根据摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,包括:
对摩尔纹训练数据集合的目标匹配图像对中的图像进行尺寸调整处理,得到P个不同尺寸的变形匹配图像对;
根据P个变形匹配图像对,对初始模型进行训练,得到P个损失函数值,并根据P个损失函数值对所述初始模型中的模型参数进行优化,以便于得到图像处理模型。
对于每一匹配图像对的训练样本,都放缩到256x256,384x384,512x512,768x768等分辨率后,得到多个不同尺度的变形匹配图像对,每一个变形匹配图像对单独作为一训练用数据,分别输入到初始模型中,按照例如上述的第一种模型训练方式进行训练,然后各自计算损失函数值,最后把损失函数值加起来进行训练,即总的损失函数值时所有损失函数相加,优化后的模型能够使得对应的各损失函数值相加后得到的值为最小值。这样每训练一个样本,就可以让初始模型同时学习不同分辨率下的摩尔纹样式,初始模型学到去摩尔纹能力将更具有鲁棒性。经过实践发现一个普遍现象,分辨率越大,摩尔纹的去除效果越好。
第三种模型训练方式:
为了使用更多的带摩尔纹的图像进行模型训练,本申请还可以基于大量的带摩尔纹的图像直接对模型进行训练。在一个实施例中,图像处理方法中关于图像处理模型的训练,还可以包括以下步骤。
对摩尔纹图像集合中的所述训练图像进行尺寸放大调整,得到目标尺寸的摩尔纹放大图像。摩尔纹图像集合中不存在匹配图像对,在摩尔纹图像集合中仅包括带有摩尔纹的摩尔纹图像,而没有摩尔纹图像所对应的监督图像。相较于前述实施例中描述的通过仿真处理等处理方式得到匹配图像对,和通过拍摄得到带摩尔纹的图像并进行特征点匹配、光流法处理等得到匹配图像对相比而言,直接获取带摩尔纹的图像更容易,例如,调整好相机位置后,对准电脑显示器显示的图像按照一定时间间隔进行拍摄,同时,电脑显示器按照一定的时间间隔切换不同的图像内容提供给相机拍摄,这样可以轻松获取到大量的带摩尔纹的摩尔纹图像。
在训练过程中,运行第一图像处理模型对所述摩尔纹放大图像进行去除摩尔纹处理,得到摩尔纹监督图像。其中,可以将根据上述提及的摩尔纹训练数据集合中的各个匹配图像对训练得到的模型作为第一图像处理模型。比如经过上述第一种模型训练方式和/或第二种模型训练方式训练初始模型后,得到的图像处理模型作为所述第一图像处理模型。第一图像处理模型也可以是其他训练方式训练得到的一个用于去除摩尔纹的模型。在该第一图像处理模型的基础上,进一步基于大量的带摩尔纹的图像进行自监督训练,得到性能更优的图像处理模型。
对所述摩尔纹监督图像和所述摩尔纹放大图像进行缩放处理,得到多个摩尔纹图像对。根据多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化,以便于得到图像处理模型。
在第二种模型训练方式所对应的多尺度学习过程中发现,一个带摩尔纹的图像如果放缩到较大的分辨率下,摩尔纹会很容易被模型去除。基于该发现,进行对第一图像处理模型的自监督学习。通过第一图像处理模型训练这批无标注数据(即摩尔纹图像集合中大量的摩尔纹图像)时,将摩尔纹图像的原图放大到1664x1664(或者其他较大的尺寸),运行第一图像处理模型进行去摩尔纹处理,将所述第一图像处理模型输出的结果作为相应摩尔纹图像的摩尔纹监督图像ground truth,对摩尔纹图像及其摩尔纹监督图像构成的摩尔纹匹配图像对而言,再同时将摩尔纹匹配图像对放缩到256x256,384x384,512x512,768x768等尺度下,对第一图像处理模型再次进行多尺度训练。训练到收敛后,可以得到最终所需的性能较好的图像处理模型。
在一个实施例中,还可以再用新训练好的模型作为新的第一图像处理模型,通过新的第一图像处理模型来消除1664x1664(或者其他较大的尺寸)的摩尔纹图像的摩尔纹,再次进行执行自监督训练。如此往复,最终可以得 到一个或者多个可以去除图像上的摩尔纹的图像处理模型。
在一个实施例中,将根据上述放大处理、第一图像处理模型去除摩尔纹处理后得到的多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化得到的模型,作为第一版本的图像处理模型。
将所述第一版本的图像处理模型作为新的第一图像处理模型,以便于根据所述摩尔纹图像集合对所述新的第一图像处理模型进行训练,得到第二版本的图像处理模型。进一步还可以再将第二版本的图像处理模型作为新的第一图像处理模型,循环往复,可以得到多个版本的图像处理模型。
在得到两个或者多个版本的图像处理模型后,在运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像的过程中,可以包括以下步骤:
运行所述第一版本的图像处理模型和所述第二版本的图像处理模型,对所述原始图像进行摩尔纹去除处理,得到第一图像和第二图像。当然,有多个版本的图像处理模型时,还可以运行其他版本的图像处理模型对原始图像进行摩尔纹去除处理,得到其他图像。
根据所述第一图像和所述第二图像,得到目标图像。在根据第一图像和第二图像,甚至更多版本的图像处理模型进行去除摩尔纹处理后所得到的图像,确定目标图像时,可以将所有版本的图像处理模型输出的图像中,按照局部图像区域甚至逐个像素点进行比较,根据去除摩尔纹效果最好的局部图像区域或像素点构建得到目标图像。也就是说,多次训练后会得到很多个版本的图像处理模型,这些图像处理模型在能力上各有侧重的,所以可以使用这些图像处理模型来分别消除原始图像上的摩尔纹,然后将结果合并,可以得到更好的摩尔纹消除效果。
在一个实施例中,合并方法是对每个版本的图像处理模型输出的图像中局部图像区域和/或像素点,确定梯度最小的局部图像区域和/或像素点,并选用梯度最小的局部图像区域和/或像素点,基于选用的各个局部图像区域和/或像素点,最后构建得到目标图像,比如,将图像划分为上下两个局部图像区域,第一图像的上半部分局部图像区域的梯度小于第二图像的上半部分局部图像区域,而第二图像的下半部分局部图像区域的梯度小于第一图像的下半部分局部图像区域,因此,可以将第一图像的上半部分局部图像区域与第二图像的下半部分局部图像区域合并,得到最终的目标图像。
在其他实施例中,还可以简单地从第一图像、第二图像甚至更多版本图像处理模型所输出的图像中,直接选择一个去除效果最好的图像作为最终的目标图像,可以在界面上同时显示两个或者多个版本图像处理模型输出的图像,并接收用户选择操作,将用户选择操作选中的图像作为最终的目标图像。
在通过上述方式得到了图像处理模型后,为了让图像处理模型输出的结果更真实,可以生成对抗网络对图像处理模型的模型参数进一步进行微调。 生成对抗网络的判别器的作用是区分网络输出图和真实的无摩尔纹的图像,其损失函数是分类的交叉熵。生成对抗网络的生成器的损失函数由两部分相加,一个是负的判别器损失函数,一个是生成图和旧模型生成图的差的L2损失函数。生成对抗网络的损失函数的设计目的是:既要图像处理模型生成更真实的无摩纹图像,又要其去摩尔纹的能力与原图像处理模型相差不太大。
再请参见图12,该图为本申请实施例的一种图像处理装置的结构示意图,本申请实施例的所述装置可以设置在智能设备中,所述智能设备可以是智能手机、平板电脑、个人电脑等设备。所述装置包括如下模块。
获取模块1201,用于获取原始图像;
处理模块1202,用于运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像;其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;并且,所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。
在一个实施例中,所述处理模块1202,具体用于运行图像处理模型,通过所述多频带模块对所述原始图像进行M次初始分析处理,得到N个尺度下的初始特征图,所述M为大于等于2的正整数,所述M大于或等于所述N,所述初始分析处理包括:对所述原始图像先进行下采样、再进行卷积处理,或者,对所述原始图像进行降尺度的卷积处理;对所述N个尺度下的初始特征图进行卷积处理,得到N个尺度下的中间特征图;根据N个尺度下的中间特征图,得到N层拉普拉斯金字塔的N个空间频带所对应的特征图;根据N个空间频带所对应的特征图得到第一处理结果特征图;根据所述第一处理结果特征图得到目标图像。
在一个实施例中,所述N个空间频带所对应的特征图中,第N个空间频带的特征图根据所述中间特征图中尺度最小的第N个中间特征图获得;
所述N个空间频带所对应的特征图中,第N-i个空间频带的特征图根据第N个中间特征图、第N-i个中间特征图及第N个中间特征图到第N-i个中间特征图之间的所有中间特征图获得,所述i为大于等于1且小于N的正整数。
在一个实施例中,所述处理模块1202,具体用于获取预测特征图,所述预测特征图是根据第N个中间特征图、第一个中间特征图及第N个中间特征图到第一个中间特征图之间的所有中间特征图得到的;将N个空间频带所对应的特征图中尺度小于所述预测特征图的尺度的特征图进行尺寸调整,以使调整后的空间频带所对应的特征图的尺度等于所述预测特征图的尺 度;根据所述预测特征图和调整后的空间频带所对应的特征图,得到第一处理结果特征图。
在一个实施例中,所述图像处理模型还包括:基于注意力机制的预测模块,所述处理模块1202,具体用于获取对所述原始图像进行模糊处理后得到的模糊特征图;根据所述第一处理结果特征图和所述模糊特征图,通过所述预测模块,得到第二处理结果特征图;根据所述第二处理结果特征图得到目标图像。
在一个实施例中,所述图像处理模型还包括:超分模块,所述处理模块1202,具体用于根据所述第一处理结果特征图得到参考特征图,以及对所述第二处理结果特征图进行处理得到中间结果特征图,所述参考特征图的尺度与所述原始图像的尺度相同,所述中间结果特征图的尺度与所述原始图像的尺度相同;根据所述参考特征图和所述中间结果特征图,并根据所述超分模块,得到目标图像。
在一个实施例中,所述装置还包括:训练模块1203;
所述训练模块1203,用于获取摩尔纹训练数据集合,所述摩尔纹训练数据集合中包括:匹配图像对,该匹配图像对包括:训练图像和监督图像,所述训练图像具有摩尔纹,所述监督图像不具有所述摩尔纹;根据所述摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,以便于得到所述图像处理模型。
在一个实施例中,所述训练模块1203,具体用于将原图数据中的每个像素点拆解为三个并排的亚像素点,得到亚像素图像,每个亚像素点对应一个颜色,所述原图数据的颜色包括红绿蓝RGB;对所述亚像素图像的尺寸进行调整,得到尺寸与所述原图数据的图像尺寸相同的第一中间图像;将第一中间图像中灰度值低于第一阈值的像素点的灰度值设置为第二阈值,得到第二中间图像;对所述第二中间图像添加径向畸变,得到第一畸变图像;对所述第一畸变图像进行相机成像模拟优化,得到添加了摩尔纹的训练图像。
在一个实施例中,所述训练模块1203,具体用于按照第一投影变换参数对第一畸变图像进行投影变换处理,得到倾斜模拟图像;根据相机成像的图像亮度分布特征,对所述倾斜模拟图像进行处理,得到亮度模拟图像;对所述亮度模拟图像添加图像噪声,得到存在噪点的噪声模拟图像;按照预设的光照系数对噪声模拟图像进行处理,得到明暗模拟图像。
在一个实施例中,所述训练模块1203,具体用于对原图数据添加径向畸变,得到第二畸变图像;按照所述第一投影变换参数对第二畸变图像进行投影变换处理,得到监督图像。
在一个实施例中,所述训练模块1203,具体用于将原图数据显示在电子显示屏上,并对电子显示屏上显示的原图数据进行拍摄,得到训练图像;根据所述训练图像,对所述原图数据进行特征点匹配和光流法对齐处理,得 到监督图像;基于所述训练图像和对应的监督图像构建得到匹配图像对。
在一个实施例中,所述初始模型中包括:待训练的多频带模块、待训练的基于注意力机制的预测模块、待训练的超分模块;
在一个实施例中,所述训练模块1203,具体用于将摩尔纹训练数据集合的匹配图像对中的训练图像作为初始模型的输入,经过初始模型处理后,得到多频带模块输出的第一结果、预测模块输出的第二结果、以及超分模块输出的第三结果,其中,所述第一结果中包括训练图像的N层拉普拉斯金字塔对应的N个空间频带的特征图,所述第二结果包括所述训练图像对应的第二结果特征图,所述第三结果包括所述训练图像对应的目标图像;对摩尔纹训练数据集合的匹配图像对中的所述监督图像进行N层拉普拉斯金字塔处理,得到所述监督图像对应的N个空间频带的特征图;获取所述监督图像对应的N个空间频带的特征图与所述第一结果中所述N个空间频带的特征图的N个基础损失函数值,获取对所述监督图像进行分辨率处理后的图像与第二结果之间的第一损失函数值,并获取所述监督图像与第三结果之间的第二损失函数值;根据N个基础损失函数值、所述第一损失函数值以及所述第二损失函数值,对所述初始模型的模型参数进行优化,以便于得到图像处理模型。
在一个实施例中,所述训练模块1203,具体用于对摩尔纹训练数据集合的目标匹配图像对中的图像进行尺寸调整处理,得到P个不同尺寸的变形匹配图像对;根据P个变形匹配图像对,对初始模型进行训练,得到P个损失函数值,并根据P个损失函数值对所述初始模型中的模型参数进行优化,以便于得到图像处理模型。
在一个实施例中,将根据摩尔纹训练数据集合中的各个匹配图像对训练得到的模型作为第一图像处理模型,所述训练模块1203,具体用于对所述摩尔纹图像集合中的所述训练图像进行尺寸放大调整,得到目标尺寸的摩尔纹放大图像;运行所述第一图像处理模型对所述摩尔纹放大图像进行去除摩尔纹处理,得到摩尔纹监督图像;对所述摩尔纹监督图像和所述摩尔纹放大图像进行缩放处理,得到多个摩尔纹图像对;根据所述多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化,以便于得到图像处理模型。
在一个实施例中,将根据多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化得到的模型,作为第一版本的图像处理模型;将所述第一版本的图像处理模型作为新的第一图像处理模型,以便于根据所述摩尔纹图像集合对所述新的第一图像处理模型进行训练,得到第二版本的图像处理模型;
所述图像处理模型包括:所述第一版本的图像处理模型、所述第二版本的图像处理模型;
所述处理模块1202,还用于运行所述第一版本的图像处理模型和所述第二版本的图像处理模型,对所述原始图像进行摩尔纹去除处理,得到第一图像和第二图像;根据所述第一图像和所述第二图像,得到目标图像。
本申请实施例所述装置包括的各个模块的具体实现可参考前述各个实施例中相关步骤的具体描述,在此不赘述。
本申请实施例针对在不同尺度和不同空间频带下摩尔纹的差异性,在去除摩尔纹的图像处理模型中,设计了基于拉普拉斯金字塔构建的多尺度模型,利用拉普拉斯金字塔多个频带下的特征图,可以训练得到能够在不同尺度、不同频带下较为全面地去除摩尔纹的图像处理模型,可以便捷地实现更好的摩尔纹去除效果。
再请参见图13,是本申请实施例的一种智能设备的结构示意图,本申请实施例的所述智能设备例如可以是智能手机、平板电脑、个人电脑、服务器等设备,该智能设备可以实现数据传输、存储、数据分析、编辑等功能。所述智能设备还包括各种所需的壳体结构,并包括供电电源、通信接口等。所述智能设备还可以包括处理器1301以及存储装置1302、输入接口1303、输出接口1304。
所述输入接口1303可以是一些用户接口,或者数据接口,或者通信接口,能够获取到一些数据。所述输出接口1304可以是一些网络接口,能够向外发出数据,输出接口1304可以输出处理后的数据给显示器,以便于显示器显示输出接口1304输出的去除摩尔纹的图像等数据。
所述存储装置1302可包括易失性存储器(volatile memory),如随机存取存储器(random-access memory,RAM);存储装置1302也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory),固态硬盘(solid-state drive,SSD)等;存储装置1302还可以包括上述种类的存储器的组合。
所述处理器1301可以是中央处理器(central processing unit,CPU)。所述处理器1301还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)等。上述PLD可以是现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)等。
在本申请实施例中,所述存储装置1302存储有程序指令,所述处理器1301调用所述存储装置1302中存储的程序指令,用于执行上述各个实施例中提及的相关方法和步骤。
在一个实施例中,所述处理器1301,用于执行如下步骤:
获取原始图像;
运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图 像;
其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;
并且,所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
运行图像处理模型,通过所述多频带模块对所述原始图像进行M次初始分析处理,得到N个尺度下的初始特征图,所述M为大于等于2的正整数,所述M大于或等于所述N,所述初始分析处理包括:对所述原始图像先进行下采样、再进行卷积处理,或者,对所述原始图像进行降尺度的卷积处理;
对所述N个尺度下的初始特征图进行卷积处理,得到N个尺度下的中间特征图;
根据N个尺度下的中间特征图,得到N层拉普拉斯金字塔的N个空间频带所对应的特征图;
根据N个空间频带所对应的特征图得到第一处理结果特征图;
根据所述第一处理结果特征图得到目标图像。
在一个实施例中,所述N个空间频带所对应的特征图中,第N个空间频带的特征图根据所述中间特征图中尺度最小的第N个中间特征图获得;所述N个空间频带所对应的特征图中,第N-i个空间频带的特征图根据第N个中间特征图、第N-i个中间特征图及第N个中间特征图到第N-i个中间特征图之间的所有中间特征图获得,所述i为大于等于1且小于N的正整数。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
获取预测特征图,所述预测特征图是根据第N个中间特征图、第一个中间特征图及第N个中间特征图到第一个中间特征图之间的所有中间特征图得到的;
将N个空间频带所对应的特征图中尺度小于所述预测特征图的尺度的特征图进行尺寸调整,以使调整后的空间频带所对应的特征图的尺度等于所述预测特征图的尺度;
根据所述预测特征图和调整后的空间频带所对应的特征图,得到第一处理结果特征图。
在一个实施例中,所述图像处理模型还包括:基于注意力机制的预测模块,所述处理器1301,用于执行如下步骤:
获取对所述原始图像进行模糊处理后得到的模糊特征图;
根据所述第一处理结果特征图和所述模糊特征图,通过所述预测模块,得到第二处理结果特征图;
根据所述第二处理结果特征图得到目标图像。
在一个实施例中,所述图像处理模型还包括:超分模块,所述处理器1301,用于执行如下步骤:
根据所述第一处理结果特征图得到参考特征图,以及对所述第二处理结果特征图进行处理得到中间结果特征图,所述参考特征图的尺度与所述原始图像的尺度相同,所述中间结果特征图的尺度与所述原始图像的尺度相同;
根据所述参考特征图和所述中间结果特征图,并根据所述超分模块,得到目标图像。
在一个实施例中,所述处理器1301,还用于执行如下步骤:
获取所述摩尔纹训练数据集合,所述摩尔纹训练数据集合中包括:匹配图像对,该匹配图像对包括:训练图像和监督图像,所述训练图像具有摩尔纹,所述监督图像不具有所述摩尔纹;
根据所述摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,以便于得到所述图像处理模型。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
将原图数据中的每个像素点拆解为三个并排的亚像素点,得到亚像素图像,每个亚像素点对应一个颜色,所述原图数据的颜色包括红绿蓝RGB;
对所述亚像素图像的尺寸进行调整,得到尺寸与所述原图数据的图像尺寸相同的第一中间图像;
将第一中间图像中灰度值低于第一阈值的像素点的灰度值设置为第二阈值,得到第二中间图像;
对所述第二中间图像添加径向畸变,得到第一畸变图像;
对所述第一畸变图像进行相机成像模拟优化,得到添加了摩尔纹的训练图像。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
按照第一投影变换参数对第一畸变图像进行投影变换处理,得到倾斜模拟图像;
根据相机成像的图像亮度分布特征,对所述倾斜模拟图像进行处理,得到亮度模拟图像;
对所述亮度模拟图像添加图像噪声,得到存在噪点的噪声模拟图像;
按照预设的光照系数对噪声模拟图像进行处理,得到明暗模拟图像。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
对原图数据添加径向畸变,得到第二畸变图像;
按照所述第一投影变换参数对第二畸变图像进行投影变换处理,得到监督图像。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
将原图数据显示在电子显示屏上,并对电子显示屏上显示的原图数据进行拍摄,得到训练图像;
根据所述训练图像,对所述原图数据进行特征点匹配和光流法对齐处理,得到监督图像;
基于所述训练图像和对应的监督图像构建得到匹配图像对。
在一个实施例中,所述初始模型中包括:待训练的多频带模块、待训练的基于注意力机制的预测模块、待训练的超分模块;
所述处理器1301,用于执行如下步骤:
将摩尔纹训练数据集合的匹配图像对中的训练图像作为初始模型的输入,经过初始模型处理后,得到多频带模块输出的第一结果、预测模块输出的第二结果、以及超分模块输出的第三结果,其中,所述第一结果中包括训练图像的N层拉普拉斯金字塔对应的N个空间频带的特征图,所述第二结果包括所述训练图像对应的第二结果特征图,所述第三结果包括所述训练图像对应的目标图像;
对摩尔纹训练数据集合的匹配图像对中的所述监督图像进行N层拉普拉斯金字塔处理,得到所述监督图像对应的N个空间频带的特征图;
获取所述监督图像对应的N个空间频带的特征图与所述第一结果中所述N个空间频带的特征图的N个基础损失函数值,获取对所述监督图像进行分辨率处理后的图像与第二结果之间的第一损失函数值,并获取所述监督图像与第三结果之间的第二损失函数值;
根据N个基础损失函数值、所述第一损失函数值以及所述第二损失函数值,对所述初始模型的模型参数进行优化,以便于得到图像处理模型。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
对摩尔纹训练数据集合的目标匹配图像对中的图像进行尺寸调整处理,得到P个不同尺寸的变形匹配图像对;
根据P个变形匹配图像对,对初始模型进行训练,得到P个损失函数值,并根据P个损失函数值对所述初始模型中的模型参数进行优化,以便于得到图像处理模型。
在一个实施例中,将根据摩尔纹训练数据集合中的各个匹配图像对训练得到的模型作为第一图像处理模型,所述处理器1301,还用于执行如下步骤:
对所述摩尔纹图像集合中的所述训练图像进行尺寸放大调整,得到目标尺寸的摩尔纹放大图像;
运行所述第一图像处理模型对所述摩尔纹放大图像进行去除摩尔纹处理,得到摩尔纹监督图像;
对所述摩尔纹监督图像和所述摩尔纹放大图像进行缩放处理,得到多个 摩尔纹图像对;
根据所述多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化,以便于得到图像处理模型。
在一个实施例中,所述处理器1301,具体用于执行如下步骤:
将根据所述多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化得到的模型,作为第一版本的图像处理模型;
将所述第一版本的图像处理模型作为新的第一图像处理模型,以便于根据所述摩尔纹图像集合对所述新的第一图像处理模型进行训练,得到第二版本的图像处理模型;
所述图像处理模型包括:所述第一版本的图像处理模型、所述第二版本的图像处理模型;
所述处理器1301,用于执行如下步骤:
运行所述第一版本的图像处理模型和所述第二版本的图像处理模型,对所述原始图像进行摩尔纹去除处理,得到第一图像和第二图像;
根据所述第一图像和所述第二图像,得到目标图像。
本申请实施例所述处理器1301的具体实现可参考前述各个实施例中相关步骤的具体描述,在此不赘述。
本申请实施例针对在不同尺度和不同空间频带下摩尔纹的差异性,在去除摩尔纹的图像处理模型中,设计了基于拉普拉斯金字塔构建的多尺度模型,利用拉普拉斯金字塔多个频带下的特征图,可以训练得到能够在不同尺度、不同频带下较为全面地去除摩尔纹的图像处理模型,可以便捷地实现更好的摩尔纹去除效果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(read-only memory,ROM)或随机存储记忆体(random access memory,RAM)等。
在另一些实施例中,本申请还提供了一种计算机程序产品,该计算机程序产品包括指令,当其在计算机上运行时,使得计算机执行如上述任一实施例中的图像处理方法。
以上所揭露的仅为本申请的部分实施例而已,当然不能以此来限定本申请之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本申请权利要求所作的等同变化,仍属于本申请所涵盖的范围。

Claims (17)

  1. 一种图像处理方法,由智能设备执行,所述方法包括:
    获取原始图像;
    运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像;
    其中,所述图像处理模型是预先根据摩尔纹训练数据集合训练得到的网络模型;
    并且,所述图像处理模型包括:多频带模块,所述多频带模块用于对所述原始图像进行处理得到关于所述原始图像的N层拉普拉斯金字塔,并基于所述N层拉普拉斯金字塔的N个空间频带所对应的特征图得到第一处理结果特征图,所述目标图像根据所述第一处理结果特征图获得,所述N为大于等于2的正整数。
  2. 如权利要求1所述的方法,所述运行图像处理模型对所述原始图像进行摩尔纹去除处理,包括:
    运行图像处理模型,通过所述多频带模块对所述原始图像进行M次初始分析处理,得到N个尺度下的初始特征图,所述M为大于等于2的正整数,所述M大于或等于所述N,所述初始分析处理包括:对所述原始图像先进行下采样、再进行卷积处理,或者,对所述原始图像进行降尺度的卷积处理;
    对所述N个尺度下的初始特征图进行卷积处理,得到N个尺度下的中间特征图;
    根据N个尺度下的中间特征图,得到N层拉普拉斯金字塔的N个空间频带所对应的特征图;
    根据N个空间频带所对应的特征图得到第一处理结果特征图;
    根据所述第一处理结果特征图得到所述目标图像。
  3. 如权利要求2所述的方法,
    所述N个空间频带所对应的特征图中,第N个空间频带的特征图根据所述中间特征图中尺度最小的第N个中间特征图获得;
    所述N个空间频带所对应的特征图中,第N-i个空间频带的特征图根据第N个中间特征图、第N-i个中间特征图及第N个中间特征图到第N-i个中间特征图之间的所有中间特征图获得,所述i为大于等于1且小于N的正整数。
  4. 如权利要求2所述的方法,所述根据N个空间频带所对应的特征图得到第一处理结果特征图,包括:
    获取预测特征图,所述预测特征图是根据第N个中间特征图、第一个中间特征图及第N个中间特征图到第一个中间特征图之间的所有中间特征图得到的;
    将N个空间频带所对应的特征图中尺度小于所述预测特征图的尺度的特征图进行尺寸调整,以使调整后的空间频带所对应的特征图的尺度等于所述预测特征图的尺度;
    根据所述预测特征图和调整后的空间频带所对应的特征图,得到第一处理结果特征图。
  5. 如权利要求2所述的方法,所述图像处理模型还包括:预测模块,所述根据所述第一处理结果特征图得到目标图像,包括:
    获取对所述原始图像进行模糊处理后得到的模糊特征图;
    根据所述第一处理结果特征图和所述模糊特征图,通过所述预测模块,得到第二处理结果特征图;
    根据所述第二处理结果特征图得到目标图像。
  6. 如权利要求5所述的方法,所述图像处理模型还包括:超分模块,所述根据所述第二处理结果特征图得到目标图像,包括:
    根据所述第一处理结果特征图得到参考特征图,以及对所述第二处理结果特征图进行处理得到中间结果特征图,所述参考特征图的尺度与所述原始图像的尺度相同,所述中间结果特征图的尺度与所述原始图像的尺度相同;
    根据所述参考特征图和所述中间结果特征图,并根据所述超分模块,得到目标图像。
  7. 如权利要求1至6任一项所述的方法,还包括:
    获取所述摩尔纹训练数据集合,所述摩尔纹训练数据集合中包括:匹配图像对,该匹配图像对包括:训练图像和监督图像,所述训练图像具有摩尔纹,所述监督图像不具有所述摩尔纹;
    根据所述摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,以得到所述图像处理模型。
  8. 如权利要求7所述的方法,所述获取所述摩尔纹训练数据集合,包括:
    将原图数据中的每个像素点拆解为三个并排的亚像素点,得到亚像素图像,每个亚像素点对应一个颜色,所述原图数据的颜色包括红绿蓝RGB;
    对所述亚像素图像的尺寸进行调整,得到尺寸与所述原图数据的图像尺寸相同的第一中间图像;
    将第一中间图像中灰度值低于第一阈值的像素点的灰度值设置为第二阈值,得到第二中间图像;
    对所述第二中间图像添加径向畸变,得到第一畸变图像;
    对所述第一畸变图像进行相机成像模拟优化,得到添加了摩尔纹的训练图像。
  9. 如权利要求8所述的方法,所述对所述第一畸变图像进行相机成像模拟优化,包括:
    按照第一投影变换参数对第一畸变图像进行投影变换处理,得到倾斜模拟图像;
    根据相机成像的图像亮度分布特征,对所述倾斜模拟图像进行处理,得到亮度模拟图像;
    对所述亮度模拟图像添加图像噪声,得到存在噪点的噪声模拟图像;
    按照预设的光照系数对噪声模拟图像进行处理,得到明暗模拟图像。
  10. 如权利要求7所述的方法,所述获取所述摩尔纹训练数据集合,包括:
    将原图数据显示在电子显示屏上,并对电子显示屏上显示的原图数据进行拍摄,得到训练图像;
    根据所述训练图像,对所述原图数据进行特征点匹配和光流法对齐处理,得到监督图像;
    基于所述训练图像和对应的监督图像构建得到匹配图像对。
  11. 如权利要求7所述的方法,所述初始模型中包括:待训练的多频带模块、待训练的预测模块、待训练的超分模块;
    所述根据摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,包括:
    将摩尔纹训练数据集合的匹配图像对中的训练图像作为初始模型的输入,经过初始模型处理后,得到多频带模块输出的第一结果、预测模块输出的第二结果、以及超分模块输出的第三结果,其中,所述第一结果中包括训练图像的N层拉普拉斯金字塔对应的N个空间频带的特征图,所述第二结果包括所述训练图像对应的第二结果特征图,所述第三结果包括所述训练图像对应的目标图像;
    对摩尔纹训练数据集合的匹配图像对中的所述监督图像进行N层拉普拉斯金字塔处理,得到所述监督图像对应的N个空间频带的特征图;
    获取所述监督图像对应的N个空间频带的特征图与所述第一结果中所述N个空间频带的特征图的N个基础损失函数值,获取对所述监督图像进行分辨率处理后的图像与第二结果之间的第一损失函数值,并获取所述监督图像与第三结果之间的第二损失函数值;
    根据N个基础损失函数值、所述第一损失函数值以及所述第二损失函数值,对所述初始模型的模型参数进行优化,以得到所述图像处理模型。
  12. 如权利要求7所述的方法,所述根据所述摩尔纹训练数据集合中的各个匹配图像对,对初始模型进行训练,包括:
    对摩尔纹训练数据集合的目标匹配图像对中的图像进行尺寸调整处理,得到P个不同尺寸的变形匹配图像对;
    根据P个变形匹配图像对,对初始模型进行训练,得到P个损失函数值,并根据P个损失函数值对所述初始模型中的模型参数进行优化,以得 到所述图像处理模型。
  13. 如权利要求7所述的方法,将根据所述摩尔纹训练数据集合中的各个匹配图像对训练得到的模型作为第一图像处理模型,所述方法还包括:
    对所述摩尔纹图像集合中的所述训练图像进行尺寸放大调整,得到目标尺寸的摩尔纹放大图像;
    运行所述第一图像处理模型对所述摩尔纹放大图像进行去除摩尔纹处理,得到摩尔纹监督图像;
    对所述摩尔纹监督图像和所述摩尔纹放大图像进行缩放处理,得到多个摩尔纹图像对;
    根据所述多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化,以得到图像处理模型。
  14. 如权利要求13所述的方法,
    将根据所述多个摩尔纹图像对,对所述第一图像处理模型中的模型参数进行优化得到的模型,作为第一版本的图像处理模型;
    将所述第一版本的图像处理模型作为新的第一图像处理模型,以根据所述摩尔纹图像集合对所述新的第一图像处理模型进行训练,得到第二版本的图像处理模型;
    所述图像处理模型包括:所述第一版本的图像处理模型、所述第二版本的图像处理模型;
    所述运行图像处理模型对所述原始图像进行摩尔纹去除处理,得到目标图像,包括:
    运行所述第一版本的图像处理模型和所述第二版本的图像处理模型,对所述原始图像进行摩尔纹去除处理,得到第一图像和第二图像;
    根据所述第一图像和所述第二图像,得到所述目标图像。
  15. 一种智能设备,所述智能设备包括:存储装置和处理器;
    所述存储装置中存储有用于进行图像处理的程序指令;
    所述处理器,调用所述程序指令,用于实现如权利要求1至14任一项所述的图像处理方法。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至14任一项所述图像处理方法的步骤。
  17. 一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行如上述权利要求1至14任一项所述的图像处理方法。
PCT/CN2021/077544 2020-04-15 2021-02-24 一种图像处理方法、智能设备及计算机可读存储介质 WO2021208600A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP21789264.5A EP4030379A4 (en) 2020-04-15 2021-02-24 IMAGE PROCESSING METHOD, INTELLIGENT DEVICE, AND COMPUTER READABLE STORAGE MEDIA
JP2022533195A JP7357998B2 (ja) 2020-04-15 2021-02-24 画像処理方法、スマート機器及びコンピュータプログラム
US17/711,852 US20220222786A1 (en) 2020-04-15 2022-04-01 Image processing method, smart device, and computer readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010295743.1 2020-04-15
CN202010295743.1A CN111476737B (zh) 2020-04-15 2020-04-15 一种图像处理方法、智能设备及计算机可读存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/711,852 Continuation US20220222786A1 (en) 2020-04-15 2022-04-01 Image processing method, smart device, and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2021208600A1 true WO2021208600A1 (zh) 2021-10-21

Family

ID=71753473

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/077544 WO2021208600A1 (zh) 2020-04-15 2021-02-24 一种图像处理方法、智能设备及计算机可读存储介质

Country Status (5)

Country Link
US (1) US20220222786A1 (zh)
EP (1) EP4030379A4 (zh)
JP (1) JP7357998B2 (zh)
CN (1) CN111476737B (zh)
WO (1) WO2021208600A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476737B (zh) * 2020-04-15 2022-02-11 腾讯科技(深圳)有限公司 一种图像处理方法、智能设备及计算机可读存储介质
CN111709890B (zh) * 2020-06-12 2023-11-24 北京小米松果电子有限公司 一种图像增强模型的训练方法、装置及存储介质
CN113744140A (zh) * 2020-10-16 2021-12-03 北京沃东天骏信息技术有限公司 一种图像处理方法、设备和计算机可读存储介质
CN112508801A (zh) * 2020-10-21 2021-03-16 华为技术有限公司 图像处理方法及计算设备
CN112712467B (zh) * 2021-01-11 2022-11-11 郑州科技学院 基于计算机视觉与色彩滤波阵列的图像处理方法
CN112884666B (zh) * 2021-02-02 2024-03-19 杭州海康慧影科技有限公司 图像处理方法、装置及计算机存储介质
CN112906710A (zh) * 2021-03-26 2021-06-04 北京邮电大学 一种基于bakaze-magsac的视觉图像特征提取方法
CN113486861A (zh) * 2021-08-03 2021-10-08 北京百度网讯科技有限公司 摩尔纹图片生成方法和装置
EP4152248A1 (en) * 2021-09-20 2023-03-22 Koninklijke Philips N.V. Medical image analysis system
CA3233549A1 (en) * 2021-09-30 2023-04-06 Peking University Systems and methods for image processing
CN115205738B (zh) * 2022-07-05 2023-08-01 广州和达水务科技股份有限公司 应用于城市内涝的应急排水方法和系统
CN115311145A (zh) * 2022-08-12 2022-11-08 中国电信股份有限公司 图像处理方法及装置、电子设备、存储介质
CN115631115B (zh) * 2022-12-08 2023-03-28 中国科学院自动化研究所 基于递归Transformer的动态图像复原方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8107764B2 (en) * 2005-05-11 2012-01-31 Fujifilm Corporation Image processing apparatus, image processing method, and image processing program
CN108154487A (zh) * 2017-12-25 2018-06-12 天津大学 基于多通道分解的屏摄图像摩尔纹消除方法
CN110287969A (zh) * 2019-06-14 2019-09-27 大连理工大学 基于图残差注意力网络的摩尔文本图像二值化系统
CN111476737A (zh) * 2020-04-15 2020-07-31 腾讯科技(深圳)有限公司 一种图像处理方法、智能设备及计算机可读存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002171410A (ja) * 2000-12-01 2002-06-14 Minolta Co Ltd 画像処理装置
CN106875346A (zh) * 2016-12-26 2017-06-20 奇酷互联网络科技(深圳)有限公司 图像处理方法、装置和终端设备
CN107424123B (zh) * 2017-03-29 2020-06-23 北京猿力教育科技有限公司 一种摩尔纹去除方法及装置
CN108389164B (zh) * 2018-01-31 2021-05-11 深圳市商巨视觉技术有限公司 基于频域分析的去除图像摩尔纹的方法
CN108846818B (zh) * 2018-06-25 2021-03-02 Oppo(重庆)智能科技有限公司 去除摩尔纹的方法、装置、终端及计算机可读存储介质
CN110738609B (zh) 2019-09-11 2022-05-06 北京大学 一种去除图像摩尔纹的方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8107764B2 (en) * 2005-05-11 2012-01-31 Fujifilm Corporation Image processing apparatus, image processing method, and image processing program
CN108154487A (zh) * 2017-12-25 2018-06-12 天津大学 基于多通道分解的屏摄图像摩尔纹消除方法
CN110287969A (zh) * 2019-06-14 2019-09-27 大连理工大学 基于图残差注意力网络的摩尔文本图像二值化系统
CN111476737A (zh) * 2020-04-15 2020-07-31 腾讯科技(深圳)有限公司 一种图像处理方法、智能设备及计算机可读存储介质

Also Published As

Publication number Publication date
EP4030379A4 (en) 2023-01-11
CN111476737B (zh) 2022-02-11
EP4030379A1 (en) 2022-07-20
US20220222786A1 (en) 2022-07-14
JP2023504669A (ja) 2023-02-06
CN111476737A (zh) 2020-07-31
JP7357998B2 (ja) 2023-10-10

Similar Documents

Publication Publication Date Title
WO2021208600A1 (zh) 一种图像处理方法、智能设备及计算机可读存储介质
Yue et al. Supervised raw video denoising with a benchmark dataset on dynamic scenes
CN108898567B (zh) 图像降噪方法、装置及系统
Ignatov et al. Dslr-quality photos on mobile devices with deep convolutional networks
US20220014684A1 (en) Image display method and device
US10708525B2 (en) Systems and methods for processing low light images
CN113508416B (zh) 图像融合处理模块
Mousavi et al. Sparsity-based color image super resolution via exploiting cross channel constraints
Afifi et al. Cie xyz net: Unprocessing images for low-level computer vision tasks
WO2023151511A1 (zh) 模型训练方法、图像去摩尔纹方法、装置及电子设备
Guan et al. Srdgan: learning the noise prior for super resolution with dual generative adversarial networks
Lv et al. Low-light image enhancement via deep Retinex decomposition and bilateral learning
Xu et al. Deep joint demosaicing and high dynamic range imaging within a single shot
Zhang et al. Deep motion blur removal using noisy/blurry image pairs
CN111353965A (zh) 图像修复方法、装置、终端及存储介质
Weng et al. Boosting event stream super-resolution with a recurrent neural network
Soh et al. Joint high dynamic range imaging and super-resolution from a single image
US20230016350A1 (en) Configurable keypoint descriptor generation
US20220398704A1 (en) Intelligent Portrait Photography Enhancement System
Deng et al. Selective kernel and motion-emphasized loss based attention-guided network for HDR imaging of dynamic scenes
US11810266B2 (en) Pattern radius adjustment for keypoint descriptor generation
US11968471B2 (en) Sliding window for image keypoint detection and descriptor generation
CN113724153A (zh) 一种基于机器学习的图像多余人物消除方法
CN114387443A (zh) 一种图像处理方法、存储介质及终端设备
Jin et al. Boosting single image super-resolution learnt from implicit multi-image prior

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21789264

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021789264

Country of ref document: EP

Effective date: 20220414

ENP Entry into the national phase

Ref document number: 2022533195

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE