WO2023151511A1 - 模型训练方法、图像去摩尔纹方法、装置及电子设备 - Google Patents

模型训练方法、图像去摩尔纹方法、装置及电子设备 Download PDF

Info

Publication number
WO2023151511A1
WO2023151511A1 PCT/CN2023/074325 CN2023074325W WO2023151511A1 WO 2023151511 A1 WO2023151511 A1 WO 2023151511A1 CN 2023074325 W CN2023074325 W CN 2023074325W WO 2023151511 A1 WO2023151511 A1 WO 2023151511A1
Authority
WO
WIPO (PCT)
Prior art keywords
moiré
feature extraction
image
extraction layer
model
Prior art date
Application number
PCT/CN2023/074325
Other languages
English (en)
French (fr)
Inventor
刘晗
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2023151511A1 publication Critical patent/WO2023151511A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20216Image averaging

Definitions

  • the application belongs to the technical field of artificial intelligence, and specifically relates to a model training method, an image moiré removal method, a device, and an electronic device.
  • Moiré is a kind of high-frequency interference fringes that appear on photosensitive elements in digital cameras or scanners. It is a kind of high-frequency irregular fringes that make images appear colored.
  • the moiré removal methods are mainly divided into two categories: one is to use the traditional method to process the moiré image on the YUV channel by using the space and frequency characteristics of the moiré.
  • the density and color of moiré patterns vary widely, so traditional methods are not robust to moiré pattern removal.
  • the other is to use the method of deep learning.
  • the network learns the mapping relationship from the moiré image to the image without moiré, and then uses the trained network model to remove the moiré in the image.
  • the purpose of the embodiments of the present application is to provide a model training method, an image removal moiré method, device and electronic equipment, which can solve the problem of low moiré removal efficiency existing in the prior art.
  • the embodiment of the present application provides a model training method, the method comprising:
  • model to be trained is a model constructed based on a lightweight network, and the lightweight network includes a plurality of feature extraction layers of different scales;
  • the plurality of moiré sample images are respectively input to the model to be trained, and according to the predicted image output by the feature extraction layer of the smallest scale in the model to be trained and the sample image without moiré pattern of the same scale after downsampling, the first A loss; update the parameters of the feature extraction layer of the smallest scale according to the first loss until the preset training conditions are met; after completing the training of the feature extraction layer of the smallest scale, apply the same training process to The feature extraction layer of the previous scale in the model to be trained is completed until the training is completed on the feature extraction layer of the largest scale to obtain the target model.
  • the embodiment of the present application provides an image moiré removal method for performing moiré removal processing based on the target model in the first aspect, the method comprising:
  • the second moiré image is divided into N moiré sub-images, wherein the N moiré sub-images There is area overlap between each sub-image and its adjacent sub-images in the sub-image, and N is an integer greater than 1;
  • the N moiré sub-images are respectively input into the target model for processing to obtain N moiré-free sub-images;
  • the embodiment of the present application provides a model training device, the device includes:
  • An acquisition module configured to acquire a plurality of moiré pattern sample images and corresponding moiré pattern-free sample images
  • a building block for building a model to be trained wherein the model to be trained is a model constructed based on a lightweight network, and the lightweight network includes a plurality of feature extraction layers of different scales;
  • a training module configured to input the plurality of moiré pattern sample images into the model to be trained respectively, and the predicted image outputted by the minimum-scale feature extraction layer in the model to be trained is the moiré-free pattern of the same scale after downsampling For this image, obtain the first loss; update the parameters of the feature extraction layer of the smallest scale according to the first loss until the preset training conditions are met; after completing the training of the feature extraction layer of the smallest scale, Apply the same training process to the feature extraction layer of the previous scale in the model to be trained until the training is completed on the feature extraction layer of the largest scale, and obtain target model.
  • the embodiment of the present application provides an image removal device for moiré removal based on the target model in the third aspect, the device includes:
  • a receiving module configured to receive the second moiré image to be processed
  • a segmentation module configured to segment the second moiré image into N moiré sub-images when the size of the second moiré image exceeds the maximum size recognizable by the target model, wherein, There is area overlap between each sub-image and its adjacent sub-images in the N moiré sub-images, and N is an integer greater than 1;
  • the first processing module is configured to respectively input the N moiré sub-images into the target model for processing to obtain N moiré-free sub-images;
  • the second processing module is configured to perform splicing processing on the N moiré-free sub-images, and perform a pixel weighted average operation on overlapping regions in the splicing process to obtain a second moiré-free image corresponding to the second moiré pattern image.
  • the embodiment of the present application provides an electronic device, the electronic device includes a processor and a memory, the memory stores programs or instructions that can run on the processor, and the programs or instructions are processed by the implement the steps of the method described in the first aspect or the second aspect when executed by a device.
  • the embodiment of the present application provides a readable storage medium, on which a program or instruction is stored, and when the program or instruction is executed by a processor, the implementation as described in the first aspect or the second aspect is realized. steps of the method.
  • the embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions, so as to implement the first aspect Or the method described in the second aspect.
  • an embodiment of the present application provides a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the method described in the first aspect or the second aspect.
  • multiple moiré sample images and corresponding moiré-free sample images can be obtained as training data; a lightweight anti-moiré network can be constructed as a model to be trained; the training model can be trained using the training data , to obtain the target model for removing moiré in the input image.
  • the existing deep learning network can be compressed and quantized to obtain a lightweight network, and the model training is performed based on the lightweight network, so that there is no loss Reduce the computing power of the model while maintaining the accuracy, so as to implement the anti-moiré network to the electronic device side, so that users can automatically trigger the anti-moiré function when taking images with electronic devices, and quickly get a picture without moiré.
  • the high-definition image that truly restores the shooting screen improves the efficiency of moiré removal.
  • the image to be processed can be divided into multiple parts, each part has an overlapping area, and each part is separately Input it into the model for processing to obtain the corresponding high-definition images without moiré for each part, and then stitch the high-definition images of each part, and perform pixel-level weighted average calculation on the overlapping areas in the two images to obtain the final without moiré.
  • FIG. 1 is a flow chart of a model training method provided in an embodiment of the present application
  • FIG. 2 is a flow chart of a method for generating training data provided in an embodiment of the present application
  • FIG. 3 is a flowchart of a lightweight network generation process provided by an embodiment of the present application.
  • Fig. 4 is an example diagram of the PyNET network provided by the embodiment of the present application.
  • FIG. 5 is an example diagram of a lightweight network provided by an embodiment of the present application.
  • Fig. 6 is a flow chart of the model training process based on the lightweight network provided by the embodiment of the present application.
  • FIG. 7 is a flow chart of an image moiré removal method provided by an embodiment of the present application.
  • Fig. 8 is a structural block diagram of a model training device provided by an embodiment of the present application.
  • FIG. 9 is a structural block diagram of an image removing moiré device provided in an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present application.
  • Embodiments of the present application provide a model training method, an image moiré removal method, device, and electronic equipment.
  • Model compression simplify the trained deep model to obtain a lightweight network with comparable accuracy.
  • the compressed network has a smaller structure and fewer parameters, which can effectively reduce computing and storage overhead and facilitate deployment in a constrained hardware environment.
  • AF Automatic Focus, automatic focus
  • the camera will use the electronic rangefinder to automatically adjust to lock the distance and movement of the target.
  • the electronic rangefinder will control the lens moving forward and backward at the corresponding position. It needs to be aimed at the subject when shooting, otherwise it may be because the focus is not clear. A situation where the video is blurred.
  • Lens distortion It is actually a general term for the inherent perspective distortion of optical lenses, including pincushion distortion, barrel distortion, linear distortion and so on.
  • FLOPS Floating-point Operations Per Second, the number of floating-point operations performed per second: It is often used to estimate the computing power of the deep learning model. The larger the value, the greater the amount of calculation required by the model.
  • Fig. 1 is a flow chart of a model training method provided by the embodiment of the present application. As shown in Fig. 1, the method may include the following steps: step 101, step 102 and step 103, wherein,
  • step 101 a plurality of moiré pattern sample images and corresponding moiré pattern free sample images are acquired.
  • multiple moiré pattern sample images and corresponding non-moiré pattern sample images are used as training data.
  • a model to be trained is constructed, wherein the model to be trained is a model constructed based on a lightweight network, and the lightweight network includes multiple feature extraction layers of different scales.
  • feature extraction layers of different scales are used to extract features of different scales of the input image.
  • the existing deep learning network can be compressed and quantized to obtain a lightweight network.
  • the generation process of the lightweight network may include: obtaining the PyNET network, deleting the feature extraction layer of a specific scale in the PyNET network, reducing the number of convolution kernel channels of the retained feature extraction layer to a preset value, and The activation function and normalization function in the retained feature extraction layer are modified to obtain a lightweight network, in which a specific scale feature extraction layer is used to extract specific scale features of the input image.
  • step 103 a plurality of moiré pattern sample images are respectively input to the model to be trained, and according to the predicted image output by the minimum-scale feature extraction layer in the model to be trained and the sample image without moiré patterns of the same scale after downsampling, the first loss; according to the first loss, update the parameters of the feature extraction layer of the smallest scale until the preset training conditions are met; after completing the training of the feature extraction layer of the smallest scale, apply the same training process to the model to be trained One-scale feature extraction layer, until the training is completed on the largest-scale feature extraction layer to obtain the target model.
  • the minimum-scale feature extraction layer is trained sequentially. After pre-training the minimum-scale feature extraction layer, the same process is applied to the adjacent upper-scale Feature extraction layers until training is done on the largest scale feature extraction layer.
  • the existing deep learning network can be compressed and quantized to obtain a lightweight network, and model training can be performed based on the lightweight network, so that the accuracy of the model can be reduced without loss of accuracy.
  • Computing power so as to implement the anti-moiré network to the electronic device side, so that users can automatically trigger the anti-moiré function when shooting images with electronic devices, and quickly obtain a high-definition image without moiré and truly restore the shooting picture , improving the efficiency of moiré removal.
  • FIG. 2 is a flow chart of a method for generating training data provided by the embodiment of the present application, including the following steps: Step 201, Step 202 and Step 203,
  • step 201 a screenshot from a display device is obtained.
  • the screenshot is an image obtained by performing a screenshot operation on an image displayed on a screen of a display device.
  • Screenshots are high-resolution images without moiré.
  • step 202 when the camera is in focus, the white image displayed on the display device is captured, Obtain the first moiré image, and generate a moiré sample image based on the screenshot, the white image, and the first moiré image.
  • the white image is a pure white image, where the pixel value of each pixel is 255.
  • the display device can be a computer. Considering that the pattern of moiré is mainly the result of the combined effect of the frequency of the display device screen and the frequency of the camera of the camera device, it is basically irrelevant to the picture displayed on the display device screen. Therefore, in the embodiment of this application, a pure white background image is first used as the The material is subjected to moiré shooting to obtain a first moiré image.
  • the moiré pattern captured by the camera can be regarded as a complex additive noise
  • this noise is related to the shooting angle and lens parameters, and has nothing to do with the background image displayed on the screen of the display device. Therefore, in the embodiment of the present application, The first moiré image and the screenshot can be modeled to synthesize a moiré sample image.
  • step 202 may specifically include the following steps (not shown in the figure): step 2021, step 2022 and step 2023, wherein,
  • step 2021 the RGB value I bg of each pixel in the screenshot, the RGB value I 0 of each pixel in the white image, and the RGB value I moire1 of each pixel in the first moiré image are acquired.
  • the moire noise I moire-feature is calculated according to I 0 and I moire1 .
  • step 2023 calculate the RGB value I moire2 of each pixel in the moire sample image according to I moire-feature and I bg , and generate a moire sample image according to I moire2 .
  • step 203 when the camera is out of focus, capture the white image displayed on the display device to obtain the first moiré-free image, and generate a moiré sample image corresponding to the screenshot, the white image, and the first moire-free image Moiré-free sample image of .
  • step 203 may specifically include the following steps (not shown in the figure): step 2031, step 2032 and step 2033, wherein,
  • step 2031 the RGB value I clean1 of each pixel in the first moiré-free image is obtained.
  • step 2032 according to I clean1 and I 0 , the moire-free noise I clean-feature is calculated.
  • step 2033 calculate the RGB value I clean2 of each pixel in the moiré-free sample image, and generate a moiré-free sample image according to I clean2 .
  • the moiré pattern in the synthesized image is real and very close to the real scene, and there will be no problem that the effect in the training set is good but the effect in the real test is poor.
  • a pure white background image can be used as a material for moiré shooting, and the captured pure white background moiré image and screenshots can be modeled to synthesize a moiré sample image. Then keep the position of the camera unchanged, make the camera out of focus, and then continue to shoot the image with a pure white background. Since there is no moiré in the out-of-focus state, you can get a picture without moiré but with a pure white background.
  • a pure white image with no moiré in the illumination and shadow of the image is basically the same.
  • both the synthesized moiré sample image and the moiré-free sample image retain the illumination information of the original image
  • the illumination information of the image is also retained in the trained network model.
  • the network model is used for moiré removal, the color of the original image can be truly restored, and the image after moiré removal can restore the real state seen by the human eye when shooting, so that the image conforms to the perception of human eyes and the effect of moiré removal more natural.
  • the lightweight network is a network obtained by transforming the PyNET network
  • the generation of the lightweight network may include the following steps: step 301, step 302 and step 303, wherein,
  • the PyNET network is obtained, wherein the PyNET network includes: an input layer, a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth feature extraction layer and a fifth feature extraction layer, the first The first to fifth feature extraction layers are respectively used to extract features of 5 different scales of the input image; the scale of features extracted by the i-th feature extraction layer is larger than the scale of features extracted by the i+1-th feature extraction layer, 1 ⁇ i ⁇ 5.
  • the existing deep learning network can be transformed to obtain a lightweight network, for example, using the name "Replacing Mobile Camera ISP with a Single Deep Learning
  • the PyNET network in the "Model” paper is shown in Figure 4.
  • the five Level layers in Figure 4: Level1, Level2, Level3, Level4 and Level5 correspond to the first feature extraction layer and the second feature extraction layer respectively. Extraction layer, the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer. Among them, the calculation amount of the Level1 layer is the largest, and the calculation amount of the Level5 layer is the smallest.
  • step 302 the first feature extraction layer and the second feature extraction layer in the PyNET network are deleted, the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer are retained, and the convolution of the third feature extraction layer
  • the number of kernel channels is adjusted from the first value to the second value
  • the number of convolution kernel channels of the fourth feature extraction layer is adjusted from the third value to the fourth value
  • the number of convolution kernel channels of the fifth feature extraction layer is changed from the fifth
  • the numerical value is adjusted to the sixth numerical value; the first numerical value is greater than the second numerical value, the third numerical value is greater than the fourth numerical value, and the fifth numerical value is greater than the sixth numerical value.
  • the Level1 and Level2 layers of the PyNET network are removed, and only the Level3, Level4, and Level5 layers are retained.
  • the network structure changes from a five-layer pyramid to a three-layer pyramid.
  • the network input size is 512*512
  • the PyNET network sends the input image to the Level5 layer after 4 times of downsampling.
  • the size of the feature map output by Level5 is 32*32.
  • the size of the feature map sent to the Level4 layer after 3 times of downsampling is 64*64).
  • the output 512*512 of the Level1 layer is finally obtained.
  • the structure of the modified lightweight network only contains 3 Level layers. The network input is sent to the Level5 layer after 2 times of downsampling, and the output feature map size is 128*128 at this time. After acquiring the features of the Level5 layer, after upsampling, it is connected with the input features of the Level4 layer (the size of the feature map delivered to Level4 after one downsampling is 256*256).
  • Level3 layer After obtaining the features of the Level4 layer, after upsampling, it is connected with the input features of the Level3 layer (the size of the feature map is 512*512).
  • the output of the final Level3 layer is the de-moiré image predicted by the final network model.
  • the number of convolution kernel channels used in the Level5 layer, Level4 layer and Level3 layer in the PyNET network is 512, 256, and 128, respectively.
  • the number of channels of the convolution kernel ie, the filter
  • the size of the convolution kernel is K x K
  • the dimension of a convolution kernel is K x K x C, and such a convolution kernel is calculated with the input to obtain a channel of the output.
  • the output has P channels.
  • the improved The lightweight network reduces the number of convolution kernel channels of the Level5 layer, Level4 layer, and Level3 layer to 128, 64, and 32, respectively. Since the number of channels of the convolution kernel is reduced and the dimension of the convolution kernel is reduced, the calculation amount will be greatly reduced each time the convolution kernel and the input matrix are multiplied, and the number of output channels will also decrease accordingly.
  • the output of the previous layer is used as the input of the next layer, so this operation can reduce the calculation amount of each subsequent layer exponentially.
  • step 303 delete the first normalization function in the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer, add the second normalization function in the input layer, and extract the third feature Layer, the fourth feature extraction layer and the activation function in the fifth feature extraction layer are changed to the hyperbolic tangent function to obtain a lightweight network; the second normalization function is used to change the pixel value of the input image from the range of (0,255) Normalized to the (-1,1) range.
  • the training process of the target model may specifically include the following steps: step 601, step 602 and step 603, wherein,
  • step 601 a plurality of moiré sample images are respectively input to the model to be trained, and the first loss is obtained according to the predicted image output by the fifth feature extraction layer in the model to be trained and the sample image without moiré after downsampling by 4 times , according to the first loss, the parameters of the fifth feature extraction layer are updated until convergence to obtain the first intermediate model; where the first loss is used to indicate that the predicted image output by the fifth feature extraction layer is different from the unsampled 4 times The difference between moiré sample images.
  • step 602 a plurality of moiré pattern sample images are respectively input into the first intermediate model, and according to the predicted image output by the fourth feature extraction layer in the first intermediate model and the sample image without moiré pattern after being downsampled by 2 times, the second The second loss is to update the parameters of the first intermediate model according to the second loss until convergence to obtain the second intermediate model; wherein, the second loss is used to indicate that the predicted image output by the fourth feature extraction layer is different from the one after downsampling by 2 times Difference between moiré-free sample images.
  • step 603 a plurality of moiré sample images are respectively input into the second intermediate model, and the third loss is obtained according to the predicted image output by the third feature extraction layer in the second intermediate model and the corresponding sample image without moiré, according to The third loss updates the model parameters of the second intermediate model until convergence to obtain the target model; where the third loss is used to indicate the difference between the predicted image output by the third feature extraction layer and the corresponding sample image without moiré .
  • the Level5 layer is trained first, and the moiré pattern sample image is input to the model to be trained, and the first loss is obtained according to the image output by the Level5 layer and the clean image after downsampling by 4 times, and the model parameters of the model to be trained are calculated according to the first loss. Update until the model training conditions are met. After the training of the Level5 layer is completed, import the model parameters of the Level5 layer, and then train the Level4 layer. According to the image output by the Level4 layer and the clean image after downsampling by 2 times, the second loss is obtained. By analogy, after the training of the Level4 layer is completed, the model parameters are imported, the Level3 layer is trained, and finally a predicted image with the same resolution as the input size is obtained.
  • the method of model compression is used to transform the network structure of the original moiré removal model without loss of accuracy, reducing the model computing power, for example, when the PyNET network is input at 512*512
  • the computing power is 1695GFLOPS
  • the computing power of the modified lightweight network is reduced to 51.6GFLOPS when the input is 512*512.
  • the user in order to implement the anti-moiré network on the electronic device side, the user can automatically trigger the anti-moiré function when using the camera to take pictures of the electronic screen, and obtain a picture without moiré that truly restores the shooting picture.
  • High-definition images can compress and quantize the original PyNET network, which greatly reduces the computing power of the model without losing accuracy.
  • the model training method provided in the embodiment of the present application may be executed by a model training device.
  • the model training method executed by the model training device is taken as an example to describe the model training device provided in the embodiment of the present application.
  • Fig. 7 is a flow chart of a method for removing moiré in an image provided by an embodiment of the present application. Based on the target model trained in any of the above embodiments, moiré removal is performed. As shown in Fig. 7, the method may include the following Steps: step 701, step 702, step 703 and step 704, wherein,
  • step 701 a second moiré image to be processed is received.
  • the user opens the camera application, and the camera preview interface opens.
  • the system obtains the YUV image data previewed by the camera and passes it to the subject detection module.
  • the subject detection module judges whether the YUV image contains moiré.
  • the specific method is: use the existing image classification algorithm to process the input image,
  • the output information includes: whether there is moiré in the image. If the input image does not contain moiré, it will directly jump to the preview interface; if the input image contains moiré, it will call the anti-moiré algorithm.
  • the image classification algorithm can be used to detect the moiré pattern on the camera preview image, which can automatically remove the moiré pattern without any manual adjustment by the user, and there is no abrupt feeling.
  • step 702 when the size of the second moiré image exceeds the maximum size recognizable by the target model, the second moiré image is divided into N moiré sub-images; wherein, among the N moiré sub-images There is area overlap between each sub-image and its adjacent sub-images, and N is an integer greater than 1.
  • step 703 the N moiré sub-images are respectively input into the target model for processing to obtain N moiré-free sub-images.
  • the image needs to be divided into multiple patches, which are sequentially sent to the network for prediction.
  • step 704 stitching is performed on the N moiré-free sub-images, and a pixel-weighted average operation is performed on overlapping regions in the stitching process to obtain a second moiré-free image corresponding to the second moire-free image.
  • the input image cannot be equally divided.
  • slide according to the window size of 1120*1120 set the step size to 940, and you can get 9
  • the size of the patch is 1120*1120, and there is an overlapping area between each patch, and the weighted average pixel weighting of the overlapping area can eliminate the stitching line.
  • the image to be processed can be divided into multiple parts, and there is an overlapping area between each part , each part is input into the model for processing, and the corresponding high-definition image without moiré is obtained for each part, and then the high-definition images of each part are stitched together, and the pixel-level weighted average operation is performed on the overlapping areas in the two images , to get the final complete high-definition image without stitching lines, and the effect of removing moiré is better.
  • the method for removing moiré in an image provided in the embodiment of the present application may be executed by an apparatus for removing moiré in an image.
  • the method for removing moiré in an image performed by the device for removing moiré is taken as an example to describe the device for removing moiré in an image provided in the embodiment of the present application.
  • the embodiment of the present application uses the method of synthetic data for training, so that the color of the original image can be truly restored when the model is predicted, and the method of model compression is used to greatly reduce the accuracy without loss of accuracy.
  • the computing power of the model With the help of image classification algorithm, moiré detection is performed on the camera preview image, which can automatically remove moiré without any manual adjustment by the user, and there is no abrupt feeling.
  • the image after moiré removal can restore the real state seen by the human eye when shooting, so that the photo conforms to the perception of the human eye.
  • Fig. 8 is a structural block diagram of a model training device provided by the embodiment of the present application.
  • the model training device 800 may include: an acquisition module 801, a construction module 802 and a training module 803,
  • An acquisition module 801 configured to acquire a plurality of moiré pattern sample images and corresponding moiré pattern-free sample images
  • a construction module 802 configured to construct a model to be trained, wherein the model to be trained is a model constructed based on a lightweight network, and the lightweight network includes a plurality of feature extraction layers of different scales;
  • the training module 803 is configured to input the plurality of moiré pattern sample images into the model to be trained respectively, and according to the predicted image output by the minimum-scale feature extraction layer in the model to be trained and the moiréless image of the same scale after downsampling pattern sample image, and obtain the first loss; update the parameters of the feature extraction layer of the smallest scale according to the first loss until the preset training conditions are met; after completing the training of the feature extraction layer of the smallest scale , applying the same training process to the feature extraction layer of the previous scale in the model to be trained until the training is completed on the feature extraction layer of the largest scale to obtain the target model.
  • the existing deep learning network can be compressed and quantized to obtain a lightweight network, and model training can be performed based on the lightweight network, so that the accuracy of the model can be reduced without loss of accuracy.
  • Computing power so as to implement the anti-moiré network to the electronic device side, so that users can automatically trigger the anti-moiré function when shooting images with electronic devices, and quickly obtain a high-definition image without moiré and truly restore the shooting picture , improving the efficiency of moiré removal.
  • model training device 800 may also include:
  • the generation module is used to obtain the PyNET network, delete the feature extraction layer of a specific scale in the PyNET network, reduce the number of convolution kernel channels of the retained feature extraction layer to a preset value, and activate the retained feature extraction layer function and the normalization function are modified to obtain a lightweight network, wherein the scale-specific feature extraction layer is used to extract features of a specific scale from the input image.
  • the generating module may include:
  • the first acquisition submodule is used to acquire the PyNET network, wherein the PyNET network includes: an input layer, a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth feature extraction layer and a fifth Feature extraction layer, the first to fifth feature extraction layers are used to extract 5 features of different scales, the scale of the features extracted by the i-th feature extraction layer is greater than the scale of the features extracted by the i+1-th feature extraction layer, 1 ⁇ i ⁇ 5;
  • the first modification submodule is used to delete the first feature extraction layer and the second feature extraction layer in the PyNET network, and retain the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer layer, adjusting the number of convolution kernel channels of the third feature extraction layer from a first value to a second value, adjusting the number of convolution kernel channels of the fourth feature extraction layer from a third value to a fourth value, The number of convolution kernel channels of the fifth feature extraction layer is adjusted from the fifth value to the sixth value, wherein the first value is greater than the second value, and the third value is greater than the fourth value, said fifth numerical value is greater than said sixth numerical value;
  • the second modification submodule is used to delete the first normalization function in the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer, and add a second normalization function in the input layer , and change the activation function in the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer to a hyperbolic tangent function to obtain a lightweight network, wherein the second normalization function is used It is used to normalize the pixel values of the input image from the range of (0,255) to the range of (-1,1).
  • the training module 803 may include:
  • the first training sub-module is used to input the plurality of moiré pattern sample images to the model to be trained respectively, according to the prediction image output by the fifth feature extraction layer in the model to be trained and after downsampling by 4 times Moiré-free sample image, obtain the first loss, update the parameters of the fifth feature extraction layer according to the first loss until convergence, and obtain the first intermediate model, wherein the first loss is used to indicate The difference between the predicted image output by the fifth feature extraction layer and the moiré-free sample image after downsampling by 4 times;
  • the second training sub-module is used to respectively input the plurality of moiré pattern sample images into the first intermediate model, and according to the predicted image output by the fourth feature extraction layer in the first intermediate model and downsampling 2 times the sample image without moiré pattern, obtain the second loss, update the parameters of the first intermediate model according to the second loss until convergence, and obtain the second intermediate model, wherein the second loss is used for Indicating the difference between the predicted image output by the fourth feature extraction layer and the moiré-free sample image after downsampling by 2 times;
  • the third training sub-module is configured to input the plurality of moiré pattern sample images to the second intermediate model respectively, according to the predicted image output by the third feature extraction layer in the second intermediate model and the corresponding A moiré pattern sample image, to obtain a third loss, according to the third loss to the second
  • the model parameters of the inter-model are updated until convergence to obtain the target model, wherein the third loss is used to indicate the difference between the predicted image output by the third feature extraction layer and the corresponding sample image without moiré.
  • the obtaining module 801 may include:
  • the second obtaining submodule is used to obtain screenshots from the display device
  • the first generation sub-module is used to capture the white image displayed on the display device in the camera focusing state to obtain a first moiré image, and according to the screenshot, the white image and the first moiré image image to generate a moiré pattern sample image;
  • the second generation sub-module is used to capture the white image displayed on the display device when the camera is out of focus to obtain a first moiré-free image, and according to the screenshot, the white image and The first moiré-free image generates a moiré-free sample image corresponding to the moiré sample image.
  • the first generating submodule may include:
  • a first acquisition unit configured to acquire the RGB value I bg of each pixel in the screenshot, the RGB value I 0 of each pixel in the white image, and the RGB value of each pixel in the first moiré image I moire1 ;
  • the first calculation unit is used to calculate moire noise I moire-feature according to the I 0 and I moire1 ;
  • the first generation unit is configured to calculate the RGB value I moire2 of each pixel in the moire sample image according to the I moire-feature and I bg , and generate the moire sample image according to the I moire2 ;
  • the second generating submodule may include:
  • the second acquisition unit is used to acquire the RGB value I clean1 of each pixel in the first moiré-free image
  • the second calculation unit is used to calculate moiré noise-free I clean-feature according to the I clean1 and I 0 ;
  • the second generation unit is configured to calculate the RGB value I clean2 of each pixel in the moiré-free sample image corresponding to the moiré sample image according to the I clean-feature and I bg , and generate the I clean2 according to the I clean-feature A moiré-free sample image is described.
  • Fig. 9 is a structural block diagram of an image removing moiré device provided by an embodiment of the present application.
  • the image removing device 900 may include: a receiving module 901, a segmentation module 902, and a first processing module 903 and the second processing module 904, wherein,
  • the segmentation module 902 is configured to segment the second moiré image into N moiré sub-images when the size of the second moiré image exceeds the maximum size recognizable by the target model, wherein , there is area overlap between each sub-image and its adjacent sub-images in the N moiré sub-images, and N is an integer greater than 1;
  • the first processing module 903 is configured to respectively input the N moiré sub-images into the target model for processing to obtain N moiré-free sub-images;
  • the second processing module 904 is configured to perform splicing processing on the N moiré-free sub-images, and perform a pixel weighted average operation on the overlapping regions in the splicing process to obtain the second moiré-free image corresponding to the second moiré pattern pattern image.
  • the image to be processed can be divided into multiple parts, and there is an overlapping area between each part , each part is input into the model for processing, and the corresponding high-definition image without moiré is obtained for each part, and then the high-definition images of each part are stitched together, and the pixel-level weighted average operation is performed on the overlapping areas in the two images , to get the final complete high-definition image without stitching lines, and the effect of removing moiré is better.
  • the model training device and the image removing moiré device in the embodiments of the present application may be electronic equipment, or components in electronic equipment, such as integrated circuits or chips.
  • the electronic device may be a terminal, or other devices other than the terminal.
  • the electronic device can be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle electronic device, a mobile Internet device (Mobile Internet Device, MID), an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) ) devices, robots, wearable devices, ultra-mobile personal computers (ultra-mobile personal computer, UMPC), netbooks, or personal digital assistants (personal digital assistant, PDA), etc., can also serve as servers, network attached storage (Network Attached Storage , NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which are not specifically limited in this embodiment of the present application.
  • Network Attached Storage Network Attached
  • the model training device and the image moiré removal device in the embodiments of the present application may be devices with an operating system.
  • the operating system may be an Android operating system, an iOS operating system, or other possible operating systems, which are not specifically limited in this embodiment of the present application.
  • the model training device and the image moiré removal device provided in the embodiments of the present application can implement the above-mentioned processes of the model training method and the image moiré removal method embodiments. In order to avoid repetition, here No longer.
  • the embodiment of the present application also provides an electronic device 1000, including a processor 1001 and a memory 1002, and the memory 1002 stores programs or instructions that can run on the processor 1001, When the program or instruction is executed by the processor 1001, each step of the above-mentioned model training method or image moiré removal method embodiment can be achieved, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
  • the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 11 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present application.
  • the electronic device 1100 includes, but is not limited to: a radio frequency unit 1101, a network module 1102, an audio output unit 1103, an input unit 1104, a sensor 1105, a display unit 1106, a user input unit 1107, an interface unit 1108, a memory 1109, and a processor 1110, etc. part.
  • the electronic device 1100 can also include a power supply (such as a battery) for supplying power to various components, and the power supply can be logically connected to the processor 1110 through the power management system, so that the management of charging, discharging, and function can be realized through the power management system. Consumption management and other functions.
  • a power supply such as a battery
  • the structure of the electronic device shown in FIG. 11 does not constitute a limitation to the electronic device.
  • the electronic device may include more or fewer components than shown in the figure, or combine certain components, or arrange different components, and details will not be repeated here. .
  • the processor 1110 when the electronic device executes the model training method in the embodiment shown in FIG. 1 , the processor 1110 is configured to acquire a plurality of moiré pattern sample images and corresponding non-moire pattern sample images;
  • the model to be trained wherein the model to be trained is a model based on a lightweight network, which includes multiple feature extraction layers of different scales; multiple moiré pattern sample images are respectively input to the model to be trained, according to the The predicted image output by the smallest-scale feature extraction layer in the training model and the moiré-free sample image of the same scale after downsampling are used to obtain the first loss; the parameters of the smallest-scale feature extraction layer are updated according to the first loss until the predetermined value is met.
  • Set the training conditions after completing the training of the feature extraction layer of the smallest scale, apply the same training process to the feature extraction layer of the previous scale in the model to be trained, until the training is completed on the feature extraction layer of the largest scale, and the target Model.
  • the existing deep learning network can be compressed and quantized to obtain a lightweight network, and model training can be performed based on the lightweight network, so that the computing power of the model can be reduced without loss of accuracy.
  • the anti-moiré network will be implemented on the electronic device side, enabling users When using electronic equipment to shoot images, it can automatically trigger the moiré removal function, and quickly obtain a high-definition image without moiré and truly restore the shooting picture, which improves the efficiency of moiré removal.
  • the processor 1110 is also used to obtain the PyNET network, delete the feature extraction layer of a specific scale in the PyNET network, and reduce the number of convolution kernel channels of the retained feature extraction layer to a preset value, and The activation function and normalization function in the retained feature extraction layer are modified to obtain a lightweight network, and the scale-specific feature extraction layer is used to extract specific scale features of the input image.
  • the processor 1110 is also configured to obtain a PyNET network, wherein the PyNET network includes: an input layer, a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth The feature extraction layer and the fifth feature extraction layer, the first to fifth feature extraction layers are respectively used to extract the features of 5 different scales of the input image, and the scale of the feature extracted by the i feature extraction layer is larger than that extracted by the i+1 feature extraction layer to the feature scale, 1 ⁇ i ⁇ 5; delete the first feature extraction layer and the second feature extraction layer in the PyNET network, retain the third feature extraction layer, the fourth feature extraction layer and the fifth feature extraction layer, and replace the third Adjust the number of convolution kernel channels of the feature extraction layer from the first value to the second value, adjust the number of convolution kernel channels of the fourth feature extraction layer from the third value to the fourth value, and adjust the convolution kernel channel number of the fifth feature extraction layer The number of nuclear channels is adjusted from the fifth value to the sixth value, wherein the first value is greater than
  • the processor 1110 is also configured to respectively input a plurality of moiré pattern sample images to the model to be trained, and according to the predicted image output by the fifth feature extraction layer in the model to be trained and after downsampling by 4 times Moiré-free sample image, obtain the first loss, update the parameters of the fifth feature extraction layer according to the first loss until convergence, and obtain the first intermediate model; input multiple moiré sample images into the first intermediate model respectively , according to the predicted image output by the fourth feature extraction layer in the first intermediate model and the moiré-free sample image after downsampling by 2 times, the second loss is obtained, and the parameters of the first intermediate model are updated according to the second loss until convergence , to obtain a second intermediate model; input a plurality of moiré pattern sample images into the second intermediate model respectively, and obtain the first moiré pattern sample image according to the predicted image output by the third feature extraction layer in the second intermediate model and the corresponding non-moiré pattern sample image three losses, The model parameters of the second intermediate model are updated according to the third loss until convergence to
  • the processor 1110 is also configured to obtain a screenshot from the display device; when the camera is in focus, shoot a white image displayed on the display device to obtain the first moiré image, and obtain the first moiré image according to the screenshot , the white image and the first moiré image to generate a moiré sample image; when the camera is out of focus, shoot the white image displayed on the display device to obtain the first moiré-free image, and according to the screenshot, the white image and the first Moiré-free image, generate a moiré-free sample image.
  • the processor 1110 is further configured to obtain the RGB value I bg of each pixel in the screenshot, the RGB value I 0 of each pixel in the white image, and each pixel in the first moiré image
  • the RGB value I moire1 of the moire pattern calculate the moire noise I moire-feature according to I 0 and I moire1 ; calculate the RGB value I moire2 of each pixel in the moire sample image according to I moire-feature and I bg , and generate the moire pattern according to I moire2
  • the processor 1110 when the electronic device executes the image removing method in the embodiment shown in FIG. 7 , the processor 1110 is configured to receive the second moiré image to be processed; When the size of the moiré image exceeds the maximum size recognizable by the target model, the second moiré image is divided into N moiré sub-images, wherein each sub-image in the N moiré sub-images is adjacent to its sub-image There is area overlap between them, and N is an integer greater than 1; N moiré sub-images are respectively input into the target model for processing, and N moiré-free sub-images are obtained; N moiré-free sub-images are spliced, For the overlapping regions in the splicing process, a pixel weighted average operation is performed to obtain a second moiré-free image corresponding to the second moiré image.
  • the image to be processed when using the target model to perform moiré removal processing, for a large-sized image to be processed, can be divided into multiple parts, and there is an overlapping area between each part, and each part Parts are respectively input into the model for processing, and the corresponding high-definition images without moiré are obtained for each part, and then the high-definition images of each part are stitched together, and the pixel-level weighted average operation is performed on the overlapping areas in the two images to obtain the final A full HD image without stitching lines, with better moiré removal.
  • the input unit 1104 may include a graphics processor (Graphics Processing Unit, GPU) 11041 and a microphone 11042, and the graphics processor 11041 is used for the image capture device (such as camera) to process the image data of still images or videos.
  • the display unit 1106 may include a display panel 11061, and the display panel 11061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1107 includes at least one of a touch panel 11071 and other input devices 11072 .
  • Touch panel 11071 also called touch screen.
  • the touch panel 11071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 11072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • the memory 1109 can be used to store software programs as well as various data.
  • the memory 1109 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required by at least one function (such as a sound playing function, image playback function, etc.), etc.
  • memory 1109 may include volatile memory or nonvolatile memory, or, memory 1109 may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • ROM Read-Only Memory
  • PROM programmable read-only memory
  • Erasable PROM Erasable PROM
  • EPROM erasable programmable read-only memory
  • Electrical EPROM Electrical EPROM
  • EEPROM electronically programmable Erase Programmable Read-Only Memory
  • Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (Synch link DRAM , SLDRAM) and Direct Memory Bus Random Access Memory (Direct Rambus RAM, DRRAM).
  • RAM Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM Double Data Rate SDRAM
  • DDRSDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • Synch link DRAM , SLDRAM
  • Direct Memory Bus Random Access Memory Direct Rambus
  • the processor 1110 may include one or more processing units; optionally, the processor 1110 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to the operating system, user interface, and application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the foregoing modem processor may not be integrated into the processor 2110 .
  • the embodiment of the present application also provides a readable storage medium, where a program or instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the above embodiment of the model training method or the image removal method is implemented.
  • a program or instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the above embodiment of the model training method or the image removal method is implemented.
  • Each process can achieve the same technical effect, so in order to avoid repetition, it will not be repeated here.
  • the processor is the processor in the electronic device described in the above embodiments.
  • the readable storage medium includes a computer-readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk, and the like.
  • the embodiment of the present application also provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the above-mentioned model training method or image removal
  • the processor is used to run programs or instructions to implement the above-mentioned model training method or image removal
  • chips mentioned in the embodiments of the present application may also be called system-on-chip, system-on-chip, system-on-a-chip, or system-on-a-chip.
  • the embodiment of the present application also provides a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the various processes in the above embodiments of the model training method or the image removal method , and can achieve the same technical effect, in order to avoid repetition, it will not be repeated here.
  • the term “comprising”, “comprising” or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase “comprising a " does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.
  • the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course they can also over hardware, but in many cases the former is a better implementation.
  • the technical solution of the present application can be embodied in the form of computer software products, which are stored in a storage medium (such as ROM/RAM, magnetic disk, etc.) , optical disc), including several instructions to enable a terminal (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开一种模型训练方法、图像去摩尔纹方法、装置及电子设备,属于人工智能技术领域,该模型训练方法包括:获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;构建待训练模型,待训练模型是基于轻量级网络构建的模型;将多个摩尔纹样本图像分别输入至待训练模型,根据待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据第一损失对最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对最小尺度的特征提取层的训练后,将相同的训练过程应用于待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型,目标模型用于去除图像中的摩尔纹。

Description

模型训练方法、图像去摩尔纹方法、装置及电子设备
相关申请的交叉引用
本申请要求在2022年02月08日提交中国专利局、申请号为202210118889.8、名称为“模型训练方法、图像去摩尔纹方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于人工智能技术领域,具体涉及一种模型训练方法、图像去摩尔纹方法、装置及电子设备。
背景技术
摩尔纹一种在数码照相机或者扫描仪等设备上,感光元件出现的高频干扰的条纹,是一种会使图像出现彩色的高频率不规则的条纹。目前去摩尔纹方法主要分为两大类:一类是使用传统方法,利用摩尔纹的空间和频率特性,在YUV通道上对摩尔纹图像进行处理,由于摩尔纹的频率分布范围广,不同的摩尔纹其密度和颜色差异很大,因此传统方法对摩尔纹的去除效果不具备鲁棒性。另一类是使用深度学习的方法,通过训练,网络学习由摩尔纹图像到无摩尔纹图像的映射关系,之后使用训练得到的网络模型去除图像中的摩尔纹。
相较于传统方法,现有的深度学习方法在去摩尔纹效果方面具备鲁棒性,但是由于其训练得到的网络模型无法落地到电子设备端,因此用户在使用电子设备拍摄图像时去摩尔纹耗时较长,导致去摩尔纹效率较低。
发明内容
本申请实施例的目的是提供一种模型训练方法、图像去摩尔纹方法、装置及电子设备,能够解决现有技术中存在的去摩尔纹效率较低的问题。
第一方面,本申请实施例提供了一种模型训练方法,所述方法包括:
获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;
构建待训练模型,其中,所述待训练模型是基于轻量级网络构建的模型,所述轻量级网络中包括多个不同尺度的特征提取层;
将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据所述第一损失对所述最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对最小尺度的特征提取层的训练后,将相同的训练过程应用于所述待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型。
第二方面,本申请实施例提供了一种图像去摩尔纹方法,用于基于第一方面中的目标模型,进行去摩尔纹处理,所述方法包括:
接收待处理的第二摩尔纹图像;
在所述第二摩尔纹图像的尺寸超过所述目标模型可识别的最大尺寸的情况下,将所述第二摩尔纹图像切分为N个摩尔纹子图像,其中,所述N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数;
将所述N个摩尔纹子图像分别输入至所述目标模型中进行处理,得到N个无摩尔纹子图像;
对所述N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到所述第二摩尔纹图像对应的第二无摩尔纹图像。
第三方面,本申请实施例提供了一种模型训练装置,所述装置包括:
获取模块,用于获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;
构建模块,用于构建待训练模型,其中,所述待训练模型是基于轻量级网络构建的模型,所述轻量级网络中包括多个不同尺度的特征提取层;
训练模块,用于将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据所述第一损失对所述最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对所述最小尺度的特征提取层的训练后,将相同的训练过程应用于所述待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到 目标模型。
第四方面,本申请实施例提供了一种图像去摩尔纹装置,用于基于第三方面中的目标模型,进行去摩尔纹处理,所述装置包括:
接收模块,用于接收待处理的第二摩尔纹图像;
切分模块,用于在所述第二摩尔纹图像的尺寸超过所述目标模型可识别的最大尺寸的情况下,将所述第二摩尔纹图像切分为N个摩尔纹子图像,其中,所述N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数;
第一处理模块,用于将所述N个摩尔纹子图像分别输入至所述目标模型中进行处理,得到N个无摩尔纹子图像;
第二处理模块,用于对所述N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到所述第二摩尔纹图像对应的第二无摩尔纹图像。
第五方面,本申请实施例提供了一种电子设备,该电子设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面或第二方面所述的方法的步骤。
第六方面,本申请实施例提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面或第二方面所述的方法的步骤。
第七方面,本申请实施例提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面或第二方面所述的方法。
第八方面,本申请实施例提供一种计算机程序产品,该程序产品被存储在存储介质中,该程序产品被至少一个处理器执行以实现如第一方面或第二方面所述的方法。
本申请实施例中,可以获取多个摩尔纹样本图像和对应的无摩尔纹样本图像,作为训练数据;构建轻量级的去摩尔纹网络,作为待训练模型;使用训练数据对待训练模型进行训练,得到用于去除输入图像中摩尔纹的目标模型。与现有技术相比,本申请实施例中,可以对现有的深度学习网络进行压缩和量化,得到轻量级网络,基于轻量级网络进行模型训练,使得在不损失 精度的情况下降低模型的算力,从而实现将去摩尔纹网络落地到电子设备端,使得用户在使用电子设备拍摄图像时能够自动触发去摩尔纹功能,快速地得到一张不带摩尔纹且真实还原拍摄画面的高清图像,提高了去摩尔纹的效率。
本申请实施例中,在使用目标模型进行去摩尔纹处理时,对于尺寸较大的待处理图像,可以将待处理图像切分成多个部分,每个部分之间存在重叠区域,将各部分分别输入至模型中进行处理,得到各部分对应的不带摩尔纹的高清图像,之后对各部分的高清图像进行拼接,对于两两图像中的重叠区域进行像素级的加权平均运算,得到最终不带拼接线的完整高清图像,去摩尔纹效果较好。
附图说明
图1是本申请实施例提供的一种模型训练方法的流程图;
图2是本申请实施例提供的一种训练数据生成方法的流程图;
图3是本申请实施例提供的一种轻量级网络生成过程的流程图;
图4是本申请实施例提供的PyNET网络的示例图;
图5是本申请实施例提供的轻量级网络的示例图;
图6是本申请实施例提供的基于轻量级网络进行模型训练过程的流程图;
图7是本申请实施例提供的一种图像去摩尔纹方法的流程图;
图8是本申请实施例提供的一种模型训练装置的结构框图;
图9是本申请实施例提供的一种图像去摩尔纹装置的结构框图;
图10是本申请实施例提供的一种电子设备的结构示意图;
图11是实现本申请各个实施例的一种电子设备的硬件结构示意图。
具体实施例
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区 别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
本申请实施例提供了一种模型训练方法、图像去摩尔纹方法、装置及电子设备。
为便于理解,下面首先对本申请实施例中涉及到的一些概念进行介绍。
模型压缩:对已经训练好的深度模型进行精简,进而得到一个轻量且准确率相当的网络,压缩后的网络具有更小的结构和更少的参数,可以有效降低计算和存储开销,便于部署在受限的硬件环境中。
AF(Automatic Focus,自动对焦):一般会自动将照片调节至最清晰的状态。相机会利用电子测距仪自动进行调节,锁定目标距离和动作,电子测距仪将前后移动的镜头控制在相应的位置上,拍摄时需要将其对准拍摄物体,不然可能因为对焦不清楚,造成影片模糊的情况。
镜头畸变:实际上是光学透镜固有的透视失真的总称,包括枕形畸变、桶形畸变、线性畸变等等。
FLOPS(Floating-point Operations Per Second,每秒所执行的浮点运算次数):常被用来估算深度学习模型的算力,该数值越大则模型所需的计算量越大。
接下来结合附图,通过具体的实施例及其应用场景对本申请实施例提供的方法进行详细地说明。
图1是本申请实施例提供的一种模型训练方法的流程图,如图1所示,该方法可以包括以下步骤:步骤101、步骤102和步骤103,其中,
在步骤101中,获取多个摩尔纹样本图像和对应的无摩尔纹样本图像。
本申请实施例中,将多个摩尔纹样本图像和对应的无摩尔纹样本图像,作为训练数据。
在步骤102中,构建待训练模型,其中,待训练模型是基于轻量级网络构建的模型,轻量级网络中包括多个不同尺度的特征提取层。
本申请实施例中,不同尺度的特征提取层用于提取输入图像的不同尺度的特征。可以对现有的深度学习网络进行压缩和量化,得到轻量级网络。
在一些实施例中,轻量级网络的生成过程可以包括:获取PyNET网络,删除PyNET网络中特定尺度的特征提取层,将保留的特征提取层的卷积核通道数降低至预设数值,以及对保留的特征提取层中的激活函数和归一化函数进行修改,得到轻量级网络,其中,特定尺度的特征提取层用于提取输入图像的特定尺度的特征。
在步骤103中,将多个摩尔纹样本图像分别输入至待训练模型,根据待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据第一损失对最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对最小尺度的特征提取层的训练后,将相同的训练过程应用于待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型。
本申请实施例中,在进行模型训练时,从最小尺度的特征提取层开始按顺序训练,在对最小尺度的特征提取层进行预训练后,将相同的过程应用于相邻的上一尺度的特征提取层,直到在最大尺度的特征提取层上完成训练。
由上述实施例可见,该实施例中,可以对现有的深度学习网络进行压缩和量化,得到轻量级网络,基于轻量级网络进行模型训练,使得在不损失精度的情况下降低模型的算力,从而实现将去摩尔纹网络落地到电子设备端,使得用户在使用电子设备拍摄图像时能够自动触发去摩尔纹功能,快速地得到一张不带摩尔纹且真实还原拍摄画面的高清图像,提高了去摩尔纹的效率。
现有技术中,采用屏幕截图作为无摩尔纹样本图像,手机拍摄的屏幕截图作为摩尔纹样本图像。然而,以屏幕截图作为目标图像进行训练,网络模型是无法学习到原图的光照信息的,导致训练得到的网络模型的去摩尔纹效果较差。为解决上述问题,如图2所示,图2是本申请实施例提供的一种训练数据生成方法的流程图,包括以下步骤:步骤201、步骤202和步骤203,
在步骤201中,获取来自显示设备的屏幕截图。
本申请实施例中,屏幕截图是对显示设备屏幕上显示的图像进行截图操作,得到的图像。屏幕截图为不带摩尔纹的高清图像。
在步骤202中,在相机对焦状态下,拍摄显示设备上显示的白色图像, 得到第一摩尔纹图像,并根据屏幕截图、白色图像和第一摩尔纹图像,生成摩尔纹样本图像。
本申请实施例中,白色图像为纯白底图像,其中每个像素点的像素值均为255。显示设备可以为电脑。考虑到摩尔纹的样式主要是显示设备屏幕的频率和拍照设备相机的频率共同作用的结果,与显示设备屏幕显示的画面基本不相关,因此,本申请实施例中,先使用纯白底图像作为素材进行摩尔纹拍摄,得到第一摩尔纹图像。
又考虑到相机拍摄的摩尔纹可以看成是一种复杂的加性噪声,这种噪声与拍摄角度、镜头参数有关,与显示设备屏幕内显示的背景图像无关,因此,本申请实施例中,可以对第一摩尔纹图像和屏幕截图进行建模,合成摩尔纹样本图像。
相应地,在一些实施例中,上述步骤202具体可以包括以下步骤(图中未示出):步骤2021、步骤2022和步骤2023,其中,
在步骤2021中,获取屏幕截图中各像素点的RGB值Ibg、白色图像中各像素点的RGB值I0和第一摩尔纹图像中各像素点的RGB值Imoire1
在步骤2022中,根据I0和Imoire1,计算摩尔纹噪声Imoire-feature。其中,可以根据公式Imoire1=Imoire-feature+I0,计算Imoire-feature,Imoire-feature=Imoire1-I0
在步骤2023中,根据Imoire-feature和Ibg,计算摩尔纹样本图像中各像素点的RGB值Imoire2,根据Imoire2,生成摩尔纹样本图像。其中,可以根据公式Imoire2=Imoire-feature+Ibg,计算Imoire2,Imoire2=Imoire1-I0+Ibg
在步骤203中,在相机失焦状态下,拍摄显示设备上显示的白色图像,得到第一无摩尔纹图像,并根据屏幕截图、白色图像和第一无摩尔纹图像,生成摩尔纹样本图像对应的无摩尔纹样本图像。
本申请实施例中,保持拍摄设备相机所持位置不变,调节相机的AF,使相机失焦,由于失焦状态是没有摩尔纹的,因此可以得到一张不带摩尔纹但光照、阴影与第一摩尔纹图像基本一致的第一无摩尔纹图像。之后对第一无摩尔纹图像和屏幕截图进行建模,合成无摩尔纹样本图像。
在一些实施例中,上述步骤203具体可以包括以下步骤(图中未示出):步骤2031、步骤2032和步骤2033,其中,
在步骤2031中,获取第一无摩尔纹图像中各像素点的RGB值Iclean1
在步骤2032中,根据Iclean1和I0,计算无摩尔纹噪声Iclean-feature。其中,可以根据公式Iclean1=Iclean-feature+I0,计算Iclean-feature,Iclean-feature=Iclean1-I0
在步骤2033中,根据Iclean-feature和Ibg,计算无摩尔纹样本图像中各像素点的RGB值Iclean2,根据Iclean2,生成无摩尔纹样本图像。其中,可以根据公式Iclean2=Iclean-feature+Ibg,计算Iclean2,Iclean2=Iclean1-I0+Ibg
本申请实施例中,合成图像中的摩尔纹是真实存在的,十分逼近真实场景,不会出现在训练集效果好,真实测试效果差的问题。
由上述实施例可见,该实施例中,可以使用纯白底图像作为素材进行摩尔纹拍摄,对拍摄到的纯白底摩尔纹图像与屏幕截图进行建模,合成摩尔纹样本图像。之后保持相机所持位置不变,使相机失焦,再对纯白底图像继续进行拍摄,由于失焦状态是没有摩尔纹的,因此可以拍摄得到一张不带摩尔纹、但与纯白底摩尔纹图像的光照、阴影基本一致的纯白底无摩尔纹图像,对拍摄到的纯白底无摩尔纹图像与屏幕截图进行建模,合成无摩尔纹样本图像,使得合成的摩尔纹样本图像和无摩尔纹样本图像均保留了原本图像的光照信息。最后,将合成的摩尔纹样本图像和无摩尔纹样本图像,作为训练数据,以供后续进行模型训练。与现有技术相比,本申请实施例中,由于合成的摩尔纹样本图像和无摩尔纹样本图像均保留了原本图像的光照信息,因此训练得到的网络模型中也保留的图像的光照信息,在使用该网络模型进行去摩尔纹处理时,能真实还原原图色彩,去摩尔纹后的图像能还原拍摄时人眼看到的真实状态,使图像符合人眼观察的认知,去摩尔纹效果更加自然。
当轻量级网络是对PyNET网络进行改造得到的网络时,在本申请提供的再一个实施例中,在图1所示实施例的基础上,如图3所示,轻量级网络的生成过程可以包括以下步骤:步骤301、步骤302和步骤303,其中,
在步骤301中,获取PyNET网络,其中,PyNET网络中包括:输入层、第一特征提取层、第二特征提取层、第三特征提取层、第四特征提取层和第五特征提取层,第一至五特征提取层分别用于提取输入图像的5个不同尺度的特征;第i特征提取层提取到特征的尺度大于第i+1特征提取层提取到特征的尺度,1≤i≤5。
本申请实施例中,可以对现有的深度学习网络进行改造,得到轻量级网络,例如使用名称为“Replacing Mobile Camera ISP with a Single Deep Learning  Model”论文中的PyNET网络,如图4所示。图4中的5个Level层:Level1层、Level2层、Level3层、Level4层和Level5层,分别对应于第一特征提取层、第二特征提取层、第三特征提取层、第四特征提取层和第五特征提取层。其中,Level1层的计算量最大,Level5层的计算量最小。
在步骤302中,删除PyNET网络中的第一特征提取层和第二特征提取层,保留第三特征提取层、第四特征提取层和第五特征提取层,将第三特征提取层的卷积核通道数由第一数值调整为第二数值,将第四特征提取层的卷积核通道数由第三数值调整为第四数值,将第五特征提取层的卷积核通道数由第五数值调整为第六数值;第一数值大于第二数值,第三数值大于第四数值,第五数值大于第六数值。
本申请实施例中,去掉了PyNET网络的Level1层和Level2层,仅保留Level3层、Level4层和Level5层。经过上述改造后,如图5所示,网络结构由五层金字塔变为三层金字塔。假设网络输入大小为512*512,PyNET网络是将输入图像经过4次下采样送达至Level5层,此时Level5输出的特征图大小为32*32。获取Level5层特征后,经过上采样,和Level4层的输入特征(经过3次下采样送达至Level4层的特征图大小为64*64)进行连接。获取Level4层的特征后,经过上采样,和Level3层的输入特征(经过2次下采样送达至Level3的特征图大小为128*128)进行连接。以此类推,最后得到Level1层的输出512*512。而改造后的轻量级网络的结构仅包含3个Level层。网络输入经过2次下采样送达Level5层,此时输出的特征图大小为128*128。获取Level5层特征后,经过上采样,和Level4层的输入特征(经过1次下采样送达至Level4的特征图大小为256*256)进行连接。获取Level4层特征后,经过上采样,和Level3层的输入特征(特征图大小为512*512)进行连接。最后Level3层的输出即最终网络模型预测的去摩尔纹图像。
本申请实施例中,PyNET网络中Level5层、Level4层和Level3层采用的卷积核通道数分别是512、256、128。在卷积层的计算中,假设输入是H x W x C,C是输入的深度(即通道数),那么卷积核(即滤波器)的通道数需要和输入的通道数相同,所以也为C。假设卷积核大小为K x K,那么一个卷积核的维度就是K x K x C,这样一个卷积核与输入进行计算,就得到输出的一个通道。假设有P个K x K x C的卷积核,这样输出就有P个通道。而改进后的 轻量级网络将Level5层、Level4层和Level3层的卷积核通道数分别减少至128、64、32。由于卷积核的通道数减少了,卷积核的维度降低了,那么每一次卷积核和输入进行矩阵相乘时计算量就会大大降低,输出的通道数也会随之下降。在卷积神经网络中,上一层的输出是作为下一层的输入,因此该操作能使后续每一层的计算量呈现指数级下降。
在步骤303中,删除第三特征提取层、第四特征提取层和第五特征提取层中的第一归一化函数,在输入层中增加第二归一化函数,以及将第三特征提取层、第四特征提取层和第五特征提取层中的激活函数更改为双曲正切函数,得到轻量级网络;第二归一化函数用于将输入图像的像素值从(0,255)的范围归一化至(-1,1)的范围。
本申请实施例中,PyNET网络分patch推理后的图像进行拼接会有严重的色差。经分析网络结构发现,是由于归一化的方式导致的。原来在计算归一化统计量时是考虑的单个样本,也就是仅考虑单个patch的图像信息,而忽略了全局的图像信息。因此,在改造后的轻量级网络的结构中,把原有归一化函数去掉,改为在输入层进行归一化的方法进行训练,可以解决分patch有色差的问题。具体的归一化做法是把输入图像从(0,255)的范围归一化至(-1,1)。而PyNET网络中激活函数是sigmoid,值域范围为(0,1),与输入的范围不一致,因此将激活函数修改为双曲正切函数。
在图3所示实施例的情况下,如图6所示,目标模型的训练过程,具体可以包括以下步骤:步骤601、步骤602和步骤603,其中,
在步骤601中,将多个摩尔纹样本图像分别输入至待训练模型,根据待训练模型中第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像,获取第一损失,根据第一损失对第五特征提取层的参数进行更新,直至收敛,得到第一中间模型;其中,第一损失用于指示第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像之间的差异。
在步骤602中,将多个摩尔纹样本图像分别输入至第一中间模型,根据第一中间模型中第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像,获取第二损失,根据第二损失对第一中间模型的参数进行更新,直至收敛,得到第二中间模型;其中,第二损失用于指示第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像之间的差异。
在步骤603中,将多个摩尔纹样本图像分别输入至第二中间模型,根据第二中间模型中第三特征提取层输出的预测图像和对应的无摩尔纹样本图像,获取第三损失,根据第三损失对第二中间模型的模型参数进行更新,直至收敛,得到目标模型;其中,第三损失用于指示第三特征提取层输出的预测图像与对应的无摩尔纹样本图像之间的差异。
也就是,首先训练Level5层,将摩尔纹样本图像输入至待训练模型,根据Level5层输出的图像和下采样4倍后干净图像,获取第一损失,根据第一损失对待训练模型的模型参数进行更新,直至满足模型训练条件。Level5层训练结束后,导入Level5层的模型参数,接下来训练Level4层。根据Level4层输出的图像和下采样2倍后干净图像,获取第二损失。以此类推,Level4层训练结束后,导入模型参数,训练Level3层,最终得到和输入大小相同分辨率的预测图像。
本申请实施例中,利用了模型压缩的手段,在不损失精度的情况下对原始的去摩尔纹大模型进行网络结构的改造,减小模型算力,例如PyNET网络在512*512输入时的算力为1695GFLOPS,而经过改造后的轻量级网络在512*512输入时的算力降低至51.6GFLOPS。
可见,本申请实施例中,为了能将去摩尔纹网络落地到电子设备端,使得用户在使用相机拍摄电子屏幕时自动触发去摩尔纹功能,得到一张不带摩尔纹且真实还原拍摄画面的高清图像,可以对原始PyNET网络进行压缩和量化,使得在不损失精度的情况下大大降低了模型的算力。
本申请实施例提供的模型训练方法,执行主体可以为模型训练装置。本申请实施例中以模型训练装置执行模型训练方法为例,说明本申请实施例提供的模型训练装置。
图7是本申请实施例提供的一种图像去摩尔纹方法的流程图,基于上述任一实施例中训练得到的目标模型,进行去摩尔纹处理,如图7所示,该方法可以包括以下步骤:步骤701、步骤702、步骤703和步骤704,其中,
在步骤701中,接收待处理的第二摩尔纹图像。
在一个例子中,用户打开相机应用,相机预览界面开启。系统获取相机预览的YUV图像数据,传递给主体检测模块。主体检测模块判断YUV图像中是否包含摩尔纹。具体做法为:采用现有的图像分类算法处理输入图像, 输出信息包括:图像中是否存在摩尔纹。若输入图像不包含摩尔纹,则直接跳转至预览界面;若输入图像包含摩尔纹,则调用去摩尔纹算法。其中,可以借助图像分类算法对相机预览图像进行摩尔纹检测,能够在不需要用户任何手工调整的情况下自动去除摩尔纹,而且没有突兀感。
在步骤702中,在第二摩尔纹图像的尺寸超过目标模型可识别的最大尺寸的情况下,将第二摩尔纹图像切分为N个摩尔纹子图像;其中,N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数。
在步骤703中,将N个摩尔纹子图像分别输入至目标模型中进行处理,得到N个无摩尔纹子图像。
本申请实施例中,由于手机端内存的限制,输入图像如果尺寸较大,则无法直接处理,因此需要将图像切分成多个patch,依次送入网络进行预测。
在步骤704中,对N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到第二摩尔纹图像对应的第二无摩尔纹图像。
本申请实施例中,为了消除拼接线,不能对输入图像进行均等分,例如,对3000*3000大小的输入图像,按照1120*1120大小的窗口进行滑动,设置步长为940,可以得到9个1120*1120大小的patch,而且每个patch之间存在重叠区域,重叠区域进行加权平均像素加权即可消除拼接线。
由上述实施例可见,该实施例中,在使用目标模型进行去摩尔纹处理时,对于尺寸较大的待处理图像,可以将待处理图像切分成多个部分,每个部分之间存在重叠区域,将各部分分别输入至模型中进行处理,得到各部分对应的不带摩尔纹的高清图像,之后对各部分的高清图像进行拼接,对于两两图像中的重叠区域进行像素级的加权平均运算,得到最终不带拼接线的完整高清图像,去摩尔纹效果较好。
本申请实施例提供的图像去摩尔纹方法,执行主体可以为图像去摩尔纹装置。本申请实施例中以图像去摩尔纹装置执行图像去摩尔纹方法为例,说明本申请实施例提供的图像去摩尔纹装置。
可见,本申请实施例通过合成数据的方法进行训练,使模型预测时能真实还原原图色彩,且利用了模型压缩的手段在不损失精度的情况下大大降低 了模型的算力。借助图像分类算法对相机预览图像进行摩尔纹检测,能够在不需要用户任何手工调整的情况下自动去除摩尔纹,而且没有突兀感。去摩尔纹后的图像能还原拍摄时人眼看到的真实状态,使照片符合人眼观察的认知。轻量级去摩尔纹模型在使用合成数据训练后,去摩尔纹前后的对比效果。
图8是本申请实施例提供的一种模型训练装置的结构框图,如图8所示,模型训练装置800,可以包括:获取模块801、构建模块802和训练模块803,
获取模块801,用于获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;
构建模块802,用于构建待训练模型,其中,所述待训练模型是基于轻量级网络构建的模型,所述轻量级网络中包括多个不同尺度的特征提取层;
训练模块803,用于将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据所述第一损失对所述最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对所述最小尺度的特征提取层的训练后,将相同的训练过程应用于所述待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型。
由上述实施例可见,该实施例中,可以对现有的深度学习网络进行压缩和量化,得到轻量级网络,基于轻量级网络进行模型训练,使得在不损失精度的情况下降低模型的算力,从而实现将去摩尔纹网络落地到电子设备端,使得用户在使用电子设备拍摄图像时能够自动触发去摩尔纹功能,快速地得到一张不带摩尔纹且真实还原拍摄画面的高清图像,提高了去摩尔纹的效率。
可选地,作为一个实施例,所述模型训练装置800,还可以包括:
生成模块,用于获取PyNET网络,删除所述PyNET网络中特定尺度的特征提取层,将保留的特征提取层的卷积核通道数降低至预设数值,以及对保留的特征提取层中的激活函数和归一化函数进行修改,得到轻量级网络,其中,所述特定尺度的特征提取层用于提取输入图像的特定尺度的特征。
可选地,作为一个实施例,所述生成模块,可以包括:
第一获取子模块,用于获取PyNET网络,其中,所述PyNET网络中包括:输入层、第一特征提取层、第二特征提取层、第三特征提取层、第四特征提取层和第五特征提取层,第一至五特征提取层分别用于提取输入图像的5 个不同尺度的特征,第i特征提取层提取到特征的尺度大于第i+1特征提取层提取到特征的尺度,1≤i≤5;
第一修改子模块,用于删除所述PyNET网络中的所述第一特征提取层和所述第二特征提取层,保留所述第三特征提取层、第四特征提取层和第五特征提取层,将所述第三特征提取层的卷积核通道数由第一数值调整为第二数值,将所述第四特征提取层的卷积核通道数由第三数值调整为第四数值,将所述第五特征提取层的卷积核通道数由第五数值调整为第六数值,其中,所述第一数值大于所述第二数值,所述第三数值大于所述第四数值,所述第五数值大于所述第六数值;
第二修改子模块,用于删除所述第三特征提取层、第四特征提取层和第五特征提取层中的第一归一化函数,在所述输入层中增加第二归一化函数,以及将所述第三特征提取层、第四特征提取层和第五特征提取层中的激活函数更改为双曲正切函数,得到轻量级网络,其中,所述第二归一化函数用于将输入图像的像素值从(0,255)的范围归一化至(-1,1)的范围。
可选地,作为一个实施例,所述训练模块803,可以包括:
第一训练子模块,用于将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中所述第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像,获取第一损失,根据所述第一损失对所述第五特征提取层的参数进行更新,直至收敛,得到第一中间模型,其中,所述第一损失用于指示所述第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像之间的差异;
第二训练子模块,用于将所述多个摩尔纹样本图像分别输入至所述第一中间模型,根据所述第一中间模型中所述第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像,获取第二损失,根据所述第二损失对所述第一中间模型的参数进行更新,直至收敛,得到第二中间模型,其中,所述第二损失用于指示所述第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像之间的差异;
第三训练子模块,用于将所述多个摩尔纹样本图像分别输入至所述第二中间模型,根据所述第二中间模型中所述第三特征提取层输出的预测图像和对应的无摩尔纹样本图像,获取第三损失,根据所述第三损失对所述第二中 间模型的模型参数进行更新,直至收敛,得到目标模型,其中,所述第三损失用于指示所述第三特征提取层输出的预测图像与对应的无摩尔纹样本图像之间的差异。
可选地,作为一个实施例,所述获取模块801,可以包括:
第二获取子模块,用于获取来自显示设备的屏幕截图;
第一生成子模块,用于在相机对焦状态下,拍摄所述显示设备上显示的白色图像,得到第一摩尔纹图像,并根据所述屏幕截图、所述白色图像和所述第一摩尔纹图像,生成摩尔纹样本图像;
第二生成子模块,用于在所述相机失焦状态下,拍摄所述显示设备上显示的所述白色图像,得到第一无摩尔纹图像,并根据所述屏幕截图、所述白色图像和所述第一无摩尔纹图像,生成所述摩尔纹样本图像对应的无摩尔纹样本图像。
可选地,作为一个实施例,所述第一生成子模块,可以包括:
第一获取单元,用于获取所述屏幕截图中各像素点的RGB值Ibg、所述白色图像中各像素点的RGB值I0和所述第一摩尔纹图像中各像素点的RGB值Imoire1
第一计算单元,用于根据所述I0和Imoire1,计算摩尔纹噪声Imoire-feature
第一生成单元,用于根据所述Imoire-feature和Ibg,计算摩尔纹样本图像中各像素点的RGB值Imoire2,根据所述Imoire2,生成所述摩尔纹样本图像;
所述第二生成子模块,可以包括:
第二获取单元,用于获取所述第一无摩尔纹图像中各像素点的RGB值Iclean1
第二计算单元,用于根据所述Iclean1和I0,计算无摩尔纹噪声Iclean-feature
第二生成单元,用于根据所述Iclean-feature和Ibg,计算所述摩尔纹样本图像对应的无摩尔纹样本图像中各像素点的RGB值Iclean2,根据所述Iclean2,生成所述无摩尔纹样本图像。
图9是本申请实施例提供的一种图像去摩尔纹装置的结构框图,如图9所示,图像去摩尔纹装置900,可以包括:接收模块901、切分模块902、第一处理模块903和第二处理模块904,其中,
接收模块901,用于接收待处理的第二摩尔纹图像;
切分模块902,用于在所述第二摩尔纹图像的尺寸超过所述目标模型可识别的最大尺寸的情况下,将所述第二摩尔纹图像切分为N个摩尔纹子图像,其中,所述N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数;
第一处理模块903,用于将所述N个摩尔纹子图像分别输入至所述目标模型中进行处理,得到N个无摩尔纹子图像;
第二处理模块904,用于对所述N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到所述第二摩尔纹图像对应的第二无摩尔纹图像。
由上述实施例可见,该实施例中,在使用目标模型进行去摩尔纹处理时,对于尺寸较大的待处理图像,可以将待处理图像切分成多个部分,每个部分之间存在重叠区域,将各部分分别输入至模型中进行处理,得到各部分对应的不带摩尔纹的高清图像,之后对各部分的高清图像进行拼接,对于两两图像中的重叠区域进行像素级的加权平均运算,得到最终不带拼接线的完整高清图像,去摩尔纹效果较好。
本申请实施例中的模型训练装置、图像去摩尔纹装置可以是电子设备,也可以是电子设备中的部件,例如集成电路或芯片。该电子设备可以是终端,也可以为除终端之外的其他设备。示例性的,电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、或者个人数字助理(personal digital assistant,PDA)等,还可以为服务器、网络附属存储器(Network Attached Storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的模型训练装置、图像去摩尔纹装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为iOS操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的模型训练装置、图像去摩尔纹装置能够实现上述模型训练方法、图像去摩尔纹方法实施例实现的各个过程,为避免重复,这里 不再赘述。
可选地,如图10所示,本申请实施例还提供了一种电子设备1000,包括处理器1001和存储器1002,存储器1002上存储有可在所述处理器1001上运行的程序或指令,该程序或指令被处理器1001执行时实现上述模型训练方法或图像去摩尔纹方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。
图11是实现本申请各个实施例的一种电子设备的硬件结构示意图。该电子设备1100包括但不限于:射频单元1101、网络模块1102、音频输出单元1103、输入单元1104、传感器1105、显示单元1106、用户输入单元1107、接口单元1108、存储器1109、以及处理器1110等部件。
本领域技术人员可以理解,电子设备1100还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器1110逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图11中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
在本申请提供的一个实施例中,当电子设备执行图1所示实施例中的模型训练方法时,处理器1110,用于获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;构建待训练模型,其中,待训练模型是基于轻量级网络构建的模型,轻量级网络中包括多个不同尺度的特征提取层;将多个摩尔纹样本图像分别输入至待训练模型,根据待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据第一损失对最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对最小尺度的特征提取层的训练后,将相同的训练过程应用于待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型。
可见,本申请实施例中,可以对现有的深度学习网络进行压缩和量化,得到轻量级网络,基于轻量级网络进行模型训练,使得在不损失精度的情况下降低模型的算力,从而实现将去摩尔纹网络落地到电子设备端,使得用户 在使用电子设备拍摄图像时能够自动触发去摩尔纹功能,快速地得到一张不带摩尔纹且真实还原拍摄画面的高清图像,提高了去摩尔纹的效率。
可选地,作为一个实施例,处理器1110,还用于获取PyNET网络,删除PyNET网络中特定尺度的特征提取层,将保留的特征提取层的卷积核通道数降低至预设数值,以及对保留的特征提取层中的激活函数和归一化函数进行修改,得到轻量级网络,特定尺度的特征提取层用于提取输入图像的特定尺度的特征。
可选地,作为一个实施例,处理器1110,还用于获取PyNET网络,其中,PyNET网络中包括:输入层、第一特征提取层、第二特征提取层、第三特征提取层、第四特征提取层和第五特征提取层,第一至五特征提取层分别用于提取输入图像的5个不同尺度的特征,第i特征提取层提取到特征的尺度大于第i+1特征提取层提取到特征的尺度,1≤i≤5;删除PyNET网络中的第一特征提取层和第二特征提取层,保留第三特征提取层、第四特征提取层和第五特征提取层,将第三特征提取层的卷积核通道数由第一数值调整为第二数值,将第四特征提取层的卷积核通道数由第三数值调整为第四数值,将第五特征提取层的卷积核通道数由第五数值调整为第六数值,其中,第一数值大于第二数值,第三数值大于第四数值,第五数值大于第六数值;删除第三特征提取层、第四特征提取层和第五特征提取层中的第一归一化函数,在输入层中增加第二归一化函数,以及将第三特征提取层、第四特征提取层和第五特征提取层中的激活函数更改为双曲正切函数,得到轻量级网络,其中,第二归一化函数用于将输入图像的像素值从(0,255)的范围归一化至(-1,1)的范围。
可选地,作为一个实施例,处理器1110,还用于将多个摩尔纹样本图像分别输入至待训练模型,根据待训练模型中第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像,获取第一损失,根据第一损失对第五特征提取层的参数进行更新,直至收敛,得到第一中间模型;将多个摩尔纹样本图像分别输入至第一中间模型,根据第一中间模型中第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像,获取第二损失,根据第二损失对第一中间模型的参数进行更新,直至收敛,得到第二中间模型;将多个摩尔纹样本图像分别输入至所述第二中间模型,根据第二中间模型中第三特征提取层输出的预测图像和对应的无摩尔纹样本图像,获取第三损失, 根据第三损失对第二中间模型的模型参数进行更新,直至收敛,得到目标模型。
可选地,作为一个实施例,处理器1110,还用于获取来自显示设备的屏幕截图;在相机对焦状态下,拍摄显示设备上显示的白色图像,得到第一摩尔纹图像,并根据屏幕截图、白色图像和第一摩尔纹图像,生成摩尔纹样本图像;在相机失焦状态下,拍摄显示设备上显示的白色图像,得到第一无摩尔纹图像,并根据屏幕截图、白色图像和第一无摩尔纹图像,生成无摩尔纹样本图像。
可选地,作为一个实施例,处理器1110,还用于获取屏幕截图中各像素点的RGB值Ibg、白色图像中各像素点的RGB值I0和第一摩尔纹图像中各像素点的RGB值Imoire1;根据I0和Imoire1计算摩尔纹噪声Imoire-feature;根据Imoire-feature和Ibg计算摩尔纹样本图像中各像素点的RGB值Imoire2,根据Imoire2生成摩尔纹样本图像;获取第一无摩尔纹图像中各像素点的RGB值Iclean1;根据Iclean1和I0计算无摩尔纹噪声Iclean-feature;根据Iclean-feature和Ibg计算摩尔纹样本图像对应的无摩尔纹样本图像中各像素点的RGB值Iclean2,根据Iclean2生成无摩尔纹样本图像。
在本申请提供的另一个实施例中,当电子设备执行图7所示实施例中的图像去摩尔纹方法时,处理器1110,用于接收待处理的第二摩尔纹图像;在第二摩尔纹图像的尺寸超过目标模型可识别的最大尺寸的情况下,将第二摩尔纹图像切分为N个摩尔纹子图像,其中,N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数;将N个摩尔纹子图像分别输入至目标模型中进行处理,得到N个无摩尔纹子图像;对N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到第二摩尔纹图像对应的第二无摩尔纹图像。
可见,本申请实施例中,在使用目标模型进行去摩尔纹处理时,对于尺寸较大的待处理图像,可以将待处理图像切分成多个部分,每个部分之间存在重叠区域,将各部分分别输入至模型中进行处理,得到各部分对应的不带摩尔纹的高清图像,之后对各部分的高清图像进行拼接,对于两两图像中的重叠区域进行像素级的加权平均运算,得到最终不带拼接线的完整高清图像,去摩尔纹效果较好。
应理解的是,本申请实施例中,输入单元1104可以包括图形处理器(Graphics Processing Unit,GPU)11041和麦克风11042,图形处理器11041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图像或视频的图像数据进行处理。显示单元1106可包括显示面板11061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板11061。用户输入单元1107包括触控面板11071以及其他输入设备11072中的至少一种。触控面板11071,也称为触摸屏。触控面板11071可包括触摸检测装置和触摸控制器两个部分。其他输入设备11072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。
存储器1109可用于存储软件程序以及各种数据。存储器1109可主要包括存储程序或指令的第一存储区和存储数据的第二存储区,其中,第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外,存储器1109可以包括易失性存储器或非易失性存储器,或者,存储器1109可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请实施例中的存储器1109包括但不限于这些和任意其它适合类型的存储器。
处理器1110可包括一个或多个处理单元;可选的,处理器1110集成应用处理器和调制解调处理器,其中,应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作,调制解调处理器主要处理无线通信信号,如基带处理器。可以理解的是,上述调制解调处理器也可以不集成到处理器2110中。
本申请实施例还提供了一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述模型训练方法或图像去摩尔纹方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。
本申请实施例还提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述模型训练方法或图像去摩尔纹方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
本申请实施例还提供了一种计算机程序产品,该程序产品被存储在存储介质中,该程序产品被至少一个处理器执行以实现如上述模型训练方法或图像去摩尔纹方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通 过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (19)

  1. 一种模型训练方法,其中,所述方法包括:
    获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;
    构建待训练模型,其中,所述待训练模型是基于轻量级网络构建的模型,所述轻量级网络中包括多个不同尺度的特征提取层;
    将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;
    根据所述第一损失对所述最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;
    在完成对所述最小尺度的特征提取层的训练后,将相同的训练过程应用于所述待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型。
  2. 根据权利要求1所述的方法,其中,所述构建待训练模型的步骤之前,还包括:
    获取PyNET网络,删除所述PyNET网络中特定尺度的特征提取层,将保留的特征提取层的卷积核通道数降低至预设数值,以及对保留的特征提取层中的激活函数和归一化函数进行修改,得到轻量级网络,其中,所述特定尺度的特征提取层用于提取输入图像的特定尺度的特征。
  3. 根据权利要求2所述的方法,其中,所述获取PyNET网络,删除所述PyNET网络中特定尺度的特征提取层,将保留的特征提取层的卷积核通道数降低至预设数值,以及对保留的特征提取层中的激活函数和归一化函数进行修改,得到轻量级网络,包括:
    获取PyNET网络,其中,所述PyNET网络中包括:输入层、第一特征提取层、第二特征提取层、第三特征提取层、第四特征提取层和第五特征提取层,第一至五特征提取层分别用于提取输入图像的5个不同尺度的特征,第i特征提取层提取到特征的尺度大于第i+1特征提取层提取到特征的尺度,1≤i≤5;
    删除所述PyNET网络中的所述第一特征提取层和所述第二特征提取层, 保留所述第三特征提取层、第四特征提取层和第五特征提取层,将所述第三特征提取层的卷积核通道数由第一数值调整为第二数值,将所述第四特征提取层的卷积核通道数由第三数值调整为第四数值,将所述第五特征提取层的卷积核通道数由第五数值调整为第六数值,其中,所述第一数值大于所述第二数值,所述第三数值大于所述第四数值,所述第五数值大于所述第六数值;
    删除所述第三特征提取层、第四特征提取层和第五特征提取层中的第一归一化函数,在所述输入层中增加第二归一化函数,以及将所述第三特征提取层、第四特征提取层和第五特征提取层中的激活函数更改为双曲正切函数,得到轻量级网络,其中,所述第二归一化函数用于将输入图像的像素值从(0,255)的范围归一化至(-1,1)的范围。
  4. 根据权利要求3所述的方法,其中,所述将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;根据所述第一损失对所述最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;在完成对所述最小尺度的特征提取层的训练后,将相同的训练过程应用于所述待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型,包括:
    将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中所述第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像,获取第一损失,根据所述第一损失对所述第五特征提取层的参数进行更新,直至收敛,得到第一中间模型,其中,所述第一损失用于指示所述第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像之间的差异;
    将所述多个摩尔纹样本图像分别输入至所述第一中间模型,根据所述第一中间模型中所述第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像,获取第二损失,根据所述第二损失对所述第一中间模型的参数进行更新,直至收敛,得到第二中间模型,其中,所述第二损失用于指示所述第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像之间的差异;
    将所述多个摩尔纹样本图像分别输入至所述第二中间模型,根据所述第 二中间模型中所述第三特征提取层输出的预测图像和对应的无摩尔纹样本图像,获取第三损失,根据所述第三损失对所述第二中间模型的模型参数进行更新,直至收敛,得到目标模型,其中,所述第三损失用于指示所述第三特征提取层输出的预测图像与对应的无摩尔纹样本图像之间的差异。
  5. 根据权利要求1所述的方法,其中,所述获取多个摩尔纹样本图像和对应的无摩尔纹样本图像,包括:
    获取来自显示设备的屏幕截图;
    在相机对焦状态下,拍摄所述显示设备上显示的白色图像,得到第一摩尔纹图像,并根据所述屏幕截图、所述白色图像和所述第一摩尔纹图像,生成摩尔纹样本图像;
    在所述相机失焦状态下,拍摄所述显示设备上显示的所述白色图像,得到第一无摩尔纹图像,并根据所述屏幕截图、所述白色图像和所述第一无摩尔纹图像,生成所述摩尔纹样本图像对应的无摩尔纹样本图像。
  6. 根据权利要求5所述的方法,其中,所述根据所述屏幕截图、所述白色图像和所述第一摩尔纹图像,生成摩尔纹样本图像,包括:
    获取所述屏幕截图中各像素点的RGB值Ibg、所述白色图像中各像素点的RGB值I0和所述第一摩尔纹图像中各像素点的RGB值Imoire1
    根据所述I0和Imoire1,计算摩尔纹噪声Imoire-feature
    根据所述Imoire-feature和Ibg,计算摩尔纹样本图像中各像素点的RGB值Imoire2,根据所述Imoire2,生成所述摩尔纹样本图像;
    所述根据所述屏幕截图、所述白色图像和所述第一无摩尔纹图像,生成所述摩尔纹样本图像对应的无摩尔纹样本图像,包括:
    获取所述第一无摩尔纹图像中各像素点的RGB值Iclean1
    根据所述Iclean1和I0,计算无摩尔纹噪声Iclean-feature
    根据所述Iclean-feature和Ibg,计算所述摩尔纹样本图像对应的无摩尔纹样本图像中各像素点的RGB值Iclean2,根据所述Iclean2,生成所述无摩尔纹样本图像。
  7. 一种图像去摩尔纹方法,用于基于权利要求1至6任一项中生成的目标模型,进行去摩尔纹处理,其中,所述方法包括:
    接收待处理的第二摩尔纹图像;
    在所述第二摩尔纹图像的尺寸超过所述目标模型可识别的最大尺寸的情况下,将所述第二摩尔纹图像切分为N个摩尔纹子图像,其中,所述N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数;
    将所述N个摩尔纹子图像分别输入至所述目标模型中进行处理,得到N个无摩尔纹子图像;
    对所述N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到所述第二摩尔纹图像对应的第二无摩尔纹图像。
  8. 一种模型训练装置,其中,所述装置包括:
    获取模块,用于获取多个摩尔纹样本图像和对应的无摩尔纹样本图像;
    构建模块,用于构建待训练模型,其中,所述待训练模型是基于轻量级网络构建的模型,所述轻量级网络中包括多个不同尺度的特征提取层;
    训练模块,用于将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中最小尺度的特征提取层输出的预测图像与下采样后相同尺度的无摩尔纹样本图像,获取第一损失;
    根据所述第一损失对所述最小尺度的特征提取层的参数进行更新,直至满足预设训练条件;
    在完成对所述最小尺度的特征提取层的训练后,将相同的训练过程应用于所述待训练模型中上一尺度的特征提取层,直至在最大尺度的特征提取层上完成训练,得到目标模型。
  9. 根据权利要求8所述的装置,其中,所述装置还包括:
    生成模块,用于获取PyNET网络,删除所述PyNET网络中特定尺度的特征提取层,将保留的特征提取层的卷积核通道数降低至预设数值,以及对保留的特征提取层中的激活函数和归一化函数进行修改,得到轻量级网络,其中,所述特定尺度的特征提取层用于提取输入图像的特定尺度的特征。
  10. 根据权利要求9所述的装置,其中,所述生成模块包括:
    第一获取子模块,用于获取PyNET网络,其中,所述PyNET网络中包括:输入层、第一特征提取层、第二特征提取层、第三特征提取层、第四特征提取层和第五特征提取层,第一至五特征提取层分别用于提取输入图像的5个不同尺度的特征,第i特征提取层提取到特征的尺度大于第i+1特征提取层 提取到特征的尺度,1≤i≤5;
    第一修改子模块,用于删除所述PyNET网络中的所述第一特征提取层和所述第二特征提取层,保留所述第三特征提取层、第四特征提取层和第五特征提取层,将所述第三特征提取层的卷积核通道数由第一数值调整为第二数值,将所述第四特征提取层的卷积核通道数由第三数值调整为第四数值,将所述第五特征提取层的卷积核通道数由第五数值调整为第六数值,其中,所述第一数值大于所述第二数值,所述第三数值大于所述第四数值,所述第五数值大于所述第六数值;
    第二修改子模块,用于删除所述第三特征提取层、第四特征提取层和第五特征提取层中的第一归一化函数,在所述输入层中增加第二归一化函数,以及将所述第三特征提取层、第四特征提取层和第五特征提取层中的激活函数更改为双曲正切函数,得到轻量级网络,其中,所述第二归一化函数用于将输入图像的像素值从(0,255)的范围归一化至(-1,1)的范围。
  11. 根据权利要求10所述的装置,其中,所述训练模块包括:
    第一训练子模块,用于将所述多个摩尔纹样本图像分别输入至所述待训练模型,根据所述待训练模型中所述第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像,获取第一损失,根据所述第一损失对所述第五特征提取层的参数进行更新,直至收敛,得到第一中间模型,其中,所述第一损失用于指示所述第五特征提取层输出的预测图像与下采样4倍后的无摩尔纹样本图像之间的差异;
    第二训练子模块,用于将所述多个摩尔纹样本图像分别输入至所述第一中间模型,根据所述第一中间模型中所述第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像,获取第二损失,根据所述第二损失对所述第一中间模型的参数进行更新,直至收敛,得到第二中间模型,其中,所述第二损失用于指示所述第四特征提取层输出的预测图像与下采样2倍后的无摩尔纹样本图像之间的差异;
    第三训练子模块,用于将所述多个摩尔纹样本图像分别输入至所述第二中间模型,根据所述第二中间模型中所述第三特征提取层输出的预测图像和对应的无摩尔纹样本图像,获取第三损失,根据所述第三损失对所述第二中间模型的模型参数进行更新,直至收敛,得到目标模型,其中,所述第三损 失用于指示所述第三特征提取层输出的预测图像与对应的无摩尔纹样本图像之间的差异。
  12. 根据权利要求8所述的装置,其中,所述获取模块包括:
    第二获取子模块,用于获取来自显示设备的屏幕截图;
    第一生成子模块,用于在相机对焦状态下,拍摄所述显示设备上显示的白色图像,得到第一摩尔纹图像,并根据所述屏幕截图、所述白色图像和所述第一摩尔纹图像,生成摩尔纹样本图像;
    第二生成子模块,用于在所述相机失焦状态下,拍摄所述显示设备上显示的所述白色图像,得到第一无摩尔纹图像,并根据所述屏幕截图、所述白色图像和所述第一无摩尔纹图像,生成所述摩尔纹样本图像对应的无摩尔纹样本图像。
  13. 根据权利要求12所述的装置,其中,所述第一生成子模块包括:
    第一获取单元,用于获取所述屏幕截图中各像素点的RGB值Ibg、所述白色图像中各像素点的RGB值I0和所述第一摩尔纹图像中各像素点的RGB值Imoire1
    第一计算单元,用于根据所述I0和Imoire1,计算摩尔纹噪声Imoire-feature
    第一生成单元,用于根据所述Imoire-feature和Ibg,计算摩尔纹样本图像中各像素点的RGB值Imoire2,根据所述Imoire2,生成所述摩尔纹样本图像;
    所述第二生成子模块包括:
    第二获取单元,用于获取所述第一无摩尔纹图像中各像素点的RGB值Iclean1
    第二计算单元,用于根据所述Iclean1和I0,计算无摩尔纹噪声Iclean-feature
    第二生成单元,用于根据所述Iclean-feature和Ibg,计算所述摩尔纹样本图像对应的无摩尔纹样本图像中各像素点的RGB值Iclean2,根据所述Iclean2,生成所述无摩尔纹样本图像。
  14. 一种图像去摩尔纹装置,用于基于权利要求8至13任一项中生成的目标模型,进行去摩尔纹处理,其中,所述装置包括:
    接收模块,用于接收待处理的第二摩尔纹图像;
    切分模块,用于在所述第二摩尔纹图像的尺寸超过所述目标模型可识别的最大尺寸的情况下,将所述第二摩尔纹图像切分为N个摩尔纹子图像,其 中,所述N个摩尔纹子图像中每个子图像与其相邻的子图像之间存在区域重叠,N为大于1的整数;
    第一处理模块,用于将所述N个摩尔纹子图像分别输入至所述目标模型中进行处理,得到N个无摩尔纹子图像;
    第二处理模块,用于对所述N个无摩尔纹子图像进行拼接处理,对于拼接过程中的重叠区域,进行像素加权平均运算,得到所述第二摩尔纹图像对应的第二无摩尔纹图像。
  15. 一种电子设备,其中,该电子设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至6任一项所述的模型训练方法的步骤,或者,所述程序或指令被所述处理器执行时实现如权利要求7所述的图像去摩尔纹方法的步骤。
  16. 一种可读存储介质,其中,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至6任一项所述的模型训练方法的步骤,或者,所述程序或指令被所述处理器执行时实现如权利要求7所述的图像去摩尔纹方法的步骤。
  17. 一种芯片,其中,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如权利要求1至6任一项所述的模型训练方法的步骤,或者,所述程序或指令被所述处理器执行时实现如权利要求7所述的图像去摩尔纹方法的步骤。
  18. 一种计算机程序产品,其中,所述程序产品被存储在非易失的存储介质中,所述程序产品被至少一个处理器执行以实现如权利要求1至6任一项所述的模型训练方法的步骤,或者,所述程序或指令被所述处理器执行时实现如权利要求7所述的图像去摩尔纹方法的步骤。
  19. 一种电子设备,其中,所述电子设备被配置为实现如权利要求1至6任一项所述的模型训练方法的步骤,或者,所述程序或指令被所述处理器执行时实现如权利要求7所述的图像去摩尔纹方法的步骤。
PCT/CN2023/074325 2022-02-08 2023-02-03 模型训练方法、图像去摩尔纹方法、装置及电子设备 WO2023151511A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210118889.8A CN116612015A (zh) 2022-02-08 2022-02-08 模型训练方法、图像去摩尔纹方法、装置及电子设备
CN202210118889.8 2022-02-08

Publications (1)

Publication Number Publication Date
WO2023151511A1 true WO2023151511A1 (zh) 2023-08-17

Family

ID=87563605

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074325 WO2023151511A1 (zh) 2022-02-08 2023-02-03 模型训练方法、图像去摩尔纹方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN116612015A (zh)
WO (1) WO2023151511A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333399A (zh) * 2023-10-27 2024-01-02 天津大学 基于通道和空间调制的Raw域图像及视频去摩尔纹方法
CN117611422A (zh) * 2024-01-23 2024-02-27 暨南大学 一种基于摩尔纹生成的图像隐写方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291857B (zh) * 2023-11-27 2024-03-22 武汉精立电子技术有限公司 图像处理方法、摩尔纹消除方法、设备及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030112474A1 (en) * 2001-12-13 2003-06-19 International Business Machines Corporation System and method for anti-moire imaging in a one dimensional sensor array
CN111598796A (zh) * 2020-04-27 2020-08-28 Oppo广东移动通信有限公司 图像处理方法及装置、电子设备、存储介质
CN113592742A (zh) * 2021-08-09 2021-11-02 天津大学 一种去除图像摩尔纹的方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030112474A1 (en) * 2001-12-13 2003-06-19 International Business Machines Corporation System and method for anti-moire imaging in a one dimensional sensor array
CN111598796A (zh) * 2020-04-27 2020-08-28 Oppo广东移动通信有限公司 图像处理方法及装置、电子设备、存储介质
CN113592742A (zh) * 2021-08-09 2021-11-02 天津大学 一种去除图像摩尔纹的方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GAO, TIANYU: "Design and Implementation of Moiré Removal System for Mobile Terminals", MASTER'S THESIS, no. 01, 15 April 2021 (2021-04-15), CN, pages 1 - 78, XP009548261, DOI: 10.26991/d.cnki.gdllu.2021.002330 *
IGNATOV ANDREY; VAN GOOL LUC; TIMOFTE RADU: "Replacing Mobile Camera ISP with a Single Deep Learning Model", 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), IEEE, 14 June 2020 (2020-06-14), pages 2275 - 2285, XP033799224, DOI: 10.1109/CVPRW50498.2020.00276 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333399A (zh) * 2023-10-27 2024-01-02 天津大学 基于通道和空间调制的Raw域图像及视频去摩尔纹方法
CN117333399B (zh) * 2023-10-27 2024-04-23 天津大学 基于通道和空间调制的Raw域图像及视频去摩尔纹方法
CN117611422A (zh) * 2024-01-23 2024-02-27 暨南大学 一种基于摩尔纹生成的图像隐写方法
CN117611422B (zh) * 2024-01-23 2024-05-07 暨南大学 一种基于摩尔纹生成的图像隐写方法

Also Published As

Publication number Publication date
CN116612015A (zh) 2023-08-18

Similar Documents

Publication Publication Date Title
WO2020192483A1 (zh) 图像显示方法和设备
WO2023151511A1 (zh) 模型训练方法、图像去摩尔纹方法、装置及电子设备
WO2021164234A1 (zh) 图像处理方法以及图像处理装置
EP4030379A1 (en) Image processing method, smart device, and computer-readable storage medium
Nam et al. Modelling the scene dependent imaging in cameras with a deep neural network
WO2024027583A1 (zh) 图像处理方法、装置、电子设备和可读存储介质
Lv et al. Low-light image enhancement via deep Retinex decomposition and bilateral learning
CN110956063A (zh) 图像处理方法、装置、设备及存储介质
Song et al. Multi-scale joint network based on Retinex theory for low-light enhancement
CN113989387A (zh) 相机拍摄参数调整方法、装置及电子设备
CN116055895B (zh) 图像处理方法及其装置、芯片系统和存储介质
Huang et al. Learning image-adaptive lookup tables with spatial awareness for image harmonization
CN114697530B (zh) 一种智能取景推荐的拍照方法及装置
CN114782280A (zh) 图像处理方法和装置
CN114693538A (zh) 一种图像处理方法及装置
Wang et al. Near-infrared fusion for deep lightness enhancement
WO2024012227A1 (zh) 应用于电子设备的图像显示方法、编码方法及相关装置
CN112367470B (zh) 图像处理方法、装置及电子设备
WO2023028866A1 (zh) 图像处理方法、装置和车辆
CN116740777B (zh) 人脸质量检测模型的训练方法及其相关设备
WO2023186417A1 (en) Enhancing images from a mobile device to give a professional camera effect
CN116309130A (zh) 图像处理方法及其装置
CN116797886A (zh) 模型训练方法、图像处理方法、装置和电子设备
CN116862801A (zh) 图像处理方法、装置、电子设备及存储介质
Gan et al. Ghost-free multi-exposure high dynamic range imaging based on feedback network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23752291

Country of ref document: EP

Kind code of ref document: A1