WO2022215375A1 - Procédé de traitement d'image, procédé de production de modèle d'apprentissage automatique, dispositif de traitement d'image, système de traitement d'image et programme - Google Patents

Procédé de traitement d'image, procédé de production de modèle d'apprentissage automatique, dispositif de traitement d'image, système de traitement d'image et programme Download PDF

Info

Publication number
WO2022215375A1
WO2022215375A1 PCT/JP2022/007248 JP2022007248W WO2022215375A1 WO 2022215375 A1 WO2022215375 A1 WO 2022215375A1 JP 2022007248 W JP2022007248 W JP 2022007248W WO 2022215375 A1 WO2022215375 A1 WO 2022215375A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
component
residual map
blur
noise
Prior art date
Application number
PCT/JP2022/007248
Other languages
English (en)
Japanese (ja)
Inventor
義明 井田
法人 日浅
Original Assignee
キヤノン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by キヤノン株式会社 filed Critical キヤノン株式会社
Publication of WO2022215375A1 publication Critical patent/WO2022215375A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/40Picture signal circuits
    • H04N1/409Edge or detail enhancement; Noise or error suppression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/20Circuitry for controlling amplitude response
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present invention relates to an image processing method for correcting multiple components using machine learning.
  • Patent Literature 1 discloses a method of performing noise reduction processing after correcting blur due to aberration and diffraction from a captured image using a neural network without changing noise components.
  • noise reduction processing is performed on the captured image whose intensity of blur correction has been adjusted. Therefore, when the intensity of blur correction is weakened to adjust sharpness or undershoot, performing noise reduction processing without using deep learning changes aberration components. Since noise reduction processing generally becomes blur processing, blur components spread and chromatic aberration also spreads and becomes conspicuous. In addition, since the blur component is not acquired as a residual map, the intensity of blur correction cannot be adjusted after noise reduction processing, and noise reduction processing must be performed each time the intensity of blur correction is adjusted. Since the estimation processing of deep learning has a large computational load, performing noise reduction processing with deep learning each time the intensity of blur correction is adjusted significantly increases the computational load.
  • the noise component does not follow the change in the luminance value in the region where the blur component has changed, resulting in an unnatural appearance.
  • the noise component is not acquired as a residual map, the noise component cannot be corrected in consideration of the blur component.
  • An object of the present invention is to provide an image processing method capable of acquiring a natural corrected image by performing image correction corresponding to a plurality of correlated components with appropriate correction strength.
  • An image processing method as one aspect of the present invention inputs an input image to at least one machine learning model to generate a first residual map corresponding to a first component and a second residual map corresponding to a second component. modifying the second residual map into a third residual map with a different distribution based on the input image and the first residual map; and obtaining an output image based on the residual map and a third residual map.
  • an image processing method capable of acquiring a natural corrected image by performing image correction corresponding to a plurality of correlated components with appropriate correction strength.
  • FIG. 1 is a block diagram of an image processing system of Example 1;
  • FIG. 1 is an external view of an image processing system of Example 1.
  • FIG. 4 is a flow chart relating to weight learning in Example 1.
  • FIG. 10 is a diagram showing the flow of weight learning in the neural network of Example 1; 5 is a flow chart regarding generation of a corrected image in Example 1.
  • FIG. 4 is a block diagram of an image processing flow of Example 1.
  • FIG. 10 is a block diagram of an image processing flow of a modification of Example 1;
  • FIG. 11 is a block diagram of an image processing system of Example 2;
  • FIG. 10 is an external view of an image processing system of Example 2;
  • the present invention relates to image processing for obtaining a natural corrected image by performing image correction corresponding to a plurality of correlated components with appropriate correction strength.
  • Each component can be expressed as a two-dimensional map corresponding to the amount of correction at each pixel position in the image.
  • Each component is, for example, a noise component, a blur component due to an imaging system, a blur component due to defocus/camera shake/subject blur, a scattering component due to fog, a relighting component, a background blur component for correcting background blur, and the like.
  • the noise component is random noise that is randomly generated in a captured image with a predetermined distribution and that is generated with a different distribution each time an image is captured.
  • Noise components include dark current noise, shot noise, and readout noise.
  • the blur components caused by the imaging system include aberration/diffraction, low-pass filter, aperture effects, and the like.
  • Relighting is to virtually change the light source environment of an image after imaging, and includes light source addition and light source change.
  • a luminance change component due to relighting is hereinafter referred to as a relighting component.
  • Image correction corresponding to each component means changing each component such as noise reduction or addition, blur correction or addition, etc., to obtain a preferable image or an image that reflects the user's editing intention. That is.
  • a residual map corresponding to each component is obtained.
  • a residual map is a two-dimensional map obtained by extracting the correction amount (or the component value itself) of a specific component included in an image, and correction corresponding to each component is performed by addition or subtraction.
  • noise components have different variances depending on shooting conditions such as ISO sensitivity and imaging devices, but when shot noise is included, they have different variances depending on luminance values.
  • the luminance value of a certain pixel changes due to the correction of a blur component or scattering component due to fog, the variance of noise originally generated in that pixel in the absence of blur or scattering will be different. If these corrections are performed without considering the correlation, an image with unnatural noise remains. For example, large-dispersion noise corresponding to the luminance value before the blur is removed is superimposed on a pixel or image region whose luminance has become low due to the removal of the diffraction blur.
  • the sharpness and the degree of noise are preferably corrected in such a way that the degree of correction changes depending on the image creation intention of the imaging device manufacturer and the editing intention of the user.
  • the correction strength can be adjusted, because artifacts may occur due to the correction process depending on the scene. Artifacts include blurring due to noise reduction processing, undershoot due to sharpening processing, ringing, and the like, and may occur depending on the scene even in highly accurate correction processing using machine learning. For example, artifacts are likely to occur in scenes where the light source is reflected or the subject has high contrast.
  • the correction strength is adjusted so that some blur components remain. Therefore, the noise reduction processing that is subsequently executed can perform highly accurate correction without being affected by the blur component by using machine learning.
  • each component can be corrected.
  • each component is obtained as a corresponding residual map.
  • By adjusting the correction intensity based on the residual map it is possible to acquire a corrected image in which each component is corrected with an appropriate intensity. Since it is necessary to obtain each component separately, it is necessary to use a machine learning model for each.
  • one residual map is modified based on the other residual map.
  • the residual maps before and after correction have different distributions.
  • the fact that the two residual maps have different distributions means that they are not in a relationship in which a uniform value is applied to each pixel or an offset is added. This is because the modification is not a uniform process for the screen but a process for each pixel or image area, and the value of one component for each pixel or image area is used to modify the other component.
  • a multi-layered neural network (machine learning model) is used to learn and execute the correction of the blur component (first component) and the noise component (second component) derived from the optical characteristics of the imaging system.
  • An image processing system that allows The blur and noise components correspond to different tasks in image correction.
  • the present invention is not limited to image processing that combines blur component correction and noise component correction, and can be applied to other image processing.
  • FIG. 1 is a block diagram of the image processing system 100 of this embodiment.
  • FIG. 2 is an external view of the image processing system 100.
  • the image processing system 100 has a learning device 101 , an imaging device 102 , an image estimation device (image processing device) 103 , a display device 104 , a recording medium 105 , an output device 106 and a network 107 .
  • the learning device 101 has a storage unit 101a, an acquisition unit 101b, a generation unit 101c, and an update unit 101d.
  • the imaging device 102 has an optical system 102a and an imaging device 102b.
  • the optical system 102a condenses light incident on the imaging device 102 from the object space.
  • the image sensor 102b receives (photoelectrically converts) an optical image (object image) formed via the optical system 102a and obtains a captured image.
  • the imaging element 102b is, for example, a CCD (Charge Coupled Device) sensor, a CMOS (Complementary Metal-Oxide Semiconductor) sensor, or the like.
  • a captured image acquired by the imaging device 102 includes blur due to aberration and diffraction of the optical system 102a and noise due to the imaging device 102b.
  • the image estimation device 103 has a storage unit 103a, an acquisition unit (first acquisition unit) 103b, and a correction unit (correction unit, second acquisition unit) 103c.
  • the image estimating device 103 acquires the captured image from the imaging device 102, and uses at least one machine learning model to generate a blur residual map (first residual map) corresponding to the blur component and noise corresponding to the noise component.
  • a residual map (second residual map) is obtained.
  • the noise residual map is modified based on the captured image (input image) and the blur residual map to obtain a modified noise residual map (third residual map).
  • an estimated image (corrected image, output image) in which blur and noise are corrected with appropriate intensity is generated based on the captured image, the blur residual map, and the modified noise residual map.
  • the image estimation device 103 has a function of performing development processing and other image processing as necessary.
  • a multilayer neural network is used in acquiring the residual map, and weight information (machine learning model weight) is read from the storage unit 103a.
  • the weight information is learned by the learning device 101 .
  • the image estimation device 103 reads weight information from the storage unit 101a via the network 107 in advance and stores it in the storage unit 103a.
  • the weight information may be the numerical value of the weight itself or information in an encoded format. Details regarding weight learning and acquisition of each residual map using the weights will be described later.
  • the corrected image is output to at least one of the display device 104, the recording medium 105, and the output device 106.
  • the display device 104 is, for example, a liquid crystal display, a projector, or the like.
  • the user can perform editing work or the like while confirming the image being processed through the display device 104 .
  • the recording medium 105 is, for example, a semiconductor memory, a hard disk, a server on a network, or the like.
  • the output device 106 is, for example, a printer or the like.
  • FIG. 3 is a flow chart regarding weight learning. Processing of each step in FIG.
  • FIG. 4 is a diagram showing the flow of learning weights in a neural network.
  • the acquisition unit 101b acquires an original image (subject image).
  • the original image is a high-resolution (high-quality) image with little blur due to aberration and diffraction of the optical system 102a.
  • the original images are acquired in multiples and are images with different objects, i.e. edges with different strengths and directions, textures, gradations, flats, and the like.
  • the original image may be a photographed image or an image generated by CG (Computer Graphics).
  • CG Computer Graphics
  • the original image preferably has a signal value higher than the luminance saturation value of the image sensor 102b. This is because even in actual subjects, there are subjects that do not fall within the luminance saturation value when photographed by the imaging apparatus 102 under specific exposure conditions.
  • a photographed image is used as the original image, it can be acquired by HDR photography or photography with an imaging device having a higher dynamic range than the imaging device 102 .
  • an image captured by an imaging device having a dynamic range equivalent to that of the imaging device 102 is used as the original image, it is possible to increase the signal value by, for example, proportionally multiplying the signal value. However, it is preferable that the reduction in gradation due to the proportional multiplication does not affect the learning result.
  • the original image may have a noise component. In this case, noise in the original image does not pose a particular problem because the object can be regarded as including noise contained in the original image.
  • step S102 the acquisition unit 101b acquires blurring used for performing an imaging simulation, which will be described later.
  • the acquisition unit 101b acquires shooting conditions corresponding to the lens state (zoom, aperture, and focal length states) of the optical system 102a.
  • the acquisition unit 101b acquires the blur determined by the shooting conditions and the screen position.
  • the blur is the PSF (point spread intensity distribution) or OTF (optical transfer function) of the optical system 102a.
  • Blurring can be obtained by optical simulation or measurement in the optical system 102a. Note that different lens states, image heights, azimuth aberrations, and diffraction blurring are acquired for each original image.
  • a component such as an optical low-pass filter included in the imaging device 102 may be added to the blur to be applied.
  • step S103 the generation unit 101c generates learning data that summarizes correct data consisting of correct patches (correct images) and training data consisting of training patches (training images).
  • the correct patch and training patch are changed according to the function or effect to be learned, and the corresponding images can be used as the correct patch and training patch.
  • Images with different first and second components are generated as images to be used for the correct patch and the training patch.
  • a componentless patch generated from the original image and a blur patch added with the blur component added to the componentless patch in step S102 are generated.
  • a noise patch and a noise blur patch are generated by adding a noise component to the non-component patch and the blur patch.
  • a plurality of non-component patches and blur patches are generated, and one or more patches are generated corresponding to one original image.
  • the non-component patch and the blurred patch are images of the same subject.
  • a componentless patch and a blurred patch are used as the correct patch and the training patch, respectively.
  • a combination of a plurality of these combinations is used as learning data to learn a machine learning model for obtaining a blur residual map for correcting aberration and diffraction of the optical system 102a. Therefore, the componentless patch should be an image with less blur than the blurred patch.
  • the learning data may include cases where the non-component patch and the blurred patch are the same image.
  • a patch refers to an image having a predetermined number of pixels (eg, 64 ⁇ 64 pixels). Also, the numbers of pixels of the correct patch and the training patch do not necessarily have to match.
  • mini-batch learning is used for learning weights of a multi-layered neural network. Therefore, in step S103, multiple sets of correct and training patches are generated.
  • the present invention is not limited to this, and online learning or batch learning may be used.
  • a plurality of pairs of a non-component image and a blurred image having relatively different effects of blurring due to aberration and diffraction are generated. Generate.
  • the componentless image and the blurred image have the same number of pixels as or larger than the number of pixels of the patch used as training data.
  • a plurality of non-component patches and blur patches are obtained by extracting partial regions having a prescribed pixel size at the same position from a plurality of pairs of non-component images and blurred images.
  • the original image is an undeveloped RAW image, and the componentless and blurred patches are also RAW images.
  • the present invention is not limited to this, and may be an image after development.
  • the position of the partial area refers to the center of the partial area.
  • non-component patches and blurred patches are obtained by the above method, but the present invention is not limited to this.
  • noise blurred image noise blur patch
  • noise image noise patch
  • ⁇ 2 (x,y) [k1(S1(x,y)-OB)+k0)] ⁇ ISO/100
  • (x, y) are two-dimensional spatial coordinates
  • S1(x, y) is the signal value of the pixel at the coordinates (x, y) of the blur patch before adding noise.
  • r(x, y) is the numerical value at the coordinates (x, y) of the random number map with standard deviation 1
  • ⁇ (x, y) is the standard deviation of the noise ( ⁇ 2 (x, y) is the variance).
  • OB is the optical black (black level image) signal value
  • ISO is the ISO sensitivity
  • k1 and k0 are the proportional coefficient and constant for the signal value at the ISO sensitivity of 100.
  • a proportional coefficient k1 represents the effect of shot noise
  • a constant k0 represents the effect of dark current and readout noise.
  • the values of k1 and k0 are determined by the noise characteristics of the image sensor 102b.
  • common noise is applied to corresponding pixels (corresponding pixels) of the non-component patch and the blurred patch.
  • a corresponding pixel is a pixel that captures an image of the same position in the object space, or a pixel at the same position of a non-component patch and a blurred patch.
  • the blur residual map used in this embodiment does not include distortion.
  • Distortion aberration is individually corrected after blur correction by using bilinear interpolation, bicubic interpolation, or the like.
  • step S104 the generation unit 101c inputs the noise blur patch as the input data 212 (training patch, training image) in FIG.
  • the generation unit 101c inputs the noise blur patch as the input data 212 (training patch, training image) in FIG.
  • the input data 212 training patch, training image
  • FIG. 4 shows the flow from step S104 to step S105.
  • noise patches are used as the correct data 211 .
  • the estimated patch 213 is a noise-blurred patch corrected for blurring, and ideally matches the correct data (correct patch, correct image) 211 .
  • the neural network outputs an estimated residual map 214 corresponding to the difference between the input data 212 (noise blur patches in this embodiment) and the correct data 211 (noise patches in this embodiment).
  • Estimated residual map 214 is an estimated blur residual map.
  • the configuration of the neural network shown in FIG. 3 is used, but the present invention is not limited to this.
  • CN in FIG. 3 represents a convolution layer
  • DC represents a deconvolution layer
  • the convolution of the input with the filter plus the bias is computed and the result is non-linearly transformed by the activation function.
  • the initial values of the components of the filter and the bias are arbitrary, and are determined by random numbers in this embodiment.
  • a ReLU Rectified Linear Unit
  • a sigmoid function or the like can be used as the activation function.
  • the output of each layer except the final layer is called a feature map.
  • Skip connections 222 and 223 combine feature maps output from discontinuous layers. Synthesis of feature maps may be summed element by element, or may be concatenated in the channel direction. In this embodiment, the sum is taken for each element.
  • Skip connection 221 sums input data 212 with estimated residual map 214 estimated from input data 212 to generate estimated patch 213 .
  • An estimated patch 213 is generated for each of the plurality of input data 212 .
  • step S105 the updating unit 101d updates the weight of the neural network from the error between the estimated patch 213 and the correct data 211.
  • the weights contain the components and biases of the filters in each layer. Backpropagation is used to update the weights, but the present invention is not limited to this.
  • For mini-batch learning errors between a plurality of noise patches input as correct data 211 and estimated patches 213 corresponding to them are obtained, and weights are updated.
  • the error function for example, the L2 norm, the L1 norm, or the like may be used.
  • step S106 the updating unit 101d determines whether or not weight learning has been completed. Completion can be determined based on whether the number of iterations of learning (update of weights) has reached a specified value, or whether the amount of weight change during update is smaller than a specified value, or the like. If it is determined to be incomplete, the process returns to step S104 to acquire a plurality of new noise patches and noise blur patches. On the other hand, when it is determined that the learning has been completed, the learning device 101 (the update unit 101d) ends the learning and saves the weight information in the storage unit 101a.
  • the neural network can learn by separating the object, the blur component, and the noise component. Therefore, it is possible to obtain a blur residual map corresponding to the blur component of only the subject while suppressing noise fluctuation.
  • the estimated error is obtained from the noise patch used for the correct data 211 and the estimated patch 213, but the estimated error is obtained from the estimated residual map 214 and the correct answer of the blur residual map without outputting the estimated patch 213.
  • the correct answer for the blur residual map is the difference between the noise patch and the noise blur patch, which is equal to the difference between the componentless patch and the blur patch.
  • a machine learning model can be trained that directly outputs the estimated blur residual map as the estimated residual map 214 .
  • the input data 212 is the noise blur patch
  • the correct data 211 is the blur patch instead of the noise patch.
  • the neural network can learn by separating the subject, the blur component, and the noise component by learning using patches that have the same blur component and differ only in the noise component. Therefore, it is possible to obtain a noise residual map corresponding to the noise component while suppressing changes in the blur component.
  • the estimated error may be obtained from the estimated residual map 214 and the correct noise residual map without outputting the estimated patch 213 .
  • the correct answer for the noise residual map is the difference between the noise blur patch and the blur patch, which is equal to the difference between the noise patch and the componentless patch.
  • a machine learning model can be trained that directly outputs the estimated noise residual map as the estimated residual map 214 .
  • the noise blur patch is not used. Therefore, it is not necessary to obtain a noise blur patch in this case.
  • the machine learning model that acquires the blur residual map and the machine learning model that acquires the noise residual map are trained separately, so different network configurations may be used for each.
  • the neural network can learn by separating the subject, the blur component, and the noise component, it is possible to output both the blur residual map and the noise residual map in a common network (one network). In this case, both learning and estimation by the machine learning model can be performed only once, so the computational load can be reduced.
  • FIG. 5 is a flow chart regarding generation of a corrected image.
  • FIG. 6 is a block diagram of the image processing flow. Processing of each step in FIG.
  • step S201 the acquisition unit 103b acquires the input image 401 and weight information.
  • An input image 401 is an undeveloped RAW image, which is transmitted from the imaging apparatus 102 in this embodiment.
  • the weight information is the weights of the machine learning model that acquires the blur residual map and the machine learning model that acquires the noise residual map, which are transmitted from the learning device 101 and stored in the storage unit 103a.
  • step S202 the correction unit 103c acquires a blur residual map (first residual map) 402 by inputting the input image 401 to the machine learning model for acquiring the blur residual map acquired in step S201. Further, the correction unit 103c acquires a noise residual map (second residual map) 403 by inputting the input image 401 to a machine learning model that acquires a noise residual map.
  • a blur residual map first residual map
  • second residual map noise residual map
  • step S203 the correction unit 103c acquires the correction strength using the blur residual map and the correction strength using the noise residual map.
  • the correction strength the correction ratio of each component may be used. In the present embodiment, a predetermined value is used, but a user-designated value may be obtained.
  • step S204 the correction unit 103c corrects the noise residual map 403 based on the input image 401 and the blur residual map 402, and acquires a corrected noise residual map (third residual map) 404.
  • the correction unit 103c adds the blur residual map 402 to the input image 401 by multiplying the blur component correction ratio obtained in step S203. For example, if the correction ratio is 0.5, an intermediate corrected image 410 in which the blur component is corrected by 50% is obtained.
  • the standard deviations ⁇ 1(x, y) and ⁇ 2(x, y) of the noise when the signal value S1(x, y) in Equation (1) is the input image 401 and the intermediate correction image 410 are get.
  • the noise residual map 403 corresponds to noise components contained in the input image 401 .
  • a modified noise residual map 404 is obtained by multiplying the noise residual map 403 by the ratio ⁇ 2 (x, y)/ ⁇ 1 (x, y) pixel by pixel.
  • Modified noise residual map 404 corresponds to the noise components contained in intermediate corrected image 410 . Since the luminance value of the input image 401 and the value of the blur residual map are usually different for each screen position, the ratio to be applied is different for each pixel. Therefore, the modified noise residual map 404 is multiplied by different coefficients for each pixel of the noise residual map 403 resulting in different distributions. As a result, it is possible to obtain a residual map that enables natural correction that reflects the correlation of each component for each pixel.
  • step S205 the correction unit 103c multiplies the corrected noise residual map 404 by the correction ratio of the noise component obtained in step S203 and adds the result to the intermediate corrected image 410 to obtain the estimated image 405. Since the acquired image is a RAW image, development processing is performed as necessary.
  • the correction strength of each component may be adjustable while viewing the estimated image.
  • the configuration of this embodiment is not limited to the above.
  • acquisition of the correction strength may be performed at any time prior to using the correction strength of each component.
  • the estimated image 405 is obtained by adding the modified noise residual map 404 multiplied by the correction ratio to the intermediate corrected image 410, but the present invention is not limited to this.
  • An estimated image 405 may be obtained by adding the blur residual map 402 and the corrected noise residual map 404 to the input image 401 according to the correction ratio of each component.
  • both the blur residual map 402 and the noise residual map 403 may be output using a common machine learning model (one machine learning model).
  • the signal value S1(x, y) for obtaining the standard deviation ⁇ 1(x, y) is the input image 401, but it may be the image after correcting the noise component. That is, an image obtained by adding a noise residual map at a correction ratio of 1 to the input image 401 may be used. This method can improve the accuracy, especially when the noise component is large.
  • each component is corrected (removed) by adding the residual map
  • the code may be defined such that each component can be removed by subtracting the residual map
  • FIG. 7 is a block diagram of the image processing flow of the modification.
  • an input image 401 is input to a machine learning model (second learning model) that has learned about noise components to acquire a noise corrected image 411 and a noise residual map 403 .
  • a noise-corrected image 411 is an image obtained by subtracting 100% of the acquired noise component from the input image 401 .
  • the noise-corrected image 411 is input to a machine learning model (first learning model) that has learned blur components, and a noise-blur corrected image 412 and a blur residual map 402 are obtained.
  • a noise-blur corrected image 412 is an image obtained by subtracting 100% of the acquired blur component from the noise-corrected image 411 .
  • the blur residual map 402 is multiplied by (1-blur component correction ratio) and subtracted from the noise blur corrected image 412 to weaken the correction strength of the blur component and obtain an intermediate corrected image 413 .
  • the intermediate corrected image 413 becomes a corrected image reflecting the correction ratio of the blur component.
  • the noise residual map 403 corresponds to noise components contained in the input image 401 . Therefore, similarly to the present embodiment, the corrected noise residual map 404 is obtained by multiplying the ratio of each standard deviation of noise to the luminance value of the noise corrected image 411 and the intermediate corrected image 413 .
  • the modified noise residual map 404 is multiplied by (1-blur component correction ratio) and subtracted from the intermediate corrected image 413 to weaken the correction strength of the noise component, and the estimated image 405 is acquired to complete the process.
  • the present embodiment and modifications there is no need to recalculate the machine learning model even if the correction strength of each component is adjusted.
  • the steps from acquiring the corrected noise residual map based on the luminance values before and after the blur correction may be executed again. Therefore, the calculation load can be greatly reduced, and it is also possible to adjust the correction strength while viewing the image.
  • the present invention is also effective for blur due to other factors (defocus, blurring, etc.).
  • the blur given to the blur patch to defocus, blur, or the like, it is possible to separate the blur caused by these factors from the noise component and obtain a residual map.
  • Defocus blur conversion is a process of converting defocus blur in a captured image into a shape and distribution desired by the user.
  • Defocus blur of a captured image includes chipping due to vignetting, double-line blur, annular patterns due to cutting marks of an aspherical lens, and central shielding due to a catadioptic optical system.
  • These defocus blurs are converted by a neural network into a shape and distribution desired by the user (for example, a flat circular shape, a normal distribution function, etc.).
  • a neural network that realizes conversion of defocus blur can be trained by the following method.
  • a captured equivalent image to which defocus blur occurring in the captured image is added and an ideal equivalent image to which defocus blur desired by the user is added are generated for a plurality of defocus amounts.
  • an image equivalent image with a defocus amount of zero and an ideal equivalent image are also generated.
  • a noise first defocus patch and a noise second defocus patch obtained by extracting a plurality of first defocus patches and a plurality of second defocus patches from the generated plurality of imaging equivalent images and ideal equivalent images, respectively, and adding noise to them. to get By performing the same learning as in the present embodiment, the correction component of the defocus blurring conversion can be separated from the noise component and obtained as a residual map.
  • the present invention can also be applied to lighting conversion (relighting) instead of blur component correction.
  • a neural network that implements lighting transformations can be trained in the following way.
  • a captured image is generated by rendering the original image, which is the same normal map, in a light source environment assumed for the captured image.
  • an ideal equivalent image is generated by rendering in the light source environment desired by the user.
  • a plurality of first lighting patches and second lighting patches are extracted from the imaged equivalent image and the ideal equivalent image, respectively, and noise first lighting patches and noise second lighting patches are obtained by adding noise to these.
  • the first lighting corresponds to the lighting before relighting
  • the second lighting corresponds to the lighting after relighting.
  • the relighting component can be separated from the noise component and obtained as a residual map.
  • the present invention can also be applied to relighting instead of blur component correction, and blur component correction instead of noise component correction.
  • Blurring caused by the optical characteristics of the imaging system, defocus, and blur causes blur that spreads to surrounding pixels depending on the brightness of the subject. Therefore, when the brightness of the subject changes due to lighting, the blurring also changes at the same time.
  • the noise standard deviation change is obtained based on the luminance change before and after the blur correction
  • the modified noise residual map is obtained by multiplying by the ratio of the standard deviation change.
  • the blur residual map is modified for each pixel based on the ratio of the luminance values before and after the light source correction by the relighting component.
  • a neural network that obtains the blurring component and the relighting component separately can be learned by the following method.
  • a captured image is generated by rendering the original image, which is the same normal map, in a light source environment assumed for the captured image.
  • an ideal equivalent image is generated by rendering in the light source environment desired by the user.
  • a plurality of first lighting patches and second lighting patches are extracted from the imaged equivalent image and the ideal equivalent image, respectively, and a blurred first lighting patch and a blurred second lighting patch obtained by imparting blur to these are obtained.
  • the second component is the blur component
  • the blur first lighting patch and the blur second lighting patch correspond to the noise blur patch and the noise patch in this embodiment.
  • the relighting component can be separated from the blur component and obtained as a residual map.
  • the residual map of the relighting component may be corrected based on the ratio of luminance values before and after correction of the blur component.
  • the present invention is applicable to three or more components. Even if there are three or more components, each component can be separated and obtained as a residual map, so the residual map of each component can be obtained as appropriate while considering the correlation between the components as in the case of two components. Correct it. For example, when performing blur component correction, relighting, and noise component correction, first, the blur residual map is corrected for each pixel based on the ratio of luminance values before and after light source correction by the relighting component. After that, the noise component map may be corrected for each pixel based on the ratio of the luminance value when both the light source correction and the blur component correction by the relighting component are not applied and the luminance value when both are applied.
  • the learning device 101 and the image estimation device 103 are separate units, but the present invention is not limited to this.
  • the learning device 101 and the image estimation device 103 may be integrated. That is, learning (processing shown in FIG. 3) and estimation (processing shown in FIG. 5) may be performed within an integrated device.
  • a natural corrected image can be obtained by performing image correction while adjusting each correction strength corresponding to a plurality of correlated components.
  • the image processing system of this embodiment differs from the image processing system of Embodiment 1 in that the corrected image is generated by an image estimation unit within the imaging device.
  • FIG. 8 is a block diagram of the image processing system 300 of this embodiment.
  • FIG. 9 is an external view of the image processing system 300.
  • the image processing system 300 has a learning device 301 and an imaging device 302 .
  • the learning device 301 and imaging device 302 are connected via a network 303 .
  • the learning device 301 has a storage unit 311, an acquisition unit 312, a generation unit 313, and an update unit 314, and learns weights (weight information) for acquiring a residual map with a neural network.
  • the imaging device 302 has an optical system 321 , an imaging device 322 , an image estimation unit (image processing device) 323 , a storage unit 324 , a recording medium 325 , a display unit 326 and a system controller 327 .
  • the imaging device 302 acquires a captured image by capturing an object space, and acquires a blur residual map and a noise residual map from the captured image using the read weight information.
  • the imaging device 302 generates a corrected image using a modified noise residual map obtained by modifying the noise residual map.
  • the image estimation unit 323 has an acquisition unit (first acquisition unit) 323a and a correction unit (correction unit, second acquisition unit) 323b, and uses weight information stored in the storage unit 324 to obtain each residual map. Acquire and perform a correction based on each residual map.
  • Weight information is learned in advance by the learning device 301 and stored in the storage unit 311 .
  • the imaging device 302 reads weight information from the storage unit 311 via the network 303 and stores it in the storage unit 324 .
  • the corrected image is saved in the recording medium 325 .
  • the saved corrected image is read and displayed on the display unit 326 .
  • the captured image already stored in the recording medium 325 may be read, and the image estimation unit 323 may perform performance deviation correction.
  • the above series of controls are performed by the system controller 327 .
  • the learning of the machine learning model executed by the learning device 301 is the same as the learning of the machine learning model described in the first embodiment.
  • a natural corrected image can be obtained by performing image correction while adjusting each correction strength corresponding to a plurality of correlated components.
  • the image processing system of this embodiment is a processing device (computer) that transmits a captured image to be subjected to image processing to an image estimation device and receives a processed output image (corrected image, estimated image) from the image estimation device. is different from the image processing systems of the first and second embodiments.
  • FIG. 10 is a block diagram of the image processing system 600 of this embodiment.
  • the image processing system 600 has a learning device 601 , an imaging device 602 , an image estimation device (image processing device) 603 , and a processing device (computer) 604 .
  • the learning device 601 and the image estimation device 603 are, for example, servers.
  • the processing device 604 is, for example, a user terminal (personal computer or smart phone) and is connected to the image estimation device 603 via the network 605 . That is, the image estimation device 603 and the processing device 604 are configured to be able to communicate with each other.
  • Image estimation device 603 is connected to learning device 601 via network 606 . That is, the learning device 601 and the image estimation device 603 are configured to be able to communicate with each other.
  • the configuration of the learning device 601 is the same as that of the learning device 101 of the first embodiment, so the explanation is omitted. Also, since the configuration of the imaging device 602 is the same as that of the imaging device 102 of the first embodiment, description thereof will be omitted.
  • the image estimation device 603 has a storage unit 603a, an acquisition unit (first acquisition unit) 603b, a correction unit (correction unit, second acquisition unit) 603c, and a communication unit (reception unit) 603d.
  • the storage unit 603a, acquisition unit 603b, and correction unit 603c are the same as the storage unit 103a, acquisition unit 103b, and correction unit 103c of the image estimation apparatus 103 of the first embodiment, respectively.
  • the communication unit 603 d has a function of receiving a request transmitted from the processing device 604 and a function of transmitting an output image generated by the image estimation device 603 to the processing device 604 .
  • the processing device 604 has a communication unit (transmitting unit) 604a, a display unit 604b, an image processing unit 604c, and a recording unit 604d.
  • the communication unit 604a has a function of transmitting a request to the image estimating device 603 to cause the image estimating device 603 to process a captured image, and a function of receiving an output image processed by the image estimating device 603 .
  • the display unit 604b has a function of displaying various information. Information displayed by the display unit 604b includes, for example, captured images to be transmitted to the image estimation device 603 and output images received from the image estimation device 603.
  • the image processing unit 604 c has a function of performing image processing on the output image received from the image estimation device 603 .
  • the recording unit 604d records captured images acquired from the imaging device 602, output images received from the image estimation device 603, and the like.
  • FIG. 11 is a flowchart regarding image processing.
  • the image processing in FIG. 11 is started when the user issues an instruction to start image processing via the processing device 604 .
  • the operation of the processing device 604 will be described.
  • step S701 the processing device 604 transmits a request for processing the captured image to the image estimation device 603.
  • the captured image may be uploaded to the image estimation device 603 simultaneously with the processing of step S701, or may be uploaded to the image estimation device 603 prior to the processing of step S701.
  • the captured image may be an image stored on a server different from the image estimation device 603 .
  • the processing device 604 may transmit ID information or the like for authenticating the user together with a request for processing the captured image.
  • the processing device 604 receives the output image generated within the image estimation device 603 .
  • the output image is an image obtained by correcting each component of the captured image using the residual map, as in the first embodiment.
  • step S801 the image estimation device 603 receives a request for processing the captured image transmitted from the processing device 604.
  • the image estimating device 603 determines that processing for the captured image has been instructed, and executes processing from step S802 onward.
  • the image estimation device 603 acquires weight information.
  • the weight information is information (learned model) learned by the same method as in the first embodiment.
  • the image estimation device 603 may acquire weight information from the learning device 601, or may acquire weight information previously acquired from the learning device 601 and stored in the storage unit 603a.
  • step S803 to step S806 is the same as the processing from step S202 to step S205 in FIG.
  • step S807 the image estimation device 603 transmits the output image to the processing device 604.
  • the processing load due to the correction processing can be borne in the image estimation device 603, so the processing capacity required for the processing device 604 can be reduced. can.
  • the image estimation device 603 may be configured to be controlled using the processing device 604 communicably connected to the image estimation device 603 .
  • the present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.
  • a circuit for example, ASIC

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

L'objectif de l'invention est de fournir un procédé de traitement d'image dans lequel des corrections d'image correspondant à une pluralité de composants qui sont corrélés l'un avec l'autre sont effectuées avec des intensités de correction appropriées, et qui permet ainsi d'obtenir une image corrigée naturelle. À cet effet, la présente invention concerne un procédé de traitement d'image comprenant : une étape consistant à entrer une image d'entrée dans au moins un modèle d'apprentissage automatique afin d'acquérir une première carte résiduelle qui correspond à un premier composant et une seconde carte résiduelle qui correspond à un second composant; une étape consistant à modifier, d'après l'image d'entrée et la première carte résiduelle, la deuxième carte résiduelle en une troisième carte résiduelle qui comprend une distribution différente; et une étape consistant à acquérir une image de sortie d'après l'image d'entrée, la première carte résiduelle et la troisième carte résiduelle.
PCT/JP2022/007248 2021-04-09 2022-02-22 Procédé de traitement d'image, procédé de production de modèle d'apprentissage automatique, dispositif de traitement d'image, système de traitement d'image et programme WO2022215375A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021066380A JP2022161503A (ja) 2021-04-09 2021-04-09 画像処理方法、機械学習モデルの製造方法、画像処理装置、画像処理システム、及びプログラム
JP2021-066380 2021-04-09

Publications (1)

Publication Number Publication Date
WO2022215375A1 true WO2022215375A1 (fr) 2022-10-13

Family

ID=83546334

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/007248 WO2022215375A1 (fr) 2021-04-09 2022-02-22 Procédé de traitement d'image, procédé de production de modèle d'apprentissage automatique, dispositif de traitement d'image, système de traitement d'image et programme

Country Status (2)

Country Link
JP (1) JP2022161503A (fr)
WO (1) WO2022215375A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024161572A1 (fr) * 2023-02-02 2024-08-08 三菱電機株式会社 Dispositif de capture d'image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020144489A (ja) * 2019-03-05 2020-09-10 キヤノン株式会社 画像処理方法、画像処理装置、プログラム、学習済みモデルの製造方法、および、画像処理システム
JP2020166628A (ja) * 2019-03-29 2020-10-08 キヤノン株式会社 画像処理方法、画像処理装置、プログラム、画像処理システム、および、学習済みモデルの製造方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020144489A (ja) * 2019-03-05 2020-09-10 キヤノン株式会社 画像処理方法、画像処理装置、プログラム、学習済みモデルの製造方法、および、画像処理システム
JP2020166628A (ja) * 2019-03-29 2020-10-08 キヤノン株式会社 画像処理方法、画像処理装置、プログラム、画像処理システム、および、学習済みモデルの製造方法

Also Published As

Publication number Publication date
JP2022161503A (ja) 2022-10-21

Similar Documents

Publication Publication Date Title
JP7258604B2 (ja) 画像処理方法、画像処理装置、プログラム、および学習済みモデルの製造方法
JP7242185B2 (ja) 画像処理方法、画像処理装置、画像処理プログラム、および、記憶媒体
US9036032B2 (en) Image pickup device changing the size of a blur kernel according to the exposure time
US20060239549A1 (en) Method and apparatus for correcting a channel dependent color aberration in a digital image
US11508038B2 (en) Image processing method, storage medium, image processing apparatus, learned model manufacturing method, and image processing system
JP7297470B2 (ja) 画像処理方法、画像処理装置、プログラム、画像処理システム、および、学習済みモデルの製造方法
WO2011121760A9 (fr) Appareil de traitement d'images et appareil de capture d'images le comprenant
WO2019124289A1 (fr) Dispositif, procédé de commande et support de stockage
US20220405892A1 (en) Image processing method, image processing apparatus, image processing system, and memory medium
WO2022215375A1 (fr) Procédé de traitement d'image, procédé de production de modèle d'apprentissage automatique, dispositif de traitement d'image, système de traitement d'image et programme
JP2021140663A (ja) 画像処理方法、画像処理装置、画像処理プログラム、および記憶媒体
JP2021189929A (ja) 画像処理方法、プログラム、画像処理装置、および、画像処理システム
JP2021140758A (ja) 学習データの製造方法、学習方法、学習データ製造装置、学習装置、およびプログラム
JP2013127804A (ja) 画像処理プログラム、画像処理装置、および画像処理方法
JP7225316B2 (ja) 画像処理方法、画像処理装置、画像処理システム、およびプログラム
JP2021190814A (ja) 学習方法、画像処理方法、およびプログラム
US20240070826A1 (en) Image processing method, image processing apparatus, and storage medium
JP2021086272A (ja) 画像処理方法、プログラム、学習済みモデルの製造方法、画像処理装置、および、画像処理システム
JP2023179838A (ja) 画像処理方法、画像処理装置、画像処理システム、画像処理プログラム
JP2024137693A (ja) 画像処理方法、画像処理装置、学習済みモデルの製造方法、学習装置、画像処理システム、プログラム、および記憶媒体
JP2023104667A (ja) 画像処理方法、画像処理装置、画像処理システム、およびプログラム
JP2023088349A (ja) 画像処理方法、画像処理装置、画像処理システム、およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22784360

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22784360

Country of ref document: EP

Kind code of ref document: A1