WO2023112172A1 - 学習装置、画像処理装置、学習方法、画像処理方法及びプログラム - Google Patents

学習装置、画像処理装置、学習方法、画像処理方法及びプログラム Download PDF

Info

Publication number
WO2023112172A1
WO2023112172A1 PCT/JP2021/046132 JP2021046132W WO2023112172A1 WO 2023112172 A1 WO2023112172 A1 WO 2023112172A1 JP 2021046132 W JP2021046132 W JP 2021046132W WO 2023112172 A1 WO2023112172 A1 WO 2023112172A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
learning
image
data
unit
Prior art date
Application number
PCT/JP2021/046132
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
泳青 孫
小萌 武
幸浩 坂東
正樹 北原
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/046132 priority Critical patent/WO2023112172A1/ja
Priority to JP2023567359A priority patent/JPWO2023112172A1/ja
Publication of WO2023112172A1 publication Critical patent/WO2023112172A1/ja

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting

Definitions

  • the present invention relates to a learning device and learning method for constructing a model that performs super-resolution, an image processing device and image processing method that performs super-resolution, and a program.
  • security cameras when a high-resolution camera is applied to a security camera that requires continuous shooting, the high-resolution image has a large capacity, so a storage device for storing the enormous amount of images is required. necessary and costly.
  • security cameras When security cameras are installed in many places, the cost required for installing the cameras is also high because cameras with high resolution are expensive. Therefore, security cameras generally use inexpensive cameras that capture low-resolution still images and moving images. Due to the circumstances described above, for example, when an image taken by a security camera is used to enlarge the face of a person included in the image, the resolution is not sufficient and the face becomes blurry, and the features of the person's face become blurred. It can be difficult to determine exactly what
  • Super-Resolution As described above, there is a technology called Super-Resolution that obtains a high-resolution image from a low-resolution image in case an image with the desired resolution cannot be obtained.
  • super-resolution technology restores the information of frequency components higher than those contained in the low-resolution image from the low-resolution image, and transfers the restored information between the pixels of the low-resolution image. It is a technique to obtain a high-resolution image by applying it.
  • CNN convolutional neural networks
  • Non-Patent Documents 1 and 2 disclose techniques for performing super-resolution on hyperspectral images (hereinafter referred to as "HS" (Hyper-Spectral) images). More specifically, in Non-Patent Document 1, an image generated by upsampling a low-resolution image of an HS image (hereinafter referred to as an "LR" (Low Resolution) image), and an auxiliary image corresponding to the image. is provided to a neural network including a CNN to predict a high-resolution image of an HS image (hereinafter referred to as "HR” (High Resolution) image).
  • HR High Resolution
  • Non-Patent Document 2 discloses a technique of predicting an HR image of an HS image from an LR image of an HS image using a neural network including a CNN.
  • Non-Patent Documents 1 and 2 are both techniques for performing supervised learning. Therefore, in Non-Patent Documents 1 and 2, ground-truth data required for supervised learning, so-called true value data, must be prepared in advance for each input data.
  • FIG. 9 is a block diagram schematically showing a configuration for performing supervised learning applied in Non-Patent Documents 1 and 2.
  • HR image data 102 which is output data output from a deep neural network 201, and true value data prepared in advance corresponding to the input data.
  • Supervised learning is performed to update the coefficients of the deep neural network 201 so as to minimize the loss value that indicates the magnitude of the difference from 103 .
  • the LR image data 101 is represented by I LR
  • the calculation by the deep neural network 201 is represented by the function f ⁇ ( ⁇ )
  • the true value data 103 is represented by I HR .
  • the HR image data 102 which is the output data of the deep neural network 201, is expressed as f ⁇ (I LR ). Therefore, the above supervised learning that minimizes the loss value, for example, minimizes the square value of the L1 norm that indicates the magnitude of the difference between the true value data I HR and f ⁇ (I LR ) It is represented by the following formula (1).
  • the suffix i is an identifier for identifying each of the plurality of LR image data 101 when the plurality of LR image data 101 is given to the deep neural network 201, and each LR image data It is an identifier that specifies each of the true value data 103 corresponding to 101 .
  • the accuracy of super-resolution depends on the number of data sets of LR image data 101 and true value data 103 prepared in advance as training data. . Therefore, in order to increase the precision of super-resolution, it is necessary to prepare a large number of data sets according to the precision, but in reality, it is difficult to prepare such a large number of data sets. There is a problem of difficulty.
  • Non-Patent Document 1 requires auxiliary image data, and the precision of super-resolution depends on the quality of the auxiliary image data. Therefore, if the LR image and the auxiliary image data corresponding to the LR image do not correspond correctly, there is a problem that the deep neural network 201 that can perform super-resolution with desired accuracy cannot be constructed. .
  • the present invention generates high-resolution image data from low-resolution image data without using true value data or auxiliary image data that requires an accurate correspondence relationship with the original image data.
  • the purpose is to provide technology.
  • two pieces of image data are selected from arbitrary positions of original image data, one of the selected pieces of image data is reduced at a predetermined reduction ratio to be used as first learning image data, and the other selected piece of image data is used as first learning image data.
  • each of the first learning image data and the second learning image data is input data, and the input data is the reciprocal of the reduction ratio.
  • a super-resolution model that outputs, as output data, predicted image data that is image data that is image data enlarged at the indicated magnification and has a resolution higher than that of the input data, wherein the first learning image data is The size of the difference between the predicted image data which is the output when input and the image data before reduction of the first learning image data, and the output when the second learning image data is input a learning processing unit that generates a super-resolution model that minimizes the sum of magnitudes of differences between image data obtained by reducing certain predicted image data at the reduction ratio and the second image data for learning. It is a device.
  • One aspect of the present invention has an image capturing unit that captures image data, and a trained super-resolution model generated by the learning device, and uses the image data captured by the image capturing unit as input data. and a prediction processing unit that outputs predicted image data obtained as output data by applying the predicted super-resolution model to the trained super-resolution model.
  • two pieces of image data are selected from arbitrary positions of original image data, one of the selected pieces of image data is reduced at a predetermined reduction ratio to be used as first learning image data, and the other selected piece of image data is used as first learning image data.
  • each of the first learning image data and the second learning image data is used as input data, and the input data is enlarged at an enlargement ratio indicated by the reciprocal of the reduction ratio.
  • a super-resolution model that outputs as output data predictive image data that is image data that is image data having a higher resolution than the input data, and is output when the first learning image data is input and the size of the difference between the predicted image data and the image data before reduction of the first learning image data, and the predicted image data which is the output when the second learning image data is used as the input.
  • the learning method includes generating a super-resolution model that minimizes the sum of magnitudes of differences between image data reduced by a reduction ratio and the second learning image data.
  • One aspect of the present invention captures image data and supplies the captured image data as input data to a trained super-resolution model generated by the above learning method, thereby obtaining predicted image data as output data. It is an image processing method for outputting.
  • One aspect of the present invention is a program for executing a computer as the learning device or the image processing device.
  • FIG. 1 is a block diagram showing the configuration of a learning device according to one embodiment
  • FIG. It is a figure which shows the procedure of the production
  • 4 is a flow chart showing the flow of processing of a learning image generation unit provided in the learning device of one embodiment
  • 4 is a flow chart showing the flow of processing of a learning processing unit provided in the learning device of one embodiment
  • FIG. 1 is a block diagram showing the configuration of an image processing device according to an embodiment
  • FIG. 4 is a flow chart showing the flow of processing of the image processing apparatus of one embodiment
  • FIG. 2 is a diagram showing an outline of supervised learning in the techniques disclosed in Non-Patent Documents 1 and 2;
  • FIG. 1 is a block diagram showing the configuration of a learning device 1 according to one embodiment of the present invention.
  • the learning device 1 includes an LR basic image storage unit 11 , an original image generation unit 12 , a learning image generation unit 13 , a learning processing unit 14 and a super-resolution model data storage unit 15 .
  • the LR basic image storage unit 11 pre-stores, for example, low-resolution image data captured by a low-resolution HS camera.
  • the low-resolution image data stored in the LR basic image storage unit 11 is hereinafter referred to as "LR basic image data".
  • the LR basic image data is, for example, image data of a 31-channel HS image captured by a camera that captures a wavelength band of 400 nm to 700 nm in increments of 10 nm, and has 128 vertical and horizontal pixels. be.
  • the “resolution” in this embodiment means the ability to express details of an image represented by image data, and that the greater the number of pixels of the imaging device of the camera used for photographing, the higher the resolution of image data that can be obtained. become.
  • the image data obtained by reducing the vertical and horizontal lengths of the image data obtained by shooting at a fixed reduction ratio has fewer pixels than the original image data, and the high frequency components are lost, resulting in a lower resolution.
  • super-resolution technology is applied to image data obtained by photographing, information on frequency components higher than those in the original image data is restored from the original image data, and the inter-pixel spacing of the original image data is restored.
  • the number of pixels increases as the resolution increases.
  • the resolution is not determined only by the number of pixels in the image data. Image data that is simply combined vertically, horizontally, and diagonally has a larger number of pixels, but the resolution is the same as the resolution of the original image data. is.
  • the number of vertical pixels multiplied by the number of horizontal pixels that is, the two-dimensional size notation expressed by "number of vertical pixels ⁇ number of horizontal pixels”
  • the vertical pixel size including the number of channels The multiplication of the number, the number of horizontal pixels, and the number of channels, that is, the notation of the three-dimensional size indicated by "the number of vertical pixels x the number of horizontal pixels x the number of channels" is generally used.
  • the notation of the two-dimensional size is "128 ⁇ 128” and the notation of the three-dimensional size is "128 ⁇ 128 ⁇ 31".
  • the number of channels of the image data is assumed to be 31 channels, especially when the number of channels is not mentioned.
  • the original image generation unit 12 generates original image data that serves as a basis for generating learning image data used in learning processing performed by the learning processing unit 14 from the LR basic image data stored in the LR basic image storage unit 11 .
  • the original image generation unit 12 converts the LR basic image data 60 having a three-dimensional size of 128 ⁇ 128 ⁇ 31 shown in FIG. Image data 61, LR basic image data 62 with its top and bottom inverted, and LR basic image data 63 with its left and right upper limits inverted are generated.
  • the original image generator 12 combines the LR basic image data 60, 61, 62, and 63 in the positional relationship shown in FIG. do.
  • the numbers shown in the 16 square lattices of the original image data 70 shown in FIG. 60", “61", “62”, and “63", and the LR basic image data corresponding to each number is assigned to the position of the grid indicated by each number. showing.
  • the original image generation unit 12 generates the original image data 70 having the positional relationship shown in FIG.
  • “Procedure 1” Combine the LR basic image data 61 to the right of the LR basic image data 60, combine the LR basic image data 62 below the LR basic image data 60, and combine the LR basic image data 62 below the LR basic image data 61. , LR basic image data 63 are combined.
  • “Procedure 2” Combine the LR basic image data 60 to the right of the LR basic image data 61 , and combine the LR basic image data 60 under the LR basic image data 62 .
  • “Procedure 3” "Procedure 1" is applied to the LR basic image data 60 newly combined in “Procedure 2", and then "Procedure 2" is applied.
  • the original image data 70 shown in FIG. 2(c) is generated. Since the original image data 70 is image data generated by combining the four LR basic image data 60, 61, 62, 63, the original image data 70 has the same resolution as the LR basic image data 60. is.
  • the learning image generation unit 13 includes an image patch selection unit 21, image conversion units 22-1 and 22-2, and a downsampling unit 23.
  • the image patch selection unit 21 selects two pieces of image data, the first image data and the second image data, from arbitrary positions of the original image data 70 generated by the original image generation unit 12 .
  • the first image data is image data having a three-dimensional size of 128 ⁇ 128 ⁇ 31, which is the same three-dimensional size as the LR basic image data 60 .
  • the second image data is image data having a three-dimensional size of 256 ⁇ 256 ⁇ 31, which is double the number of vertical and horizontal pixels of the LR basic image data 60 .
  • the image conversion unit 22-1 performs predetermined image conversion on the first image data selected by the image patch selection unit 21.
  • the image conversion section 22-2 performs predetermined image conversion on the second image data selected by the image patch selection section 21.
  • the predetermined image conversion performed by the image conversion units 22-1 and 22-2 is image conversion obtained by arbitrarily combining image rotation conversion and image inversion conversion described below.
  • Image rotation transformation is image transformation that rotates an image by any one angle randomly selected from four angles of 0 degrees, 90 degrees, 180 degrees, and 270 degrees.
  • the image reversal conversion is an image conversion in which an image is reversed by any one reversal conversion randomly selected from four reversal operations of no reversal, horizontal reversal, vertical reversal, and horizontal/vertical reversal.
  • the image conversion units 22-1 and 22-2 select “0 degrees” as the angle and "no inversion” as the inversion operation, the image conversion units 22-1 and 22-2 are given The given image data is output as it is without performing image conversion on the image data. Since the image conversion unit 22-1 and the image conversion unit 22-2 independently and randomly select the type of image conversion to be executed, in many cases, the image conversion unit 22-1 and the image conversion unit The image transformation performed by 22-2 will be different, but the same image transformation may be performed.
  • the original image data 70 has the same resolution as the LR basic image data 60 . Since the image patch selection unit 21 and the image conversion units 22-1 and 22-2 do not change the resolution of the original image data 70, each of the image conversion units 22-1 and 22-2 outputs 2
  • the resolution of the two image data is also the same resolution as the LR basic image data 60 .
  • the image data that the image conversion unit 22-1 takes in the first image data and outputs is represented by the symbol of the following equation (2), and the symbol of the equation (2) is described as I LR — 1 in the text.
  • the three-dimensional size of the image data I LR — 1 is the same as that of the first image data, it becomes 128 ⁇ 128 ⁇ 31.
  • the image data that the image conversion unit 22-2 captures and outputs the second image data is represented by the symbol of the following equation (3), and the symbol of the equation (3) is described as I LR — 2 in the text.
  • the three-dimensional size of the image data I LR — 2 is 256 ⁇ 256 ⁇ 31 because it is the same as the second image data.
  • the downsampling unit 23 downsamples the image data I LR _1 output by the image conversion unit 22-1, thereby reducing the two-dimensional size at a predetermined reduction ratio, that is, reducing the length and width. to downscale.
  • the predetermined reduction ratio is a value represented by 1/s, where s is a number exceeding 1, and is predetermined according to the resolution desired to be obtained by super-resolution.
  • the image data generated by the downsampling unit 23 reducing the image data I LR — 1 is represented by the symbol of the following equation (4), and the symbol of the equation (4) is described as I LR _son in the text.
  • the resolution of the image data I LR _son is lower than the low resolution image data I LR _ 1 because the image data I LR _ 1 is reduced at a reduction rate of 1/s.
  • the learning processing unit 14 includes an output selection unit 31 , a super-resolution model 32 , an output switching unit 33 , a downsampling unit 34 and a loss calculation unit 35 .
  • the output selection unit 31 takes in the image data I LR _ 2 output by the image conversion unit 22 - 2 and the image data I LR _son output by the downsampling unit 23 .
  • the output selection unit 31 determines whether the captured image data is the image data I LR _son or the image data I LR _2 based on, for example, the two-dimensional size of the captured image data.
  • the output selection unit 31 outputs the first switching instruction signal before the output.
  • the output selection unit 31 selects and outputs the captured image data I LR — 2
  • the output selection unit 31 outputs the second switching instruction signal before the output.
  • the super-resolution model 32 is a model constructed by each of the function approximator 41 and the upsampling unit 42 reading the corresponding coefficients.
  • the super-resolution model 32 predicts and restores information of frequency components higher than the frequency components contained in the image data of the input data, and applies the restored information to the pixels of the image of the input data.
  • An operation is performed to generate image data having a size obtained by enlarging the number of vertical and horizontal pixels of the input data with an enlargement ratio s, which is the reciprocal of the reduction ratio 1/s of the downsampling unit 23. Constructed by learning process.
  • the coefficients applied to each of the function approximator 41 and the upsampling unit 42 are repeatedly updated by the learning processing performed by the learning processing unit 14, so that the desired resolution is higher than the resolution of the input data.
  • a super-resolution model 32 that predicts image data with a resolution of .
  • the super-resolution model 32 represents the predicted image data calculated as the calculation result when the image data I LR _son is taken in as input data by the symbol of the following equation (5).
  • the symbol is written as ⁇ I LR _1.
  • the super-resolution model 32 represents the predicted image data calculated as the calculation result when the image data I LR _ 2 is taken in as input data by the symbol of the following equation (6). Described as I HR — 2.
  • the function approximator 41 and the upsampling unit 42 are deep neural networks including CNN.
  • the structure of the deep neural network of the function approximator 41 is a structure in which a plurality of convolution layers provided with 3 ⁇ 3 filters each having 3 pixels vertically and horizontally are superimposed.
  • the output data output from the function approximator 41 ie, the so-called feature map data, have the same three-dimensional size.
  • the number of 3 ⁇ 3 filters provided in one convolution layer is appropriately determined according to the number of channels of input data. For example, when the number of channels of input data is 31, 128 filters are provided. .
  • the structure of the deep neural network of the upsampling unit 42 is such that the two-dimensional size of the feature map data output by the function approximator 41, that is, the number of vertical and horizontal pixels, is the reciprocal of the reduction ratio 1/s of the downsampling unit 23. It is a structure for generating predicted image data by enlarging at the indicated enlargement factor s.
  • a deep neural network that performs transposed convolution is applied as the structure of the deep neural network of the upsampling unit 42.
  • the coefficients read by the function approximator 41 and the upsampling unit 42 are weights and biases applied to neurons included in the function approximator 41 and the upsampling unit 42 .
  • the output switching section 33 switches the output destination to the loss calculating section 35 when receiving the first switching instruction signal from the output selecting section 31 .
  • the output switching unit 33 switches the output destination to the downsampling unit 34 when receiving the second switching instruction signal from the output selecting unit 31 .
  • the output switching unit 33 outputs the captured predicted image data to the output destination, and then outputs an output completion notification signal to the output selection unit 31 .
  • the downsampling unit 34 downsamples the predicted image data ⁇ IHR_2 output by the output switching unit 33, thereby reducing the two-dimensional size at a predetermined reduction ratio, that is, reducing the length and width. to downscale.
  • the reduction rate predetermined in the down-sampling section 34 is the same value as 1/s, which is the reduction rate predetermined in the down-sampling section 23 .
  • the predicted image data generated by the downsampling unit 34 reducing the predicted image data ⁇ I HR _2 is represented by the symbol of the following formula (7), and in the text, the symbol of formula (7) is described as ⁇ I LR _2. .
  • the resolution of the predicted image data ⁇ I LR — 2 is lower than that of the predicted image data ⁇ I HR — 2 because the predicted image data ⁇ I HR — 2 is reduced at a reduction rate of 1/s.
  • the loss calculator 35 calculates a loss function represented by the following equation (8): Calculate the loss value (Loss).
  • the loss value calculated by the loss calculation unit 35 is the L1 norm indicating the magnitude of the difference between the image data I LR _1 and the predicted image data ⁇ I LR _1, and the image data I It is the total value of two L1 norms, the L1 norm indicating the magnitude of the difference between LR — 2 and predicted image data ⁇ I LR — 2 .
  • the three-dimensional size of the predicted image data ⁇ I LR _1 obtained as output data when the image data I LR _son is given as input data to the super-resolution model 32 is obtained by downsampling the image data I LR _son to It has the same three-dimensional size as the image data I LR — 1 before reduction.
  • the three-dimensional size of I LR _2 is the same three-dimensional size as the image data I LR _2.
  • the first term on the right side of equation (8) is obtained by summing the absolute values of differences in pixel values of pixels at corresponding positions in each of the image data I LR _1 and the predicted image data ⁇ I LR _1. can be calculated.
  • the second term on the right side of equation (8) is calculated by summing the absolute values of differences in pixel values of pixels at corresponding positions in each of the image data I LR — 2 and the predicted image data ⁇ I LR — 2. be able to.
  • the loss calculation unit 35 determines whether to continue the learning process based on the calculated loss value and a predetermined threshold value. When determining to continue the learning process, the loss calculator 35 calculates new coefficients applied to the function approximator 41 and the upsampling unit 42 so as to minimize the calculated loss value. When the loss calculation unit 35 determines to continue the learning process, it outputs a processing continuation instruction signal to the image patch selection unit 21, the function approximator 41, and the upsampling unit 42 as indicated by the connecting line of the dotted arrow in FIG. do.
  • the super-resolution model data storage unit 15 stores super-resolution model data, that is, coefficients applied to the function approximator 41 and the upsampling unit 42 .
  • the super-resolution model data storage unit 15 stores in advance super-resolution model data of initial values, that is, initial values of coefficients applied to the function approximator 41 and the upsampling unit 42 .
  • FIG. 3 is a flowchart showing the flow of processing by the learning image generation unit 13 of the learning device 1.
  • FIG. 4 is a flow chart showing the flow of processing by the learning processing unit 14.
  • the processing of the flowchart of FIG. 3 and the processing of the flowchart of FIG. 4 are continuous processing, and as indicated by the symbol "A" indicating that the processing continues, after the processing of step Sa5 of the flowchart of FIG.
  • the processing of step Sb1 in the flowchart of FIG. 4 is performed.
  • the downsampling section 23 and the downsampling section 34 are configured in advance to reduce the number of vertical and horizontal pixels of the input data to 1/4. It is preconfigured so that the number of pixels in the vertical and horizontal directions of the data is expanded four times.
  • Each of the function approximator 41 and the upsampling unit 42 reads the initial values of the corresponding coefficients from the super-resolution model data storage unit 15 when the learning device 1 is activated.
  • Each of the function approximator 41 and the upsampling unit 42 applies each read initial value of the coefficient to the corresponding neuron.
  • the original image generation unit 12 as described with reference to FIG.
  • the original image data 70 having a three-dimensional size of 512 ⁇ 512 ⁇ 31 is generated, and the generated original image data 70 is output to the image patch selection section 21 .
  • the image patch selection unit 21 takes in the original image data 70 output by the original image generation unit 12 .
  • the image patch selection unit 21 extracts first image data of 128 ⁇ 128 ⁇ 31, which is the same three-dimensional size as the LR basic image data 60, from an arbitrary position of the original image data 70, and the number of vertical and horizontal pixels. is twice the LR basic image data 60, and the second image data of a three-dimensional size of 256 ⁇ 256 ⁇ 31.
  • the image patch selection unit 21 randomly determines a square-shaped selection area 71 having 128 vertical and horizontal pixels within the range of the original image data 70. .
  • the image patch selection unit 21 randomly determines a square-shaped selection area 72 having 256 pixels each in the vertical and horizontal directions within the range of the original image data 70 .
  • the selection area 71 and the selection area 72 determined by the image patch selection unit 21 may not overlap each other, or may overlap each other. may
  • the image patch selection unit 21 extracts and selects the first image data having a three-dimensional size of 128 ⁇ 128 ⁇ 31 based on the selection area 71 defined in the original image data 70 .
  • the image patch selection unit 21 extracts and selects second image data having a three-dimensional size of 256 ⁇ 256 ⁇ 31 based on the selection area 72 defined in the original image data 70 .
  • the image patch selection unit 21 outputs the selected first image data to the image conversion unit 22-1, and outputs the selected second image data to the image conversion unit 22-2 (step Sa1).
  • Step Sa2 When the image conversion unit 22-1 captures the first image data output from the image patch selection unit 21, the image conversion unit 22-1 performs the above-described predetermined image conversion randomly determined at the timing of capturing the first image data as shown in FIG. 5(b). , image data 81 is generated. This image data 81 becomes the image data I LR — 1 described above.
  • the image conversion unit 22-2 captures the second image data output from the image patch selection unit 21, the image conversion unit 22-2 performs the above-mentioned predetermined image conversion randomly determined at the timing of capture, and converts the data as shown in FIG. 5(c). to generate the image data 82 .
  • This image data 82 becomes the image data I LR — 2 described above. (Step Sa2).
  • the image conversion unit 22 - 1 outputs the image data I LR — 1 to the loss calculation unit 35 of the learning processing unit 14 .
  • the image conversion unit 22-2 outputs the image data I LR — 2 to the output selection unit 31 and the loss calculation unit 35 of the learning processing unit 14 (step Sa3).
  • the image conversion unit 22 - 1 outputs the image data I LR — 1 to the downsampling unit 23 .
  • the downsampling unit 23 reduces the number of vertical and horizontal pixels of the 128 ⁇ 128 ⁇ 31 three-dimensional image data I LR — 1 output by the image patch selection unit 21 to 1/4. to generate image data 91 having a three-dimensional size of 32 ⁇ 32 ⁇ 31.
  • This image data 91 becomes the above-described image data I LR _son (step Sa4).
  • the downsampling unit 23 outputs the generated image data I LR _son to the output selection unit 31 of the learning processing unit 14 (step Sa5). After that, the process proceeds to step Sb1 in FIG. 4, as indicated by symbol "A".
  • the output selection unit 31 takes in the image data I LR _son output by the downsampling unit 23 and the image data I LR _2 output by the image conversion unit 22-2.
  • the loss calculator 35 takes in the image data I LR _1 output by the image converter 22-1 and the image data I LR _2 output by the image converter 22-2 (step Sb1).
  • the output selection unit 31 selects the image data I LR _son as image data to be output, and outputs a first switching instruction signal to the output switching unit 33 (step Sb2). Upon receiving the first switching instruction signal from the output selection unit 31 , the output switching unit 33 switches the output destination to the loss calculation unit 35 .
  • the output selector 31 outputs the selected image data I LR _son to the function approximator 41 (step Sb3).
  • the function approximator 41 takes in the 32 ⁇ 32 ⁇ 31 three-dimensional size image data I LR —son as input data, performs calculations, and generates 32 ⁇ 32 ⁇ 31 three-dimensional size feature map data.
  • the function approximator 41 outputs the generated feature map data to the upsampling section 42 .
  • the upsampling unit 42 takes in the feature map data output by the function approximator 41 .
  • the upsampling unit 42 performs an operation to quadruple the number of vertical and horizontal pixels of the captured feature map data, and as shown in FIG. Generate LR_1 .
  • the upsampling unit 42 outputs the generated predicted image data ⁇ I LR — 1 to the output switching unit 33 (step Sb4).
  • the output switching unit 33 takes in the predicted image data ⁇ I LR _1 output by the upsampling unit 42, and outputs the taken-in predicted image data ⁇ I LR _1 to the loss calculation unit 35, which is the output destination. After outputting the predicted image data ⁇ I LR — 1 to the loss calculation unit 35 , the output switching unit 33 outputs an output completion notification signal to the output selection unit 31 .
  • the loss calculation unit 35 takes in the predicted image data ⁇ I LR — 1 output by the output switching unit 33 (step Sb5).
  • the output selecting unit 31 Upon receiving the output completion notification signal from the output switching unit 33 , the output selecting unit 31 next selects the image data I LR — 2 as the image data to be output, and outputs the second switching instruction signal to the output switching unit 33 . (Step Sb6). Upon receiving the second switching instruction signal from the output selection unit 31 , the output switching unit 33 switches the output destination to the downsampling unit 34 .
  • the output selector 31 outputs the selected image data I LR — 2 to the function approximator 41 (step Sb7).
  • the function approximator 41 takes in the 256 ⁇ 256 ⁇ 31 three-dimensional size image data I LR — 2 as input data, performs calculations, and generates 256 ⁇ 256 ⁇ 31 three-dimensional size feature map data.
  • the function approximator 41 outputs the generated feature map data to the upsampling section 42 .
  • the upsampling unit 42 takes in the feature map data output by the function approximator 41 .
  • the upsampling unit 42 performs an operation to quadruple the number of vertical and horizontal pixels of the captured feature map data, and as shown in FIG. Generate HR_2 .
  • the upsampling unit 42 outputs the generated predicted image data ⁇ IHR_2 to the output switching unit 33 (step Sb8).
  • the output switching unit 33 takes in the predicted image data ⁇ IHR_2 output from the upsampling unit 42, and outputs the taken in predicted image data ⁇ IHR_2 to the downsampling unit 34, which is the output destination. After outputting the predicted image data ⁇ I LR — 2 to the downsampling unit 34 , the output switching unit 33 outputs an output completion notification signal to the output selection unit 31 . However, even if the output selection unit 31 receives the output completion notification signal at this time, there is no image data to be selected next. do not The downsampling unit 34 takes in the predicted image data ⁇ IHR_2 output by the output switching unit 33 (step Sb9).
  • the downsampling unit 34 reduces the number of vertical and horizontal pixels of the captured 1024 ⁇ 1024 ⁇ 31 three-dimensional size prediction image data ⁇ I HR — 2 to 1/4 to obtain 256 ⁇ 256 ⁇ 31 three-dimensional size predicted image data ⁇ I LR — 2 is generated.
  • the downsampling unit 34 outputs the generated predicted image data ⁇ I LR — 2 to the loss calculation unit 35 .
  • the loss calculator 35 takes in the predicted image data ⁇ I LR — 2 output from the downsampling unit 34 (step Sb10).
  • the loss calculation unit 35 calculates the image data I LR — 1 and the image data I LR — 2 captured in the process of step Sb1, the predicted image data ⁇ I LR — 1 captured in the process of step Sb5, and the prediction captured in the process of step Sb10. Based on the image data ⁇ I LR — 2, the loss value is calculated by Equation (8) (step Sb11).
  • the loss calculation unit 35 sums the absolute values of the differences between the pixel values of the pixels at the corresponding positions in the image data I LR _1 and the predicted image data ⁇ I LR _1. Calculate the value of the first term on the right side of equation (8).
  • the loss calculation unit 35 sums the absolute values of the differences between the pixel values of the pixels at corresponding positions in the image data I LR — 2 and the predicted image data ⁇ I LR — 2, respectively, to obtain Calculate the value of the second term.
  • the loss calculation unit 35 calculates the sum of the calculated value of the first term on the right side of Equation (8) and the value of the second term on the right side of Equation (8), and sets the calculated total value as the loss value. .
  • the loss calculation unit 35 determines whether or not the calculated loss value is equal to or greater than a predetermined threshold (step Sb12). When the loss calculation unit 35 determines that the loss value is equal to or greater than the predetermined threshold value (step Sb12, Yes), based on the calculated loss value, new super-resolution model data, that is, the function approximator 41 and new coefficients to be applied to the upsampling unit 42 are calculated. The loss calculator 35 calculates new coefficients by, for example, error backpropagation based on the loss function of Equation (8). After calculating the new super-resolution model data, the loss calculation unit 35 overwrites and updates the super-resolution model data stored in the super-resolution model data storage unit 15 with the calculated new super-resolution model data. (Step Sb13).
  • the loss calculator 35 outputs a processing continuation instruction signal to the image patch selector 21, the function approximator 41, and the upsampling unit 42 (step Sb14).
  • each of the function approximator 41 and the upsampling unit 42 receives the processing continuation instruction signal from the loss calculation unit 35, it reads the coefficient corresponding to each from the super-resolution model data storage unit 15, and each of the read coefficients is Apply to the neurons corresponding to each (step Sb15).
  • step Sa1 in FIG. 3, as indicated by the symbol "B" after step Sb15 in FIG. Then, the processing from step Sa1 is performed again.
  • the learning image generation unit 13 Through the processing of steps Sa1 to Sa5 that are performed again, the learning image generation unit 13 generates new image data I LR _1, I LR _2, and I LR _son. Based on the image data I LR _1, I LR _2, and I LR _son newly generated by the learning image generation unit 13, the learning processing unit 14 performs steps Sb1 to Sb11 again.
  • the loss calculator 35 performs the determination process of step Sb12 again. When the loss value calculated by the loss calculation unit 35 is equal to or greater than a predetermined threshold value, the processing of steps Sb13 to Sb15 is further performed, and the super-resolution model data stored in the super-resolution model data storage unit 15 is , will be updated to the new super-resolution model data.
  • the resolution of LR_2 and the resolution of image data ILR_2 gradually become closer to each other.
  • the super-resolution model 32 a model that predicts output data with a higher resolution than the resolution of the input data is gradually constructed while enlarging the number of vertical and horizontal pixels of the input data by a factor of four. Become.
  • the process ends. 3 and 4 are completed, the super-resolution model data storage unit 15 stores the fully learned data, that is, the number of pixels in the vertical and horizontal directions of the input data is expanded four times, Super-resolution model data will be stored that has a higher resolution than the resolution of the data and builds a model that predicts the desired resolution of the output data.
  • the learning image generation unit 13 selects two image data, the first image data and the second image data, from arbitrary positions of the original image data 70, and selects
  • the first image data is subjected to a predetermined image conversion and reduced at a predetermined reduction ratio of 1/s to generate the first learning image data, that is, the image data I LR _son, which is selected.
  • Predetermined image conversion is performed on the other second image data to generate second learning image data, that is, image data I LR — 2 .
  • the learning processing unit 14 uses each of the image data I LR _son and the image data I LR _ 2 as input data, and the input data is image data obtained by enlarging the input data at an enlargement ratio s indicated by the reciprocal of the reduction ratio 1/s.
  • a super-resolution model 32 that outputs predicted image data , which is image data with a higher resolution than The size of the difference between the image data I LR _son and the image data I LR _1 before reduction and the predicted image data ⁇ I HR _2 that is the output when the image data I LR _2 are input are reduced at a reduction rate of 1/s.
  • a super-resolution model 32 is generated that minimizes the sum of the magnitudes of the differences between the image data ⁇ I LR — 2 and the image data I LR — 2 .
  • the learning device 1 performs unsupervised learning using only the LR basic image data 60 without using true value data or auxiliary image data that requires an accurate correspondence with the LR basic image data 60. By doing so, it is possible to construct a super-resolution model that generates high-resolution image data from low-resolution image data. Therefore, compared to the method using supervised learning shown in Non-Patent Documents 1 and 2, the cost required to create the data required for supervised learning is significantly reduced, and high-precision super-resolution is achieved. be able to.
  • FIG. 7 is a block diagram showing the configuration of the image processing device 2. As shown in FIG. In the image processing device 2, the same components as those of the learning device 1 shown in FIG.
  • the image processing device 2 includes an image acquisition unit 51, a prediction processing unit 52, and a trained super-resolution model data storage unit 15a.
  • the image fetching unit 51 fetches the LR image data ILR which is the image data of the low-resolution HS image supplied from the outside, and outputs the fetched LR image data ILR to the function approximator 41 .
  • the learned super-resolution model data storage unit 15a stores the learned super-resolution model stored in the super-resolution model data storage unit 15 when the learning device 1 completes the processing shown in FIGS. Pre-store data.
  • the prediction processing unit 52 includes a function approximator 41 and an upsampling unit 42 .
  • FIG. 8 is a flowchart showing the flow of processing by the image processing device 2.
  • the image processing apparatus 2 is started, which is before the flowchart shown in FIG. read out.
  • Each of the function approximator 41 and the upsampling unit 42 applies each of the read learned coefficients to the corresponding neuron.
  • the learned super-resolution model 32a is constructed.
  • the image fetching unit 51 fetches the LR image data ILR supplied from the outside (step Sc1).
  • the three-dimensional size of the LR image data ILR captured by the image capturing unit 51 is the same as the LR basic image data 60 used in the learning device 1, that is, the three-dimensional size of 128 ⁇ 128 ⁇ 31. It is assumed that the resolution of the image data ILR is the same as the resolution of the LR basic image data 60 .
  • the function approximator 41 takes in the image data ILR as input data, performs calculations, and outputs output data to the upsampling section 42 .
  • the upsampling unit 42 performs an operation for upsampling the output data of the function approximator 41, and quadruples the number of vertical and horizontal pixels of the feature map data output by the function approximator 41 to 512 ⁇ 512 ⁇ 31.
  • 3D size high-resolution HS image prediction image data is generated (step Sc2).
  • the predicted image data generated by the upsampling unit 42 is represented by the symbol of the following equation (9), and is described as ⁇ I HR in the text.
  • the up-sampling unit 42 outputs the generated predicted image data IHR of the high-resolution HS image having a three-dimensional size of 512 ⁇ 512 ⁇ 31 (step Sc3).
  • the image capturing section 51 captures the LR image data ILR . Having a trained super-resolution model 32a generated by the learning processing unit 14 of the learning device 1, and providing the LR image data ILR captured by the image capturing unit 51 as input data to the trained super-resolution model 32a Predicted image data ⁇ I HR obtained as output data is output.
  • the image data ILR captured by the image capture unit 51 has the same three-dimensional size of 128 ⁇ 128 ⁇ 31 as the LR basic image data 60 stored in the LR basic image storage unit 11 of the learning device 1, as described above.
  • the information of the frequency component higher than the frequency component included in the image data ILR is restored as the predicted image data ⁇ IHR , and the restored information is the image data ILR .
  • High-resolution image data applied between the pixels of the image, and 512 ⁇ 512 ⁇ 31 three-dimensional image data in which the number of vertical and horizontal pixels is enlarged to four times the image data ILR is obtained. It will be.
  • the learning device 1 of the above-described embodiment regarding the image data I LR _ 1 , as the first learning process, the down-sampling unit 23 and the super-resolution model 32 are combined, and 128 data are input as input data. It can be said that learning processing is performed so that when image data I LR — 1 having a three-dimensional size of ⁇ 128 ⁇ 31 is given, the same image data I LR — 1 as the input data is obtained as output data. Therefore, it can be said that the first learning process performs unsupervised learning using an autoencoder that is configured by combining the downsampling unit 23 and the super-resolution model 32 .
  • the super-resolution model 32 and the down-sampling unit 34 are connected to a three-dimensional size of 256 ⁇ 256 ⁇ 31 as input data. It can be said that learning processing is performed so that when image data I LR — 2 is given, the same image data I LR — 2 as the input data is obtained as output data. Therefore, it can be said that the second learning process performs unsupervised learning using a configuration in which the super-resolution model 32 and the downsampling unit 34 are combined as an autoencoder.
  • the down-sampling units 23 and 34 are not objects of the first and second learning processes, and are only configured to reduce given image data at a reduction ratio of 1/s.
  • the super-resolution model 32 which is the common configuration of the two autoencoders except for the This means that the model is constructed as a model capable of performing highly accurate super-resolution on the data.
  • the resolution of the 256 ⁇ 256 ⁇ 31 image data I LR — 2 is the same resolution as the LR basic image data 60 . Therefore, image data of arbitrary two-dimensional size with 31 channels, which is image data having a resolution equal to or higher than the resolution of the image data I LR _son and equal to or lower than the resolution of the LR basic image data 60, is used as the image data I LR for trained super-resolution processing. Even if it is given to the model 32a, it is image data in which the number of pixels in the vertical and horizontal directions is enlarged four times, and the resolution of the image data is higher than that of the image data ILR .
  • the image patch selection unit 21 selects image data with a three-dimensional size of 128 ⁇ 128 ⁇ 31, which is the same three-dimensional size as the LR basic image data 60, as the first image data.
  • image data No. 2 three-dimensional size image data of 256 ⁇ 256 ⁇ 31, which has twice the number of vertical and horizontal pixels of the LR basic image data 60, is selected.
  • the two-dimensional size of 128 ⁇ 128 for the first image data and the two-dimensional size of 256 ⁇ 256 for the second image data are examples, and the number of channels is maintained at 31. If there is, it may be any two-dimensional size of a square shape.
  • the two-dimensional size of the second image data is It is desirable to make it as large as possible.
  • the super-resolution model 32 constructed by the learning process has a two-dimensional size that is not too biased toward the features of the second image data. is preferably about twice the size of the LR basic image data 60 shown in the above embodiment.
  • the number of vertical and horizontal pixels of the LR basic image data 60 is 128, but the number of pixels may be other than 128.
  • the shape of the LR basic image data 60 is not limited to a square shape, and may be rectangular image data having different numbers of pixels in the vertical and horizontal directions. However, when rectangular image data having different numbers of vertical and horizontal pixels is used as the LR basic image data 60, the shape of the image data given to the function approximator 41 of the learning processing unit 14 is also the shape of the LR basic image data 60. must be similar to Therefore, when the image patch selection unit 21 selects the first image data and the second image data, it is necessary to select such that the shape of the LR basic image data 60 is similar.
  • the shape of the image data to be supplied to the image processing device 2 as the object of super-resolution also needs to be similar to the shape of the LR basic image data 60 .
  • the LR basic image data 60 is not limited to the HS image, and may be image data with an arbitrary number of channels other than 31 channels, for example, image data with three channels of RGB. However, when image data with a number of channels other than 31 is used as the LR basic image data 60, the number of channels of the image data to be supplied to the image processing device 2 as a super-resolution target is also the same as the number of channels of the LR basic image data 60. must be identical.
  • the original image generation unit 12 generates the original image data 70 from the LR basic image data 60 according to the procedure shown in FIG.
  • the presence of this boundary works advantageously when constructing the super-resolution model 32 by learning processing. Therefore, a procedure other than the procedure shown in FIG. 2 may be used to generate the original image data 70 in which a boundary portion that does not exist in such normal image data occurs.
  • an external device connected to the learning device 1 has the LR basic image storage unit 11 and the original image generation unit 12 , and an external device may output the generated original image data 70 to the learning device 1 .
  • the image conversion units 22-1 and 22-2 perform a predetermined image conversion that is randomly determined at the timing of fetching the fetched image data.
  • the learning image generation unit 13 does not include the image conversion units 22-1 and 22-2, and the image patch selection unit 21 downloads the first image data to be output to the image conversion unit 22-1.
  • the second image data output to the sampling unit 23 and the loss calculation unit 35 and output to the image conversion unit 22 - 2 may be output to the output selection unit 31 and the loss calculation unit 35 .
  • the up-sampling unit 42 an example of applying a deep neural network that performs transposed convolution as the upsampling unit 42 is shown.
  • the method shown in the following references which is known as the so-called PixelShuffle, may be applied.
  • PixelShuffle method When the PixelShuffle method is applied to the upsampling unit 42, there is no need to update the coefficients of the upsampling unit 42, so there is an advantage that the convergence time of the learning process is shortened.
  • the coefficients of the upsampling unit 42 are no longer updated. Only the coefficients that apply to 41 neurons will be stored. The loss calculator 35 will only calculate new coefficients to be applied to the neurons of the function approximator 41 based on the loss values.
  • the expansion ratio s indicated by the reciprocal of the reduction ratio 1/s is Settings are made in advance so as to provide a configuration for enlargement.
  • bicubic downsampling can be applied as the downsampling method of the downsampling unit 23 .
  • a downsampling method of the downsampling unit 34 for example, a method of approximating bicubic downsampling can be applied by performing a convolution operation using a kernel called a Lanczos kernel.
  • the technique applied to the down-sampling section 23 and the down-sampling section 34 is not limited to this technique, and other techniques for reducing image data may be applied.
  • the function approximator 41 is applied with a deep neural network having a structure in which multiple convolution layers with 3 ⁇ 3 filters are superimposed.
  • the function approximator 41 may be realized by a method other than a neural network, as long as it can extract feature map data of the same three-dimensional size as the input data from the input data.
  • the loss calculation unit 35 determines whether or not the calculated loss value is equal to or greater than a predetermined threshold.
  • the present invention is not limited to this embodiment. may be replaced with the determination process.
  • the learning device 1 and the image processing device 2 in the above-described embodiment may be realized by a computer.
  • a program for realizing this function may be recorded in a computer-readable recording medium, and the program recorded in this recording medium may be read into a computer system and executed.
  • the term "computer system” as used herein includes an OS (Operating System) and hardware such as peripheral devices.
  • "computer-readable recording medium” refers to portable media such as flexible disks, magneto-optical disks, ROM (Read Only Memory), CD-ROMs, and storage devices such as hard disks built into computer systems. say.
  • “computer-readable recording medium” refers to a program that dynamically retains programs for a short period of time, like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It may also include something that holds the program for a certain period of time, such as a volatile memory inside a computer system that serves as a server or client in that case. Further, the program may be for realizing a part of the functions described above, or may be capable of realizing the functions described above in combination with a program already recorded in the computer system, It may be implemented using a programmable logic device such as an FPGA (Field Programmable Gate Array).
  • FPGA Field Programmable Gate Array

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
PCT/JP2021/046132 2021-12-14 2021-12-14 学習装置、画像処理装置、学習方法、画像処理方法及びプログラム WO2023112172A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2021/046132 WO2023112172A1 (ja) 2021-12-14 2021-12-14 学習装置、画像処理装置、学習方法、画像処理方法及びプログラム
JP2023567359A JPWO2023112172A1 (zh) 2021-12-14 2021-12-14

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/046132 WO2023112172A1 (ja) 2021-12-14 2021-12-14 学習装置、画像処理装置、学習方法、画像処理方法及びプログラム

Publications (1)

Publication Number Publication Date
WO2023112172A1 true WO2023112172A1 (ja) 2023-06-22

Family

ID=86773804

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/046132 WO2023112172A1 (ja) 2021-12-14 2021-12-14 学習装置、画像処理装置、学習方法、画像処理方法及びプログラム

Country Status (2)

Country Link
JP (1) JPWO2023112172A1 (zh)
WO (1) WO2023112172A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020027557A (ja) * 2018-08-17 2020-02-20 日本放送協会 超解像装置およびそのプログラム
JP2021502644A (ja) * 2017-11-09 2021-01-28 京東方科技集團股▲ふん▼有限公司Boe Technology Group Co.,Ltd. 画像処理方法、処理装置及び処理デバイス
US20210076066A1 (en) * 2019-09-11 2021-03-11 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021502644A (ja) * 2017-11-09 2021-01-28 京東方科技集團股▲ふん▼有限公司Boe Technology Group Co.,Ltd. 画像処理方法、処理装置及び処理デバイス
JP2020027557A (ja) * 2018-08-17 2020-02-20 日本放送協会 超解像装置およびそのプログラム
US20210076066A1 (en) * 2019-09-11 2021-03-11 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KASHIHARA, KOJI: "Super resolution of vein images using convolutional neural networks", IEICE TECHNICAL REPORT., vol. 116, no. 196, 18 August 2016 (2016-08-18), pages 39 - 44, XP009546527 *
SAKURAI, AYUMU ET AL.: "Single-Image Super-Resolution Based on Sparse Coding Using Adaptive Basis Learning", IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, vol. J96-D, no. 8, 1 August 2013 (2013-08-01), pages 1801 - 1810, XP009546526 *

Also Published As

Publication number Publication date
JPWO2023112172A1 (zh) 2023-06-22

Similar Documents

Publication Publication Date Title
CN108629743B (zh) 图像的处理方法、装置、存储介质和电子装置
Sun et al. Learned image downscaling for upscaling using content adaptive resampler
KR102177233B1 (ko) 고해상도 이미지 세분화를 위한 종단간 네트워크 모델
CN111105352A (zh) 超分辨率图像重构方法、系统、计算机设备及存储介质
JP2013518336A (ja) 入力画像から増加される画素解像度の出力画像を生成する方法及びシステム
KR102493492B1 (ko) 초해상도 모델의 메타 러닝을 통한 빠른 적응 방법 및 장치
JP4337463B2 (ja) 画像処理装置、画像処理システム、撮像装置および画像処理方法
CN112132741A (zh) 一种人脸照片图像和素描图像的转换方法及系统
CN108876716B (zh) 超分辨率重建方法及装置
US11688100B1 (en) Systems and methods for multi-sensor image enhancement
Xu et al. Exploiting raw images for real-scene super-resolution
CN114641790A (zh) 红外图像的超分辨率的处理方法及系统
CN111951165A (zh) 图像处理方法、装置、计算机设备和计算机可读存储介质
Sidike et al. A fast single-image super-resolution via directional edge-guided regularized extreme learning regression
CN114494022B (zh) 模型训练方法、超分辨率重建方法、装置、设备及介质
JP6102938B2 (ja) 画像処理装置、及び、画像処理方法
CN108401104B (zh) 基于频带修复和超分辨的双焦相机数字变焦方法
Maral Single image super-resolution methods: A survey
KR102374022B1 (ko) 영상 처리 장치 및 방법
WO2023112172A1 (ja) 学習装置、画像処理装置、学習方法、画像処理方法及びプログラム
JP2000354244A (ja) 画像処理装置、方法及びコンピュータ読み取り可能な記憶媒体
Yamaguchi et al. Fast and high quality image interpolation for single-frame using multi-filtering and weighted mean
CN112419146B (zh) 一种图像处理方法、装置及终端设备
Su et al. Single image super-resolution based on space structure learning
CN112581362A (zh) 用于调整图像细节的图像处理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21968089

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023567359

Country of ref document: JP

Kind code of ref document: A