WO2022249661A1 - 画像処理装置および画像処理方法 - Google Patents
画像処理装置および画像処理方法 Download PDFInfo
- Publication number
- WO2022249661A1 WO2022249661A1 PCT/JP2022/011503 JP2022011503W WO2022249661A1 WO 2022249661 A1 WO2022249661 A1 WO 2022249661A1 JP 2022011503 W JP2022011503 W JP 2022011503W WO 2022249661 A1 WO2022249661 A1 WO 2022249661A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- target frame
- processing
- resolution
- processing target
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 426
- 238000003672 processing method Methods 0.000 title claims abstract description 12
- 238000010801 machine learning Methods 0.000 claims abstract description 33
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000000034 method Methods 0.000 abstract description 29
- 230000000306 recurrent effect Effects 0.000 abstract description 10
- 238000013527 convolutional neural network Methods 0.000 description 44
- 238000010586 diagram Methods 0.000 description 10
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000002730 additional effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
Definitions
- the present invention relates to an image processing apparatus and an image processing method for performing high-resolution processing of a moving image sequence.
- Non-Patent Documents 1 and 2 a high-resolution method has been proposed to increase the resolution of video (increase the number of pixels) using a machine learning model (especially a neural network such as a convolution neural network (hereinafter referred to as "CNN")).
- CNN convolution neural network
- An object of the present invention is to provide an image processing apparatus and an image processing method capable of increasing the resolution of a moving image sequence with higher performance in the recurrent method.
- an image processing method for performing high-resolution processing on a moving image sequence composed of a plurality of frames, wherein it is determined whether or not a frame to be processed in the plurality of frames is a key frame. (i) when the frame to be processed is determined to be a key frame, estimating the degree of difficulty in increasing the resolution of the frame to be processed; determining a machine learning model according to the estimated difficulty level of high resolution; determining whether the processing target frame is a key frame corresponding to a scene change; If the key frame does not correspond to a change, the image quality enhancement processing unit uses the determined machine learning model to extract the processing target frame, the frame before and/or after the processing target frame, and the processing target frame.
- generating a high-resolution frame by increasing the resolution of the processing target frame using a feature map generated when the frame before the processing is subjected to high-resolution processing; and If the keyframe corresponds to a change, generating an alternative feature map using the processing target frame and the frame after the processing target frame, and performing image quality enhancement processing using the determined machine learning model.
- a high-resolution image processing unit that uses the processing target frame, frames before and/or after the processing target frame, the processing target frame, and the alternative feature map to obtain a high-resolution image of the processing target frame; generating a resolution frame; and (ii) if the processing target frame is determined not to be a key frame, the image quality enhancement processing unit uses the determined machine learning model to generate the processing target frame and the Using a frame before and/or after the frame to be processed and a feature map generated when the frame before the frame to be processed is subjected to a resolution enhancement process, the frame to be processed is increased in resolution. and generating a high resolution frame for the image processing.
- an image processing apparatus that performs high-resolution processing on a moving image sequence composed of a plurality of frames, and determines whether or not a frame to be processed in the plurality of frames is a frame to be reset. and a first machine learning model, and when the processing target frame is not a reset target frame, the processing target frame, a frame before and/or after the processing target frame, and the processing and a feature map generated when a frame preceding the target frame is subjected to high-resolution processing to generate a high-resolution frame and a feature map obtained by increasing the resolution of the processing target frame, and the processing target When the frame is a reset target frame, the processing target frame and the frames before and/or after the processing target frame are used.
- An image processing apparatus is provided that includes a high-resolution processing unit that generates the high-resolution frame and the feature map without using the generated feature map.
- the reset determination unit may determine whether the processing target frame is a reset target frame based on key frames included in the plurality of frames.
- the reset determination unit may determine that the processing target frame is the reset target frame.
- the reset determination unit may determine that the processing target frame is the reset target frame when the processing target frame is a frame corresponding to a scene change.
- the reset determination unit may determine that the processing target frame is the reset target frame when the processing target frame is a key frame corresponding to a scene change.
- the reset determination unit may determine that the processing target frame is the reset target frame when the processing target frame is the leading frame of the plurality of frames.
- the reset determination unit may determine that the processing target frame is the reset target frame for each predetermined frame.
- a feature map generation unit that has a second machine learning model different from the first machine learning model and generates a substitute feature map using the processing target frame and a frame after the processing target frame.
- the high-resolution processing unit may generate the high-resolution frame and the feature map using the alternative feature map when the processing target frame is a reset target frame.
- an image processing apparatus for performing resolution enhancement processing on a moving image sequence composed of a plurality of frames, wherein the resolution enhancement difficulty level of a frame to be processed in the plurality of frames is estimated.
- a difficulty level estimation unit and a plurality of resolution enhancement processing units having machine learning models with different computational complexity, wherein one of the plurality of resolution enhancement processing units determines the resolution enhancement difficulty level of the processing target frame.
- a resolution enhancement processing unit having a corresponding machine learning model uses the processing target frame and frames before and/or after the processing target frame to perform high resolution processing of the processing target frame.
- An image processing apparatus is provided for generating an image frame.
- the higher the resolution enhancement difficulty, the higher the resolution enhancement processing unit having a machine learning model with a larger calculation amount may generate the high resolution frame.
- the difficulty level estimation unit estimates a difficulty level for increasing the resolution of the processing target frame when the processing target frame is a key frame, and estimates the difficulty level for increasing the resolution of the processing target frame when the processing target frame is a key frame.
- a resolution enhancement processing unit having a machine learning model corresponding to the resolution enhancement difficulty level of the processing target frame estimated by the difficulty level estimation unit generates the high resolution frame,
- the resolution enhancement processing unit that has performed the resolution enhancement processing of the frame one frame before the processing target frame may perform the resolution enhancement processing of the processing target frame.
- an image processing method for performing high-resolution processing on a moving image sequence composed of a plurality of frames, wherein whether or not a frame to be processed in the plurality of frames is a frame to be reset. and, if the processing target frame is not a reset target frame, the image quality enhancement processing unit using a machine learning model determines the processing target frame, the frame before and/or after the processing target frame, generating a high-resolution frame and a feature map obtained by increasing the resolution of the processing target frame using a feature map generated when a frame preceding the processing target frame is subjected to high-resolution processing, and When the processing target frame is a reset target frame, the processing target frame and the frames before and/or after the processing target frame are used, and the frame before the processing target frame is subjected to high resolution processing. and generating the high resolution frame and the feature map without using the feature map generated during the processing.
- an image processing method for performing resolution enhancement processing on a moving image sequence composed of a plurality of frames which includes estimating a difficulty level for resolution enhancement of a processing target frame in the plurality of frames. and a high-quality image processing unit using a machine learning model corresponding to the degree of difficulty in increasing the resolution of the processing target frame from among a plurality of machine learning models having different computational amounts, the processing target frame and the processing. and generating a high-resolution frame obtained by increasing the resolution of the target frame using a frame before and/or after the target frame.
- FIG. 4 is a diagram schematically showing a recurrent method resolution enhancement process using a CNN
- FIG. 1B is a block diagram schematically showing a schematic configuration of an image processing apparatus that performs a recurrent method resolution enhancement process using the CNN shown in FIG. 1A
- 1 is a block diagram schematically showing the schematic configuration of an image processing apparatus according to a first embodiment
- FIG. 3 is a flowchart showing an example of processing operations of the image processing apparatus of FIG. 2
- FIG. 2 is a block diagram schematically showing the schematic configuration of an image processing apparatus according to a second embodiment
- FIG. 5 is a flowchart showing an example of processing operations of the image processing apparatus of FIG. 4
- FIG. 11 is a block diagram schematically showing the schematic configuration of an image processing apparatus according to a third embodiment
- 7 is a flowchart showing an example of processing operations of the image processing apparatus of FIG. 6;
- FIG. 1A is a diagram schematically showing a recurrent-method high-resolution processing using CNN, and shows how a low-resolution video sequence X is increased in resolution to generate a high-resolution video sequence Y.
- FIG. 1A It should be noted that "high resolution” in this specification means relative to "low resolution”, and does not mean that the number of pixels is greater than or equal to a specific number.
- a processing module for registration, feature extraction and reconstruction is configured using CNN. Then, the previous frame x(t ⁇ 1) and/or the subsequent frame x(t+1) are used to perform the high resolution processing of the low resolution frame x(t). Below, the low-resolution frame x(t) to be subjected to the high-resolution processing may be referred to as a "processing target frame x(t)". Note that the processing module may use only one of the front and back frames, or may use two or more front and/or back frames.
- the feature map h (t-1) is used. Using the feature map h(t ⁇ 1) improves the accuracy of resolution enhancement. Further, a high-resolution frame y(t-1) obtained by performing the resolution-enhancement processing on the previous frame x(t-1) may be used for the resolution-enhancement processing.
- the processing module applies CNN to perform high-resolution processing, and outputs a high-resolution frame y(t) obtained by high-resolution processing target frame x(t) and a feature map h(t).
- the feature map h(t) is used when the resolution of the subsequent frame x(t+1) is increased.
- the processing module recursively and sequentially applies the above processing to each of the low-resolution frames x(1) to x(n) constituting the low-resolution video sequence X, whereby the high-resolution video sequence Y is obtained. can get.
- FIG. 1B is a block diagram schematically showing the schematic configuration of an image processing apparatus that performs a recurrent method resolution enhancement process using the CNN shown in FIG. 1A.
- This image processing apparatus includes a resolution enhancement processing unit 100 having a CNN 10.
- the CNN 10 performs high-resolution processing on each of the low-resolution frames x(1) to x(n) forming the moving image sequence X, and outputs high-resolution frames y(1) to y(n). That is, the CNN 10 includes a plurality of weighting parameters, and weighting parameter values suitable for increasing the resolution are learned and set in advance.
- the CNN 10 uses the feature map h(t-1) output from the intermediate layer of the CNN 10 during the resolution enhancement processing of the previous frame x(t-1) when performing the resolution enhancement processing of the processing target frame x(t). .
- Non-Patent Document 3 has the following problems.
- the initialized feature map h(0) (for example, constant 0) must be used. As a result, the accuracy of resolution enhancement may be lowered for the first frame and several subsequent frames.
- an object of the present invention is to solve at least part of these problems.
- the first embodiment mainly deals with problems (1) and (2)
- the second embodiment mainly deals with problem (3)
- the third embodiment deals with (1) to ( 3) is addressed.
- FIG. 2 is a block diagram showing a schematic configuration of the image processing apparatus according to the first embodiment.
- This image processing apparatus performs high-resolution processing on a moving image sequence composed of a plurality of frames. I have. A part or all of these may be realized by a processor executing a predetermined program.
- the resolution enhancement processing unit 101 uses the processing target frame x(t) and the previous frame x(t ⁇ 1) and/or the subsequent frame x(t+1) of the processing target frame x(t) to convert the processing target frame
- a high-resolution frame y(y) is generated by increasing the resolution of x(t). Any number of before and after frames may be used. One or more previous frames y with high resolution may also be used.
- the resolution enhancement processing unit 101 During resolution enhancement processing, if the processing target frame x(t) is not a reset target frame (described later), the resolution enhancement processing unit 101 generates a A feature map h(t ⁇ 1) is used. On the other hand, when the processing target frame x(t) is the reset target frame, the resolution enhancement processing unit 101 does not use the feature map h(t ⁇ 1), but uses an alternative feature map h′ (described later).
- the reset determination unit 1 determines whether the processing target frame x(t) is a reset target frame. The determination result is notified to the resolution enhancement processing unit 101 .
- the reset target frame will be described.
- key frames for example, I-pictures in MPEG-compressed moving images
- key frames are inserted at scene changes
- key frames are inserted at regular intervals when scene changes do not occur for a long period of time.
- the reset determination unit 1 may determine whether or not the processing target frame x(t) is the reset target frame based on the key frame. Information on whether each of the low-resolution frames x(1) to x(n) is a key frame is included in the video sequence X (or each of the low-resolution frames x(1) to x(n) is included in the video sequence X). ).
- the reset determination unit 1 determines that the frame to be processed x(t) is a frame to be reset. Alternatively, it may be determined that the frame is not a reset target frame.
- the reset determination unit 1 determines that the processing target frame x(t) is one or several frames before or after the key frame. It may be determined that (t) is the reset target frame.
- the reset determination unit 1 determines that the processing target frame x(t) is the reset target frame, and does not correspond to the scene change. In this case, it may be determined that the frame is not a reset target frame.
- the reset determination unit 1 may determine whether or not to respond to a scene change using a known method.
- the reset determination unit 1 has another CNN (not shown), and determines whether or not the processing target frame x(t) is a scene change using several frames before and after the processing target frame x(t).
- the video sequence X may include information as to whether or not each of the low-resolution frames x(1) to x(n) corresponds to a scene change (or each low-resolution frame x(1) ) to x(n)).
- the reset determination unit 1 determines that the processing target frame x(t) is the reset target frame, and determines that the processing target frame x(t) is the reset target frame. If the keyframe does not correspond to a change (that is, if it does not correspond to a keyframe or does not correspond to a scene change), it may be determined that the frame is not a reset target frame. Even if the processing target frame x(t) is a key frame, if it does not correspond to a scene change, the feature map h(t-1 ), the continuity of the resolution enhancement process can be maintained.
- the reset determination unit 1 may determine that the processing target frame x(t) is the reset target frame when it is the leading frame x(0). Further, the reset determination unit 1 may determine that a frame is a reset target frame for each predetermined frame (for example, every predetermined interval such as the 1st frame, the 11th frame, the 21st frame, and so on). Note that the constant interval in this specification includes a strictly constant interval and a case where the interval is not exactly constant but is only a small variation that can solve the above problem.
- the alternative feature map generation unit 2 generates an alternative feature map h' that is used when the processing target frame x(t) is the reset target frame.
- the generated alternative feature map h′ is input to the resolution enhancement processing unit 101 .
- the alternative feature map h' may be a predetermined constant. However, it is desirable that the alternative feature map generator 2 has a CNN 21 separate from the CNN 11 and generates the alternative feature map h' from the low resolution frames. That is, the CNN 21 includes a plurality of weighting parameters, and weighting parameter values suitable for generating a suitable alternative feature map h' are learned and set in advance.
- the alternative feature map generation unit 2 may generate the alternative feature map h' from the processing target frame x(t) and/or the subsequent frame.
- the number of post frames to be used is not particularly limited, and is, for example, about 5 frames.
- FIG. 3 is a flow chart showing an example of the processing operation of the image processing apparatus of FIG.
- the resolution enhancement processing unit 101 sets one of the resolution enhancement frames x(1) to x(n) as the processing target frame x(t) (step S1).
- the reset determination unit 1 determines whether or not the processing target frame x(t) is the reset target frame (step S2).
- the resolution enhancement processing unit 101 converts the processing target frame x(t) and the previous frame x(t) of the processing target frame. -1) and/or the subsequent frame x(t+1) and the feature map h(t ⁇ 1) generated when the previous frame x(t ⁇ 1) is subjected to the high resolution processing, the processing target frame x A high-resolution frame y(t) is generated by increasing the resolution of (t) (step S3a). At this time, a feature map h(t) is also generated.
- a high-resolution frame y(t) is generated by increasing the resolution of the processing target frame x(t) (step S3b). At this time, a feature map h(t) is also generated.
- the generated high resolution frame y(t) may be displayed on a display.
- step S4 the processing operation of the image processing apparatus ends (YES in step S4).
- the resolution enhancement processing unit 101 may generate a high resolution video sequence Y from the high resolution frames y(1) to y(n).
- the reset determination unit 1 is provided.
- the generated feature map h(t-1) is not used. As a result, it is possible to prevent errors from accumulating and propagating in the feature map, and to prevent a decrease in accuracy in increasing the resolution of the low-resolution frames in the latter half.
- the feature map can be reset periodically. Also, by setting the key frame corresponding to the scene change as the reset target frame, the continuity of the high resolution frame y(t) can be maintained.
- the accuracy of resolution enhancement is improved. can be improved.
- the first frame as the frame to be reset and using the alternative feature map h', it is possible to improve the accuracy of resolution enhancement for the first frame and several subsequent frames.
- FIG. 4 is a block diagram schematically showing the schematic configuration of an image processing apparatus according to the second embodiment.
- This image processing apparatus includes a resolution enhancement processing unit 102 and a difficulty estimation unit 3 . A part or all of these may be realized by a processor executing a predetermined program. The following description will focus on differences from the first embodiment.
- the difficulty estimating unit 3 estimates the difficulty of increasing the resolution of the processing target frame x(t). It is assumed that the difficulty level estimation unit 3 of the present embodiment performs binary determination as to whether the difficulty level of the processing target frame x(t) is high or low. Judgment may apply a well-known technique.
- the difficulty level estimation unit 3 may determine the difficulty level based on the magnitude of motion in the time direction in the processing target frame x(t). More specifically, the processing target frame x(t) is compared with one or more previous and/or subsequent frames x, and if the difference between the frames is large, the difficulty estimation unit 3 determines the processing target frame x It may be estimated that the degree of difficulty in increasing the resolution of (t) is high.
- the difficulty level estimation unit 3 may determine the difficulty level based on the frequency components included in the processing target frame x(t). More specifically, when the processing target frame x(t) contains many high-frequency components, the difficulty level estimation unit 3 may estimate that the resolution enhancement difficulty of the processing target frame x(t) is high.
- the difficulty estimation unit 3 has a CNN 31 and determines the difficulty of increasing the resolution of the processing target frame x(t). That is, the CNN 31 includes a plurality of weighting parameters, and the values of the weighting parameters are learned and set in advance so that the degree of difficulty in increasing the resolution can be appropriately determined.
- the high-resolution processing unit 102 has a high-difficulty high-resolution processing unit 103 having a CNN 13 and a low-difficulty high-resolution processing unit 104 having a CNN 14 as a plurality of high-resolution processing units.
- each of the high-difficulty high-resolution enhancement processing unit 103 and the low-difficulty high-resolution enhancement processing unit 104 enhances the resolution of the processing target frame x(t) to produce a high-resolution frame y(t). ) and feature map h(t).
- CNN13 and CNN14 are different in size. Specifically, the size of CNN 13 is larger than the size of CNN 14 . More specifically, CNN 13 is computationally more expensive than CNN 14 (eg, has more layers or includes more weight parameters than CNN 14). In other words, the CNN 13 has a higher resolution accuracy than the CNN 14 but a slower processing speed than the CNN 14 .
- the CNN 13 it is preferable to apply the CNN 13 to the processing target frame x(t) for which the difficulty of increasing the resolution is high, even if it takes some processing time.
- the high-difficulty level resolution enhancement processing unit 103 performs resolution enhancement processing. and when the difficulty level estimation unit 3 estimates that the difficulty level for increasing the resolution of the processing target frame x(t) is low, the low-difficulty level resolution enhancement processing unit 103 performs resolution enhancement processing. do. That is, the CNNs 13 and 14 having different sizes are adaptively applied according to the inter-high-resolution difficulty level of the processing target frame x(t).
- the CNNs 13 and 14 to be applied in units of frames may be switched by determining the degree of difficulty in increasing the resolution for each processing target frame x(t). can be switched. This is because it is considered that the degree of difficulty in increasing the resolution does not change significantly between adjacent frames.
- FIG. 5 is a flow chart showing an example of the processing operation of the image processing apparatus of FIG.
- the resolution enhancement processing unit 102 sets one of the low resolution frames x(1) to x(n) as the processing target frame x(t) (step S11).
- the difficulty level estimation unit 3 determines whether or not the processing target frame x(t) is a switching target frame (step S12).
- the difficulty level estimation unit 3 may set the processing target frame x(t), which is a key frame, as the switching target frame.
- the difficulty estimation unit 3 may set the processing target frame x(t), which is the key frame corresponding to the scene change, as the switching target frame, or may set all the frames as the switching target frame.
- the difficulty level estimation unit 3 estimates the resolution enhancement difficulty level of the processing target frame x(t) (step S13). In this embodiment, the difficulty level estimation unit 3 determines whether the resolution enhancement difficulty level of the processing target frame x(t) is high or low.
- a specific estimation method may be the one described above. When a key frame is used as a switching target frame, the average difficulty level of resolution enhancement is estimated for one or all frames between the current key frame and the next key frame, and the estimated difficulty level is applied to the processing target frame x (t) may be the high resolution difficulty level.
- the resolution enhancement processing unit 102 selects the resolution enhancement processing unit 103 for high difficulty and the It is determined which one of the high-resolution processing units 104 for degrees is to be applied (step S14). Specifically, the high-difficulty level resolution enhancement processing unit 103 is applied to the processing target frame x(t) for which the difficulty level of resolution enhancement is determined to be high, and it is determined that the difficulty level of resolution enhancement is low. The low-difficulty high-resolution processing unit 104 is applied to the processed frame x(t).
- the high-difficulty level high-resolution processing unit 103 or the low-difficulty level high-resolution processing unit 104 that has been determined performs the high-resolution processing for the processing target frame x(t) (step S15).
- the generated high resolution frame y(t) may be displayed on a display. Note that if the resolution enhancement processing unit applied to the processing target frame x(t) is different from the resolution enhancement processing unit applied to the previous frame x(t ⁇ 1), the previous frame x(t ⁇ 1) If it is not possible to use the feature map h(t-1) generated when increasing the resolution of the target frame x(t) for increasing the resolution of the processing target frame x(t), then the constant feature map or You can use an alternative feature map such as
- the high-difficulty high-resolution processor 103 that processed the previous frame x(t-1) and the low-difficulty high-resolution Any one of the resolution processing units 104 performs resolution enhancement processing for the processing target frame x(t) (step S15).
- the high-difficulty high-resolution processing unit 103 and the low-difficulty high-resolution processing unit 104 determined according to the high-resolution difficulty of the switching target frame before the processing target frame x(t). Either one performs resolution enhancement of the processing target frame x(t).
- the generated high resolution frame y(t) may be displayed on a display.
- the resolution enhancement processing unit 101 may generate a high resolution video sequence Y from the high resolution frames y(1) to y(n).
- the resolution enhancement processing unit 103 has the same keyframe from one keyframe to the frame before the next keyframe.
- the resolution enhancement processing unit 104 is applied.
- the difficulty level estimation unit 3 estimates the difficulty level of high resolution in two stages (high or low difficulty level), but it may be estimated in three or more stages.
- the resolution enhancement processing unit 102 has three or more resolution enhancement processing units having CNNs of different sizes, and the higher the resolution enhancement difficulty, the higher the resolution enhancement processing unit having a larger size CNN. should increase the resolution of the processing target frame x(t).
- a plurality of resolution enhancement processing units having CNNs of different sizes are provided, and a CNN having a size corresponding to the difficulty of resolution enhancement of the processing target frame x(t) is used to achieve high resolution.
- Perform resolution processing As a result, it is possible to achieve both accuracy and processing speed of high resolution, and to improve the efficiency of high resolution processing.
- this embodiment is not limited to recurrent resolution enhancement processing, and can also be applied to cases in which information generated when resolution enhancement processing is performed on a previous frame is not used for resolution enhancement processing on the next frame. is.
- a third embodiment described below is a combination of the first and second embodiments.
- FIG. 6 is a block diagram schematically showing the schematic configuration of an image processing apparatus according to the third embodiment. The function of each part is as described in the first and second embodiments.
- FIG. 7 is a flow chart showing an example of the processing operation of the image processing apparatus of FIG.
- the difficulty estimation unit 3 determines that the processing target frame is the switching target frame.
- the reset determination unit 1 determines that the processing target frame is the reset target frame.
- the high-resolution processing unit 102 sets one of the low-resolution frames x(1) to x(n) as the processing target frame x(t) (step S21). Then, the difficulty estimation unit 3 determines whether or not the processing target frame x(t) is a switching target frame (in this example, whether or not it is a key frame) (step S22).
- step S26a If the processing target frame x(t) is not the switching target frame (NO in step S22), the process proceeds to step S26a.
- the difficulty level estimation unit 3 determines whether the processing target frame x(t) is estimated (step S23). In this embodiment, the difficulty level estimation unit 3 determines whether the resolution enhancement difficulty level of the processing target frame x(t) is high or low.
- the resolution enhancement processing unit 102 selects the resolution enhancement processing unit 103 for high difficulty and the It is determined which one of the high-resolution processing units 104 for degrees is to be applied (step S24). Specifically, the high-difficulty level resolution enhancement processing unit 103 is applied to the processing target frame x(t) for which the difficulty level of resolution enhancement is determined to be high, and it is determined that the difficulty level of resolution enhancement is low. The low-difficulty high-resolution processing unit 104 is applied to the processed frame x(t).
- the reset determination unit 1 determines whether or not the processing mode frame x(t) is a reset target frame (in this example, whether or not it is a key frame corresponding to a scene change) (step S25).
- processing target frame x(t) is not the reset target frame (NO in step S25)
- the processing target frame x(t) the previous frame x(t ⁇ 1) and/or the subsequent frame x(t+1) of the processing target frame, and the previous frame x(t ⁇ 1) are subjected to high resolution processing
- a high-resolution frame y(t) and a feature map h(t) are generated by increasing the resolution of the processing target frame x(t) (step S26a ).
- the generated high resolution frame y(t) may be displayed on a display.
- the high-difficulty high-resolution processing unit 103 and the low-difficulty high-resolution processing unit 104 determined in step S24 is the processing target frame x(t), the previous frame x(t ⁇ 1) and/or the subsequent frame x(t+1) of the processing target frame, and the alternative feature map generated by the alternative feature map generation unit 2 h′ (without using the feature map h(t ⁇ 1)) to generate a high-resolution frame y(t) and a feature map h(t) by increasing the resolution of the processing target frame x(t) (step S26b).
- the generated high resolution frame y(t) may be displayed on a display.
- step S27 the resolution enhancement processing unit 102 may generate a high resolution video sequence Y from the high resolution frames y(1) to y(n).
- the reset determination unit 1 is provided to suppress the accumulation and propagation of errors
- the difficulty level estimation unit 3 is provided to increase the efficiency of the resolution enhancement process. Note that the items described in the first and second embodiments may also be applied to the third embodiment as appropriate.
- each resolution enhancement processing unit may perform processing using an arbitrary machine learning model, and may perform processing using a neural network as an example, and processing using a CNN as an example.
- the program referred to in this specification may be recorded non-temporarily on a computer-readable recording medium and distributed, or may be distributed via a communication line (including wireless communication) such as the Internet. , may be distributed as installed on any terminal.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Television Systems (AREA)
- Image Processing (AREA)
Abstract
Description
まず、前提となるCNNを用いたリカレント方式の高解像度化処理(非特許文献3)について説明する。
図2は、第1実施形態に係る画像処理装置の概略構成を示すブロック図である。この画像処理装置は複数のフレームから構成される動画シーケンスを高解像度化処理するものであり、CNN11を有する高解像度化処理部101に加え、リセット判定部1と、代替特徴マップ生成部2とを備えている。これらの一部または全部は、プロセッサが所定のプログラムを実行することによって実現されてもよい。
る場合には、その処理対象フレームx(t)がリセット対象フレームであると判定し、キーフレームでない場合には、リセット対象フレームでないと判定してもよい。
図4は、第2実施形態に係る画像処理装置の概略構成を模式的に示すブロック図である。この画像処理装置は、高解像度化処理部102と、難易度推定部3とを備えている。これらの一部または全部は、プロセッサが所定のプログラムを実行することによって実現されてもよい。以下、第1実施形態との相違点を中心に説明する。
または高解像度化処理部104が適用されることとなる。
次に述べる第3実施形態は第1実施形態と第2実施形態とを組み合わせたものである。
10,11,13,14,21,31 CNN
1 リセット判定部
2 代替特徴マップ生成部
3 難易度推定部
Claims (13)
- 複数のフレームから構成される動画シーケンスを高解像化処理する画像処理方法であって、
前記複数のフレームにおける処理対象フレームがキーフレームであるか否かを判定するステップと、
(i)前記処理対象フレームがキーフレームであると判定された場合、
前記処理対象フレームの高解像度化難易度を推定するステップと、
互いに計算量が異なる複数の機械学習モデルのうち、推定された高解像度化難易度に応じた機械学習モデルを決定するステップと、
前記処理対象フレームがシーンチェンジに対応するキーフレームであるか否かを判定するステップと、
前記処理対象フレームがシーンチェンジに対応するキーフレームでない場合、前記決定された機械学習モデルを用いて高画質化処理部が、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、前記処理対象フレームの前のフレームを高解像化処理した際に生成される特徴マップと、を用いて、前記処理対象フレームを高解像化した高解像フレームを生成するステップと、
前記処理対象フレームがシーンチェンジに対応するキーフレームである場合、
前記処理対象フレームと、前記処理対象フレームの後のフレームと、を用いて代替特徴マップを生成し、
前記決定された機械学習モデルを用いて高画質化処理部が、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、前記代替特徴マップと、を用いて、前記処理対象フレームを高解像化した高解像フレームを生成するステップと、
(ii)前記処理対象フレームがキーフレームでないと判定された場合、前記決定された機械学習モデルを用いて高画質化処理部が、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、前記処理対象フレームの前のフレームを高解像化処理した際に生成される特徴マップと、を用いて、前記処理対象フレームを高解像化した高解像フレームを生成するステップと、を備える画像処理方法。 - 複数のフレームから構成される動画シーケンスを高解像化処理する画像処理装置であって、
前記複数のフレームにおける処理対象フレームがリセット対象フレームであるか否かを判定するリセット判定部と、
第1機械学習モデルを有し、
前記処理対象フレームがリセット対象フレームでない場合、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、前記処理対象フレームの前のフレームを高解像化処理した際に生成される特徴マップと、を用いて、前記処理対象フレームを高解像化した高解像フレームおよび特徴マップを生成し、
前記処理対象フレームがリセット対象フレームである場合、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、を用いるが、前記処理対象フレームの前のフレームを高解像化処理した際に生成される特徴マップを用いず、前記高解像フレームおよび前記特徴マップを生成する高解像度化処理部と、を備える画像処理装置。 - 前記リセット判定部は、前記複数のフレームに含まれるキーフレームに基づいて、前記処理対象フレームがリセット対象フレームであるか否かを判定する、請求項2に記載の画像処理装置。
- 前記リセット判定部は、前記処理対象フレームがキーフレームである場合、前記処理対象フレームがリセット対象フレームであると判定する、請求項2または3に記載の画像処理装置。
- 前記リセット判定部は、前記処理対象フレームがシーンチェンジに対応するフレームである場合、前記処理対象フレームがリセット対象フレームであると判定する、請求項2に記載の画像処理装置。
- 前記リセット判定部は、前記処理対象フレームがシーンチェンジに対応するキーフレームである場合、前記処理対象フレームがリセット対象フレームであると判定する、請求項2に記載の画像処理装置。
- 前記リセット判定部は、前記処理対象フレームが前記複数のフレームのうちの先頭フレームである場合、前記処理対象フレームがリセット対象フレームであると判定する、請求項2に記載の画像処理装置。
- 前記リセット判定部は、所定フレーム毎に、前記処理対象フレームがリセット対象フレームであると判定する、請求項2に記載の画像処理装置。
- 前記第1機械学習モデルとは異なる第2機械学習モデルを有し、前記処理対象フレームと、前記処理対象フレームの後のフレームと、を用いて、代替特徴マップを生成する特徴マップ生成部を備え、
前記高解像度化処理部は、前記処理対象フレームがリセット対象フレームである場合、前記代替特徴マップを用いて前記高解像フレームおよび前記特徴マップを生成する、請求項2乃至8のいずれかに記載の画像処理装置。 - 複数のフレームから構成される動画シーケンスを高解像化処理する画像処理装置であって、
前記複数のフレームにおける処理対象フレームがキーフレームである場合、前記処理対象フレームの高解像度化難易度を推定する難易度推定部と、
互いに計算量が異なる機械学習モデルを有する複数の高解像度化処理部と、を備え、
前記処理対象フレームがキーフレームである場合には、前記複数の高解像度化処理部のうち、前記処理対象フレームの高解像度化難易度に応じた機械学習モデルを有する高解像度化処理部が、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、を用いて、前記処理対象フレームを高解像化した高解像フレームを生成し、
前記処理対象フレームがキーフレームでない場合には、前記処理対象フレームより1つ前のフレームの高解像度化処理を行った高解像度化処理部が前記処理対象フレームの高解像度化処理を行う、画像処理装置。 - 前記高解像度化難易度が高いほど、計算量が大きい機械学習モデルを有する高解像度化処理部が前記高解像フレームを生成する、請求項10に記載の画像処理装置。
- 複数のフレームから構成される動画シーケンスを高解像化処理する画像処理方法であって、
前記複数のフレームにおける処理対象フレームがリセット対象フレームであるか否かを判定するステップと、
機械学習モデルを用いて高画質化処理部が、
前記処理対象フレームがリセット対象フレームでない場合、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、前記処理対象フレームの前のフレームを高解像化処理した際に生成される特徴マップと、を用いて、前記処理対象フレームを高解像化した高解像フレームおよび特徴マップを生成し、
前記処理対象フレームがリセット対象フレームである場合、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、を用いるが、前記処理対象フレームの前のフレームを高解像化処理した際に生成される特徴マップを用いず、前記高解像フレームおよび前記特徴マップを生成するステップと、を含む画像処理方法。 - 複数のフレームから構成される動画シーケンスを高解像化処理する画像処理方法であって、
前記複数のフレームにおける処理対象フレームがキーフレームである場合、前記処理対象フレームの高解像度化難易度を推定するステップと、
前記処理対象フレームがキーフレームである場合には、互いに計算量が異なる複数の機械学習モデルのうち、前記処理対象フレームの高解像度化難易度に応じた機械学習モデルを用いて高画質化処理部が、前記処理対象フレームと、前記処理対象フレームの前および/または後のフレームと、を用いて、前記処理対象フレームを高解像化した高解像フレームを生成するステップと、
前記処理対象フレームがキーフレームでない場合には、前記処理対象フレームより1つ前のフレームの高解像度化処理を行った高解像度化処理部が前記処理対象フレームの高解像度化処理を行うステップと、含む画像処理方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280004929.4A CN115836519A (zh) | 2021-05-24 | 2022-03-15 | 图像处理装置以及图像处理方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-087069 | 2021-05-24 | ||
JP2021087069A JP7007000B1 (ja) | 2021-05-24 | 2021-05-24 | 画像処理装置および画像処理方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022249661A1 true WO2022249661A1 (ja) | 2022-12-01 |
Family
ID=80624149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/011503 WO2022249661A1 (ja) | 2021-05-24 | 2022-03-15 | 画像処理装置および画像処理方法 |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP7007000B1 (ja) |
CN (1) | CN115836519A (ja) |
WO (1) | WO2022249661A1 (ja) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7508525B2 (ja) | 2022-10-21 | 2024-07-01 | キヤノン株式会社 | 情報処理装置、情報処理方法及びプログラム |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210097646A1 (en) * | 2019-09-26 | 2021-04-01 | Lg Electronics Inc. | Method and apparatus for enhancing video frame resolution |
-
2021
- 2021-05-24 JP JP2021087069A patent/JP7007000B1/ja active Active
-
2022
- 2022-03-15 WO PCT/JP2022/011503 patent/WO2022249661A1/ja active Application Filing
- 2022-03-15 CN CN202280004929.4A patent/CN115836519A/zh active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210097646A1 (en) * | 2019-09-26 | 2021-04-01 | Lg Electronics Inc. | Method and apparatus for enhancing video frame resolution |
Also Published As
Publication number | Publication date |
---|---|
JP2022180137A (ja) | 2022-12-06 |
JP7007000B1 (ja) | 2022-01-24 |
CN115836519A (zh) | 2023-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106127688B (zh) | 一种超分辨率图像重建方法及其系统 | |
CN103761710B (zh) | 基于边缘自适应的高效图像盲去模糊方法 | |
US8554019B2 (en) | Apparatus and method for image processing | |
CN102576454A (zh) | 利用空间图像先验的图像去模糊法 | |
CN104103052B (zh) | 一种基于稀疏表示的图像超分辨率重建方法 | |
KR101987079B1 (ko) | 머신러닝 기반의 동적 파라미터에 의한 업스케일된 동영상의 노이즈 제거방법 | |
WO2022249661A1 (ja) | 画像処理装置および画像処理方法 | |
CN113724136B (zh) | 一种视频修复方法、设备及介质 | |
CN112288632A (zh) | 基于精简esrgan的单图像超分辨率方法及系统 | |
JP2009081574A (ja) | 画像処理装置、方法およびプログラム | |
CN115731132A (zh) | 图像修复方法、装置、设备及介质 | |
US12056793B2 (en) | Image processing method and apparatus and computer program product for the same | |
JP2015197818A (ja) | 画像処理装置およびその方法 | |
JP4563982B2 (ja) | 動き推定方法,装置,そのプログラムおよびその記録媒体 | |
JP2011070283A (ja) | 顔画像高解像度化装置、及びプログラム | |
JP6652052B2 (ja) | 画像処理装置および画像処理方法 | |
CN116523743A (zh) | 一种基于循环神经网络的游戏超分辨率方法 | |
JP4232831B2 (ja) | 画像処理装置および画像処理方法並びに画像処理プログラム | |
CN110598547A (zh) | 快速运动人体姿态估计方法及装置 | |
JP6854629B2 (ja) | 画像処理装置、画像処理方法 | |
CN114429203B (zh) | 一种卷积计算方法、卷积计算装置及其应用 | |
WO2014175484A1 (ko) | 흔들림 영상 안정화 방법 및 이를 적용한 영상 처리 장치 | |
Kim et al. | NERDS: A General Framework to Train Camera Denoisers from Raw-RGB Noisy Image Pairs | |
JP7003342B2 (ja) | 動画分離装置、プログラム及び動画分離方法 | |
JP2005316985A (ja) | 画像拡大装置及び画像拡大方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22810939 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022810939 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2022810939 Country of ref document: EP Effective date: 20240102 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22810939 Country of ref document: EP Kind code of ref document: A1 |