WO2021045599A1 - 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 - Google Patents
비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 Download PDFInfo
- Publication number
- WO2021045599A1 WO2021045599A1 PCT/KR2020/012058 KR2020012058W WO2021045599A1 WO 2021045599 A1 WO2021045599 A1 WO 2021045599A1 KR 2020012058 W KR2020012058 W KR 2020012058W WO 2021045599 A1 WO2021045599 A1 WO 2021045599A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- effect
- bokeh
- bokeh effect
- applying
- Prior art date
Links
- 230000000694 effects Effects 0.000 title claims abstract description 264
- 238000000034 method Methods 0.000 title claims abstract description 98
- 238000013528 artificial neural network Methods 0.000 claims description 55
- 230000011218 segmentation Effects 0.000 claims description 50
- 238000002156 mixing Methods 0.000 claims description 26
- 238000004590 computer program Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 36
- 238000012545 processing Methods 0.000 description 27
- 238000010586 diagram Methods 0.000 description 26
- 238000004458 analytical method Methods 0.000 description 18
- 238000001914 filtration Methods 0.000 description 16
- 238000003384 imaging method Methods 0.000 description 11
- 238000007781 pre-processing Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000013473 artificial intelligence Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 235000019800 disodium phosphate Nutrition 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000013529 biological neural network Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000011514 reflex Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 210000000225 synapse Anatomy 0.000 description 2
- 230000000946 synaptic effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 238000012887 quadratic function Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/536—Depth or shape recovery from perspective effects, e.g. by using vanishing points
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present disclosure relates to a method of applying a bokeh effect to a video image and a recording medium, and more specifically, a single lens reflex camera (SLR) or a digital single-lens (DSLR) having a large aperture diameter using computer vision technology.
- SLR single lens reflex camera
- DSLR digital single-lens
- Reflex Camera a method of applying focusing, out-focusing, and bokeh effects to real-time video images and a recording medium that can be realized.
- the algorithm operation speed In order to apply the bokeh effect in real time, the algorithm operation speed must be fast enough, so image processing at a lower resolution than the original is recommended, but the part in focus or the part that is determined to be sharp is the same as the original. Having a level of clarity is aesthetically important.
- the present disclosure provides a method and a recording medium for applying a bokeh effect to a video image to solve the above problems.
- the bokeh effect can be implemented in various ways to create a natural image.
- a method of applying a bokeh effect to a video image includes extracting characteristic information of an image from an image included in the video image, analyzing the extracted characteristic information of the image, and analyzing the image. And determining a bokeh effect to be applied to the image based on the characteristic information, and applying the determined bokeh effect to the image.
- analyzing the extracted characteristic information of the image includes: detecting an object in the image, generating an area corresponding to the object in the image, a location, size, and Determining at least one of the directions, and analyzing a characteristic of the image based on information on at least one of a location, a size, and a direction of an area corresponding to the object.
- the object in the image may include at least one of a person object, a face object, and a landmark object included in the image, and at least one of the position, size, and direction of the object in the image is determined.
- the step of performing includes determining a ratio of the size of the image and the size of the area corresponding to the object, and analyzing the characteristics of the image based on information on at least one of the position, size, and direction of the object And classifying a pose of the included object.
- the analyzing of the extracted characteristic information of the image includes detecting at least one of the height of the asymptote (horizontal line) and the vanishing point included in the image, and at least one of the height of the detected asymptote and the vanishing point. And analyzing depth characteristics in the raw image.
- determining the bokeh effect to be applied to the image includes determining a type and application method of the bokeh effect applied to at least a portion of the image based on the analyzed characteristic information of the image.
- the method of applying a bokeh effect to a video image further comprises receiving input information on an intensity of a bokeh effect on the video image, and the step of applying the bokeh effect to the image comprises: Based on the input information on the intensity, determining the intensity of the bokeh effect and applying it to the image.
- the applying of the determined bokeh effect to an image includes generating sub-images corresponding to regions to which a blur effect is to be applied, applying a blur effect to the sub-images, and And mixing the sub-images to which the blur effect has been applied.
- the step of applying the determined bokeh effect to the image further includes downsampling the image to generate an image with a resolution lower than the resolution of the image, and corresponds to areas in the image to which the blur effect is applied.
- the generating of the sub-images includes applying a blur effect to a region corresponding to the sub-image in the low-resolution image.
- the mixing of the sub-images to which the blur effect is applied includes mixing a low-resolution image and sub-images corresponding to regions to which the blur effect is applied, and the sub-images are mixed so as to be the same as the resolution of the image. And correcting sharpness of the up-sampled image by mixing the image and the up-sampled image and upsampling the lower resolution image.
- step (d) includes correcting the extracted depth map using the generated segmentation mask, and applying a depth effect to a plurality of image frames based on the corrected depth map.
- each of steps (a) to (d) is executed by any one of a plurality of heterogeneous processors.
- a computer-readable medium in which a computer program for executing a method of applying a bokeh effect to a video image according to an embodiment of the present disclosure on a computer is recorded.
- a bokeh effect is applied to a video image in which the degree, range, and method of the effect are automatically adjusted according to the characteristics of the input image and the intensity of the user using a segmentation mask.
- a method and a recording medium can be provided.
- the bokeh effect is applied in the image using information on the object to be focused in the image and the depth map of the image, it is possible to differentially apply the depth effect to the background area of the image.
- the depth effect can be applied differentially to the object area in the image.
- FIG. 1 is an exemplary diagram of a method of applying a bokeh effect to a video image according to an embodiment of the present disclosure.
- FIG. 2 is a flowchart of a method of applying a bokeh effect to a video image according to an embodiment of the present disclosure.
- FIG. 3 is a block diagram of a system for applying a bokeh effect to a video image according to an embodiment of the present disclosure.
- FIG. 4 is a flowchart of an image processing method of a processing unit of a system for applying a bokeh effect to a video image according to an embodiment of the present disclosure.
- FIG. 5 is an exemplary diagram illustrating a process of analyzing extracted characteristic information of an image according to an embodiment of the present disclosure.
- FIG. 6 is an exemplary view illustrating a process of classifying a pose of an object in a process of analyzing extracted characteristic information of an image according to an embodiment of the present disclosure.
- FIG. 7 is a diagram illustrating at least one of an asymptotic line (horizontal line) and a height of a vanishing point included in an image in a process of analyzing extracted characteristic information of an image according to an embodiment of the present disclosure, and analyzing depth characteristics in the image. It is an exemplary diagram for explaining the process of doing.
- FIG. 8 is an exemplary diagram for describing a process of determining the type and application method of a bokeh effect to be applied to an image based on analyzed characteristic information of an image according to an embodiment of the present disclosure.
- FIG. 9 is an exemplary view illustrating a process of determining the type and application method of a bokeh effect to be applied to an image based on analyzed characteristic information of an image according to an embodiment of the present disclosure.
- FIG. 10 is an exemplary diagram illustrating a process of receiving input information on the intensity of a bokeh effect on a bokeh video image according to an embodiment of the present disclosure.
- FIG. 11 is a flowchart illustrating a step of applying a bokeh effect to an image according to an embodiment of the present disclosure.
- FIG. 12 is an exemplary diagram for describing a process of correcting a depth map using a segmentation mask according to an embodiment of the present disclosure.
- FIG. 13 is a schematic diagram illustrating a data flow of a video bokeh solution according to an embodiment of the present disclosure.
- FIG. 14 is a schematic diagram illustrating a data flow of a video bokeh solution according to an embodiment of the present disclosure.
- FIG. 15 is a schematic diagram illustrating a data flow of a video bokeh solution according to an embodiment of the present disclosure.
- 16 is a block diagram illustrating a data flow of a video bokeh solution according to an embodiment of the present disclosure.
- 17 is an exemplary diagram illustrating an artificial neural network model according to an embodiment of the present disclosure.
- module refers to software or hardware components, and “module” performs certain roles.
- module'module' is not limited to software or hardware.
- The'module' may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors.
- the'module' includes components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, Includes subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, database, data structures, tables, arrays and variables. Components and'modules' may be combined into a smaller number of components and'modules', or further separated into additional components and'modules'.
- the'system' may refer to at least one of a server device and a cloud server device, but is not limited thereto.
- the'user terminal' is provided with a communication module to enable a network connection, and any electronic device capable of outputting content by accessing a website, an application, etc. (for example, a smartphone, a PC, a tablet PC, a laptop PC, etc.). Users are provided with arbitrary content accessible through the network by input through the user terminal interface (eg, touch display, keyboard, mouse, touch pen or stylus, microphone, motion recognition sensor, etc.) through the user terminal.
- I can. 1 is an exemplary diagram of a method of applying a bokeh effect to a video image according to an embodiment of the present disclosure.
- the user terminal 100 is shown as a smart phone, but is not limited thereto, and a camera may be provided to take a picture, and a CPU (Central Processing Unit) or a GPU (Graphic Processing Unit) Processing Unit) or a computer system such as NPU (Neural Processing Unit) and equipped with a device capable of executing a program operation, and an arbitrary electronic device capable of outputting video image content (for example, a PC, Tablet PC, laptop PC, etc.).
- the user can control the intensity of the bokeh effect to be applied to the video image by input through the interface of the user terminal 100 (for example, a touch display, a keyboard, a mouse, a touch pen or a stylus, a microphone, a motion recognition sensor).
- the user terminal 100 may receive a service for applying a bokeh effect to a video image through an application provided by a server.
- the user terminal 100 may apply a bokeh effect to a video image.
- a blur effect is applied to the background area 110, and the foreground area, which is the person object area 120, is not applied to the blur effect, and processing the bokeh effect in real time may be confirmed on the screen at the same time as recording. .
- a method of applying a bokeh effect to a video image in a user terminal comprising: extracting image characteristic information from an image included in the video image, analyzing the extracted characteristic information of the image, based on the analyzed characteristic information of the image It may include determining a bokeh effect to be applied to the raw image and applying the determined bokeh effect to the image.
- the bokeh effect application system may extract characteristic information of the image from the image included in the video image.
- the characteristic information may mean information that can be extracted from an image, such as an RGB value of a pixel in the image, but is not limited thereto.
- the bokeh effect application system may analyze the extracted characteristic information of the image. For example, by receiving characteristic information extracted from an input image, characteristic information for determining the type and intensity of a bokeh effect to be applied to the input image may be analyzed.
- the bokeh effect application system may determine a bokeh effect to be applied to the image based on the analyzed characteristic information of the image.
- the determined bokeh effect may include Flat Bokeh or Gradient Bokeh, but is not limited thereto.
- the bokeh effect application system may apply the determined bokeh effect to the image.
- the detailed configuration will be described with reference to FIGS. 3 and 4 below.
- the bokeh effect application system 300 includes an imaging unit 310, an input unit 320, an output unit 330, a processing unit 340, a storage unit 350, and a communication unit 360. can do.
- the imaging unit 310 may capture an input image for applying a bokeh effect and transmit it to the storage unit 350.
- the imaging unit 310 may be equipped with a camera or the like to take a picture or an image.
- the camera may be configured as a monocular camera with one lens and one sensor or a camera having two or more lenses and sensors, but is not limited thereto.
- the input unit 320 receives the input intensity from the user in order to determine the type, intensity, and intensity distribution of the bokeh effect to be applied when the bokeh effect application unit 440 applies the bokeh effect to the input image. can do.
- the output unit 330 may receive and output an image to which a bokeh effect is applied to the input image from the storage unit 350.
- the output unit 330 may output an image to which a bokeh effect is applied to the input image and check it in real time.
- the processor 340 may extract feature information from the input image, analyze the feature information based on the extracted feature information, and determine a bokeh effect based on the analyzed feature information. In addition, based on the determined bokeh effect and input information on the intensity of the bokeh effect received from the user, the intensity of the applied blur effect or the distribution of the intensity of the blur effect may be determined.
- the detailed configuration of the processing unit 340 is shown in FIG. 4 below. And with reference to FIG. 11.
- the storage unit 350 stores images captured from the imaging unit 310, or images generated in a series of processes in which the processing unit 340 applies a bokeh effect to an input image (for example, a sub-image, a mixed image). , Down-sampling video, etc.) and final output video can be saved. Also, an external input image received from the communication unit 360 may be stored. The storage unit 350 outputs the image stored in the storage unit 350 through the output unit 330, or transmits the images used by the processing unit 340 to apply a bokeh effect to the input image through the communication unit 360. can do.
- the communication unit 360 may exchange data inside the bokeh effect application system 300 or transmit and receive data such as an image by communicating with an external server. In another embodiment, the communication unit 360 may receive a service for applying a bokeh effect to a video image through an application provided by a server.
- the processing unit 340 receives the image being photographed received from the imaging unit 310 or the image stored in the user terminal 100 received from the storage unit 350 as an input image, and applies a bokeh effect to output it. I can.
- the processing unit 340 may include a characteristic information extracting unit 410, a characteristic information analyzing unit 420, a bokeh effect determining unit 430, and a bokeh effect applying unit 440.
- the processing unit 340 from artificial intelligence of a rule-based algorithm, to machine learning, a convolutional neural network (CNN), a recurrent neural network (RNN), a deep neural network (DNN), etc. It can be implemented with artificial intelligence technology.
- CNN convolutional neural network
- RNN recurrent neural network
- DNN deep neural network
- the characteristic information extracting unit 410 extracts information that can be extracted from an image, such as an RGB value of a pixel in the image, required for the characteristic information analysis unit 420 to analyze characteristic information for an input image. It may mean, but is not limited thereto.
- the characteristic information analysis unit 420 may receive characteristic information extracted from the characteristic information extraction unit 410 and analyze characteristic information for determining the type and intensity of a bokeh effect to be applied to the input image.
- the analyzed image characteristic information generated by analyzing the characteristic information by the characteristic information analysis unit 420 is an object within the image, a bounding box corresponding to the object within the image, and a segmentation mask corresponding to the edge of the object within the image.
- the bokeh effect determiner 430 may determine a bokeh effect to be applied to the input image based on the characteristic information of the image analyzed by the characteristic information analyzer 420.
- the bokeh effect determined by the bokeh effect determiner 430 may include a flat bokeh or a gradient bokeh, but is not limited thereto.
- the characteristic information analysis unit 420 may be configured through a conventionally well-known artificial intelligence technology such as a rule-based artificial intelligence or a simple artificial neural network that performs a classification task, but is not limited thereto. Further, the detailed configuration of the bokeh effect determination unit 430 will be described with reference to FIGS. 8 and 9 below.
- the bokeh effect application unit 440 may determine a distribution of the intensity of a flat bokeh or a blur effect of a gradient bokeh based on input information about the intensity of the bokeh effect received from the user. As shown in FIG. 11 to be described later, the processing unit 340 further includes the sub-image generation module 1110 of the bokeh effect application unit 440 to downsample the input image to a lower resolution than the input image, and The sub-images to which the blur effect is applied to the generated sub-image may be stored in the storage unit 350.
- the sub-image mixing module 1120 of the bokeh effect application unit 440 when applying a gradient bokeh to an input image, mixes the sub-images to which the blur effect generated by the sub-image generation module 1110 is applied.
- the mixed image may be stored in the storage unit 350.
- the mixed image up-sampling module 1130 of the bokeh effect applying unit 440 may up-sample an image of a low resolution mixed by the sub-image mixing module 1120.
- the sharpness correction module 1140 of the bokeh effect applying unit 440 may correct the sharpness of the bokeh effect applied image by using the original input image before applying the blur effect.
- the characteristic information analysis unit 420 receives the characteristic information extracted from the characteristic information extraction unit 410, and analyzes characteristic information for determining the type and intensity of the bokeh effect to be applied to the input image. I can.
- the characteristic information analysis unit 420 detects an object in the images 510 and 530 based on the characteristic information extracted from the characteristic information extraction unit, and a bounding box 515 corresponding to the object in the image , 535) and segmentation masks 525 and 545 corresponding to objects in the image may be generated. It can be generated by object detection using various types of artificial intelligence technologies that are widely known, such as a convolutional neural network (CNN) and a deep neural network (DNN).
- CNN convolutional neural network
- DNN deep neural network
- the characteristic information analyzer 420 may determine at least one of a location, a size, and a direction of an area corresponding to an object in the image.
- the characteristic information analysis unit 420 analyzes the characteristics of the image based on information on at least one of the location, size, and direction of the area corresponding to the object, and transmits the analyzed characteristic information to the bokeh effect determination unit 430. can do.
- the characteristic information analysis unit 420 has the size of the area 515 corresponding to the object in the image is more than half (50%) of the entire image, and the area 515 corresponding to the object in the image is If it is attached to the edge of the image 510, it is possible to determine the image characteristics as being a selfie.
- the characteristic information analysis unit 420 may determine that the size of the area 535 corresponding to the object in the image is less than half (50%) in the entire image, or the area 535 corresponding to the object in the image If the image 530 is not attached to the edge, the image characteristics can be determined as being a full body shot.
- whether the image characteristic is a selfie or a full body shot may be used by the bokeh effect determiner 430 to determine the type of bokeh effect to be applied to the image.
- the object in the image may include at least one of a person object, a face object, and a landmark object included in the image.
- determining at least one of the position, size, and direction of the object in the image may include determining a ratio of the size of the image and the size of the area corresponding to the object.
- analyzing the characteristics of the image based on information on at least one of the location, size, and direction of the object may include classifying a pose of the object included in the image.
- the object in the image 610 may include at least one of a person object 612, a face object 614, a landmark object, and a person object 612 included in the image.
- the landmark object may mean facial feature points such as eyes, nose, and mouth of a human face in the face object 614 in the image.
- the characteristic information analysis unit 420 includes a ratio of an object included in the image based on information on at least one of the position, size, direction, and ratio of the object 612 and the face object 614 in the image. You can classify poses. For example, it is possible to determine whether a person in the image is standing or sitting based on the position, size, direction, and ratio information of the face object 614 in the object 612 in the image.
- the ground area 616 exists on the opposite side of the face object 614 within the object 612 in the image.
- the information on the ground area 616 may be used by the bokeh effect determiner 430 to determine the intensity of a bokeh effect to be applied.
- the ground area 616 may be inferred as an area closest to the person object 612 included in the image among the background areas in terms of a distance. Accordingly, the bokeh effect determiner 430 may determine an area in which the blur effect is the least to enter among the background areas.
- FIG. 7 is a diagram illustrating at least one of an asymptotic line (horizontal line) and a height of a vanishing point included in an image in a process of analyzing extracted characteristic information of an image according to an embodiment of the present disclosure, and analyzing depth characteristics in the image. It is an exemplary diagram for explaining the process of doing.
- the analyzing of the extracted characteristic information of the image includes detecting at least one of the heights of the asymptote (horizontal line) and the vanishing point included in the image, and at least one of the heights of the detected asymptotes and vanishing points. It may include the step of analyzing the depth (depth) characteristic in the image on the basis of.
- the characteristic information analysis unit 420 may detect the vanishing point 715 in the image 710 having a background element capable of detecting the vanishing point and transmit it to the bokeh effect determiner 430.
- the vanishing point 715 may mean a point where edge components in the image intersect in a certain range. According to the perspective method, the object is projected larger as it approaches the viewer's point of view, and the object is projected smaller as it moves away from the observer's point of view.If this is expressed as a line, the line meets the line as the distance from the observer's point of view meets, and the vanishing point 715 ) Can be formed.
- the vanishing point 715 may be inferred as an area farthest from the camera in the image except for the sky area. Accordingly, in the bokeh effect determiner 430, the vanishing point 715 may be determined as an area in which the blur effect is most entered among the background areas.
- the characteristic information analysis unit 420 detects the asymptote line (horizontal line or horizontal line) 725 in the image 720 having a background element capable of detecting the asymptote line (horizontal line), and the bokeh effect determination unit 430 ) Can be sent.
- the asymptote line 725 may also be inferred as an area farthest from the camera in the image except for the sky area. Accordingly, in the bokeh effect determiner 430, the asymptotic line 725 may be determined as an area in which the blur effect is most entered among the background areas.
- determining the bokeh effect to be applied to the image may include determining the type and application method of the bokeh effect to be applied to at least a portion of the image based on analyzed characteristic information of the image. have.
- the bokeh effect determination unit 430 may be implemented as a simple artificial neural network that performs a classification task. In the case of the intensity of the blur effect of the background area, the closer to black, the stronger the blur strength is, and the closer to white, the weaker the blur strength is.
- FIG. 8 is an exemplary diagram for describing a process of determining the type and application method of a bokeh effect to be applied to an image based on analyzed characteristic information of an image according to an embodiment of the present disclosure.
- the bokeh effect determiner 430 may determine a bokeh effect to be applied to the input image 810.
- the type of the bokeh effect determined by the bokeh effect determiner 430 may include a flat bokeh or a gradient bokeh.
- the bokeh effect determiner 430 may determine distributions of blur strength of Flat Bokeh and blur strength of Gradient Bokeh.
- a blur effect of the same intensity may be applied to the background area in the image.
- a background area in the image may have different blur strengths along a horizontal or vertical axis.
- the image 830 to which the gradient bokeh is applied in the vertical direction may have the same blur intensity in the horizontal direction.
- the in-focus portion of the background area (or the portion analyzed as in-focus) may receive the least blur effect or may not receive the blur effect.
- the part where the actual distance of the background area is the farthest (or the part analyzed to be the farthest) can receive the strongest blur effect.
- the input image 810 illustrated in FIG. 8 may be an image determined by the characteristic information analysis unit 420 to be a selfie.
- the selfie image a part occupied by a person object in the entire image may be large, and an area occupied by a background may be small. Also, the difference in distance from the camera between the background areas may not be large. Therefore, even if Flat Bokeh, which collectively processes the blur intensity, is applied, the bokeh applied image may look natural. Accordingly, the bokeh effect determiner 430 may determine that it is appropriate to apply Flat Bokeh to an image whose image characteristic is a selfie.
- the bokeh effect determination unit 430 applies Flat Bokeh, which has a faster operation speed and less computational amount, to the input image 810 than that of Gradient Bokeh, and is an aesthetically important part of the bokeh process of a selfie image.
- a segmentation mask corresponding to a person region which is an area to which the blurring effect is not applied, may be calculated again.
- the bokeh effect determiner 430 may determine a bokeh effect to be applied to the input image 910.
- the input image 910 illustrated in FIG. 9 may be an image determined by the characteristic information analysis unit 420 to be a full body shot.
- the full-body shot image a portion occupied by a person object in the entire image may be small, and an area occupied by a background may be large. Also, a difference in distance from the camera between the background areas may be large. Accordingly, the image 920 to which Flat Bokeh that collectively processes the blur intensity is applied may look unnatural.
- the image 930 to which the gradient bokeh is applied may look natural because the blur effect maintains the depth characteristic of the background area. Accordingly, the bokeh effect determiner 430 may determine that it is appropriate to apply a gradient bokeh to an image whose image characteristic is a full body shot.
- the image 930 to which the Gradient Bokeh is applied is a part of the background area, such as the vanishing point and the asymptote, analyzed by the characteristic information analysis unit 420 to be the farthest or the farthest part of the background area. , As illustrated in FIG. 9, the region 932 having the strongest blur effect may be determined.
- the ground area analyzed by the characteristic information analysis unit 420 is a part of the background area that is analyzed to have the closest actual distance or the closest distance.
- the blurring effect may be the weakest, or an area 934 to which the blurring effect is not applied may be determined.
- the system for applying the bokeh effect may further include receiving input information on the intensity of the bokeh effect on the video image.
- the step of applying the bokeh effect to the image may include determining an intensity of the bokeh effect and applying it to the image based on input information on the received intensity. In the case of the intensity of the blur effect of the background area, the closer to the black, the stronger the blur strength is, and the closer to the white, the weaker the blur strength is.
- the bokeh effect application unit 440 may determine the intensity of the flat bokeh based on the input information 1015 and 1025 about the intensity of the bokeh effect received from the user. For example, when the bokeh effect application unit 440 receives the input information 1015 having a low blur intensity, the bokeh effect application unit 440 may output an image 1010 to which the background blur intensity is weakly applied. When input information 1025 having a high blur intensity is received, an image 1020 to which a background blur intensity is strongly applied may be output.
- the bokeh effect applying unit 440 may determine the intensity and position of a region with a strong blur effect and an intensity and position of a region with a low blur effect in the gradient bokeh from the user and apply it to the image.
- the step of applying the determined bokeh effect to the image includes generating sub-images corresponding to regions to which the blur effect is to be applied, applying the blur effect to the sub-images, and blurring. It may include mixing the sub-images to which the effect is applied.
- the step of downsampling the image may further include generating an image with a resolution lower than the resolution of the image, and the step of generating sub-images corresponding to areas to which the blur effect is applied in the image It may include applying a blur effect to the area corresponding to the sub-image.
- the step of mixing the sub-images to which the blur effect is applied includes mixing a low-resolution image and sub-images corresponding to the areas to which the blur effect is applied, and a low-resolution image in which the sub-images are mixed to be the same as the image resolution.
- Upsampling may include correcting sharpness of the up-sampled image by mixing the image and the up-sampled image.
- the bokeh effect application unit 440 includes a sub image generation module 1110, a sub image mixing module 1120, a mixed image upsampling module 1130, and a sharpness correction module 1140. I can.
- the sub-image generating module 1110 is applied to the sub-image and the down-sampled image that downsampled the input image to a lower resolution than the input image in order to secure a fast operation speed when Flat Bokeh is applied to the input image.
- a sub image to which the blur effect is applied may be generated and stored in the storage unit 350. The amount of computation required for the blur effect can be reduced by omitting the blur effect to the area inside the segmentation mask corresponding to the object area.
- a series of image processing steps may be added before and after the blur processing.
- the sub-image generation module 1110 blurs the downsampled image and the downsampled image of the input image to a lower resolution than the input image in order to secure a fast operation speed when applying a gradient bokeh to the input image.
- a sub image to which the effect is applied may be generated and stored in the storage unit 350.
- the sub-image may generate Flat Bokeh applied images having a specific blur strength, instead of generating an image having a blur strength that varies according to the position of a pixel of a gradient bokeh.
- the area between the area with the blur intensity level 1 and the area with the blur intensity level 2 is the image with the 1st level blur intensity and the 2nd level blur.
- the amount of computation can be reduced by calculating only an area having a blur effect larger than K-1 and smaller than K+1 for an image to which the K-level blur effect is applied.
- the amount of calculation required for the blur effect can be reduced by omitting the blur effect to the area inside the segmentation mask corresponding to the person area.
- a series of image processing steps may be added before and after the blur processing.
- the sub-image generation module 1110 calculates the blur kernel 3x3 twice when calculating the blur kernel 5x5 in order to secure a fast operation speed in the step of applying the blur effect during the process of generating the sub-image. You can use the method of applying the solution. In addition, in order to secure a fast operation speed, a method of synthesizing the blur kernels 1x3 and 3x1 may be used when calculating the blur kernel 3x3.
- the sub-image mixing module 1120 may be omitted when flat bokeh is applied to an input image.
- the sub-image mixing module 1120 may mix and store the images to which the blur effect generated by the sub-image generation module 1110 is applied when applying a gradient bokeh to an input image and store them in the storage unit 350.
- regions between a region having a first-level blur intensity and a region having a second-level blur intensity may linearly mix an image having a first-level blur intensity and an image having a second-level blur intensity.
- the sub-image blending module 1120 may represent an image having a first-level blur intensity and an image having a second-level blur intensity as a curve in the form of a quadratic function. , Can be mixed in proportions with incremental weights.
- the mixed image upsampling module 1130 may upsample an image of a low resolution mixed by the sub image mixing module 1120.
- the sharpness correction module 1140 may correct the sharpness of the bokeh effect applied image by using the original image before applying the blur effect. If you mix the kernel with no blur effect (1x1) and the kernel with blur effect (e.g. 3x3, 5x5, etc.) at a lower resolution, even at the location where the blur effect was not applied when you upsampled it to a higher resolution. , Due to pixel values lost during downsampling and upsampling, the image may appear blurred.
- a high resolution at the position corresponding to the kernel (1x) without the blur effect applied for example, the area of a person object inside the segmentation mask
- the sharpness of the input video can be maintained only when the video without applying the blur effect of is mixed. Therefore, the location is the original input image with high resolution, the image without the blur effect of the low resolution, the image with the low resolution without the blur effect, and the three are mixed, so if the blending ratio is incorrectly set, the image Noise or pixel values may change. Therefore, by using the square root ratio to the original image, it is possible to correct the sharpness while maintaining the ratio of the initially applied blur effect.
- the blending of the blur effect applied image and the low-resolution blur effect applied is sqrt(0.7):1-
- the first mixed image may be generated by mixing at a ratio of sqrt(0.7).
- the first mixed image may be upsampled.
- the first mixed image and the high-resolution input image may be mixed in a ratio of sqrt(0.7):1-sqrt(0.7). Accordingly, a blending ratio of the blur effect applied image, the low-resolution blur applied image and the high-resolution input image may be about 0.7:0.14:0.16.
- a method of applying a bokeh effect to a video image in a user terminal includes: (a) receiving information on a plurality of image frames, (b) receiving information on a plurality of image frames by first Generating a segmentation mask 1220 for one or more objects included in a plurality of image frames by inputting into the artificial neural network model, (c) inputting information on the plurality of image frames to a second artificial neural network model to obtain a plurality of images. Extracting the depth map 1210 for the frame, and (d) applying a depth effect for a plurality of image frames based on the generated segmentation mask 1220 and the extracted depth map 1210. I can.
- the imaging unit 310 may include a depth camera, and a depth effect for a plurality of image frames may be applied by using the depth map 1210 obtained through the depth camera.
- the depth camera may include a Time of Flight (ToF) sensor and a structured light sensor, but is not limited thereto, and in the present disclosure, a stereo vision method (for example, a depth value calculated by a dual camera) Even when the depth map 1210 is obtained by using the additional camera and a processor that calculates a depth value using a plurality of cameras, a depth camera may be referred to as a depth camera.
- ToF Time of Flight
- a stereo vision method for example, a depth value calculated by a dual camera
- a system for applying a bokeh effect may include an image version for applying a bokeh effect to an image and a video version for applying a bokeh effect to a video image.
- the image version of the bokeh effect application system for applying a bokeh effect to an image after input data (e.g., an image or a plurality of video frames) is input, when changing the position of the focus through user input and/or the user When changing the intensity of the blur through input, applying the bokeh effect to the input data can be processed in real time.
- input data e.g., an image or a plurality of video frames
- applying the bokeh effect to the input data can be processed in real time.
- applying a bokeh effect to input data can be processed in real time.
- the image version to which the bokeh effect is applied to the image is to obtain a filtering image for the blur kernels to be used in advance, and around the edge of the mask in consideration of the human mask.
- each pixel value is prepared in advance according to the value of the normalized depth map obtained by considering the average depth map and intensity of the focused area. It may include the step of obtaining an interpolation result by using multiple filtered images (eg, an image filled with a special filtering value).
- a bokeh method is applied to an image
- the focus is changed depending on where the user inputs and/or when the intensity of the blur is changed through user input
- interpolating from the filtered image may refer to an arbitrary value that can improve the sharpness of an image.
- a Laplacian kernel that can give a sharpening effect is applied. May contain values.
- filtering when the input data is an image, filtering may be applied first, and interpolation may be applied according to a normalized depth map.
- a filtering image may be generated in advance according to at least one of the sizes and types of several kernels.
- various filters may be applied to one image in advance, and the image to which the filter was applied may be blended according to a required effect.
- the filtering kernel size is 1, 3, 7, 15, a result similar to the result of the filtering kernel size 11 may be output by blending the filtering results of kernel sizes 7 and 15.
- the bokeh effect application system may perform a method in which filtering and interpolation are simultaneously performed according to a normalized depth map.
- the system for applying the bokeh effect can perform filtering only once for one pixel, for example, so that the size of the filtering kernel can be more compactly configured.
- the kernel size is 1, 3, 7, 15 in the image version
- the kernel size can be 1, 3, 5, 7, 9, 11, 13, 15 in the video version. That is, since a video needs to output multiple images in a short time, it may be advantageous in terms of performance to create the necessary filters and apply them at once, rather than blending images to which multiple filters have been applied, such as an image version.
- the method of performing the image version and the video version may be mutually performed according to the performance of the hardware constituting the bokeh effect application system.
- the method of performing the original video version can be executed in the image version, and in a bokeh effect application system composed of high performance hardware, the video version is performed, and the original image version
- the method of performing may be performed, but the method is not limited thereto, and various types of filtering processes may be performed.
- the focus point and the blur intensity may change each time a frame color, depth, and mask are input
- the throughput may be set to match the speed at which the video device processes frames (eg, 30 fps (frame per second) or 60 fps).
- a pipeline technique may be applied using a computing unit in order to process according to a throughput of 30 fps or 60 fps.
- each pixel value according to the value of the normalized depth map obtained by considering the average depth map and intensity of the focused area may be performed according to the depth value of each pixel.
- a pixel having a depth value of 0.6 may be 4/7.
- step (d) includes a step of correcting the extracted depth map 1210 using the generated segmentation mask 1220 and a plurality of It may include applying a depth effect to the image frame.
- the depth map 1210 extracted through the second artificial neural network may have inaccurate depth information.
- the boundaries of small and detailed parts such as a finger of a person included in an image or a plurality of image frames, may be ambiguous.
- depth information may be incorrectly extracted.
- the bokeh effect application system may normalize depth information inside and outside the segmentation mask 1220 using the segmentation mask 1220 generated through the first artificial neural network, respectively.
- a segmentation mask to be used when correcting the depth map may be used.
- the method of correcting the depth map may include normalizing a range of a depth map within a divided area to be focused within a predetermined range.
- the process of improving the depth map may include making the depth map inside the unselected divided area homogeneous. For example, the average value may be unified, the variance may be made small through Equation 1 below, or median filtering may be applied.
- it may include the step of subtracting a representative value (eg, an average) of the depth map inside the divided area to be focused from the depth map, and taking an absolute value and performing averaging.
- a representative value eg, an average
- each of steps (a) to (d) may be executed by any one of a plurality of heterogeneous processors.
- Processors A, B, and C shown in Figs. 13 to 16 are each optimized for performing simple preprocessing and data execution, a processor that can play a role of mediating, a processor that draws a screen, and performs a neural network operation. It may be a processor (for example, DSP, NPU, Neural Accelerator, etc.), but is not limited thereto.
- processor A may be a CPU
- processor B may be a GPU (eg, a GPU having a GL interface)
- processor C may be a DSP, but is not limited thereto
- each of the processors A to C has a processor configuration. It may contain one of the known processors capable of executing.
- FIG. 13 illustrates a system in which image data captured from a camera is directly input to the processor C, but is not limited thereto, and a processor capable of directly receiving a camera input may be input to and processed by one or more processors.
- FIG. 14 illustrates that the neural network is performed by two processors (processors A and C), it is not limited thereto, and may be performed in parallel by several processors.
- each task is shown to be processed by each processor, but each task may be divided and processed in stages by a plurality of processors. For example, multiple processors may process a single task serially and together.
- the flow chart of the data processed in FIG. 16 is an exemplary embodiment, and is not limited thereto, and various types of data flow charts may be implemented according to a configuration and function of a processor.
- the processor B 1320 may receive a frame image from the imaging unit 310 (S1340). In an embodiment, the processor B 1320 may perform pre-processing of the received frame image (S1342).
- the processor C 1330 may receive the frame image at the same time that the processor B 1320 receives the frame image (S1350). In an embodiment, the processor C 1330 may include a first artificial neural network model. In an embodiment, the processor C 1330 may generate a segmentation mask corresponding to the frame image received in step S1350 using the first artificial neural network model (S1352). In an embodiment, the processor C 1330 may include a second artificial neural network model. The processor C 1330 may generate a depth map corresponding to the frame image received in step S1350 using the second artificial neural network model. Processor C 1330 may transmit the generated segmentation mask and depth map to processor A 1310 (S1356).
- the processor A 1310 may receive a segmentation mask and a depth map from the processor C 1330 (S1360). In an embodiment, the processor A 1310 may transmit the received segmentation mask and depth map to the processor B 1320 (S1362).
- the processor B 1320 may receive a segmentation mask and a depth map from the processor A 1310 (S1364). In an embodiment, the processor B 1320 may pre-process the received depth map (S1370). In one embodiment, the processor B 1320, the segmentation mask received from the processor A 1310 on the image preprocessed by the processor B 1320 in step S1342, and the processor B 1320 pre-processing in step S1370 A bokeh filter may be applied using one depth map (S1372). In an embodiment, the processor B 1320 may output a frame image corresponding to a result of applying the bokeh filter through the output unit 330 (S1374).
- the processor B 1320 may receive a frame image from the imaging unit 310 (S1410). In an embodiment, the processor B 1320 may perform pre-processing of the received frame image (S1412). The processor B 1320 may transmit the preprocessed image to the processor A 1310 (S1414). Also, the processor B 1320 may transmit the preprocessed image to the processor C 1330 (S1416).
- the processor A 1310 may receive a preprocessed image from the processor B 1320 (S1420).
- the processor A 1310 may include a second artificial neural network model.
- the processor A 1310 may generate a depth map corresponding to the preprocessed image received in step S1420 by using the second artificial neural network model (S1422).
- the processor A 1310 may perform pre-processing of the depth map generated in step S1422 (S1424).
- the processor C 1330 may receive the preprocessed image from the processor B 1320 (S1430).
- the processor C 1330 may include a first artificial neural network model.
- the processor C 1330 may generate a segmentation mask corresponding to the preprocessed image received in step S1430 by using the first artificial neural network model (S1432). Further, the processor C 1330 may transmit the generated segmentation mask to the processor A 1310 (S1434).
- processor A 1310 may receive a segmentation mask from processor C 1330. In one embodiment, processor A 1310 uses a segmentation mask received in step S1440 and a depth map preprocessed by processor A 1310 in step S1424 on the preprocessed image received in step S1420. A bokeh effect filter may be applied (S1442). In addition, the processor A 1310 may transmit the bokeh filter application result to the processor B 1320 (S1444).
- the processor B 1320 may receive the bokeh filter application result from the processor A 1310 (S1446). In an embodiment, the processor B 1320 may output a frame image corresponding to a result of applying the bokeh filter through the output unit 330 (S1450).
- the processor B 1320 may receive a frame image from the imaging unit 310 (S1510). In an embodiment, the processor B 1320 may perform pre-processing of the received frame image (S1512). The processor B 1320 may transmit the preprocessed image to the processor A 1310 (S1514).
- the processor A 1310 may receive a preprocessed image from the processor B 1320 (S1520). In an embodiment, the processor A 1310 may transmit the preprocessed image to the processor C 1330 (S1522). In an embodiment, the processor C 1330 may receive the preprocessed image from the processor A 1310 (S1524).
- the processor C 1330 may include a first artificial neural network model.
- the processor C 1330 may generate a segmentation mask corresponding to the preprocessed image received in step S1524 by using the first artificial neural network model.
- the processor C 1330 may include a second artificial neural network model.
- the processor C 1330 may generate a depth map corresponding to the preprocessed image received in step S1524 using the second artificial neural network model. Further, the processor C 1330 may transmit the generated segmentation mask and depth map to the processor A 1310 (S1534).
- the processor A 1310 may receive a segmentation mask and a depth map from the processor C 1330 (S1540). In an embodiment, the processor A 1310 may pre-process the depth map received in step S1540 (S1542). Further, the processor A 1310 may transmit the segmentation mask received in step S1540 and the depth map preprocessed in step S1542 to the processor B 1320 (S1544).
- the processor B 1320 may receive a segmentation mask and a depth map from the processor A 1310 (S1546). In an embodiment, the processor B 1320 may preprocess the depth map preprocessed by the processor A 1310 again (S1550). In an embodiment, the processor B 1320 may apply a bokeh effect filter to the image preprocessed by the processor B 1320 in step S1512 using the segmentation mask and depth map received in step S1544 ( S1552). Further, the processor B 1320 may output a frame image corresponding to a result of applying the bokeh filter through the output unit 330 (S1554).
- the input/output interface 1610 illustrated in FIG. 16 may include the imaging unit 310, the input unit 320, and the output unit 330 described above in FIG. 3.
- the input/output interface 1610 may acquire an image or a plurality of image frames through the imaging unit 310.
- the input/output interface 1610 may receive an input for changing the position of the focus and/or the intensity of the bokeh effect from the user through the input unit 320.
- the input/output interface 1610 may output a result of applying the bokeh filter generated by the processor A 1630, the processor B 1620, and the processor C 1640 through the output unit 330.
- processor B 1620 may include a bokeh kernel 1622.
- the processor B 1620 may be configured as a GPU in charge of drawing a screen, but is not limited thereto.
- the bokeh kernel 1622 may apply a bokeh effect filter to an image or a plurality of image frames using a segmentation mask and a depth map.
- the processor A 1630 may include a data bridge 1632 and a preprocessor 1634.
- the processor A 1630 may be configured with a CPU that can serve as a mediator for simple preprocessing and data execution, but is not limited thereto.
- a simple pre-processing task is a blur process that smoothes the boundary line between a person and a background, a median filter that removes signal noise such as depth map noise, and a depth map that fills the empty space of the depth map. (completion) and depth map upsampling to increase the resolution quality of the depth map may be included, but the present invention is not limited thereto, and various preprocessing operations capable of improving the quality of the output may be included.
- the bokeh effect application system may include a separate artificial neural network model for simple preprocessing tasks such as depth map completion and depth map upsampling.
- the data bridge 1632 may play a role of mediating data execution between the input/output interface 1610, the processor B 1620 and the processor C 1640.
- the processor A 1630 may perform an operation to distribute a task to be processed by the processor B 1620 and the processor C 1640, but is not limited thereto.
- the preprocessor 1634 may perform preprocessing of an image, a plurality of image frames, or a depth map received from the input/output interface 1610, the processor B 1620 and the processor C 1640.
- processor C 1640 may include segmentation network 1642 and depth network 1644.
- the processor C 1640 may be configured with a processor optimized for performing neural network operations such as DPS, NPU, and Neural Accelerator, but is not limited thereto.
- the segmentation network 1642 may receive an image, a plurality of image frames, a preprocessed image, or a plurality of image frames to generate a segmentation mask.
- the segmentation network 1642 may include a first artificial neural network model, which will be described later in more detail with reference to FIG. 17.
- the depth network 1644 may receive an image, a plurality of image frames, a preprocessed image, or a plurality of image frames to extract a depth map.
- the depth network 1644 may include a second artificial neural network model, which will be described later in more detail with reference to FIG. 17.
- the artificial neural network model 1700 is a statistical learning algorithm implemented based on the structure of a biological neural network or a structure that executes the algorithm in machine learning technology and cognitive science.
- nodes which are artificial neurons that form a network by combining synapses as in a biological neural network, repeatedly adjust the weight of the synapse, By learning to reduce the error between the output and the inferred output, it is possible to represent a machine learning model having problem solving ability.
- the artificial neural network model 1700 may include an arbitrary probability model, a neural network model, and the like used in artificial intelligence learning methods such as machine learning and deep learning.
- the artificial neural network model 1700 is a first artificial neural network configured to output a segmentation mask by inputting a plurality of image frames including at least one object and/or image features extracted from the plurality of image frames.
- the artificial neural network model 1700 is a second artificial neural network configured to output a depth map by inputting a plurality of image frames including at least one object and/or image features extracted from the plurality of image frames.
- the artificial neural network model 1700 is implemented as a multilayer perceptron (MLP) composed of multilayer nodes and connections between them.
- the artificial neural network model 1700 may be implemented using one of various artificial neural network model structures including MLP.
- the artificial neural network model 1700 includes an input layer 1720 that receives an input signal or data 1710 from the outside, and an output layer that outputs an output signal or data 1750 corresponding to the input data. (1740), located between the input layer 1720 and the output layer 1740, receives a signal from the input layer 1720, extracts a characteristic, and transmits it to the output layer 1740 (where n is a positive integer). It consists of hidden layers 1730_1 to 1730_n.
- the output layer 1740 receives signals from the hidden layers 1730_1 to 1730_n and outputs them to the outside.
- the processing unit 340 analyzes a plurality of input image frames using supervised learning in order to output a segmentation mask and/or a depth map from the plurality of training image frames, and corresponds to the plurality of image frames.
- the artificial neural network model 1700 may be trained so that the segmentation mask and/or the depth map can be inferred.
- the artificial neural network model 1700 learned in this way may be stored in the storage unit 350, and in response to input of a plurality of image frames including at least one object received by the communication unit 360 and/or the input unit 320 It is possible to output a segmentation mask and/or a depth map.
- the input variable of the artificial neural network model 1700 capable of extracting depth information is a plurality of training images including at least one object. It can be a frame.
- an input variable input to the input layer 1720 of the artificial neural network model 1700 may be an image vector 1710 in which a training image is composed of one vector data element.
- an output variable output from the output layer 1740 of the artificial neural network model 1700 may be a vector 1750 representing a segmentation mask and/or a depth map.
- the output variable of the artificial neural network model 1700 is not limited to the type described above, and may include arbitrary information/data representing a deformable 3D motion model.
- a plurality of output variables corresponding to a plurality of input variables are matched to the input layer 1720 and the output layer 1740 of the artificial neural network model 1700, respectively, and the input layer 1720, the hidden layers 1730_1 to 1730_n, and the output layer
- synaptic values between nodes included in 1740 are adjusted, it may be learned so that a correct output corresponding to a specific input can be extracted.
- the nodes of the artificial neural network model 1700 can identify the characteristics hidden in the input variables of the artificial neural network model 1700 and reduce the error between the output variable calculated based on the input variable and the target output. You can adjust the synaptic value (or weight) between them.
- information on the segmentation mask and/or depth map corresponding to the plurality of input image frames will be output in response to a plurality of image frames including at least one input object. I can.
- the processing units used to perform the techniques include one or more ASICs, DSPs, digital signal processing devices (DSPDs), programmable logic devices (PLDs). ), field programmable gate arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, It may be implemented in a computer, or a combination thereof.
- various illustrative logic blocks, modules, and circuits described in connection with the disclosure herein may include a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or It may be implemented or performed in any combination of those designed to perform the functions described herein.
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
- the processor may also be implemented as a combination of computing devices, eg, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in connection with the DSP core, or any other such configuration.
- RAM random access memory
- ROM read-only memory
- NVRAM non-volatile random access memory
- PROM on a computer-readable medium such as programmable read-only memory), erasable programmable read-only memory (EPROM), electrically erasable PROM (EEPROM), flash memory, compact disc (CD), magnetic or optical data storage device, etc. It can also be implemented as stored instructions.
- the instructions may be executable by one or more processors, and may cause the processor(s) to perform certain aspects of the functionality described herein.
- Computer-readable media includes both computer storage media and communication media, including any medium that facilitates transfer of a computer program from one place to another.
- Storage media may be any available media that can be accessed by a computer.
- such computer readable medium may contain RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or desired program code in the form of instructions or data structures. It may include any other media that may be used for transfer or storage to and accessible by a computer. Also, any connection is properly termed a computer-readable medium.
- disks and disks include CDs, laser disks, optical disks, digital versatile discs (DVDs), floppy disks, and Blu-ray disks, where disks are usually magnetic It reproduces data optically, while discs reproduce data optically using a laser. Combinations of the above should also be included within the scope of computer-readable media.
- the software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other type of storage medium known in the art.
- An exemplary storage medium may be coupled to a processor such that the processor can read information from or write information to the storage medium.
- the storage medium may be integrated into the processor.
- the processor and storage medium may also reside within the ASIC.
- the ASIC may exist in the user terminal.
- the processor and storage medium may exist as separate components in the user terminal.
- exemplary implementations may refer to utilizing aspects of the currently disclosed subject matter in the context of one or more standalone computer systems, the subject matter is not so limited, but rather is associated with any computing environment, such as a network or distributed computing environment. It can also be implemented. Furthermore, aspects of the presently disclosed subject matter may be implemented in or across multiple processing chips or devices, and storage may be similarly affected across multiple devices. Such devices may include PCs, network servers, and handheld devices.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Geometry (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (13)
- 사용자 단말에서 비디오 영상에 보케 효과를 적용하는 방법에 있어서,상기 비디오 영상에 포함된 이미지로부터 상기 이미지의 특성 정보를 추출하는 단계;상기 이미지의 추출된 특성 정보를 분석하는 단계;상기 이미지의 분석된 특성 정보를 기초로 상기 이미지에 적용될 보케 효과를 결정하는 단계; 및상기 결정된 보케 효과를 상기 이미지에 적용하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제1항에 있어서,상기 이미지의 추출된 특성 정보를 분석하는 단계는,상기 이미지 내의 객체를 탐지하는 단계;상기 이미지 내의 객체에 대응하는 영역을 생성하는 단계;상기 이미지 내에서의 객체에 대응하는 영역의 위치, 크기 및 방향 중 적어도 하나를 결정하는 단계; 및상기 객체에 대응하는 영역의 위치, 크기 및 방향 중 적어도 하나에 대한 정보에 기초하여 이미지의 특성을 분석하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제2항에 있어서,상기 이미지 내의 객체는, 상기 이미지 내에 포함된 인물 객체, 얼굴 객체, 랜드마크 객체 중 적어도 하나의 객체를 포함할 수 있으며,상기 이미지 내에서의 객체의 위치, 크기 및 방향 중 적어도 하나를 결정하는 단계는 상기 이미지의 크기와 상기 객체에 대응하는 영역의 크기의 비율을 결정하는 단계를 포함하고,상기 객체의 위치, 크기 및 방향 중 적어도 하나에 대한 정보에 기초하여 이미지의 특성을 분석하는 단계는 상기 이미지 내에 포함된 객체의 포즈(pose)를 분류하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제1항에 있어서,상기 이미지의 추출된 특성 정보를 분석하는 단계는,상기 이미지 내에 포함된 점근선(지평선) 및 소실점의 높이 중 적어도 하나를 탐지하는 단계; 및상기 탐지된 점근선 및 소실점의 높이 중 적어도 하나를 기초로 상기 이미지 내의 심도(depth) 특성을 분석하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제1항에 있어서,상기 이미지에 적용될 보케 효과를 결정하는 단계는,상기 이미지의 분석된 특성 정보를 기초로 상기 이미지의 적어도 일부분에 적용될 보케 효과의 종류 및 적용 방식을 결정하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제1항에 있어서,상기 비디오 영상에 대한 보케 효과의 강도에 대한 입력 정보를 수신하는 단계를 더 포함하고,상기 이미지에 보케 효과를 적용하는 단계는,상기 수신된 강도에 대한 입력 정보에 기초하여, 보케 효과의 강도를 결정해 상기 이미지에 적용하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제1항에 있어서,상기 결정된 보케 효과를 상기 이미지에 적용하는 단계는,상기 이미지 내에서 블러(blur) 효과가 적용될 영역들에 대응하는 서브 이미지들을 생성하는 단계;상기 블러 효과를 서브 이미지들에 적용시키는 단계; 및상기 블러 효과가 적용된 서브 이미지들을 혼합하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제7항에 있어서,상기 이미지를 다운샘플링하여 상기 이미지의 해상도보다 낮은 해상도의 이미지를 생성하는 단계를 더 포함하고,상기 이미지 내에서 블러 효과가 적용된 영역들에 대응하는 서브 이미지들을 생성하는 단계는 상기 낮은 해상도의 이미지에서 상기 서브 이미지에 대응하는 영역에 블러 효과를 적용하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 제8항에 있어서,상기 블러 효과가 적용된 서브 이미지들을 혼합하는 단계는,상기 낮은 해상도의 이미지 및 상기 블러 효과가 적용된 영역들에 대응하는 서브 이미지들을 혼합하는 단계;상기 이미지의 해상도와 동일하도록, 상기 서브 이미지들이 혼합된 낮은 해상도의 이미지를 업샘플링(upsampling)하는 단계; 및상기 이미지 및 상기 업샘플링된 이미지들을 혼합하여 상기 업샘플링된 이미지의 선명도를 보정하는 단계를 포함하는,보케 효과를 적용하는 방법.
- 사용자 단말에서 비디오 영상에 보케 효과를 적용하는 방법에 있어서,(a) 복수의 영상 프레임에 대한 정보를 수신하는 단계;(b) 상기 복수의 영상 프레임에 대한 정보를 제1 인공신경망 모델에 입력하여 상기 복수의 영상 프레임 내에 포함된 하나 이상의 객체에 대한 세그멘테이션 마스크를 생성하는 단계;(c) 상기 복수의 영상 프레임에 대한 정보를 제2 인공신경망 모델에 입력하여 상기 복수의 영상 프레임에 대한 깊이 맵을 추출하는 단계; 및(d) 상기 생성된 세그멘테이션 마스크 및 상기 추출된 깊이 맵을 기초로, 상기 복수의 영상 프레임에 대한 심도 효과를 적용하는 단계를 포함하는,비디오 영상에 보케 효과를 적용하는 방법.
- 제10항에 있어서,상기 (d) 단계는,상기 생성된 세그멘테이션 마스크를 이용하여 상기 추출된 깊이 맵을 보정하는 단계; 및상기 보정된 깊이 맵을 기초로 상기 복수의 영상 프레임에 대한 심도 효과를 적용하는 단계를 포함하는,비디오 영상에 보케 효과를 적용하는 방법.
- 제10항에 있어서,상기 (a) 단계 내지 상기 (d) 단계의 각각은 복수의 이기종 프로세서 중 어느 하나의 프로세서에 의해 실행되는,비디오 영상에 보케 효과를 적용하는 방법.
- 제1항 내지 제12항 중 어느 한 항에 따른 사용자 단말에서 비디오 영상에 보케 효과를 적용하는 방법을 컴퓨터에서 실행하기 위한 컴퓨터 프로그램이 기록된, 컴퓨터로 판독 가능한 기록 매체.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/668,545 US20220270215A1 (en) | 2019-09-06 | 2022-02-10 | Method for applying bokeh effect to video image and recording medium |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2019-0111055 | 2019-09-06 | ||
KR20190111055 | 2019-09-06 | ||
KR1020200113328A KR102262671B1 (ko) | 2019-09-06 | 2020-09-04 | 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 |
KR10-2020-0113328 | 2020-09-04 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/668,545 Continuation US20220270215A1 (en) | 2019-09-06 | 2022-02-10 | Method for applying bokeh effect to video image and recording medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021045599A1 true WO2021045599A1 (ko) | 2021-03-11 |
Family
ID=74852671
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2020/012058 WO2021045599A1 (ko) | 2019-09-06 | 2020-09-07 | 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 |
Country Status (2)
Country | Link |
---|---|
US (1) | US20220270215A1 (ko) |
WO (1) | WO2021045599A1 (ko) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023195833A1 (en) * | 2022-04-09 | 2023-10-12 | Samsung Electronics Co., Ltd. | Method and electronic device for detecting blur in image |
WO2024025224A1 (en) * | 2022-07-25 | 2024-02-01 | Samsung Electronics Co., Ltd. | Method and system for generation of a plurality of portrait effects in an electronic device |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112085701A (zh) * | 2020-08-05 | 2020-12-15 | 深圳市优必选科技股份有限公司 | 一种人脸模糊度检测方法、装置、终端设备及存储介质 |
CN117319812A (zh) * | 2022-06-20 | 2023-12-29 | 北京小米移动软件有限公司 | 图像处理方法及装置、移动终端、存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130038076A (ko) * | 2011-10-07 | 2013-04-17 | 엘지전자 주식회사 | 이동 단말기 및 그의 아웃 포커싱 이미지 생성방법 |
JP2015012437A (ja) * | 2013-06-28 | 2015-01-19 | 株式会社ニコン | デジタルカメラ |
US20180286059A1 (en) * | 2017-04-04 | 2018-10-04 | Rolls-Royce Plc | Determining surface roughness |
KR20180120022A (ko) * | 2017-04-26 | 2018-11-05 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 영상 표시 방법 |
US20190213714A1 (en) * | 2018-01-11 | 2019-07-11 | Qualcomm Incorporated | Low-resolution tile processing for real-time bokeh |
-
2020
- 2020-09-07 WO PCT/KR2020/012058 patent/WO2021045599A1/ko active Application Filing
-
2022
- 2022-02-10 US US17/668,545 patent/US20220270215A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130038076A (ko) * | 2011-10-07 | 2013-04-17 | 엘지전자 주식회사 | 이동 단말기 및 그의 아웃 포커싱 이미지 생성방법 |
JP2015012437A (ja) * | 2013-06-28 | 2015-01-19 | 株式会社ニコン | デジタルカメラ |
US20180286059A1 (en) * | 2017-04-04 | 2018-10-04 | Rolls-Royce Plc | Determining surface roughness |
KR20180120022A (ko) * | 2017-04-26 | 2018-11-05 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 영상 표시 방법 |
US20190213714A1 (en) * | 2018-01-11 | 2019-07-11 | Qualcomm Incorporated | Low-resolution tile processing for real-time bokeh |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023195833A1 (en) * | 2022-04-09 | 2023-10-12 | Samsung Electronics Co., Ltd. | Method and electronic device for detecting blur in image |
WO2024025224A1 (en) * | 2022-07-25 | 2024-02-01 | Samsung Electronics Co., Ltd. | Method and system for generation of a plurality of portrait effects in an electronic device |
Also Published As
Publication number | Publication date |
---|---|
US20220270215A1 (en) | 2022-08-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021045599A1 (ko) | 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 | |
WO2020192483A1 (zh) | 图像显示方法和设备 | |
US11457138B2 (en) | Method and device for image processing, method for training object detection model | |
EP3871405A1 (en) | Techniques for convolutional neural network-based multi-exposure fusion of multiple image frames and for deblurring multiple image frames | |
WO2021179820A1 (zh) | 图像处理方法、装置、存储介质及电子设备 | |
WO2019085792A1 (en) | Image processing method and device, readable storage medium and electronic device | |
CN116018616A (zh) | 保持帧中的目标对象的固定大小 | |
ES2967691T3 (es) | Ajuste de una representación digital de una región de cabeza | |
KR101679290B1 (ko) | 영상 처리 방법 및 장치 | |
EP4198875A1 (en) | Image fusion method, and training method and apparatus for image fusion model | |
US11538175B2 (en) | Method and apparatus for detecting subject, electronic device, and computer readable storage medium | |
CN110889410A (zh) | 浅景深渲染中语义分割的稳健用途 | |
WO2019050360A1 (en) | ELECTRONIC DEVICE AND METHOD FOR AUTOMATICALLY SEGMENTING TO BE HUMAN IN AN IMAGE | |
KR102262671B1 (ko) | 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 | |
CN110335216B (zh) | 图像处理方法、图像处理装置、终端设备及可读存储介质 | |
WO2019105297A1 (zh) | 图像虚化处理方法、装置、移动设备及存储介质 | |
US20220189029A1 (en) | Semantic refinement of image regions | |
WO2022076116A1 (en) | Segmentation for image effects | |
US20180198994A1 (en) | Compressive sensing capturing device and method | |
WO2023138629A1 (zh) | 加密图像信息获取装置及方法 | |
CN108462831B (zh) | 图像处理方法、装置、存储介质及电子设备 | |
WO2023146698A1 (en) | Multi-sensor imaging color correction | |
US20060010582A1 (en) | Chin detecting method, chin detecting system and chin detecting program for a chin of a human face | |
CN114445864A (zh) | 一种手势识别方法及装置、存储介质 | |
WO2024025134A1 (en) | A system and method for real time optical illusion photography |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20860204 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20860204 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.09.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20860204 Country of ref document: EP Kind code of ref document: A1 |