WO2020262977A1 - Procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle - Google Patents

Procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle Download PDF

Info

Publication number
WO2020262977A1
WO2020262977A1 PCT/KR2020/008267 KR2020008267W WO2020262977A1 WO 2020262977 A1 WO2020262977 A1 WO 2020262977A1 KR 2020008267 W KR2020008267 W KR 2020008267W WO 2020262977 A1 WO2020262977 A1 WO 2020262977A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
objects
user terminal
user
artificial intelligence
Prior art date
Application number
PCT/KR2020/008267
Other languages
English (en)
Korean (ko)
Inventor
유수연
Original Assignee
유수연
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 유수연 filed Critical 유수연
Publication of WO2020262977A1 publication Critical patent/WO2020262977A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction

Definitions

  • the present invention relates to a method of removing an object of an image using artificial intelligence.
  • the subject to be photographed is well taken, but an element that hinders the overall aesthetics within the frame is photographed, or the location of the subject to be photographed Very often, it does not match the aesthetic angle.
  • GAN Geneative Adversarial Network
  • the two neural network models included in the GAN are called generators and discriminators, and have opposite purposes.
  • the constructor learns real data and generates false data based on it.
  • the constructor aims to generate false data that is close to reality.
  • the discriminator is trained to determine whether the data presented by the generator is real or false. That is, the discriminator is trained to determine whether the provided data is generated by the creator or actual data. Thus, the discriminator is learned for the purpose of not being fooled by the false data of the constructor.
  • the generator learns data that the discriminator could not deceive, and the discriminator receives the data deceived from the generator and learns. As this process is repeated, GANs are increasingly able to create false data that is more realistic.
  • the problem to be solved by the present invention is to provide a method of removing an object of an image using artificial intelligence.
  • a method of removing an object of an image using artificial intelligence for solving the above-described problem includes the step of photographing an image (S110), and the step of recognizing one or more objects from the photographed image ( S120), determining a first object to be removed from the photographed image (S130), obtaining data for generating a background image covered by the first object from the photographed image (S140), the masked image And generating a background image (S150), and generating a result image from which the first object is removed by synthesizing the hidden background image and the photographed image (S160).
  • FIG. 1 is a flowchart illustrating a method of removing an object of an image using artificial intelligence according to an exemplary embodiment.
  • FIG. 2 is a flowchart illustrating a process for collecting data required for execution of a GAN according to an embodiment.
  • FIG. 3 is a flowchart illustrating a process of stopping data collection according to an exemplary embodiment.
  • FIG. 4 is a flowchart illustrating a process of selecting an object to be removed according to an exemplary embodiment.
  • FIG. 5 is a flowchart illustrating a process of classifying an object to be removed into a main and a sub, according to an exemplary embodiment.
  • FIG. 6 is a flowchart illustrating a process of categorizing an object according to an exemplary embodiment.
  • FIG. 7 is a flowchart illustrating a process when there is no classification to which an object belongs, according to an exemplary embodiment.
  • FIG. 8 is a flowchart illustrating a process of designating a removal area by a user according to an exemplary embodiment.
  • FIG. 9 is a block diagram of an apparatus according to an exemplary embodiment.
  • 10 to 14 are diagrams illustrating a state in which an object according to a user's input is removed from an original photo according to an exemplary embodiment.
  • a method of removing an object of an image using artificial intelligence for solving the above-described problem includes the step of photographing an image (S110), and the step of recognizing one or more objects from the photographed image ( S120), determining a first object to be removed from the photographed image (S130), obtaining data for generating a background image covered by the first object from the photographed image (S140), the masked image And generating a background image (S150), and generating a result image from which the first object is removed by synthesizing the hidden background image and the photographed image (S160).
  • a step S220 of performing and generating at least a part of the covered background image using the collected image data (S230) may be further included.
  • step (S220), the step of measuring the amount of image data collected corresponding to the hidden background image (S310), the step of calculating an increase amount of the collected image data amount (S320), and the increase amount is predetermined. If the value is reduced to less than or equal to, stopping the collection of image data (S330) may be further included.
  • step S230 the remainder of the masked image not generated in the step S230 is generated, but the image corresponding to the remainder is generated using a Generative Adversarial Network (GAN), It may further include step S410.
  • GAN Generative Adversarial Network
  • the step (S130) includes a step (S510) of displaying an area corresponding to one or more objects recognized in the step (S120), and a step (S520) of selecting an object to be removed from among the one or more objects from the user. It may contain more.
  • the step (S120) may further include a step (S610) of classifying the one or more objects into a main object and a sub-object based on at least one of the size of the one or more objects, the movement of the object, and the type of the object. I can.
  • step S130 may further include determining the sub-object as an object to be removed (S620).
  • the step (S130) may include determining a classification to which each of the recognized one or more objects belongs (S710), outputting the classification of the one or more objects (S720), and one or more of the output classifications to be removed Requesting the selection of a classification from the user (S730), obtaining the user's selection input (S740), and determining one or more objects belonging to the classification selected by the user as the object to be removed (S750) can do.
  • the step (S710) is a step of searching for a classification of each of the one or more objects from a previously stored database (S712), when the classification corresponding to the second object is not searched in the database, corresponding to the second object
  • the method may further include generating a classification (S714) and storing the generated classification in the database (S716).
  • the step of generating a result image by synthesizing the captured background image and the captured image (S840) may be further included.
  • An apparatus for removing an object of an image using artificial intelligence according to an aspect of the present invention for solving the above-described problem includes a memory for storing one or more instructions and a processor for executing the one or more instructions stored in the memory. And, by executing the one or more instructions, the processor performs a method of removing an object of an image using artificial intelligence according to the disclosed embodiment.
  • a computer program for removing an object of an image using an artificial intelligence according to an aspect of the present invention for solving the above-described problem is combined with a computer that is hardware, and uses the artificial intelligence according to the disclosed embodiment to create an object of the image. It is stored on a computer-readable recording medium so that the removal method can be performed.
  • unit or “module” refers to a hardware component such as software, FPGA or ASIC, and the "unit” or “module” performs certain roles. However, “unit” or “module” is not meant to be limited to software or hardware.
  • the “unit” or “module” may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors.
  • “sub” or “module” refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, It includes procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays and variables. Components and functions provided within “sub” or “module” may be combined into a smaller number of components and “sub” or “modules” or into additional components and “sub” or “modules”. Can be further separated.
  • a computer refers to all kinds of hardware devices including at least one processor, and may be understood as encompassing a software configuration operating in a corresponding hardware device according to embodiments.
  • the computer may be understood as including all of a smartphone, a tablet PC, a desktop, a laptop, and a user client and an application running on each device, but is not limited thereto.
  • FIG. 1 is a flowchart illustrating a method of removing an object of an image using artificial intelligence according to an exemplary embodiment.
  • step S110 the user terminal photographs an image according to the user's manipulation.
  • the user terminal recognizes an image input through a camera according to a preset criterion and captures the moment at which a specific shooting angle, composition, or object is placed in a focused state, without user manipulation. You can also take pictures.
  • the user terminal can shoot the image without the user's manipulation.
  • the above-described preset composition may be set by an artificial intelligence learned by receiving image data such as a photograph of a professional photographer's work, a masterpiece, an illustration, or an image input by a user as learning data. It can also be set based on an image with a specific composition.
  • composition is composed of factors such as the location of the horizontal line, the area ratio of the top and bottom relative to the horizontal line, the location of the vanishing point, the method of perspective, the location of the object, the size of the object, the distance between the camera and the object, or the amount of light. It may be defined, but is not limited thereto.
  • the above-described'conformity to a certain degree' does not mean a mechanical and physical 100% match, and if a match degree of more than a preset value is shown through a probabilistic calculation, the user terminal is a composition that matches the preset composition. It can be recognized as.
  • the user terminal determines the perspective of the captured image, the composition of the color, the location of the light source, the type of the light source, the intensity of the light, the direction of the light, whether it is indoors or outdoors, whether it is backlit, the location of the camera at the time of shooting, or the user terminal. Based on the included gyro sensor information, etc., the horizontal, saturation, contrast, contrast, or exposure of the captured image can be edited.
  • the user terminal may display the above-described edit details, and may or may not adopt some or all of the edit details according to the user's choice.
  • step S120 the user terminal recognizes one or more objects in the captured image.
  • the recognition of the object may be achieved by connecting to a server on which the learned artificial intelligence is mounted.
  • 'learned artificial intelligence' may mean an artificial intelligence model learned by machine learning, but is not limited thereto.
  • the recognition of the above-described object may be achieved regardless of objects, animals, plants, natural objects, buildings, decorations, people, or other objects.
  • step S130 the user terminal determines a first object to be removed from the captured image.
  • the user terminal may identify a main subject, another subject, a subject obscuring the main subject, or a subject that hinders the overall aesthetics, among the objects recognized in step S120, which is the artificial intelligence described above. It can be made together in the process of recognizing the composition and the arrangement of objects that have been previously set or learned by.
  • the user terminal is an object that is placed at a specific location in a specific composition, an object that receives relatively concentrated light depending on the location of a light source, amount of light, direction of light, etc., or an object with high frequency labeling data selected as the main object. Can be recognized as the main subject.
  • the user terminal may determine one or more people and one or more landmarks among objects included in the photo as the main subject, but is not limited thereto.
  • the user terminal may determine the distance and depth of the object to be photographed using a camera or one or more sensors including a camera, and also measure the size, height, width, angle, etc. of each object. I can.
  • These distances and heights may be measured using triangulation, but are not limited thereto.
  • the user terminal may determine the object in focus as the main object in consideration of the depth and focal length in the image to be photographed.
  • the user terminal may further perform a step of requesting a user's selection of which of the recognized objects to be the main subject.
  • the user terminal may recognize it as an object to be removed.
  • step S140 the user terminal acquires data for generating a background image covered by the first object from the captured image.
  • step S150 the user terminal generates the hidden background image.
  • the user terminal may further perform a step of requesting a user to select a data acquisition method for generating a background image and a step of requesting data acquisition according to the selected method.
  • the above-described data acquisition method for generating a background image includes additional photographing by the user at the request of the user terminal, selection of an already photographed image by the user at the request of the user terminal, or re-input of original image data. I can.
  • the user terminal uses a Generative Adversarial Network (GAN) to retrieve image data having a probability distribution of the original image while the object to be removed from the original image is deleted. Will be created.
  • GAN Generative Adversarial Network
  • step S160 the user terminal generates a result image from which the first object is removed by synthesizing the hidden background image and the captured image.
  • the synthesis of the generated background image and the photographed image may be performed using a GAN.
  • the user terminal may generate a background covered by the first object using the GAN, or may combine the generated background or the acquired background image and an image including the first object using the GAN.
  • the user terminal displays the generated result image, and may request an answer from the user on the satisfaction level for reproduction of the result image.
  • the user terminal may input the negative feedback into the learning model in a way that the GAN classifies the result image as a fake image.
  • the user terminal may improve a corresponding learning model by receiving such feedback, original image data, image data acquired for generating a hidden background image, generated background image data, and synthesized result image data together.
  • FIG. 2 is a flowchart illustrating a process for collecting data required for execution of a GAN according to an embodiment.
  • step S210 the user terminal outputs a direction for moving the user terminal in a predetermined direction.
  • the user terminal may display a message such as'Please rotate the camera clockwise' or'Please move the camera slightly to the left' on the display.
  • Such directions may be sequentially displayed for each direction, or a moving direction (movement line) of the user terminal may be displayed on the screen of the user terminal, and the method is not limited.
  • This movement is for photographing the background of the back part covered by the first object to be removed. That is, the user terminal may instruct the movement of the user terminal to photograph at least a part of the rear part covered by the first object.
  • the direction may be output by text information and audio information
  • the direction in which the camera should be moved may be output by a direction indication display such as an arrow.
  • the user terminal simulates the image when the object recognized as the main subject is photographed from the angle at which the camera is moved, and can be displayed on the photographing screen in an overlay manner. Can induce the camera to move its position.
  • the user terminal may determine this and omit the above-described step (S210).
  • This determination may be performed by determining a distance and a depth of an object to be removed. For example, if the distance to the object to be removed is more than a preset reference value, even if the user terminal is moved, the rear portion covered by the object will hardly be photographed. In this case, the user terminal may skip this step or immediately terminate and proceed to the next step.
  • the user terminal may collect image data corresponding to a hidden background image captured according to the movement of the user terminal.
  • the user terminal may terminate the collection of image data when a predetermined ratio or more of the background image covered by the first object is acquired, but is not limited thereto.
  • the user terminal may generate an expected value of image data that can be obtained based on the distance to the first object. For example, as the distance from the first object increases, the expected value of image data that can be acquired will decrease. When the image data corresponding to a predetermined ratio of the expected value is acquired, the user terminal may terminate the image data collection.
  • the step (S220) is performed without user manipulation. You can quit.
  • the user terminal may generate at least a part of the hidden background image using the collected image data.
  • the user terminal may create at least a part of the hidden background image by stitching the collected image data, but is not limited thereto.
  • the user terminal may generate an image corresponding to a portion that cannot be created using the GAN. Since some of the generated hidden background image data has a probability distribution that is indistinguishable from the original data to be edited by the GAN, this can also be the original data for the generation of the hidden background image. It can solve the problem situation when it is difficult to take additional pictures.
  • FIG. 3 is a flowchart illustrating a process of stopping data collection according to an exemplary embodiment.
  • step S310 the user terminal may measure the amount of image data collected corresponding to the hidden background image.
  • the image data collected corresponding to the hidden background image may be used as a material for generating the hidden background image.
  • the user terminal may generate a hidden background image by converting and stitching the collected image data, but is not limited thereto.
  • the image data collected in response to the hidden background image may mean image data collected to obtain a probability distribution from the original data when using GAN to generate a hidden background image.
  • the user terminal may calculate an increase in the amount of image data to be collected.
  • the amount of image data may mean an amount of a portion that can be generated through transformation and stitching among the hidden portions.
  • the amount of increase in the amount of image data described above may be defined as an increase when the amount of image data capable of filling an area covered by the object to be removed increases.
  • the amount of increase in the amount of image data described above may be defined as an increase when the probability distribution obtained by the collected image data becomes more sophisticated when using GAN.
  • the user terminal may stop collecting image data when the amount of increase decreases below a predetermined value.
  • the user terminal ignores the increase in image data in a predetermined pixel unit so that the corresponding process is performed at the human scale, and fast, efficient, and finite computational processing can be performed.
  • step S410 the user terminal generates the remainder of the masked image not generated in the above-described step S230, but the image corresponding to the remainder may be generated using a Generative Adversarial Network (GAN).
  • GAN Generative Adversarial Network
  • the user terminal may re-input it as an object of classification model and generation model training. I can.
  • the user terminal may query the user about what objects are in the area covered by the object to be removed. For example, the user may input that there are trees, street lights, etc. in the area covered by the object to be removed, and such input may be performed directly by the user, or by selecting one or more of the items provided by the user terminal. It can also be done.
  • the user terminal may generate a list of objects expected to be located at a portion covered by the object to be removed based on the surrounding image, and sort and display the list in descending order based on probability.
  • the user may select an object located in a portion covered by the object to be removed.
  • the user terminal may generate a hidden background image using the GAN based on information obtained from the user.
  • the user terminal may generate a plurality of candidates for a hidden background image, and generate a plurality of result images obtained by synthesizing them and provide them to the user.
  • the user may select and store one of a plurality of result images.
  • FIG. 4 is a flowchart illustrating a process of selecting an object to be removed according to an exemplary embodiment.
  • the user terminal may perform a step S510 of displaying an area corresponding to the recognized one or more objects.
  • the user terminal may display the area by a rectangle, an ellipse, a polygon, or a closed curve including a boundary of a recognized object, but is not limited thereto.
  • the user terminal may display the area through different colors, brightness, saturation, highlight, or the like.
  • the display of the area may be displayed in a manner that is overlaid on an image to be edited.
  • the user terminal may receive an object to be removed from among the one or more objects from the user.
  • the above-described method for selecting an object to be removed received from a user is to select an object by touching the inside of the boundary of the area display or display a list of objects recognized by the user terminal, and the user A method of selecting one or more items or selecting by voice recognition according to a user's voice command may be included.
  • the user terminal may change the color of the border displaying the area of the selected object to a color different from that of the unselected object.
  • FIG. 5 is a flowchart illustrating a process of classifying an object to be removed into a main and a sub, according to an exemplary embodiment.
  • the user terminal further performs a step (S610) of classifying one or more objects into a main object and a sub-object based on at least one of the size of one or more objects, the movement of the object, and the type of the object. I can.
  • the user terminal may continuously capture images for a preset time, and in this process, an object moving more than a preset distance, an object moving at a speed more than a preset speed, and an object corresponding to a preset size , An object that falls outside a preset color range or an object corresponding to one or more of the objects belonging to a preset classification may be classified as a sub-object.
  • the user terminal may classify all objects that have moved a distance of 30 cm or more for a predetermined shooting time (eg, 3 seconds) as objects to be removed, and generate a result image without additional user manipulation.
  • a predetermined shooting time eg, 3 seconds
  • objects such as red buoys or garbage with extraordinarily high reflectivity are selected by the user terminal. It may be classified as a sub-object to be removed, and at least some of the objects classified as sub-objects may be classified as objects not to be removed by a user's selection.
  • the user terminal selects a frame in which the person's features are most clearly photographed from among the images of the person captured during the preset continuous shooting time, and uses this as a reference for the other frames. It can be combined with the image and used as image data used to generate the final result image.
  • the above-described main object may correspond to the above-described main subject in a photographed picture, and means an object belonging to a specific area of a preset composition, an object belonging to a preset classification, or an object not classified as the sub-object. can do.
  • the above-described setting may be recommended by an artificial intelligence connected to the user terminal, and the artificial intelligence is an existing selection history of a user who uses the user terminal or all user terminals performing this method from the user of the terminal. It may be a model trained based on the input selection.
  • the user terminal vectorizes the motion of a moving object during the continuous shooting, and when two or more objects have similar vector values within a predetermined range, the two or more objects are converted to the original and the reflected object.
  • the two or more objects may be selected as an object to be removed.
  • the user terminal may recognize a water surface or a mirror, etc. in a captured image even during a single shooting, and when an object to be removed is selected, it may recognize and remove objects reflected by the mirror or water surface together. .
  • the user terminal can recognize and remove the shadow corresponding to the object to be removed, and further determine the refraction, diffraction, and reflection of light generated by the object to be removed through image processing based on a physical engine or ray tracing.
  • the entire image may be modified in consideration of refraction, diffraction, and reflection of light that is changed by removing the object.
  • the user terminal may further perform a step (S620) of determining the sub-object as an object to be removed.
  • the user terminal skips the step of receiving the user's selection, and classifies the main subject and the subject that hinders the aesthetic sense by artificial intelligence, and determines the object to be removed. You can make it possible to acquire data.
  • the user terminal may further include reclassifying at least some of the objects classified as sub-objects in step S610 into an object to be removed and an object not to be removed according to a preset criterion.
  • the user terminal may generate a learning model through selection history of an object previously selected as an object to be removed by a user of the corresponding user terminal, and allow the learning model to select an object to be removed.
  • the user terminal may recognize the captured background and determine the main object and sub-object that may appear in the background based on a previously stored database.
  • the user terminal may obtain information from the database indicating that a buoy, garbage, etc. floating in the sea is an object that is frequently selected and removed from an image with the sea in the background.
  • the user terminal may search, recognize, and remove images corresponding to buoys or garbage from the image.
  • the user terminal acquires information on the main subject (for example, a symbol of a tourist destination, etc.) at the corresponding location, recognizes the subject, and determines the main subject have. In addition, this determination may be performed based on GPS information of the captured location. For example, if it is determined that the location where a picture is taken is the Louvre Museum in Paris, the user terminal may search for and recognize the entrance to the pyramid, which is a symbol of the Louvre, and recognize it as a main object.
  • the main subject for example, a symbol of a tourist destination, etc.
  • FIG. 6 is a flowchart illustrating a process of categorizing an object according to an exemplary embodiment.
  • 10 to 14 are diagrams illustrating a state in which an object according to a user's input is removed from an original photo according to an exemplary embodiment.
  • the user terminal may further perform a step (S710) of determining a classification to which each of the recognized one or more objects belongs.
  • the user terminal may designate one or more classifications for one object.
  • the user terminal may assign one or more labeling data of a man, a pedestrian, a person, or a sub-object to the object.
  • the user terminal may determine the classification to which the object belongs by inferring based on the photographed part of the object even when a part of the object is photographed, and in this case, the assigned labeling data is'specific classification' or'specific classification. May be displayed as'part of'.
  • the user terminal may classify it as a sub-object, and may assign labeling data such as'a part of a male face' to the object.
  • the user terminal may output the classification of the one or more objects.
  • the user terminal may display a list of labeling data assigned to the recognized object by overlaying the captured image.
  • the user terminal may display the area of the object displayed by overlaying the image captured in step S510 and labeling data assigned to the object.
  • the user terminal may request the user to select one or more categories to be removed from among the output categories.
  • the user terminal may request the user to select one or more of the displayed labeling data list, and if there is a user's selection input, the details of such selection will be stored in the corresponding user terminal or server. I can.
  • the user terminal may obtain the user's selection input.
  • the user's selection input may be a touch input through a display or a voice input through a microphone.
  • the user terminal may determine one or more objects belonging to the classification selected by the user as the object to be removed.
  • the user terminal may determine all objects belonging to one or more classifications selected by the user as objects to be removed.
  • the labeling data of'car' or'passenger' as the classification of the object to be removed
  • all vehicles belonging to a small car or a large vehicle, or all people in motion may be selected as the object to be removed.
  • the user terminal may request the user to select a classification of objects not to be removed.
  • a user wants to capture a natural landscape of pedestrians from an image captured by the user, if the user selects'passenger' as the category of objects that the user will not remove at the request of the user terminal, objects belonging to the category are It can also be excluded so as not to be removed during the removal process.
  • These settings may be applied for each image to be captured, or may be applied in common to images to be captured as a basic setting.
  • the user terminal may collect labeling data of objects selected as objects to be removed by other users from a server connected to the user terminal.
  • the user terminal may select objects selected by other users as objects to be removed most frequently as objects to be removed even if there is no user selection action.
  • FIG. 7 is a flowchart illustrating a process when there is no classification to which an object belongs, according to an exemplary embodiment.
  • the user terminal searches for a classification of each of the one or more objects from a previously stored database (S712). If the classification corresponding to the second object is not searched in the database, the second object corresponds
  • the step of generating the classification (S714) and the step of storing the generated classification in the database (S716) may be further performed.
  • the labeling data or classification to be assigned to the recognized object is stored in a database belonging to the internal storage of the user terminal or a database stored in a server connected to the user terminal. If not, it is possible to create new labeling data and include it as a kind of new classification.
  • FIG. 8 is a flowchart illustrating a process of designating a removal area by a user according to an exemplary embodiment.
  • step S810 the user terminal may display the captured image.
  • the user terminal may obtain the user's region selection information for the captured image.
  • the user terminal may provide various tools so that the user can directly designate a portion to be removed from the captured image.
  • designate an area with a spline whose size can be adjusted in the shape of a brush select one of the polygons to designate the inner area, designate the inner area of a closed curve formed by a spline, or designate a similar color.
  • Tools that are basically adopted in image editing programs, such as automatically selecting and specifying within a range, can be provided.
  • the user terminal may generate a background image corresponding to the selected area.
  • the user terminal may generate a result image by synthesizing the generated background image and the captured image.
  • the user terminal removes image data of the region designated by the user in step S820, generates image data of the removed region using GAN, and combines it with original image data to generate result image data. have.
  • the user terminal may generate a new image based on two or more images that have already been captured.
  • the user terminal requests the selection of a first image that will be the basis for image generation, recognizing objects on the first image, requesting the user to select an object to be removed, and the selected objects.
  • the step of generating a result image by synthesizing the first image from which the object has been removed may be further performed.
  • the user terminal may generate completed image data based on the previously photographed image data.
  • FIG. 9 is a block diagram of an apparatus according to an exemplary embodiment.
  • the processor 102 may include one or more cores (not shown) and a graphic processing unit (not shown) and/or a connection path (eg, a bus) for transmitting and receiving signals with other components. .
  • the processor 102 executes one or more instructions stored in the memory 104 to perform the method described with reference to FIGS. 1 to 13.
  • the processor 102 temporarily and/or permanently stores a signal (or data) processed inside the processor 102, a RAM (Random Access Memory, not shown) and a ROM (Read-Only Memory). , Not shown) may further include.
  • the processor 102 may be implemented in the form of a system on chip (SoC) including at least one of a graphic processing unit, RAM, and ROM.
  • SoC system on chip
  • the memory 104 may store programs (one or more instructions) for processing and controlling the processor 102. Programs stored in the memory 104 may be divided into a plurality of modules according to functions.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • EPROM Erasable Programmable ROM
  • EEPROM Electrically Erasable Programmable ROM
  • Flash Memory hard disk, removable disk, CD-ROM, or It may reside on any type of computer-readable recording medium well known in the art to which the present invention pertains.
  • Components of the present invention may be implemented as a program (or application) and stored in a medium in order to be executed in combination with a computer that is hardware.
  • Components of the present invention may be implemented as software programming or software elements, and similarly, embodiments include various algorithms implemented with a combination of data structures, processes, routines or other programming elements, including C, C++ , Java, assembler, or the like may be implemented in a programming or scripting language. Functional aspects can be implemented with an algorithm running on one or more processors.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

La présente invention concerne un procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle. Le procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle consiste à : (S110) capturer une image ; (S120) reconnaître au moins un objet dans l'image capturée ; (S130) déterminer un premier objet à éliminer à partir de l'image capturée ; (S140) acquérir des données pour générer une image d'arrière-plan couverte par le premier objet dans l'image capturée ; (S150) générer l'image d'arrière-plan couverte ; et (S160) synthétiser l'image d'arrière-plan couverte avec l'image capturée pour générer une image résultante avec le premier objet éliminé de celle-ci.
PCT/KR2020/008267 2019-06-26 2020-06-25 Procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle WO2020262977A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020190076115A KR102231794B1 (ko) 2019-06-26 2019-06-26 인공지능을 이용하여 이미지의 객체를 제거하는 방법
KR10-2019-0076115 2019-06-26

Publications (1)

Publication Number Publication Date
WO2020262977A1 true WO2020262977A1 (fr) 2020-12-30

Family

ID=74060295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/008267 WO2020262977A1 (fr) 2019-06-26 2020-06-25 Procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle

Country Status (2)

Country Link
KR (1) KR102231794B1 (fr)
WO (1) WO2020262977A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210304449A1 (en) * 2020-03-26 2021-09-30 Snap Inc. Machine learning-based modification of image content

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102403166B1 (ko) * 2021-09-29 2022-05-30 주식회사 인피닉 기계 학습용 데이터 증강 방법 및 이를 실행하기 위하여 기록매체에 기록된 컴퓨터 프로그램
KR20230155718A (ko) * 2022-05-04 2023-11-13 서울대학교산학협력단 이미지에서의 측색 방법, 장치, 및 컴퓨터-판독가능 매체
WO2024085352A1 (fr) * 2022-10-18 2024-04-25 삼성전자 주식회사 Procédé et dispositif électronique pour générer des données d'apprentissage pour l'apprentissage d'un modèle d'intelligence artificielle
WO2024112185A1 (fr) * 2022-11-27 2024-05-30 삼성전자주식회사 Dispositif pouvant être porté pour commander l'affichage d'un objet visuel correspondant à un objet externe, et procédé associé

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100588002B1 (ko) * 2004-12-07 2006-06-08 한국전자통신연구원 동영상에서 배경을 복원하기 위한 장치 및 그 방법
JP2009302902A (ja) * 2008-06-13 2009-12-24 Nikon Corp カメラ
US20110103644A1 (en) * 2009-10-30 2011-05-05 Zoran Corporation Method and apparatus for image detection with undesired object removal
KR20150009184A (ko) * 2013-07-16 2015-01-26 삼성전자주식회사 카메라를 구비하는 장치의 이미지 처리장치 및 방법
JP2018147019A (ja) * 2017-03-01 2018-09-20 株式会社Jストリーム オブジェクト抽出装置、オブジェクト認識システム及びメタデータ作成システム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102438201B1 (ko) 2017-12-01 2022-08-30 삼성전자주식회사 사진 촬영과 관련된 추천 정보를 제공하는 방법 및 시스템

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100588002B1 (ko) * 2004-12-07 2006-06-08 한국전자통신연구원 동영상에서 배경을 복원하기 위한 장치 및 그 방법
JP2009302902A (ja) * 2008-06-13 2009-12-24 Nikon Corp カメラ
US20110103644A1 (en) * 2009-10-30 2011-05-05 Zoran Corporation Method and apparatus for image detection with undesired object removal
KR20150009184A (ko) * 2013-07-16 2015-01-26 삼성전자주식회사 카메라를 구비하는 장치의 이미지 처리장치 및 방법
JP2018147019A (ja) * 2017-03-01 2018-09-20 株式会社Jストリーム オブジェクト抽出装置、オブジェクト認識システム及びメタデータ作成システム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RAKSHITH SHETTY, MARIO FRITZ, BERNT SCHIELE: "Adversarial Scene Editing: Automatic Object Removal from Weak Supervision", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 : 32ND CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2018) : MONTRÉAL, CANADA, 3-8 DECEMBER 2018, 3 December 2018 (2018-12-03), pages 7717 - 7727, XP009525358, ISBN: 978-1-5108-8447-2, DOI: 10.5555/3327757.3327869 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210304449A1 (en) * 2020-03-26 2021-09-30 Snap Inc. Machine learning-based modification of image content

Also Published As

Publication number Publication date
KR102231794B1 (ko) 2021-03-25
KR20210000872A (ko) 2021-01-06

Similar Documents

Publication Publication Date Title
WO2020262977A1 (fr) Procédé pour éliminer un objet dans une image par utilisation de l'intelligence artificielle
WO2020085694A1 (fr) Dispositif de capture d'image et procédé de commande associé
WO2021029648A1 (fr) Appareil de capture d'image et procédé de photographie auxiliaire associé
WO2020032354A1 (fr) Procédé, support de stockage et appareil pour convertir un ensemble d'images 2d en un modèle 3d
WO2019050360A1 (fr) Dispositif électronique et procédé de segmentation automatique d'être humain dans une image
WO2021095916A1 (fr) Système de suivi pouvant suivre le trajet de déplacement d'un objet
WO2016006946A1 (fr) Système de création et de reproduction de contenus de réalité augmentée, et procédé l'utilisant
WO2020130747A1 (fr) Appareil et procédé de traitement d'image pour transformation de style
KR20080060265A (ko) 디지털 영상 컬렉션의 특정 인물 식별 방법
WO2019125029A1 (fr) Dispositif électronique permettant d'afficher un objet dans le cadre de la réalité augmentée et son procédé de fonctionnement
WO2019066373A1 (fr) Procédé de correction d'image sur la base de catégorie et de taux de reconnaissance d'objet inclus dans l'image et dispositif électronique mettant en œuvre celui-ci
US10133932B2 (en) Image processing apparatus, communication system, communication method and imaging device
WO2020032383A1 (fr) Dispositif électronique permettant de fournir un résultat de reconnaissance d'un objet externe à l'aide des informations de reconnaissance concernant une image, des informations de reconnaissance similaires associées à des informations de reconnaissance, et des informations de hiérarchie, et son procédé d'utilisation
WO2018117538A1 (fr) Procédé d'estimation d'informations de voie et dispositif électronique
WO2016126083A1 (fr) Procédé, dispositif électronique et support d'enregistrement pour notifier des informations de situation environnante
WO2013085278A1 (fr) Dispositif de surveillance faisant appel à un modèle d'attention sélective et procédé de surveillance associé
WO2021025509A1 (fr) Appareil et procédé d'affichage d'éléments graphiques selon un objet
WO2020189909A2 (fr) Système et procédé de mise en oeuvre d'une solution de gestion d'installation routière basée sur un système multi-capteurs 3d-vr
EP3922036A1 (fr) Appareil et procédé de génération d'image
WO2019190142A1 (fr) Procédé et dispositif de traitement d'image
WO2021149947A1 (fr) Dispositif électronique et procédé de commande de dispositif électronique
EP3707678A1 (fr) Procédé et dispositif de traitement d'image
WO2020122513A1 (fr) Procédé de traitement d'image bidimensionnelle et dispositif d'exécution dudit procédé
WO2020036468A1 (fr) Procédé d'application d'effet bokeh sur une image et support d'enregistrement
JPH10124655A (ja) デジタルアルバムの作成装置及びデジタルアルバム装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20831107

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20831107

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29.04.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20831107

Country of ref document: EP

Kind code of ref document: A1