US20210090232A1 - Single Image Completion From Retrieved Image Collections - Google Patents

Single Image Completion From Retrieved Image Collections Download PDF

Info

Publication number
US20210090232A1
US20210090232A1 US17/110,290 US202017110290A US2021090232A1 US 20210090232 A1 US20210090232 A1 US 20210090232A1 US 202017110290 A US202017110290 A US 202017110290A US 2021090232 A1 US2021090232 A1 US 2021090232A1
Authority
US
United States
Prior art keywords
neural network
image
images
subset
threshold value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/110,290
Inventor
Stephen Gould
Samuel Toyer
David Reiner
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seesure
Original Assignee
Seesure
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seesure filed Critical Seesure
Priority to US17/110,290 priority Critical patent/US20210090232A1/en
Publication of US20210090232A1 publication Critical patent/US20210090232A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06T5/77
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/001Industrial image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/60
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20004Adaptive image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present invention relates to artificial intelligence, and more particularly to forming images using artificial intelligence.
  • a common operation in digital image manipulation and editing is to remove an object or region from an existing image.
  • Image editing software such as Photoshop and Gimp provide tools, such as “smart erase” and “content-aware fill” for semi-automating the removal operation.
  • the process of removing an object or region and filling in the background is iterative with only a small number of pixels erased at each time. The erased pixels and the immediate surrounding pixels are re-painted and blended with the rest of the image.
  • a user can guide the software to erase large objects and fill in the missing parts of the image to make the manipulated image look as natural as possible. Therefore, a person viewing the manipulated image should not be able to easily tell that it had been modified in anyway, and more particularly that the filled-in region is a plausible reconstruction of what the scene could have looked like were the image taken without the removed object or region in it.
  • FIGS. 1A-1D An example of an image, mask and automatically completed region is shown in FIGS. 1A-1D .
  • FIG. 1A shows a cup 10 positioned in front of a tape dispenser 12 , a stapler 14 a pen holders 16 , and a number of other pens/pencils 18 that appear to have been placed in a second pen holder not visible in FIG. 1A , a ruler 20 , a notepad 22 and a pen 24 .
  • FIG. 1B shows a mask 30 .
  • FIG. 1C shows mask 20 after it is superimposed on a region of FIG. 1A .
  • FIG. 1D shows what FIG. 1A would have looked like if the masked region were not present in FIG. 1A .
  • a method of training a neural network to complete an image when masked includes, in part, identifying a multitude of images that are visually similar to the image being masked, forming a first subset of images from the multitude of images, setting the parameters of a neural network to a first set of values in accordance with the data represented by the first subset, and using the neural network having the first set of parameter values to complete the masked image if the neural network having the first set of parameter values is determined to meet a threshold value.
  • the neural network having the first set of parameters is determined not to meet the threshold value, another subset of images, different from the first subset, is formed from the multitude of images.
  • the parameters of the neural network are then updated in accordance with the data represented by the newly formed subset.
  • a determination is then made to access whether the neural network with the updated parameter values meets a threshold value. If so, the neural network with the updated parameter values is applied to complete the masked image. If not, the process of forming another subset of images, and updating the parameters of the neural network in accordance with the new subset of images is repeated iteratively until the threshold value is met.
  • the threshold value is defined by a convergence of the neural network. In another embodiment, the threshold value is defined by a maximum number of updates to the parameters. In one embodiment, the multitude of images are identified by searching a collection of images. In one embodiment, the mask is variable.
  • the method further includes, in part, performing a post-processing on the completed image.
  • each image subset is formed by sampling image-mask pairs.
  • the sampling is a random sampling.
  • the neural network is a convolutional neural network.
  • a computer system with a neural network and configured to complete an image when masked in accordance with one embodiment of the present invention, is further configured to identify a multitude of images that are visually similar to the image being masked, form a first subset of images from the multitude of images, set the parameters of the neural network to a first set of values in accordance with the data represented by the first subset, and use the neural network having the first set of parameter values to complete the masked image if the neural network having the first set of parameter values is determined to meet a threshold value.
  • the computer system forms another subset of images, different from the first subset, from the multitude of images.
  • the computer system updates the parameters of the neural network in accordance with the data represented by the newly formed subset.
  • a determination is then made to access whether the neural network with the updated parameter values meets a threshold value. If so, the neural network with the updated parameter values is applied to complete the masked image. If not, the computer system repeats the process of forming another subset of images and updating the parameters of the neural network in accordance with the new subset of images iteratively until the threshold value is met.
  • the threshold value is defined by a convergence of the neural network. In another embodiment, the threshold value is defined by a maximum number of updates to the parameters. In one embodiment, the multitude of images are identified by searching a collection of images. In one embodiment, the mask is variable.
  • the computer system is further configures to, in part, perform a post-processing on the completed image.
  • each image subset is formed by sampling image-mask pairs.
  • the sampling is a random sampling.
  • the neural network is a convolutional neural network.
  • FIG. 1A show an image, as known in the prior art.
  • FIG. 1B shows a mask, as known in the prior art.
  • FIG. 1C shows the mask of FIG. 1B superimposed on a region of FIG. 1A , as known in the prior art.
  • FIG. 1D shows what FIG. 1A would like when the region covered by the mask is removed therefrom, as known in the prior art.
  • FIG. 2A shows an image, as known in the prior art.
  • FIG. 2B shows a collection if images, as known in the prior art.
  • FIG. 2C shows a set of images retrieved from FIG. 2B and that are visually similar to FIG. 2A , as known in the prior art.
  • FIG. 3 shows a flowchart for training a convolutional neural network to perform image completion, as known in the prior art.
  • FIG. 4 shows a flowchart for applying a trained convolutional neural network to perform image completion, as known in the prior art.
  • FIG. 5 shows a flowchart for training a convolutional neural network and applying the trained neural network to perform image completion, in accordance with one embodiment of the present invention.
  • FIG. 6 is a simplified block diagram of an exemplary computing device, in which the various aspects of the present invention may be embodied.
  • Embodiments of the present invention improve the quality of image completion by guiding a deep neural network based model with images that have similar content to the one being edited. More specifically, embodiments of the present invention fine-tune the parameters of the neural network based model using a set of images that have been retrieved from a content-based search procedure, which looks for images that appear similar (but not identical) to the one from which the object or region is being removed. Because the image completion model is refined for every specific image being edited, embodiments of the present invention complete the masked regions with more relevant content than a pre-trained model.
  • FIG. 2A is an image 45 of interest, i.e. the query image, showing a surfer riding a wave.
  • FIG. 2B shows a collection of images 50 being searched to determine therefrom images matching image 45 shown in FIG. 2A .
  • FIG. 2C shows images 52 , 54 , 56 , retrieved from image collection 50 , matching query image 45 .
  • Image collection 50 may include millions of images of diverse scenes from different viewpoints, lighting conditions, etc.
  • the search results forming the retrieved set, shown in FIG. 2C may include hundreds to thousands of images similar in appearance to query image 45 .
  • the construction of the retrieved set (such as the set shown in FIG. 2C ) used to fine-tune the image inpainting may be achieved using Content-Based Image Retrieval (CBIR) techniques.
  • CBIR Content-Based Image Retrieval
  • Such techniques may use a descriptor function, a similarity measure, and a nearest-neighbor search method.
  • the descriptor function d(I) such as the Fuzzy Color and Texture Histogram (FCTH) descriptor or the Color and Edge Directivity Descriptor (CEDD), maps a high-dimensional image I to a low-dimensional vector, which captures the global appearance and structure of that image.
  • FCTH Fuzzy Color and Texture Histogram
  • CEDD Color and Edge Directivity Descriptor
  • s ⁇ ( a , b ) a ⁇ b ⁇ a ⁇ ⁇ ⁇ b ⁇ .
  • the nearest-neighbor search method is used to scan a large database of pre-computed descriptors d 1 , . . . , d N to find the K descriptors d 1 , . . . , d K , and hence K images, which have maximum similarity s(d q , d i ) to a query descriptor d q .
  • the search may be accelerated using specialized data structures like the k-d tree or approximation methods such as locality sensitive hashing (LSH).
  • Embodiments of the present invention provide a method and a system for image completion.
  • the descriptors d 1 , . . . , d N of each of the N images in an image collection are pre-computed and added to a database.
  • a data structure for accelerating subsequent search queries may be formed.
  • the descriptor of the query image i.e., the image that has been provided
  • a nearest-neighbor search method is used to find the K closest descriptors to d q .
  • the images corresponding to the descriptors constitute the retrieval set.
  • FIG. 3 shows a conventional flowchart 100 used to train a convolutional neural network to perform image completion.
  • Training starts at 102 subsequent to which a convolutional neural network architecture and its associated parameters are initialized at 104 .
  • a mini batch is iteratively created by sampling image-mask pairs.
  • the training algorithm then iteratively updates the convolutional neural network parameters to improve the quality of image completion on a set of training images.
  • Each mini-batch may include tens to hundreds of images obtained by selecting a subset of images from the training dataset and randomly generating masks for those images. In some methods the masks may be pre-determined.
  • the parameters of the convolutional neural network are then updated at 108 , typically using stochastic gradient descent with gradient computed via backpropagation.
  • the image completion algorithm with current parameter settings then infills the masked region on a validation set of images and their quality is assessed. Training iterations are repeated at 110 until convergence or a maximum number of iterations, typically many thousands, is reached. The training ends at 112 .
  • a trained convolutional neural network image completion model in hand, a user can provide new images and masks for the algorithm to complete.
  • FIG. 4 shows a conventional flowchart 200 for invoking a trained convolutional neural network based model to complete a masked region of a new image provided by a user.
  • the trained convolutional neural network is applied at 204 to perform image completion.
  • the process ends at 208 .
  • FIG. 5 shows a flowchart 300 for performing image completion, in accordance with one embodiment of the present invention.
  • the query image and mask are obtained from the user.
  • a relatively large collection of images is searched to identify and retrieve images that are visually similar to the query image.
  • a mini-batch is formed at 306 .
  • the mini-batch formed at 306 during each iteration may include tens to hundreds of images.
  • the parameters of the neural network parameters are updated based on the retrieved mini-batch obtained at 306 to improve the quality of image completion on the mini batch.
  • the trained the neural network is applied to image completion at 312 .
  • the image completion algorithm with the current parameter settings infills the masked region to determine its quality.
  • a post-processing step is performed at 314 to further improve the perceptual quality of the infilled region by blending it with the surrounding pixels following which image completion ends at 316 . Because the parameters of the neural network are changed to improve the quality of image completion on images in the retrieved set that are similar to the one provided by the user, the final quality of the image provided by a neural network trained in accordance with embodiments of the present invention is substantially enhanced.
  • the process moves to 306 at which point a new mini batch is created by sampling image mask pairs.
  • the parameters of the neural network are then updated at 308 based on the newly created mini batch.
  • a determination is made at 310 as to whether the neural network having parameters updated in accordance with the newly created mini batch meet the threshold value or not, in an iterative manner and as described above.
  • the neural network is a convolutional neural network.
  • the masked region for each image in a mini batch is a randomly sampled rectangle whose location and size may vary over the image.
  • the masked region for each image in a mini batch is sampled so as to be of similar relative size and location to the masked region of the user provided image and mask. Images from the retrieval (also referred to herein as retrieved) set may be contained in multiple mini batches. An image selected for multiple mini batches, may have different sampled masked regions in different mini batches. Thus, the convolutional neural network algorithm learns to complete different regions of an image even though it is presented with the same image in different mini batches.
  • each image from the retrieval set is sampled for a mini batch before repeating sampling of an image for subsequent inclusion in a mini batch.
  • the parameters of the neural network are either initialized with random values or pre-trained to facilitate the further training of the neural network for the image completion task at hand, as described in detail above.
  • FIG. 6 is an exemplary block diagram of a computing device 600 that may incorporate embodiments of the present invention.
  • FIG. 6 is merely illustrative of a machine system to carry out aspects of the technical processes described herein, and does not limit the scope of the claims.
  • the computing device 600 includes a monitor or graphical user interface 602 , a data processing system 620 , a communication network interface 612 , input device(s) 608 , output device(s) 606 , and the like.
  • the data processing system 620 may include one or more central processing units (CPU) or graphical processing units 604 (collectively referred to herein as processor(s)) that communicate with a number of peripheral devices via a bus subsystem 618 .
  • peripheral devices may include input device(s) 608 , output device(s) 606 , communication network interface 612 , and a storage subsystem, such as a volatile memory 610 and a nonvolatile memory 614 .
  • the volatile memory 610 and/or the nonvolatile memory 614 may store computer-executable instructions and thus forming logic 622 that when applied to and executed by the processor(s) 604 implement embodiments of the processes disclosed herein.
  • the input device(s) 608 include devices and mechanisms for inputting information to the data processing system 620 . These may include a keyboard, a keypad, a touch screen incorporated into the monitor or graphical user interface 602 , audio input devices such as voice recognition systems, microphones, and other types of input devices. In various embodiments, the input device(s) 608 may be embodied as a computer mouse, a trackball, a track pad, a joystick, wireless remote, drawing tablet, voice command system, eye tracking system, and the like. The input device(s) 608 typically allow a user to select objects, icons, control areas, text and the like that appear on the monitor or graphical user interface 602 via a command such as a click of a button or the like.
  • the output device(s) 606 include devices and mechanisms for outputting information from the data processing system 620 . These may include speakers, printers, infrared LEDs, and so on as well understood in the art.
  • the communication network interface 612 provides an interface to communication networks (e.g., communication network 616 ) and devices external to the data processing system 620 .
  • the communication network interface 612 may serve as an interface for receiving data from and transmitting data to other systems.
  • Embodiments of the communication network interface 612 may include an Ethernet interface, a modem (telephone, satellite, cable, ISDN), (asynchronous) digital subscriber line (DSL), FireWire, USB, a wireless communication interface such as BlueTooth or WiFi, a near field communication wireless interface, a cellular interface, and the like.
  • the communication network interface 612 may be coupled to the communication network 616 via an antenna, a cable, or the like.
  • the communication network interface 612 may be physically integrated on a circuit board of the data processing system 620 , or in some cases may be implemented in software or firmware, such as “soft modems”, or the like.
  • the computing device 600 may include logic that enables communications over a network using protocols such as HTTP, TCP/IP, RTP/RTSP, IPX, UDP and the like.
  • the volatile memory 610 and the nonvolatile memory 614 are examples of tangible media configured to store computer readable data and instructions to implement various embodiments of the processes described herein.
  • Other types of tangible media include removable memory (e.g., pluggable USB memory devices, mobile device SIM cards), optical storage media such as CD-ROMS, DVDs, semiconductor memories such as flash memories, non-transitory read-only-memories (ROMS), battery-backed volatile memories, networked storage devices, and the like.
  • the volatile memory 610 and the nonvolatile memory 614 may be configured to store the basic programming and data constructs that provide the functionality of the disclosed processes and other embodiments thereof that fall within the scope of the present invention.
  • Logic 622 that implements embodiments of the present invention may be stored in the volatile memory 610 and/or the nonvolatile memory 614 .
  • Said software may be read from the volatile memory 610 and/or nonvolatile memory 614 and executed by the processor(s) 604 .
  • the volatile memory 610 and the nonvolatile memory 614 may also provide a repository for storing data used by the software.
  • the volatile memory 610 and the nonvolatile memory 614 may include a number of memories including a main random access memory (RAM) for storage of instructions and data during program execution and a read only memory (ROM) in which read-only non-transitory instructions are stored.
  • the volatile memory 610 and the nonvolatile memory 614 may include a file storage subsystem providing persistent (non-volatile) storage for program and data files.
  • the volatile memory 610 and the nonvolatile memory 614 may include removable storage systems, such as removable flash memory.
  • the bus subsystem 618 provides a mechanism for enabling the various components and subsystems of data processing system 620 communicate with each other as intended. Although the communication network interface 612 is depicted schematically as a single bus, some embodiments of the bus subsystem 618 may utilize multiple distinct busses.
  • the computing device 600 may be a device such as a smartphone, a desktop computer, a laptop computer, a rack-mounted computer system, a computer server, or a tablet computer device. As commonly known in the art, the computing device 600 may be implemented as a collection of multiple networked computing devices. Further, the computing device 600 will typically include operating system logic (not illustrated) the types and nature of which are well known in the art.

Abstract

A method of completing a masked image includes, in part, identifying a multitude of images that are visually similar to the masked image, retrieving a first subset of images from the multitude of images, setting parameters of a neural network to a first set of values in accordance with the data represented by the first retrieved subset, and using the neural network with the first set of parameters to complete the masked image if the neural network having the first set of parameters is determined to meet a threshold value. If the neural network having the first set of parameters is determined not to meet the threshold value, in an iterative manner, another subset of images different from the first subset is retrieved, and the parameters of the neural network are then updated in accordance with the data represented by the other subset until the threshold value is met.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This patent application is a continuation of U.S. application Ser. No. 16/394,410, filed Apr. 25, 2019, which claims priority from U.S. Provisional Application, Ser. No. 62/662,699, filed Apr. 25, 2018, all of which are incorporated herein by reference in its entirety.
  • FIELD OF THE INVENTION
  • The present invention relates to artificial intelligence, and more particularly to forming images using artificial intelligence.
  • BACKGROUND OF THE INVENTION
  • A common operation in digital image manipulation and editing is to remove an object or region from an existing image. Image editing software such as Photoshop and Gimp provide tools, such as “smart erase” and “content-aware fill” for semi-automating the removal operation. Usually the process of removing an object or region and filling in the background is iterative with only a small number of pixels erased at each time. The erased pixels and the immediate surrounding pixels are re-painted and blended with the rest of the image. Using these tools, a user can guide the software to erase large objects and fill in the missing parts of the image to make the manipulated image look as natural as possible. Therefore, a person viewing the manipulated image should not be able to easily tell that it had been modified in anyway, and more particularly that the filled-in region is a plausible reconstruction of what the scene could have looked like were the image taken without the removed object or region in it.
  • More recently algorithms have been developed for automatically filling image regions that have been erased with plausible reconstructions. Such algorithms, known as image completion or inpainting algorithms, are provided with an image and masked region and attempt to paint pixels into the masked region with the same goals as stated above, namely a person viewing the automatically completed image should not be able to perceive that it has been manipulated in any way. State of the art techniques use deep neural network models, and, more particularly convolutional neural networks (CNNs), that have been trained on a large corpus of images to reconstruct missing parts of image in the corpus. In some cases, additional cues are given to guide the image completion, such as user provided boundary sketches or an auxiliary image with the desired targeted texture or features.
  • An example of an image, mask and automatically completed region is shown in FIGS. 1A-1D. FIG. 1A shows a cup 10 positioned in front of a tape dispenser 12, a stapler 14 a pen holders 16, and a number of other pens/pencils 18 that appear to have been placed in a second pen holder not visible in FIG. 1A, a ruler 20, a notepad 22 and a pen 24. FIG. 1B shows a mask 30. FIG. 1C shows mask 20 after it is superimposed on a region of FIG. 1A. FIG. 1D shows what FIG. 1A would have looked like if the masked region were not present in FIG. 1A.
  • BRIEF SUMMARY OF THE INVENTION
  • A method of training a neural network to complete an image when masked, in accordance with one embodiment of the present invention, includes, in part, identifying a multitude of images that are visually similar to the image being masked, forming a first subset of images from the multitude of images, setting the parameters of a neural network to a first set of values in accordance with the data represented by the first subset, and using the neural network having the first set of parameter values to complete the masked image if the neural network having the first set of parameter values is determined to meet a threshold value.
  • If the neural network having the first set of parameters is determined not to meet the threshold value, another subset of images, different from the first subset, is formed from the multitude of images. The parameters of the neural network are then updated in accordance with the data represented by the newly formed subset. A determination is then made to access whether the neural network with the updated parameter values meets a threshold value. If so, the neural network with the updated parameter values is applied to complete the masked image. If not, the process of forming another subset of images, and updating the parameters of the neural network in accordance with the new subset of images is repeated iteratively until the threshold value is met.
  • In one embodiment, the threshold value is defined by a convergence of the neural network. In another embodiment, the threshold value is defined by a maximum number of updates to the parameters. In one embodiment, the multitude of images are identified by searching a collection of images. In one embodiment, the mask is variable.
  • In one embodiment, the method further includes, in part, performing a post-processing on the completed image. In one embodiment, each image subset is formed by sampling image-mask pairs. In one embodiment, the sampling is a random sampling. In one embodiment, the neural network is a convolutional neural network.
  • A computer system with a neural network and configured to complete an image when masked, in accordance with one embodiment of the present invention, is further configured to identify a multitude of images that are visually similar to the image being masked, form a first subset of images from the multitude of images, set the parameters of the neural network to a first set of values in accordance with the data represented by the first subset, and use the neural network having the first set of parameter values to complete the masked image if the neural network having the first set of parameter values is determined to meet a threshold value.
  • If the neural network having the first set of parameters is determined not to meet the threshold value, the computer system forms another subset of images, different from the first subset, from the multitude of images. The computer system then updates the parameters of the neural network in accordance with the data represented by the newly formed subset. A determination is then made to access whether the neural network with the updated parameter values meets a threshold value. If so, the neural network with the updated parameter values is applied to complete the masked image. If not, the computer system repeats the process of forming another subset of images and updating the parameters of the neural network in accordance with the new subset of images iteratively until the threshold value is met.
  • In one embodiment, the threshold value is defined by a convergence of the neural network. In another embodiment, the threshold value is defined by a maximum number of updates to the parameters. In one embodiment, the multitude of images are identified by searching a collection of images. In one embodiment, the mask is variable.
  • In one embodiment, the computer system is further configures to, in part, perform a post-processing on the completed image. In one embodiment, each image subset is formed by sampling image-mask pairs. In one embodiment, the sampling is a random sampling. In one embodiment, the neural network is a convolutional neural network.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1A show an image, as known in the prior art.
  • FIG. 1B shows a mask, as known in the prior art.
  • FIG. 1C shows the mask of FIG. 1B superimposed on a region of FIG. 1A, as known in the prior art.
  • FIG. 1D shows what FIG. 1A would like when the region covered by the mask is removed therefrom, as known in the prior art.
  • FIG. 2A shows an image, as known in the prior art.
  • FIG. 2B shows a collection if images, as known in the prior art.
  • FIG. 2C shows a set of images retrieved from FIG. 2B and that are visually similar to FIG. 2A, as known in the prior art.
  • FIG. 3 shows a flowchart for training a convolutional neural network to perform image completion, as known in the prior art.
  • FIG. 4 shows a flowchart for applying a trained convolutional neural network to perform image completion, as known in the prior art.
  • FIG. 5 shows a flowchart for training a convolutional neural network and applying the trained neural network to perform image completion, in accordance with one embodiment of the present invention.
  • FIG. 6 is a simplified block diagram of an exemplary computing device, in which the various aspects of the present invention may be embodied.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Embodiments of the present invention improve the quality of image completion by guiding a deep neural network based model with images that have similar content to the one being edited. More specifically, embodiments of the present invention fine-tune the parameters of the neural network based model using a set of images that have been retrieved from a content-based search procedure, which looks for images that appear similar (but not identical) to the one from which the object or region is being removed. Because the image completion model is refined for every specific image being edited, embodiments of the present invention complete the masked regions with more relevant content than a pre-trained model.
  • FIG. 2A is an image 45 of interest, i.e. the query image, showing a surfer riding a wave. FIG. 2B shows a collection of images 50 being searched to determine therefrom images matching image 45 shown in FIG. 2A. FIG. 2C shows images 52, 54, 56, retrieved from image collection 50, matching query image 45. Image collection 50 may include millions of images of diverse scenes from different viewpoints, lighting conditions, etc. The search results forming the retrieved set, shown in FIG. 2C, may include hundreds to thousands of images similar in appearance to query image 45.
  • The construction of the retrieved set (such as the set shown in FIG. 2C) used to fine-tune the image inpainting may be achieved using Content-Based Image Retrieval (CBIR) techniques. Such techniques may use a descriptor function, a similarity measure, and a nearest-neighbor search method. The descriptor function d(I), such as the Fuzzy Color and Texture Histogram (FCTH) descriptor or the Color and Edge Directivity Descriptor (CEDD), maps a high-dimensional image I to a low-dimensional vector, which captures the global appearance and structure of that image.
  • A similarity measure s(d1,d2) provides a degree of similarity between two descriptors d1 and d2 under the assumption that images with similar descriptors must have similar appearance. For instance, this measure may be the negative squared Euclidean distance s(a, b)=−∥a−b∥2 2 or the cosine similarity measure
  • s ( a , b ) = a · b a · b .
  • The nearest-neighbor search method is used to scan a large database of pre-computed descriptors d1, . . . , dN to find the K descriptors d1, . . . , dK, and hence K images, which have maximum similarity s(dq, di) to a query descriptor dq. The search may be accelerated using specialized data structures like the k-d tree or approximation methods such as locality sensitive hashing (LSH).
  • Embodiments of the present invention provide a method and a system for image completion. To achieve this, the descriptors d1, . . . , dN of each of the N images in an image collection are pre-computed and added to a database. In one aspect, a data structure for accelerating subsequent search queries may be formed. Next, the descriptor of the query image (i.e., the image that has been provided) is computed and used as the query descriptor dq. Thereafter, for example, a nearest-neighbor search method, is used to find the K closest descriptors to dq. The images corresponding to the descriptors constitute the retrieval set.
  • FIG. 3 shows a conventional flowchart 100 used to train a convolutional neural network to perform image completion. Training starts at 102 subsequent to which a convolutional neural network architecture and its associated parameters are initialized at 104. Next, at 106, a mini batch is iteratively created by sampling image-mask pairs. The training algorithm then iteratively updates the convolutional neural network parameters to improve the quality of image completion on a set of training images. Each mini-batch may include tens to hundreds of images obtained by selecting a subset of images from the training dataset and randomly generating masks for those images. In some methods the masks may be pre-determined. The parameters of the convolutional neural network are then updated at 108, typically using stochastic gradient descent with gradient computed via backpropagation. The image completion algorithm with current parameter settings then infills the masked region on a validation set of images and their quality is assessed. Training iterations are repeated at 110 until convergence or a maximum number of iterations, typically many thousands, is reached. The training ends at 112. With a trained convolutional neural network image completion model in hand, a user can provide new images and masks for the algorithm to complete.
  • FIG. 4 shows a conventional flowchart 200 for invoking a trained convolutional neural network based model to complete a masked region of a new image provided by a user. After obtaining the image and mask from the user at 202, the trained convolutional neural network is applied at 204 to perform image completion. Thereafter, following a post-processing step at 206 to further improve the perceptual quality of the infilled region by blending it with the surrounding pixels, the process ends at 208.
  • FIG. 5 shows a flowchart 300 for performing image completion, in accordance with one embodiment of the present invention. At 302 the query image and mask are obtained from the user. Next, at 304, a relatively large collection of images is searched to identify and retrieve images that are visually similar to the query image. At 306, from the images so retrieved a mini-batch is formed at 306. The mini-batch formed at 306 during each iteration may include tens to hundreds of images. Then, at 308 the parameters of the neural network parameters are updated based on the retrieved mini-batch obtained at 306 to improve the quality of image completion on the mini batch.
  • If at 310 a determination is made that the trained neural network having the updated parameter meets a predefined threshold value, characterized either by a convergence criteria or a maximum number of iterations defined by the loop involving 306, 308, 310, the trained the neural network is applied to image completion at 312. To achieve this, in one embodiment, the image completion algorithm with the current parameter settings infills the masked region to determine its quality. Thereafter, a post-processing step is performed at 314 to further improve the perceptual quality of the infilled region by blending it with the surrounding pixels following which image completion ends at 316. Because the parameters of the neural network are changed to improve the quality of image completion on images in the retrieved set that are similar to the one provided by the user, the final quality of the image provided by a neural network trained in accordance with embodiments of the present invention is substantially enhanced.
  • If at 310, a determination is made that the trained neural network having the updated parameters does not meet the predefined threshold, the process moves to 306 at which point a new mini batch is created by sampling image mask pairs. The parameters of the neural network are then updated at 308 based on the newly created mini batch. Thereafter, a determination is made at 310 as to whether the neural network having parameters updated in accordance with the newly created mini batch meet the threshold value or not, in an iterative manner and as described above. In one embodiment, the neural network is a convolutional neural network.
  • In one embodiment of the invention, in creating a mini batch by sampling image-mask pairs, as shown at 306 of flowchart 300, the masked region for each image in a mini batch is a randomly sampled rectangle whose location and size may vary over the image. In another embodiment of the invention, the masked region for each image in a mini batch is sampled so as to be of similar relative size and location to the masked region of the user provided image and mask. Images from the retrieval (also referred to herein as retrieved) set may be contained in multiple mini batches. An image selected for multiple mini batches, may have different sampled masked regions in different mini batches. Thus, the convolutional neural network algorithm learns to complete different regions of an image even though it is presented with the same image in different mini batches. In one embodiment, each image from the retrieval set is sampled for a mini batch before repeating sampling of an image for subsequent inclusion in a mini batch. Although not shown in flowchart 300, in some embodiments, the parameters of the neural network are either initialized with random values or pre-trained to facilitate the further training of the neural network for the image completion task at hand, as described in detail above.
  • FIG. 6 is an exemplary block diagram of a computing device 600 that may incorporate embodiments of the present invention. FIG. 6 is merely illustrative of a machine system to carry out aspects of the technical processes described herein, and does not limit the scope of the claims. One of ordinary skill in the art would recognize other variations, modifications, and alternatives. In one embodiment, the computing device 600 includes a monitor or graphical user interface 602, a data processing system 620, a communication network interface 612, input device(s) 608, output device(s) 606, and the like.
  • As depicted in FIG. 6, the data processing system 620 may include one or more central processing units (CPU) or graphical processing units 604 (collectively referred to herein as processor(s)) that communicate with a number of peripheral devices via a bus subsystem 618. These peripheral devices may include input device(s) 608, output device(s) 606, communication network interface 612, and a storage subsystem, such as a volatile memory 610 and a nonvolatile memory 614.
  • The volatile memory 610 and/or the nonvolatile memory 614 may store computer-executable instructions and thus forming logic 622 that when applied to and executed by the processor(s) 604 implement embodiments of the processes disclosed herein.
  • The input device(s) 608 include devices and mechanisms for inputting information to the data processing system 620. These may include a keyboard, a keypad, a touch screen incorporated into the monitor or graphical user interface 602, audio input devices such as voice recognition systems, microphones, and other types of input devices. In various embodiments, the input device(s) 608 may be embodied as a computer mouse, a trackball, a track pad, a joystick, wireless remote, drawing tablet, voice command system, eye tracking system, and the like. The input device(s) 608 typically allow a user to select objects, icons, control areas, text and the like that appear on the monitor or graphical user interface 602 via a command such as a click of a button or the like.
  • The output device(s) 606 include devices and mechanisms for outputting information from the data processing system 620. These may include speakers, printers, infrared LEDs, and so on as well understood in the art.
  • The communication network interface 612 provides an interface to communication networks (e.g., communication network 616) and devices external to the data processing system 620. The communication network interface 612 may serve as an interface for receiving data from and transmitting data to other systems. Embodiments of the communication network interface 612 may include an Ethernet interface, a modem (telephone, satellite, cable, ISDN), (asynchronous) digital subscriber line (DSL), FireWire, USB, a wireless communication interface such as BlueTooth or WiFi, a near field communication wireless interface, a cellular interface, and the like.
  • The communication network interface 612 may be coupled to the communication network 616 via an antenna, a cable, or the like. In some embodiments, the communication network interface 612 may be physically integrated on a circuit board of the data processing system 620, or in some cases may be implemented in software or firmware, such as “soft modems”, or the like.
  • The computing device 600 may include logic that enables communications over a network using protocols such as HTTP, TCP/IP, RTP/RTSP, IPX, UDP and the like.
  • The volatile memory 610 and the nonvolatile memory 614 are examples of tangible media configured to store computer readable data and instructions to implement various embodiments of the processes described herein. Other types of tangible media include removable memory (e.g., pluggable USB memory devices, mobile device SIM cards), optical storage media such as CD-ROMS, DVDs, semiconductor memories such as flash memories, non-transitory read-only-memories (ROMS), battery-backed volatile memories, networked storage devices, and the like. The volatile memory 610 and the nonvolatile memory 614 may be configured to store the basic programming and data constructs that provide the functionality of the disclosed processes and other embodiments thereof that fall within the scope of the present invention.
  • Logic 622 that implements embodiments of the present invention may be stored in the volatile memory 610 and/or the nonvolatile memory 614. Said software may be read from the volatile memory 610 and/or nonvolatile memory 614 and executed by the processor(s) 604. The volatile memory 610 and the nonvolatile memory 614 may also provide a repository for storing data used by the software.
  • The volatile memory 610 and the nonvolatile memory 614 may include a number of memories including a main random access memory (RAM) for storage of instructions and data during program execution and a read only memory (ROM) in which read-only non-transitory instructions are stored. The volatile memory 610 and the nonvolatile memory 614 may include a file storage subsystem providing persistent (non-volatile) storage for program and data files. The volatile memory 610 and the nonvolatile memory 614 may include removable storage systems, such as removable flash memory.
  • The bus subsystem 618 provides a mechanism for enabling the various components and subsystems of data processing system 620 communicate with each other as intended. Although the communication network interface 612 is depicted schematically as a single bus, some embodiments of the bus subsystem 618 may utilize multiple distinct busses.
  • It will be readily apparent to one of ordinary skill in the art that the computing device 600 may be a device such as a smartphone, a desktop computer, a laptop computer, a rack-mounted computer system, a computer server, or a tablet computer device. As commonly known in the art, the computing device 600 may be implemented as a collection of multiple networked computing devices. Further, the computing device 600 will typically include operating system logic (not illustrated) the types and nature of which are well known in the art.
  • Those having skill in the art will appreciate that there are various logic implementations by which processes and/or systems described herein can be effected (e.g., hardware, software, or firmware), and that the preferred vehicle will vary with the context in which the processes are deployed. If an implementer determines that speed and accuracy are paramount, the implementer may opt for a hardware or firmware implementation; alternatively, if flexibility is paramount, the implementer may opt for a solely software implementation; or, yet again alternatively, the implementer may opt for some combination of hardware, software, or firmware. Hence, there are numerous possible implementations by which the processes described herein may be effected, none of which is inherently superior to the other in that any vehicle to be utilized is a choice dependent upon the context in which the implementation will be deployed and the specific concerns (e.g., speed, flexibility, or predictability) of the implementer, any of which may vary. Those skilled in the art will recognize that optical aspects of implementations may involve optically-oriented hardware, software, and or firmware.
  • The above embodiments of the present invention are illustrative and not limitative. Other additions, subtractions or modifications are obvious in view of the present disclosure and are intended to fall within the scope of the appended claims.

Claims (20)

What is claimed is:
1. A method of training a neural network to complete an image when masked, the method comprising:
identifying a plurality of images that are visually similar to the image;
forming a first subset of images from the plurality of images;
setting parameters of a neural network to a first set of values in accordance with data represented by the first subset;
using the neural network having the first set of parameter values to complete the masked image if the neural network having the first set of parameter values is determined to meet a threshold value;
forming another subset of images, different from the first subset, from the plurality of images if the neural network having the first set of parameter values is determined not to meet the threshold value;
updating the parameters of the neural network in accordance with data represented by the other subset;
applying the neural network having the updated parameter values to complete the masked image if the neural network having the updated parameters values is determined to meet the threshold value; and
repeating the forming of another set and the updating of the parameters if the neural network having the updated set of parameter values is determined not to meet the threshold value.
2. The method of claim 1 said threshold value is defined by a convergence of the neural network.
3. The method of claim 1 wherein said threshold value is defined by a number of updates of the parameters.
4. The method of claim 1 wherein said plurality of images are identified by searching a collection of images.
5. The method of claim 1 wherein said mask is variable.
6. The method of claim 1 further comprising performing a post-processing on the completed image.
7. The method of claim 1 wherein forming the first subset comprises sampling image-mask pairs.
8. The method of claim 7 wherein said sampling is a random sampling.
9. The method of claim 1 wherein the neural network is a convolutional neural network.
10. The method of claim 1 wherein the neural network is pre-trained to perform image completion.
11. A computer system comprising a neural network configured to complete an image when masked, the computer system further configured to:
identify a plurality of images that are visually similar to the image;
form a first subset of images from the plurality of images;
set parameters of a neural network to a first set of values in accordance with data represented by the first subset;
use the neural network having the first set of parameter values to complete the masked image if the neural network having the first set of parameter values is determined to meet a threshold value;
form another subset of images, different from the first subset, from the plurality of images if the neural network having the first set of parameter values is determined not to meet the threshold value;
update the parameters of the neural network in accordance with data represented by the other subset; and
apply the neural network having the updated parameter values to complete the masked image if the neural network having the updated parameters values is determined to meet the threshold value; and
repeat the forming of another set and the updating of the parameters if the neural network having the updated set of parameter values is determined not to meet the threshold value.
12. The computer system of claim 10 wherein said threshold value is defined by a convergence of the neural network.
13. The computer system of claim 10 wherein said threshold value is defined by a number of updates of the parameters.
14. The computer system of claim 10 wherein said plurality of images are identified by searching a collection of images.
15. The computer system of claim 10 wherein said mask is variable.
16. The computer system of claim 10 wherein said computer system is further configured to post-process the completed masked image.
17. The computer system of claim 10 wherein the first subset is formed by sampling image-mask pairs.
18. The computer system of claim 16 wherein said sampling is a random sampling.
19. The computer system of claim 10 wherein the neural network is a convolutional neural network.
20. The computer system of claim 10 wherein the neural network is pre-trained to perform image completion.
US17/110,290 2018-04-25 2020-12-03 Single Image Completion From Retrieved Image Collections Abandoned US20210090232A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/110,290 US20210090232A1 (en) 2018-04-25 2020-12-03 Single Image Completion From Retrieved Image Collections

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201862662699P 2018-04-25 2018-04-25
US16/394,410 US10885628B2 (en) 2018-04-25 2019-04-25 Single image completion from retrieved image collections
US17/110,290 US20210090232A1 (en) 2018-04-25 2020-12-03 Single Image Completion From Retrieved Image Collections

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/394,410 Continuation US10885628B2 (en) 2018-04-25 2019-04-25 Single image completion from retrieved image collections

Publications (1)

Publication Number Publication Date
US20210090232A1 true US20210090232A1 (en) 2021-03-25

Family

ID=68295076

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/394,410 Active 2039-07-03 US10885628B2 (en) 2018-04-25 2019-04-25 Single image completion from retrieved image collections
US17/110,290 Abandoned US20210090232A1 (en) 2018-04-25 2020-12-03 Single Image Completion From Retrieved Image Collections

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/394,410 Active 2039-07-03 US10885628B2 (en) 2018-04-25 2019-04-25 Single image completion from retrieved image collections

Country Status (2)

Country Link
US (2) US10885628B2 (en)
WO (1) WO2019207524A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10885628B2 (en) * 2018-04-25 2021-01-05 Seesure Single image completion from retrieved image collections
CN109993825B (en) * 2019-03-11 2023-06-20 北京工业大学 Three-dimensional reconstruction method based on deep learning
CN112967356A (en) * 2021-03-05 2021-06-15 北京百度网讯科技有限公司 Image filling method and device, electronic device and medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180357537A1 (en) * 2017-06-12 2018-12-13 Nvidia Corporation Systems and methods for training neural networks with sparse data
US10319364B2 (en) * 2017-05-18 2019-06-11 Telepathy Labs, Inc. Artificial intelligence-based text-to-speech system and method
US20200160175A1 (en) * 2018-11-15 2020-05-21 D-Wave Systems Inc. Systems and methods for semantic segmentation
US10775174B2 (en) * 2018-08-30 2020-09-15 Mapbox, Inc. Map feature extraction system for computer map visualizations
US10885436B1 (en) * 2020-05-07 2021-01-05 Google Llc Training text summarization neural networks with an extracted segments prediction objective
US10885628B2 (en) * 2018-04-25 2021-01-05 Seesure Single image completion from retrieved image collections
US10937169B2 (en) * 2018-12-18 2021-03-02 Qualcomm Incorporated Motion-assisted image segmentation and object detection
US11030782B2 (en) * 2019-11-09 2021-06-08 Adobe Inc. Accurately generating virtual try-on images utilizing a unified neural network framework
US11195280B2 (en) * 2017-06-08 2021-12-07 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Progressive and multi-path holistically nested networks for segmentation
US11321604B2 (en) * 2017-06-21 2022-05-03 Arm Ltd. Systems and devices for compressing neural network parameters
US11335004B2 (en) * 2020-08-07 2022-05-17 Adobe Inc. Generating refined segmentation masks based on uncertain pixels

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130060790A1 (en) * 2011-09-07 2013-03-07 Michael Chertok System and method for detecting outliers
US20150339838A1 (en) 2012-06-26 2015-11-26 Eyeconit Ltd. Image mask providing a machine-readable data matrix code
WO2016145516A1 (en) * 2015-03-13 2016-09-22 Deep Genomics Incorporated System and method for training neural networks
WO2016197303A1 (en) * 2015-06-08 2016-12-15 Microsoft Technology Licensing, Llc. Image semantic segmentation
US9607391B2 (en) * 2015-08-04 2017-03-28 Adobe Systems Incorporated Image object segmentation using examples
US9864901B2 (en) * 2015-09-15 2018-01-09 Google Llc Feature detection and masking in images based on color distributions
US10540768B2 (en) * 2015-09-30 2020-01-21 Samsung Electronics Co., Ltd. Apparatus and method to segment object from image
US10235771B2 (en) * 2016-11-11 2019-03-19 Qualcomm Incorporated Methods and systems of performing object pose estimation
US10140544B1 (en) * 2018-04-02 2018-11-27 12 Sigma Technologies Enhanced convolutional neural network for image segmentation
US10671855B2 (en) * 2018-04-10 2020-06-02 Adobe Inc. Video object segmentation by reference-guided mask propagation
US10672174B2 (en) * 2018-06-28 2020-06-02 Adobe Inc. Determining image handle locations
US10936912B2 (en) * 2018-11-01 2021-03-02 International Business Machines Corporation Image classification using a mask image and neural networks
US11092899B2 (en) * 2018-11-30 2021-08-17 Taiwan Semiconductor Manufacturing Co., Ltd. Method for mask data synthesis with wafer target adjustment
US10482584B1 (en) * 2019-01-31 2019-11-19 StradVision, Inc. Learning method and learning device for removing jittering on video acquired through shaking camera by using a plurality of neural networks for fault tolerance and fluctuation robustness in extreme situations, and testing method and testing device using the same

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10319364B2 (en) * 2017-05-18 2019-06-11 Telepathy Labs, Inc. Artificial intelligence-based text-to-speech system and method
US11195280B2 (en) * 2017-06-08 2021-12-07 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Progressive and multi-path holistically nested networks for segmentation
US20180357537A1 (en) * 2017-06-12 2018-12-13 Nvidia Corporation Systems and methods for training neural networks with sparse data
US11244226B2 (en) * 2017-06-12 2022-02-08 Nvidia Corporation Systems and methods for training neural networks with sparse data
US11321604B2 (en) * 2017-06-21 2022-05-03 Arm Ltd. Systems and devices for compressing neural network parameters
US10885628B2 (en) * 2018-04-25 2021-01-05 Seesure Single image completion from retrieved image collections
US10775174B2 (en) * 2018-08-30 2020-09-15 Mapbox, Inc. Map feature extraction system for computer map visualizations
US20200160175A1 (en) * 2018-11-15 2020-05-21 D-Wave Systems Inc. Systems and methods for semantic segmentation
US10937169B2 (en) * 2018-12-18 2021-03-02 Qualcomm Incorporated Motion-assisted image segmentation and object detection
US11030782B2 (en) * 2019-11-09 2021-06-08 Adobe Inc. Accurately generating virtual try-on images utilizing a unified neural network framework
US10885436B1 (en) * 2020-05-07 2021-01-05 Google Llc Training text summarization neural networks with an extracted segments prediction objective
US11335004B2 (en) * 2020-08-07 2022-05-17 Adobe Inc. Generating refined segmentation masks based on uncertain pixels

Also Published As

Publication number Publication date
WO2019207524A1 (en) 2019-10-31
WO2019207524A9 (en) 2019-12-19
US20190385292A1 (en) 2019-12-19
US10885628B2 (en) 2021-01-05

Similar Documents

Publication Publication Date Title
US20210090232A1 (en) Single Image Completion From Retrieved Image Collections
CN111062495B (en) Machine learning method and related device
RU2631994C1 (en) Method, device and server for determining image survey plan
CN109035370B (en) Picture labeling method and system
US20180046721A1 (en) Systems and Methods for Automatic Customization of Content Filtering
CN111583100B (en) Image processing method, device, electronic equipment and storage medium
CN107451240B (en) interaction-based knowledge-graph question-answer Q/A system retrieval and promotion method and device
US10990807B2 (en) Selecting representative recent digital portraits as cover images
Zhu et al. Automatic detection of books based on Faster R-CNN
CN115630236A (en) Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment
CN111046203A (en) Image retrieval method, image retrieval device, storage medium and electronic equipment
WO2019180666A1 (en) Computer vision training using paired image data
CN107193979B (en) Method for searching homologous images
CN113821657A (en) Artificial intelligence-based image processing model training method and image processing method
AU2018204876A1 (en) Interactive content search using comparisons
US11232616B2 (en) Methods and systems for performing editing operations on media
CN110263825B (en) Data clustering method and device, computer equipment and storage medium
EP4163803A1 (en) Sample data annotation system, method, and related device
US10990241B2 (en) Rich media icon system
US20240127452A1 (en) Learning parameters for neural networks using a semantic discriminator and an object-level discriminator
US20240127412A1 (en) Iteratively modifying inpainted digital images based on changes to panoptic segmentation maps
US20240127410A1 (en) Panoptically guided inpainting utilizing a panoptic inpainting neural network
US20240127411A1 (en) Generating and providing a panoptic inpainting interface for generating and modifying inpainted digital images
CN115099212B (en) Classification method, device, medium and computer equipment based on template design
US20230071291A1 (en) System and method for a precise semantic segmentation

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION