US20170069101A1 - Method and system for unsupervised image segmentation using a trained quality metric - Google Patents
Method and system for unsupervised image segmentation using a trained quality metric Download PDFInfo
- Publication number
- US20170069101A1 US20170069101A1 US15/357,906 US201615357906A US2017069101A1 US 20170069101 A1 US20170069101 A1 US 20170069101A1 US 201615357906 A US201615357906 A US 201615357906A US 2017069101 A1 US2017069101 A1 US 2017069101A1
- Authority
- US
- United States
- Prior art keywords
- image
- value
- parameter
- segmentation
- segment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/162—Segmentation; Edge detection involving graph-based methods
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
 
- 
        - G06T7/0083—
 
- 
        - G06T7/0085—
 
- 
        - G06T7/0093—
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
 
Definitions
- the present invention relates to the processing of image data and in particular to segmentation of images, as well as 3-dimensional or higher-dimensional data.
- Segmentation is a key processing step in many applications, ranging for instance from medical imaging to machine vision and video compression technology. Although different approaches to segmentation have been proposed, those based on graphs have attracted lot of researchers because of their computational efficiency.
- segmentation algorithms are known to the practitioners in the field, today. Some examples include the watershed algorithm, and SLIC, a superpixel algorithm based on nearest neighbor aggregation. Typically, these algorithms have a common disadvantage in that they require a scale parameter to be set by a human supervisor. Thus, the practical applications have, in general, involved supervised segmentation. This may limit the range of applications, since in many instances segmentation is to be generated dynamically and there may be no time or opportunity for human supervision.
- a graph-based segmentation algorithm based on the work of P. F. Felzenszwalb and D. P. Huttenlocher is used. They discussed basic principles of segmentation in general and applied these principles to develop an efficient segmentation algorithm based on graph cutting in their paper “Efficient Graph-Based Image Segmentation,” Int. Jour. Comp. Vis., 59(2), September 2004, herein incorporated by reference in its entirety. Felzenszwalb and Huttenlocher stated that any segmentation algorithm should “capture perceptually important groupings or regions, which often reflect global aspects of the image.”
- Image segmentation is identified by finding a partition of V such that each component is connected, the internal difference between the elements of each component is minimal whereas the difference between elements of different components is maximal. This is achieved by the definition of a predicate in Equation (1) that determines if a boundary exists between two adjacent components C 1 and C 2 , that is:
- Dif(C 1 , C 2 ) is the difference between the two components, defined as the minimum weight of the set of edges that connects C 1 and C 2 ;
- MInt(C 1 , C 2 ) is the minimum internal difference, defined in Equation (2) as:
- the threshold function forces two small segments not to fuse at least there if is a strong evidence of difference between them.
- segment parameter k sets the scale of observation.
- Felzenszwalb and Huttenlocher demonstrate that the algorithm generates a segmentation map that is neither too fine nor too coarse, but the definition of fineness and coarseness finally depends on k that has to be carefully set by the user to obtain a perceptually reasonable segmentation.
- edges are detected by looking at gray-scale gradient maxima with gradient magnitudes above a threshold value.
- k is this threshold value and needs to be set appropriately for proper segmentation.
- segmentation based on edge-extraction may be used.
- edge thresholds are established based on a strength parameter k.
- a parameter is used to set the scale of observation.
- a human user selects the k value for a particular image. It is however clear that the segmentation quality provided by a certain algorithm is generally related to the quality perceived by a human observer, especially for applications (like video compression) where a human being does constitute the final beneficiary of the output of the algorithm.
- FIG. 1A For example, a 640 ⁇ 480 color image is provided in FIG. 1A .
- a graph cut algorithm was used to generate the segmentation results associated with the image of FIG. 1A as discussed herein.
- k is 3
- k is 100
- k is 10,000.
- values of k too small may lead to over-segmentation.
- large values of k may introduce under-segmentation.
- Embodiments of the presently disclosed methods can be used to perform segmentation with no supervision, using an algorithm that automatically adapts a segment parameter along the image to generate a segmentation map that is perceptually reasonable for a human observer.
- Embodiments include training a segmentation quality model with a set of training images, classifying the images as over-segmented, well-segmented, or under-segmented for various values of segment parameter k and then defining conditions when segmentation quality is desirable.
- Embodiments of the invention include an illustrative method of segmenting an image.
- Embodiments of the illustrative method include determining a first value of a segment parameter, where the segment parameter relates to a threshold function for establishing a boundary condition between a first segment and a second segment; determining a first value of a similarity function configured to indicate a similarity between the image and its segmentation based on the first value of the segment parameter; comparing the first value of the segment parameter and the first value of the similarity function to a predetermined function; determining a second value of the segment parameter based on a result of the comparing; and segmenting the image based on the second value of the segment parameter.
- a bisection algorithm may be used to find the value of k, such that the point (k, numsegments(k)), where “numsegments(k)” represents the number of segments produced when using the value, k, lies within a predetermined region or on a predetermined curve.
- the similarity function may include a symmetric uncertainty function.
- the predetermined function may be a linear function representing a linear relationship between the log of the segment parameter and the symmetric uncertainty, where a value above the linear function indicates over-segmentation and a value below the linear function indicates under-segmentation. Additional, alternative, and/or overlapping embodiments may include other similarity functions between the image and segmentation map of the image.
- Embodiments of the method of paragraph [0011] further include determining an optimal value of the segment parameter, where the optimal value is a value of the segment parameter that generates a segmentation of the image for which a difference between a corresponding value of the symmetric uncertainty and a portion of the linear relationship is minimized.
- segmenting the image based on the second value of the segment parameter may include dividing the image into a plurality of sub-images; generating a scale map for the image by determining a plurality of values of the segment parameter, wherein each of the plurality of values corresponds to one of the plurality of sub-images; and smoothing the scale map for the image using a filter.
- the filter may be a low-pass filter.
- Embodiments of the method of paragraph [0013] may further include providing an additional image, where the additional image is disposed subsequent to the image in a video; dividing the additional image into an additional plurality of sub-images, where the additional plurality of sub-images corresponds to the plurality of sub-images in at least one of size and location; providing the plurality of values of the segment parameter as a plurality of initial estimates for the segment parameter corresponding to the additional plurality of sub-images; determining a plurality of optimized values of the segment parameter, wherein each of the plurality of optimized values corresponds to one of the additional plurality of sub-images; and segmenting the additional image based on the plurality of optimized values of the segment parameter.
- determining the linear function may include providing a plurality of training images; generating a segmentation map for each of the plurality of training images at a plurality of values of the segment parameter; determining a value of a symmetric uncertainty for each segmentation map; and classifying each segmentation map as being over-segmented, well segmented, or under-segmented, based on a visual perception by at least one observer.
- another illustrative method of segmenting an image may include providing an image; and dividing the image into a plurality of sub-images, each sub-image including a plurality of pixels. For each sub-image, embodiments of the illustrative method may include determining a first value of a parameter, where the parameter relates to a threshold function for establishing a boundary condition between a first segment and a second segment; determining a first value of a symmetric uncertainty of the sub-image based on the first value of the first parameter; comparing the first value of the parameter and the first value of the symmetric uncertainty to a predetermined function; and determining a second value of the parameter based on a result of the comparing.
- the illustrative method of paragraph [0016] may further include assigning the determined value of the parameter to each pixel in the sub-image; applying a filter to the assigned values to obtain a filtered value of the parameter for each pixel in the sub-image; and segmenting the image based on the filtered values of the parameter.
- the filter includes a low-pass filter.
- Embodiments of the illustrative method of paragraph [0017] may further include providing an additional image; dividing the additional image into an additional plurality of sub-images; segmenting the additional image based in part on the second value of the parameter determined for each of the first plurality of sub-images of the image.
- the predetermined function may be a linear function representing a linear relationship between the log of the first parameter and the symmetric uncertainty, where a value above the linear function indicates over-segmentation and a value below the linear function indicates under-segmentation.
- the method of paragraph [0018] may further include providing a plurality of training images; generating a map for each of the plurality of training images at a plurality of values of the segment parameter; determining a value of a symmetric uncertainty for each segmentation map; and classifying each segmentation map as being over-segmented, well-segmented, or under-segmented, based on a visual perception by at least one observer.
- an illustrative system includes an image segmentation device having a processor and a memory.
- the memory include computer-readable media having computer-executable instructions embodied thereon that, when executed by the processor, cause the processor to instantiate one or more components.
- the one or more components include a segment module configured to determine a functional relationship between a first parameter and a second parameter based on an input electronically received about a plurality of training images.
- the segment module may be further configured to, for a sub-image of an image to be segmented, determine an initial value of the first parameter for the sub-image; and determine an initial value of the second parameter.
- the one or more components may further include a comparison module configured to perform a comparison between the initial value of the first parameter and the initial value of the second parameter to the functional relationship, where the segment module is further configured to determine an updated value of the first parameter based on the comparison.
- the segment module may be further configured to segment the image, based in part on the updated value of the first parameter, to create segmented image data.
- system of paragraph [0021] may further include an encoder configured to encode the segmented image data.
- system of paragraph [0022] may further include a communication module configured to facilitate communication of at least one of the image to be segmented and the segmented image data.
- the segment module may be further configured to divide the additional image into a plurality of sub-images; and segment the additional image, based in part on the updated value of the first parameter.
- the second parameter may include a symmetric uncertainty of the sub-image.
- FIG. 1A is an exemplary 640 ⁇ 480 color image
- FIG. 2 illustrates an exemplary image segmentation system, in accordance with embodiments of the invention
- FIG. 3 illustrates an exemplary segmentation device of the image segmentation system shown in FIG. 2 , in accordance with embodiments of the invention
- FIG. 4A is a series of segmentation maps for a block of 160 ⁇ 120 pixels labeled A in the color image of FIG. 1A , wherein k ranges from 1 to 10,000;
- FIG. 4B illustrates the weighted uncertainty, U w of the segmentation maps of FIG. 4A as a function of k together with an evaluation performed by a human observer;
- FIG. 5A illustrates the classification of training images performed by a human observer for image resolutions of 320 ⁇ 240 and an optimal segmentation line showing a desired segment number versus k;
- FIG. 5B illustrates the classification of training images performed by a human observer for image resolutions of 640 ⁇ 480 and an optimal segmentation line showing a desired segment number versus k;
- FIG. 6 illustrates an exemplary method for determining a value of k, in accordance with embodiments of the invention
- FIG. 7A illustrates the iterative method of estimating k of FIG. 6 for a 160 ⁇ 120 sub-image taken from a 640 ⁇ 480 image in the (log(k), U w ) plane, in accordance with embodiments of the invention
- FIG. 7B illustrates the corresponding segmentation based on the estimates in FIG. 7A , in accordance with embodiments of the invention.
- FIG. 8 illustrates another exemplary method for determining a value of k, in accordance with embodiments of the invention.
- FIG. 9A is an image at 640 ⁇ 480 resolution
- FIG. 9B is a scale map of k(x,y) of the image of FIG. 9A obtained using the method of FIG. 8 , in accordance with embodiments of the invention.
- FIG. 9C illustrates the corresponding segmentation of FIG. 9A using the method of FIG. 8 , in accordance with embodiments of the invention.
- FIG. 9D illustrates the corresponding segmentation of FIG. 9A using the method of Felzenszwalb and Huttenlocher, in which the scale parameter was chosen to obtain the same number of total segments as in the segmentation depicted in FIG. 9C ;
- FIG. 10A is an image at 640 ⁇ 480 resolution
- FIG. 10B is a scale map of k(x,y) of the image of FIG. 10A obtained using the method of FIG. 8 , in accordance with embodiments of the invention.
- FIG. 10C illustrates the corresponding segmentation of FIG. 10A using the method of FIG. 8 , in accordance with embodiments of the invention
- FIG. 10D illustrates the corresponding segmentation of FIG. 10A using the method of Felzenszwalb and Huttenlocher, in which the same number of total segments were obtained as in the segmentation depicted in FIG. 10C ;
- FIG. 11 illustrates a method of segmenting a second image based on the segmentation of a first image, in accordance with embodiments of the invention.
- Image segmentation system 12 includes a segmentation device 14 .
- Segmentation device 14 is illustratively coupled to image source 16 by communication link 18 A.
- segmentation device 14 illustratively receives an image file from image source 16 over communication link 18 A.
- Exemplary image files include, but are not limited to, digital photographs, digital image files from medical imaging, machine vision image files, video image files, and any other suitable images having a plurality of pixels.
- Segmentation device 14 is illustratively coupled to receiving device 20 by communication link 18 B. In one exemplary embodiment, segmentation device 14 communicates an image file over communication link 18 B.
- communication links 18 A, 18 B are independently a wired connection, or a wireless connection, or a combination of wired and wireless networks. In some embodiments, one or both of communication links 18 A, 18 B are a network.
- Illustrative networks include any number of different types of communication networks such as, a short messaging service (SMS), a local area network (LAN), a wireless LAN (WLAN), a wide area network (WAN), the Internet, a P2P network, or other suitable networks.
- the network may include a combination of multiple networks.
- the receiving device 20 may include any combination of components described herein with reference to segmentation device 14 , components not shown or described, and/or combinations of these.
- the segmentation device 14 may include, or be similar to, the encoding computing systems described in U.S. application Ser. No. 13/428,707, filed Mar. 23, 2012, entitled “VIDEO ENCODING SYSTEM AND METHOD;” and/or U.S. application Ser. No. 13/868,749, filed Apr. 23, 2013, entitled “MACROBLOCK PARTITIONING AND MOTION ESTIMATION USING OBJECT ANALYSIS FOR VIDEO COMPRESSION;” the disclosure of each of which is expressly incorporated by reference herein.
- segmentation device 14 is schematically illustrated in FIG. 3 . Although referred to as a single device, in some embodiments, segmentation device 14 may be implemented in multiple instances, distributed across multiple computing devices, instantiated within multiple virtual machines, and/or the like. Segmentation device 14 includes a processor 22 . Processor 22 may include one or multiple processors. Processor 22 executes various program components stored in memory 24 , which may facilitate encoding the image data 26 of the received image file. As shown in FIG.
- segmentation device 14 further includes at least one input/output device 28 , such as, for example, a monitor or other suitable display 30 , a keyboard, a printer, a disk drive, a universal serial bus (USB) port, a speaker, pointer device, a trackball, a button, a switch, a touch screen, and/or other suitable I/O devices.
- input/output device 28 such as, for example, a monitor or other suitable display 30 , a keyboard, a printer, a disk drive, a universal serial bus (USB) port, a speaker, pointer device, a trackball, a button, a switch, a touch screen, and/or other suitable I/O devices.
- a monitor or other suitable display 30 such as, for example, a monitor or other suitable display 30 , a keyboard, a printer, a disk drive, a universal serial bus (USB) port, a speaker, pointer device, a trackball, a button, a switch, a touch screen, and
- a computing device may include any type of computing device suitable for implementing embodiments of the invention. Examples of computing devices include specialized computing devices or general-purpose computing devices such “workstations,” “servers,” “laptops,” “desktops,” “tablet computers,” “hand-held devices,” and the like, provided that the computing device has been configured as disclosed herein.
- the segmentation device 14 may be or include a specially-designed computing device (e.g., a dedicated video encoding device), and/or a general purpose computing device (e.g., a desktop computer, a laptop, a mobile device, and/or the like) configured to execute computer-executable instructions stored in a memory 24 for causing the processor 22 to implement aspects of embodiments of system components discussed herein and/or to perform aspects of embodiments of methods and procedures discussed herein.
- a specially-designed computing device e.g., a dedicated video encoding device
- a general purpose computing device e.g., a desktop computer, a laptop, a mobile device, and/or the like
- a computing device includes a bus that, directly and/or indirectly, couples the following devices: a processor, a memory, an input/output (I/O) port, an I/O component, and a power supply. Any number of additional components, different components, and/or combinations of components may also be included in the computing device.
- the bus represents what may be one or more busses (such as, for example, an address bus, data bus, or combination thereof).
- the computing device may include a number of processors, a number of memory components, a number of I/O ports, a number of I/O components, and/or a number of power supplies. Additionally any number of these components, or combinations thereof, may be distributed and/or duplicated across a number of computing devices.
- the memory 24 includes computer-readable media in the form of volatile and/or nonvolatile memory and may be removable, nonremovable, or a combination thereof.
- Media examples include Random Access Memory (RAM); Read Only Memory (ROM); Electronically Erasable Programmable Read Only Memory (EEPROM); flash memory; optical or holographic media; magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices; data transmissions; or any other medium that can be used to store information and can be accessed by a computing device such as, for example, quantum state memory, and the like.
- the memory 24 may be local to processor 22 , and/or the memory 24 may be remote from processor 22 and accessible over a network.
- the memory 24 stores computer-executable instructions for causing the processor 22 to implement aspects of embodiments of system components discussed herein and/or to perform aspects of embodiments of methods and procedures discussed herein.
- Computer-executable instructions may include, for example, computer code, machine-useable instructions, and the like such as, for example, program components capable of being executed by one or more processors associated with a computing device. Examples of such program components include a segment module 32 , a comparison module 36 , and a filter module 38 . Some or all of the functionality contemplated herein may also, or alternatively, be implemented in hardware and/or firmware.
- training information may include a plurality of images segmented at different values of k.
- the training images may be used to determine a value for k that corresponds to a well-segmented segmentation for the given image type (medical image, video image, landscape image, etc.). As explained in more detail herein, this value of k from the training images may be used to assist in determining appropriate segmentation of further images automatically by segmentation device 14 .
- training information 33 includes information on a variety of different image types.
- training information includes a segmentation quality model that was derived from a set of training images, their segmentations and the classification of these segmentations by a human observer. In embodiments, the training images and their segmentations are not retained.
- the segment module 32 is configured to segment an image into a plurality of segments, as described in more detail below.
- the segments may be stored in memory 24 as segmented image data 34 .
- Segment data includes a plurality of pixels of the image.
- Segment image data 34 may also comprise one or more parameters associated with image data 34 , such as the scale maps illustrated in FIGS. 10B and 11B .
- the segments may include, for example, objects, groups, slices, tiles, and/or the like.
- the segment module 32 may employ any number of various automatic image segmentation methods known in the field.
- the segment module 32 may use image color of the pixels and corresponding gradients of the pixels to subdivide an image into segments that have similar color and texture.
- Two examples of image segmentation techniques include the watershed algorithm and optimum cut partitioning of a pixel connectivity graph.
- the segment module 32 may use Canny edge detection to detect edges on a video frame for optimum cut partitioning, and create segments using the optimum cut partitioning of the resulting pixel connectivity graph.
- the comparison module 36 may compare a calculated value or pair of values as described in more detail below.
- the comparison module 36 may compare the parameter k in Equation (2) above, and/or U w in Equation (4) below, with a reference value or pair of values, such as is shown in FIG. 5A .
- the filter module 38 may apply a filter to image data 26 or segmented image data 34 as described in more detail below.
- the filter module 38 may apply a low pass filter to a scale map of an image to avoid sharp transitions between adjacent sub-images, such as is shown in FIGS. 10B and 11B .
- segmentation device 14 includes an encoder 40 configured for encoding image data 26 to produce encoded image data 42 .
- image data 26 is both segmented and encoded.
- segmentation device 14 further includes a communication module 44 .
- the communication module 44 may facilitate communication of image data 26 between image source 16 and segmentation device 14 .
- the communication module 44 may facilitate communication of segmented image data 34 and/or encoded image data 42 between segmentation device 14 and receiving device 20 .
- over-segmentation generally occurs at visual inspection, thus meaning that areas that perceptually important regions are erroneously divided into sets of segments.
- k ranging from 350 to 10,000 last nine images of twenty shown
- too few segments are present in the segmentation map, resulting from under-segmentation.
- values of k from 75 to 200 (remaining three images of twenty shown), the segmentation appears generally good.
- a similarity function such as, for example, a quantitative index, can be defined to represent the amount of information contained in the original image, img, that is captured by the segmentation process.
- a color image may be defined by substituting the RGB value in each pixel with the average RGB value of the pixels in the corresponding segment, seg.
- the symmetric uncertainty U between img and seg can be computed by Equation (3), as given by Witten & Frank in Witten, Ian H. & Frank, Eibe, Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Amsterdam, ISBN 978-0-12-374856-0, the disclosures of which are hereby incorporated by reference in their entirety.
- S i j indicates the Shannon's entropy, in bits, of the i-th channel for the image j
- I(i,j) is the mutual information, in bits, of the images i and j.
- the symmetric uncertainty U expresses the percentage of bits that are shared between img and seg for each color channel.
- the value of U tends to zero when the segmentation map is uncorrelated with the original color image channel, whereas it is close to one when the segmentation map represent any fine detail in the corresponding channel of img.
- the index U W is comprised between 0 and 1 and is correlated with the segmentation quality.
- the weighted uncertainty index U w is plotted as a function of log(k) for each of the 160 ⁇ 120 pixel blocks illustrated in FIG. 1A .
- U W will decrease as k increases, passing from over-segmentation to under-segmentation.
- a representative set of training images at representative resolutions may be selected. For example, the curve depicted in FIG. 4B shows how the number of segments varies with k for the segmentation of portion A of the image in FIG. 1A at a given resolution.
- a quality model can be derived as was done with 80 in FIGS. 5A and 5B by determining a straight-line fit through the well-segmented points on the graphs.
- a single quality model can be used for multiple resolutions, or a quality model can be generated for each resolution.
- a set of twelve images, including flowers, portraits, landscapes and sport environments images, at 320 ⁇ 240 and 640 ⁇ 480 resolutions were next considered as training sets for the segmentation quality in one embodiment.
- Each segmented block was displayed with display 30 and classified by a human observer through I/O devices 28 as over-segmented, well segmented or under-segmented.
- a weighted uncertainty index, U w was determined for each segmented block according to Equation (4).
- FIGS. 5A and 5B The results are presented in FIGS. 5A and 5B . As illustrated in FIGS. 5A and 5B , a single value or range of k does not correspond to well-segmented blocks at a given resolution. However, an area in the (log(k), U w ) plane can be defined for this purpose.
- Output of the segmentation algorithm was classified as under-segmented, well-segmented, or over-segmented by a human supervisor for each training image and each input value of k.
- the straight-line quality model was stored.
- all of the training results may be stored and the quality model may be derived as needed.
- some form of classifier may be stored that allows the classification of the (k, numsegments(k)) ordered pair as over-segmented, under-segmented, or well-segmented.
- the average line between the line dividing the under-segmented and well-segmented and the line dividing the well-segmented and over-segmented was assumed to be the optimal line 80 for segmentation in the (log(k), U W ) plane.
- an exemplary method 102 of determining a k value is provided for an image.
- the optimal line for segmentation, m ⁇ log(k)+b, derived above constitutes a set of points in the (log(k), U W ) that may be reasonably perceived as good segmentation by a human observer. Consequently, given a sub-image of size 160 ⁇ 120 pixels, an optimal k value may be defined as a k value that generates a segmentation whose weighted symmetric uncertainty U W is close to m ⁇ log(k)+b.
- an optimal k value may include a k value for which a difference between the symmetric uncertainty, U W , of the generated segmentation and a portion of a linear relationship such as, for example, m ⁇ log(k)+b, is minimized.
- An optimal k value can be computed iteratively through a bisection method 102 , as illustrated in FIGS. 7A and 7B .
- the results of an exemplary implementation of the first five iterations of embodiments of method 102 ( FIG. 6 ) are provided in FIGS. 7A and 7B .
- an image is provided, illustratively the image shown in FIG. 7B multiple times. The image is divided into a plurality of sub-images. The remainder of FIG. 6 is carried out for each sub-image of the image.
- the value of i is increased for the first iteration.
- the current iteration i is compared to the maximum number of iterations.
- the maximum number of iterations is a predetermined integer, such as 5 or any other integer, which may be selected, for example, to optimize a trade-off between computational burden and image segmentation quality.
- the maximum number of iterations is based on a difference between the k and/or U w value determined in successive iterations. If the maximum number of iterations has been reached, the k value determined in block 110 is chosen as the final value of k for segmentation, as shown in block 114 .
- the image is segmented and the corresponding U W is computed for the k value determined in block 110 .
- the first iteration of k was 100, and the U w calculated in the first iteration was 0.28.
- FIG. 7B An example of a resulting segmentation of a first iteration of k is shown in FIG. 7B (second row, left image).
- the determined k i and U w values are compared to the optimal line in the (log k i , U w ) plane.
- (log k i , U w ) is located above the optimal line.
- the value of k right is replaced with k i in block 122 , and the method 102 returns to block 108 .
- the value of k left is replaced with k i in block 120 , and the method 102 returns to block 108 .
- FIG. 7A Exemplary results of embodiments of method 102 are presented in FIG. 7A for the image of FIG. 7B .
- the image of FIG. 7B appears well-segmented, and the corresponding point in the (log(k), U W ) space lies close to the optimal segmentation line, as shown in FIG. 7A .
- the k of 133.3521 and U w of 0.27 lies very close to the optimal segmentation line ( FIG. 7A ), and the image appears well-segmented ( FIG. 7B , bottom image).
- the parameters of the segmentation quality model change with the image resolution.
- the optimal segmentation line shown in FIG. 7A for a 320 ⁇ 240 resolution is lower than the optimal segmentation line shown in FIG. 7B for a 640 ⁇ 480 resolution.
- the application of the segmentation quality model to other image resolutions may indicate therefore to re-classify segmented sub-images of 160 ⁇ 120 pixels for the given resolution.
- interpolation or extrapolation of known image or sub-image resolutions are used.
- Method 102 ( FIG. 6 ) was used to estimate the optimal k value for a sub-image of 160 ⁇ 120 pixels as illustrated in FIGS. 7A and 7B .
- a set of adjacent sub-images may be considered.
- putting together the independent segmentations of each sub-image may not produce a satisfying segmentation map, since segments across the borders of the sub-images may be divided into multiple segments.
- in Equation (2) becomes ⁇ (C,x,y) k(x,y)/
- each image is then divided into a plurality of sub-images.
- each sub-image may be 160 ⁇ 120 pixels.
- a value of k for each sub-image was determined.
- the value of k is determined for the sub-image using method 102 as described above with respect to FIG. 6 .
- the value of k determined in block 212 is assigned to all pixels in the sub-image.
- a scale map of k(x,y) for the image is smoothed through a low pass filter to avoid sharp transition of k(x,y) along the image.
- FIGS. 9A and 10A are two exemplary 640 ⁇ 480 pixel images taken from the dataset used for estimating the segmentation quality model in FIG. 5B .
- the k(x,y) scale map for each image following the smoothing through the low pass filter in block 216 is presented in FIGS. 9B and 10B , and the corresponding segmentation is shown in FIGS. 9C and 10C .
- the value of k was set experimentally to guarantee an equivalent number of segments as in FIGS. 9C and 9C .
- k was set to 115
- k was set to 187.
- FIGS. 9B and 9C illustrate that the present method favors large segments (high k value) in the area occupied by persons in the image, and finer segmentation (low k value) in the upper left area of the image, where a large number of small leaves are present, when compared to the method of Felzenszwalb and Huttenlocher in FIG. 9D .
- FIGS. 10B and 10C illustrate that embodiments of the present method may favor larger segments in the homogeneous area of the sky and skyscrapers, for example, preventing over-segmentation in the sky area, when compared to the method of Felzenszwalb and Huttenlocher as shown in FIG. 10D .
- overlapping rectangular regions may be used.
- the segmentation of a second image can be estimated based on the segmentation of a first image.
- Exemplary embodiments include video processing or video encoding, in which adjacent frames of images may be highly similar or highly correlated.
- a method 302 for segmenting a second image is provided in FIG. 11 .
- the first image is provided.
- the first image is segmented by dividing the first image into a plurality of sub-images in block 306 , determining a value of k for each sub-image in block 308 , and segmenting the image based on the determined k value in block 310 .
- segmenting the first image in blocks 306 - 310 is performed using method 102 ( FIG. 6 ) or method 202 ( FIG.
- a second image is provided.
- the first and second images are subsequent video images.
- the second image is divided into a plurality of sub-images in block 314 .
- one or more of the plurality of sub-images of the second image in block 314 correspond in size and/or location to one or more of the plurality of sub-images of the first image in block 306 .
- the k value for each sub-image of the first image determined in block 308 is provided as an initial estimate for the k value of each corresponding sub-image of the second image.
- the k values for the second image are optimized, using the estimated k values from the first image as an initial iteration, followed by segmenting the second image in block 340 .
- the second image is segmented based on the estimated k value in block 340 without first being optimized in block 318 .
- segmenting the second image in blocks 316 - 340 is performed using method 102 ( FIG. 6 ) or method 202 ( FIG. 8 ).
- the computational cost of segmenting the video images can be significantly reduced.
- the proposed method performs a research of the optimal k value for each sub-image considering the entire range for k.
- the range for k can be significantly reduced by considering the estimates obtained at previous frames for the same sub-image and/or corresponding sub-image.
- k values may be updated only at certain frame intervals and/or scene changes.
- the above methods of automatically optimizing a segmentation algorithm are performed based on edge thresholding and working in the YUV color space, achieving similar results.
- a similar segmentation quality model is used, but the optimal segmentation line as show in FIGS. 5A, 5B, and 7A is replaced with transformed into a plane or hyper-plane.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
A method and apparatus for unsupervised segmentation of an image is provided. In some exemplary embodiments, the method adjusts a segmentation parameter of a traditional graph-based segmentation algorithm along the image to generate a segmentation map that is perceptually reasonable for a human observer. In some embodiments, the method reduces over-segmentation and under-segmentation of the image.
  Description
-  This application is a continuation of U.S. application Ser. No. 14/696,255, filed on Apr. 24, 2015, and issued as U.S. Pat. No. 9,501,837 on Nov. 22, 2016, which claims priority to Provisional Applications No. 62/058,647, filed on Oct. 1, 2014; and 62/132,167, filed on Mar. 12, 2015. The entirety of each of the above-identified applications is hereby incorporated by reference for all purposes.
-  The present invention relates to the processing of image data and in particular to segmentation of images, as well as 3-dimensional or higher-dimensional data.
-  Segmentation is a key processing step in many applications, ranging for instance from medical imaging to machine vision and video compression technology. Although different approaches to segmentation have been proposed, those based on graphs have attracted lot of researchers because of their computational efficiency.
-  Many segmentation algorithms are known to the practitioners in the field, today. Some examples include the watershed algorithm, and SLIC, a superpixel algorithm based on nearest neighbor aggregation. Typically, these algorithms have a common disadvantage in that they require a scale parameter to be set by a human supervisor. Thus, the practical applications have, in general, involved supervised segmentation. This may limit the range of applications, since in many instances segmentation is to be generated dynamically and there may be no time or opportunity for human supervision.
-  In embodiments, a graph-based segmentation algorithm based on the work of P. F. Felzenszwalb and D. P. Huttenlocher is used. They discussed basic principles of segmentation in general and applied these principles to develop an efficient segmentation algorithm based on graph cutting in their paper “Efficient Graph-Based Image Segmentation,” Int. Jour. Comp. Vis., 59(2), September 2004, herein incorporated by reference in its entirety. Felzenszwalb and Huttenlocher stated that any segmentation algorithm should “capture perceptually important groupings or regions, which often reflect global aspects of the image.”
-  Based on the principle of a graph-based approach to segmentation, Felzenszwalk and Huttenlocher first build an undirected graph G=(V, E) where viεV is the set of pixels of the image that has to be segmented and (vi, vj)εE is the set of edges that connects pairs of neighboring pixels; a non-negative weight w(vi, vj) is associated to each edge with a magnitude proportional to the difference between vi and vj. Image segmentation is identified by finding a partition of V such that each component is connected, the internal difference between the elements of each component is minimal whereas the difference between elements of different components is maximal. This is achieved by the definition of a predicate in Equation (1) that determines if a boundary exists between two adjacent components C1 and C2, that is:
-  
-  where Dif(C1, C2) is the difference between the two components, defined as the minimum weight of the set of edges that connects C1 and C2; MInt(C1, C2) is the minimum internal difference, defined in Equation (2) as:
-  
 MInt(C 1 ,C 2)=min[Int(C 1)+τ(C 1),Int(C 2)+τ(C 2)] (2)
-  where Int(C) is the largest weight in the minimum spanning tree of the component C and describes therefore the internal difference between the elements of C; and where τ(C)=k/|C| is a threshold function used to establish whether there is evidence for a boundary between two components. The threshold function forces two small segments not to fuse at least there if is a strong evidence of difference between them.
-  In practice, the segment parameter k sets the scale of observation. Although Felzenszwalb and Huttenlocher demonstrate that the algorithm generates a segmentation map that is neither too fine nor too coarse, but the definition of fineness and coarseness finally depends on k that has to be carefully set by the user to obtain a perceptually reasonable segmentation.
-  The definition of the proper value of k for the graph-based algorithm, as well as the choice of the threshold value used for edge extraction in other edge-based segmentation algorithms such as, for example, the algorithms described by Iannizzotto and Vita in “Fast and Accurate Edge-Based Segmentation with No Contour Smoothing in 2-D Real Images,” Giancarlo Iannizzotto and Lorenzo Vita, IEEE Transactions on Image Processing, Vol. 9, No. 7, pp. 1232-1237 (July 2000), the entirety of which is hereby incorporated by reference herein for all purposes, remains up to now an open issue when “perceptually important groupings or regions” have to be extracted from the image. In the algorithm described by Iannizzotto and Vita, edges are detected by looking at gray-scale gradient maxima with gradient magnitudes above a threshold value. For this algorithm, k is this threshold value and needs to be set appropriately for proper segmentation. In embodiments, segmentation based on edge-extraction may be used. In those embodiments, edge thresholds are established based on a strength parameter k. In the field of segmentation algorithms, in general, a parameter is used to set the scale of observation. In cases in which segmentation is performed in a supervised mode, a human user selects the k value for a particular image. It is however clear that the segmentation quality provided by a certain algorithm is generally related to the quality perceived by a human observer, especially for applications (like video compression) where a human being does constitute the final beneficiary of the output of the algorithm.
-  For example, a 640×480 color image is provided inFIG. 1A . A graph cut algorithm was used to generate the segmentation results associated with the image ofFIG. 1A as discussed herein. Segmentation maps with σ=0.5, and a min size of 5 of the image ofFIG. 1A are provided inFIGS. 1B-1D for various values of k. InFIG. 1B , k is 3, inFIG. 1C , k is 100, and inFIG. 1D , k is 10,000. As illustrated inFIG. 1B , values of k too small may lead to over-segmentation. As illustrated inFIG. 1D , large values of k may introduce under-segmentation.
-  Embodiments of the presently disclosed methods can be used to perform segmentation with no supervision, using an algorithm that automatically adapts a segment parameter along the image to generate a segmentation map that is perceptually reasonable for a human observer. Embodiments include training a segmentation quality model with a set of training images, classifying the images as over-segmented, well-segmented, or under-segmented for various values of segment parameter k and then defining conditions when segmentation quality is desirable.
-  Embodiments of the invention include an illustrative method of segmenting an image. Embodiments of the illustrative method include determining a first value of a segment parameter, where the segment parameter relates to a threshold function for establishing a boundary condition between a first segment and a second segment; determining a first value of a similarity function configured to indicate a similarity between the image and its segmentation based on the first value of the segment parameter; comparing the first value of the segment parameter and the first value of the similarity function to a predetermined function; determining a second value of the segment parameter based on a result of the comparing; and segmenting the image based on the second value of the segment parameter. In embodiments, a bisection algorithm may be used to find the value of k, such that the point (k, numsegments(k)), where “numsegments(k)” represents the number of segments produced when using the value, k, lies within a predetermined region or on a predetermined curve. The similarity function may include a symmetric uncertainty function. In embodiments, the predetermined function may be a linear function representing a linear relationship between the log of the segment parameter and the symmetric uncertainty, where a value above the linear function indicates over-segmentation and a value below the linear function indicates under-segmentation. Additional, alternative, and/or overlapping embodiments may include other similarity functions between the image and segmentation map of the image.
-  Embodiments of the method of paragraph [0011] further include determining an optimal value of the segment parameter, where the optimal value is a value of the segment parameter that generates a segmentation of the image for which a difference between a corresponding value of the symmetric uncertainty and a portion of the linear relationship is minimized.
-  According to embodiments of the method of paragraph [0011], segmenting the image based on the second value of the segment parameter may include dividing the image into a plurality of sub-images; generating a scale map for the image by determining a plurality of values of the segment parameter, wherein each of the plurality of values corresponds to one of the plurality of sub-images; and smoothing the scale map for the image using a filter. The filter may be a low-pass filter.
-  Embodiments of the method of paragraph [0013] may further include providing an additional image, where the additional image is disposed subsequent to the image in a video; dividing the additional image into an additional plurality of sub-images, where the additional plurality of sub-images corresponds to the plurality of sub-images in at least one of size and location; providing the plurality of values of the segment parameter as a plurality of initial estimates for the segment parameter corresponding to the additional plurality of sub-images; determining a plurality of optimized values of the segment parameter, wherein each of the plurality of optimized values corresponds to one of the additional plurality of sub-images; and segmenting the additional image based on the plurality of optimized values of the segment parameter.
-  According to embodiments of the method of paragraph [0011], determining the linear function may include providing a plurality of training images; generating a segmentation map for each of the plurality of training images at a plurality of values of the segment parameter; determining a value of a symmetric uncertainty for each segmentation map; and classifying each segmentation map as being over-segmented, well segmented, or under-segmented, based on a visual perception by at least one observer.
-  According to embodiments, another illustrative method of segmenting an image may include providing an image; and dividing the image into a plurality of sub-images, each sub-image including a plurality of pixels. For each sub-image, embodiments of the illustrative method may include determining a first value of a parameter, where the parameter relates to a threshold function for establishing a boundary condition between a first segment and a second segment; determining a first value of a symmetric uncertainty of the sub-image based on the first value of the first parameter; comparing the first value of the parameter and the first value of the symmetric uncertainty to a predetermined function; and determining a second value of the parameter based on a result of the comparing.
-  In embodiments, the illustrative method of paragraph [0016] may further include assigning the determined value of the parameter to each pixel in the sub-image; applying a filter to the assigned values to obtain a filtered value of the parameter for each pixel in the sub-image; and segmenting the image based on the filtered values of the parameter. In embodiments, the filter includes a low-pass filter.
-  Embodiments of the illustrative method of paragraph [0017] may further include providing an additional image; dividing the additional image into an additional plurality of sub-images; segmenting the additional image based in part on the second value of the parameter determined for each of the first plurality of sub-images of the image. In embodiments, the predetermined function may be a linear function representing a linear relationship between the log of the first parameter and the symmetric uncertainty, where a value above the linear function indicates over-segmentation and a value below the linear function indicates under-segmentation.
-  In embodiments, the method of paragraph [0018] may further include providing a plurality of training images; generating a map for each of the plurality of training images at a plurality of values of the segment parameter; determining a value of a symmetric uncertainty for each segmentation map; and classifying each segmentation map as being over-segmented, well-segmented, or under-segmented, based on a visual perception by at least one observer.
-  According to embodiments, an illustrative system includes an image segmentation device having a processor and a memory. The memory include computer-readable media having computer-executable instructions embodied thereon that, when executed by the processor, cause the processor to instantiate one or more components. In embodiments, the one or more components include a segment module configured to determine a functional relationship between a first parameter and a second parameter based on an input electronically received about a plurality of training images. The segment module may be further configured to, for a sub-image of an image to be segmented, determine an initial value of the first parameter for the sub-image; and determine an initial value of the second parameter. The one or more components may further include a comparison module configured to perform a comparison between the initial value of the first parameter and the initial value of the second parameter to the functional relationship, where the segment module is further configured to determine an updated value of the first parameter based on the comparison.
-  In embodiments of the system of paragraph [0020], the segment module may be further configured to segment the image, based in part on the updated value of the first parameter, to create segmented image data.
-  In embodiments, the system of paragraph [0021] may further include an encoder configured to encode the segmented image data.
-  In embodiments, the system of paragraph [0022] may further include a communication module configured to facilitate communication of at least one of the image to be segmented and the segmented image data.
-  In embodiments of the system of paragraph [0022], the segment module may be further configured to divide the additional image into a plurality of sub-images; and segment the additional image, based in part on the updated value of the first parameter.
-  According to embodiments of the system of paragraph [0020], the second parameter may include a symmetric uncertainty of the sub-image.
-  The above mentioned and other features of the invention, and the manner of attaining them, will become more apparent and the invention itself will be better understood by reference to the following description of embodiments of the invention taken in conjunction with the accompanying drawings.
-  The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
-  FIG. 1A is an exemplary 640×480 color image;
-  FIG. 1B is a segmentation map of the color image ofFIG. 1A wherein k=3;
-  FIG. 1C is a segmentation map of the color image ofFIG. 1A wherein k=100;
-  FIG. 1D is a segmentation map of the color image ofFIG. 1A wherein k=10,000;
-  FIG. 2 illustrates an exemplary image segmentation system, in accordance with embodiments of the invention;
-  FIG. 3 illustrates an exemplary segmentation device of the image segmentation system shown inFIG. 2 , in accordance with embodiments of the invention;
-  FIG. 4A is a series of segmentation maps for a block of 160×120 pixels labeled A in the color image ofFIG. 1A , wherein k ranges from 1 to 10,000;
-  FIG. 4B illustrates the weighted uncertainty, Uw of the segmentation maps ofFIG. 4A as a function of k together with an evaluation performed by a human observer;
-  FIG. 5A illustrates the classification of training images performed by a human observer for image resolutions of 320× 240 and an optimal segmentation line showing a desired segment number versus k;
-  FIG. 5B illustrates the classification of training images performed by a human observer for image resolutions of 640×480 and an optimal segmentation line showing a desired segment number versus k;
-  FIG. 6 illustrates an exemplary method for determining a value of k, in accordance with embodiments of the invention;
-  FIG. 7A illustrates the iterative method of estimating k ofFIG. 6 for a 160×120 sub-image taken from a 640×480 image in the (log(k), Uw) plane, in accordance with embodiments of the invention;
-  FIG. 7B illustrates the corresponding segmentation based on the estimates inFIG. 7A , in accordance with embodiments of the invention;
-  FIG. 8 illustrates another exemplary method for determining a value of k, in accordance with embodiments of the invention;
-  FIG. 9A is an image at 640×480 resolution;
-  FIG. 9B is a scale map of k(x,y) of the image ofFIG. 9A obtained using the method ofFIG. 8 , in accordance with embodiments of the invention;
-  FIG. 9C illustrates the corresponding segmentation ofFIG. 9A using the method ofFIG. 8 , in accordance with embodiments of the invention;
-  FIG. 9D illustrates the corresponding segmentation ofFIG. 9A using the method of Felzenszwalb and Huttenlocher, in which the scale parameter was chosen to obtain the same number of total segments as in the segmentation depicted inFIG. 9C ;
-  FIG. 10A is an image at 640×480 resolution;
-  FIG. 10B is a scale map of k(x,y) of the image ofFIG. 10A obtained using the method ofFIG. 8 , in accordance with embodiments of the invention;
-  FIG. 10C illustrates the corresponding segmentation ofFIG. 10A using the method ofFIG. 8 , in accordance with embodiments of the invention;
-  FIG. 10D illustrates the corresponding segmentation ofFIG. 10A using the method of Felzenszwalb and Huttenlocher, in which the same number of total segments were obtained as in the segmentation depicted inFIG. 10C ; and
-  FIG. 11 illustrates a method of segmenting a second image based on the segmentation of a first image, in accordance with embodiments of the invention.
-  The embodiments disclosed below are not intended to be exhaustive or to limit the invention to the precise forms disclosed in the following detailed description. Rather, the embodiments are chosen and described so that others skilled in the art may utilize their teachings.
-  Although the term “block” may be used herein to connote different elements of illustrative methods employed, the term should not be interpreted as implying any requirement of, or particular order among or between, various steps disclosed herein unless and except when explicitly referring to the order of individual steps.
-  Referring toFIG. 2 , an exemplaryimage segmentation system 12 is shown.Image segmentation system 12 includes asegmentation device 14.Segmentation device 14 is illustratively coupled to imagesource 16 bycommunication link 18A. In one exemplary embodiment,segmentation device 14 illustratively receives an image file fromimage source 16 overcommunication link 18A. Exemplary image files include, but are not limited to, digital photographs, digital image files from medical imaging, machine vision image files, video image files, and any other suitable images having a plurality of pixels.Segmentation device 14 is illustratively coupled to receivingdevice 20 bycommunication link 18B. In one exemplary embodiment,segmentation device 14 communicates an image file overcommunication link 18B. In some embodiments,communication links communication links 
-  Although not illustrated herein, the receivingdevice 20 may include any combination of components described herein with reference tosegmentation device 14, components not shown or described, and/or combinations of these. In embodiments, thesegmentation device 14 may include, or be similar to, the encoding computing systems described in U.S. application Ser. No. 13/428,707, filed Mar. 23, 2012, entitled “VIDEO ENCODING SYSTEM AND METHOD;” and/or U.S. application Ser. No. 13/868,749, filed Apr. 23, 2013, entitled “MACROBLOCK PARTITIONING AND MOTION ESTIMATION USING OBJECT ANALYSIS FOR VIDEO COMPRESSION;” the disclosure of each of which is expressly incorporated by reference herein.
-  Anexemplary segmentation device 14 is schematically illustrated inFIG. 3 . Although referred to as a single device, in some embodiments,segmentation device 14 may be implemented in multiple instances, distributed across multiple computing devices, instantiated within multiple virtual machines, and/or the like.Segmentation device 14 includes aprocessor 22.Processor 22 may include one or multiple processors.Processor 22 executes various program components stored inmemory 24, which may facilitate encoding theimage data 26 of the received image file. As shown inFIG. 3 ,segmentation device 14 further includes at least one input/output device 28, such as, for example, a monitor or othersuitable display 30, a keyboard, a printer, a disk drive, a universal serial bus (USB) port, a speaker, pointer device, a trackball, a button, a switch, a touch screen, and/or other suitable I/O devices.
-  Various components ofimage segmentation system 12 and/orsegmentation device 14 may be implemented on one or more computing devices. A computing device may include any type of computing device suitable for implementing embodiments of the invention. Examples of computing devices include specialized computing devices or general-purpose computing devices such “workstations,” “servers,” “laptops,” “desktops,” “tablet computers,” “hand-held devices,” and the like, provided that the computing device has been configured as disclosed herein. For example, according to embodiments, thesegmentation device 14 may be or include a specially-designed computing device (e.g., a dedicated video encoding device), and/or a general purpose computing device (e.g., a desktop computer, a laptop, a mobile device, and/or the like) configured to execute computer-executable instructions stored in amemory 24 for causing theprocessor 22 to implement aspects of embodiments of system components discussed herein and/or to perform aspects of embodiments of methods and procedures discussed herein.
-  In embodiments, a computing device includes a bus that, directly and/or indirectly, couples the following devices: a processor, a memory, an input/output (I/O) port, an I/O component, and a power supply. Any number of additional components, different components, and/or combinations of components may also be included in the computing device. The bus represents what may be one or more busses (such as, for example, an address bus, data bus, or combination thereof). Similarly, in embodiments, the computing device may include a number of processors, a number of memory components, a number of I/O ports, a number of I/O components, and/or a number of power supplies. Additionally any number of these components, or combinations thereof, may be distributed and/or duplicated across a number of computing devices.
-  In embodiments, thememory 24 includes computer-readable media in the form of volatile and/or nonvolatile memory and may be removable, nonremovable, or a combination thereof. Media examples include Random Access Memory (RAM); Read Only Memory (ROM); Electronically Erasable Programmable Read Only Memory (EEPROM); flash memory; optical or holographic media; magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices; data transmissions; or any other medium that can be used to store information and can be accessed by a computing device such as, for example, quantum state memory, and the like. Thememory 24 may be local toprocessor 22, and/or thememory 24 may be remote fromprocessor 22 and accessible over a network. In embodiments, thememory 24 stores computer-executable instructions for causing theprocessor 22 to implement aspects of embodiments of system components discussed herein and/or to perform aspects of embodiments of methods and procedures discussed herein. Computer-executable instructions may include, for example, computer code, machine-useable instructions, and the like such as, for example, program components capable of being executed by one or more processors associated with a computing device. Examples of such program components include asegment module 32, acomparison module 36, and afilter module 38. Some or all of the functionality contemplated herein may also, or alternatively, be implemented in hardware and/or firmware.
-  In embodiments, one or more of the program components utilizetraining information 33 to assist in determining the appropriate segmentation of an image. For example, training information may include a plurality of images segmented at different values of k. The training images may be used to determine a value for k that corresponds to a well-segmented segmentation for the given image type (medical image, video image, landscape image, etc.). As explained in more detail herein, this value of k from the training images may be used to assist in determining appropriate segmentation of further images automatically bysegmentation device 14. In one embodiment,training information 33 includes information on a variety of different image types. In embodiments, training information includes a segmentation quality model that was derived from a set of training images, their segmentations and the classification of these segmentations by a human observer. In embodiments, the training images and their segmentations are not retained.
-  In embodiments, thesegment module 32 is configured to segment an image into a plurality of segments, as described in more detail below. The segments may be stored inmemory 24 assegmented image data 34. Segment data includes a plurality of pixels of the image.Segment image data 34 may also comprise one or more parameters associated withimage data 34, such as the scale maps illustrated inFIGS. 10B and 11B . The segments may include, for example, objects, groups, slices, tiles, and/or the like. Thesegment module 32 may employ any number of various automatic image segmentation methods known in the field. In embodiments, thesegment module 32 may use image color of the pixels and corresponding gradients of the pixels to subdivide an image into segments that have similar color and texture. Two examples of image segmentation techniques include the watershed algorithm and optimum cut partitioning of a pixel connectivity graph. For example, thesegment module 32 may use Canny edge detection to detect edges on a video frame for optimum cut partitioning, and create segments using the optimum cut partitioning of the resulting pixel connectivity graph.
-  In embodiments, thecomparison module 36 may compare a calculated value or pair of values as described in more detail below. For example, thecomparison module 36 may compare the parameter k in Equation (2) above, and/or Uw in Equation (4) below, with a reference value or pair of values, such as is shown inFIG. 5A .
-  In embodiments, thefilter module 38 may apply a filter to imagedata 26 orsegmented image data 34 as described in more detail below. For example, thefilter module 38 may apply a low pass filter to a scale map of an image to avoid sharp transitions between adjacent sub-images, such as is shown inFIGS. 10B and 11B .
-  In the illustrative embodiment ofFIG. 3 ,segmentation device 14 includes anencoder 40 configured for encodingimage data 26 to produce encodedimage data 42. In embodiments,image data 26 is both segmented and encoded. As illustrated inFIG. 3 ,segmentation device 14 further includes acommunication module 44. In some embodiments, thecommunication module 44 may facilitate communication ofimage data 26 betweenimage source 16 andsegmentation device 14. In some embodiments, thecommunication module 44 may facilitate communication ofsegmented image data 34 and/or encodedimage data 42 betweensegmentation device 14 and receivingdevice 20.
-  FIG. 4A illustrates portion A ofFIG. 1A showing a block of 160×120 pixels segmented with the graph-based approach of Felzenszwalb and Huttenlocher for σ=0.5, min size=5, and values of k ranging from 1 to 10,000. For relatively low values of k from 1 to 50 (first eight images of twenty shown), over-segmentation generally occurs at visual inspection, thus meaning that areas that perceptually important regions are erroneously divided into sets of segments. For relatively high values of k ranging from 350 to 10,000 (last nine images of twenty shown), too few segments are present in the segmentation map, resulting from under-segmentation. For values of k from 75 to 200 (remaining three images of twenty shown), the segmentation appears generally good. These results are indicated inFIG. 4B .
-  A similarity function such as, for example, a quantitative index, can be defined to represent the amount of information contained in the original image, img, that is captured by the segmentation process. In embodiments, for example, a color image may be defined by substituting the RGB value in each pixel with the average RGB value of the pixels in the corresponding segment, seg. For each color channel, the symmetric uncertainty U between img and seg can be computed by Equation (3), as given by Witten & Frank in Witten, Ian H. & Frank, Eibe, Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Amsterdam, ISBN 978-0-12-374856-0, the disclosures of which are hereby incorporated by reference in their entirety.
-  
-  where Si j indicates the Shannon's entropy, in bits, of the i-th channel for the image j, and where I(i,j) is the mutual information, in bits, of the images i and j.
-  The symmetric uncertainty U expresses the percentage of bits that are shared between img and seg for each color channel. The value of U tends to zero when the segmentation map is uncorrelated with the original color image channel, whereas it is close to one when the segmentation map represent any fine detail in the corresponding channel of img.
-  Different images have different quantity of information in each color channel. For example, the color image ofFIG. 1A contains a large amount of information in the green channel. A weighted uncertainty index, Uw can be defined as in Equation (4) as:
-  
-  where U is determined for each channel as in Equation (3), and S is the Shannon's entropy for each channel.
-  The index UW is comprised between 0 and 1 and is correlated with the segmentation quality. Referring toFIG. 4B , the weighted uncertainty index Uw is plotted as a function of log(k) for each of the 160×120 pixel blocks illustrated inFIG. 1A .
-  For a typical image, UW will decrease as k increases, passing from over-segmentation to under-segmentation. For a particular segmentation quality model, a representative set of training images at representative resolutions may be selected. For example, the curve depicted inFIG. 4B shows how the number of segments varies with k for the segmentation of portion A of the image inFIG. 1A at a given resolution. Given multiple training images and human classification of their segmentation quality at different values of k as inFIGS. 5A and 5B , a quality model can be derived as was done with 80 inFIGS. 5A and 5B by determining a straight-line fit through the well-segmented points on the graphs. A single quality model can be used for multiple resolutions, or a quality model can be generated for each resolution. A set of twelve images, including flowers, portraits, landscapes and sport environments images, at 320×240 and 640×480 resolutions were next considered as training sets for the segmentation quality in one embodiment. According to embodiments, each image was divided into blocks of 160×120 pixels, and each block was segmented with the graph-based algorithm Felzenszwalb and Huttenlocher for σ=0.5, min size=5, and values of k ranging from 1 to 10,000. Each segmented block was displayed withdisplay 30 and classified by a human observer through I/O devices 28 as over-segmented, well segmented or under-segmented. A weighted uncertainty index, Uw, was determined for each segmented block according to Equation (4).
-  The results are presented inFIGS. 5A and 5B . As illustrated inFIGS. 5A and 5B , a single value or range of k does not correspond to well-segmented blocks at a given resolution. However, an area in the (log(k), Uw) plane can be defined for this purpose.
-  For each block considered, an S-shaped curve UW=UW[log(k)] in the (log(k), UW) space was observed. As shown inFIGS. 5A and 5B , for relatively small k values, UW remains almost constant, and a human observer generally classifies these data as over-segmented. As k increases, UW decreases rapidly and a human observer generally classifies this data as well segmented. For relatively high k values, a human observer generally classifies this data as under-segmented, and Uw approaches another almost constant value.
-  Output of the segmentation algorithm was classified as under-segmented, well-segmented, or over-segmented by a human supervisor for each training image and each input value of k. In embodiments, the straight-line quality model was stored. In other embodiments, all of the training results may be stored and the quality model may be derived as needed. In embodiments, some form of classifier may be stored that allows the classification of the (k, numsegments(k)) ordered pair as over-segmented, under-segmented, or well-segmented. The (log(k), UW) plane is subdivided into three different regions corresponding to 3 qualities of the segmentation result. Equation (5) was utilized to estimate the (m, b) parameters of the line UW=m·log(k)+b that separates under-segmented and well-segmented regions.
-  
-  where NUS and NWE are respectively the number of under-segmented and well segmented points; and where δUS,i and δWE,i are 0 if the point is correctly classified (i.e., for instance any under-segmentation point should lie under the UW=m·log(k)+b line) and 1 otherwise.
-  Equation (6) was utilized to estimate the (m, b) parameters of the line UW=m·log(k)+b that divides over-segmented and well-segmented regions.
-  
-  where NOS and NWE are respectively the number of over-segmented and well segmented points; and where βOS,i and δWE,i are 0 if the point is correctly classified (i.e., for instance any well-segmentation point should lie under the UW=m·log(k)+b line) and 1 otherwise.
-  The values of Equations (5) and (6) were minimized using a numerical algorithm. In embodiments, a simplex method is used. In practice, the cost function in each of Equation (5) and Equation (6) is the sum of the distances from the line UW=m·log(k)+b of all the points that are misclassified. The estimate of the two lines that divide the (log(k), UW) may be performed independently.
-  The average line between the line dividing the under-segmented and well-segmented and the line dividing the well-segmented and over-segmented was assumed to be theoptimal line 80 for segmentation in the (log(k), UW) plane. Given the S-like shape of the typical UW=UW[log(k)] curve in the (log(k), UW) plane, a point of intersection between the optimal line for segmentation and the UW=UW[log(k)] curve can generally be identified. In embodiments, Identification of this point gives an optimal k value for a given 160×120 image.
-  Referring next toFIG. 6 , anexemplary method 102 of determining a k value is provided for an image. The optimal line for segmentation, m·log(k)+b, derived above constitutes a set of points in the (log(k), UW) that may be reasonably perceived as good segmentation by a human observer. Consequently, given a sub-image of size 160×120 pixels, an optimal k value may be defined as a k value that generates a segmentation whose weighted symmetric uncertainty UW is close to m·log(k)+b. In other words, an optimal k value may include a k value for which a difference between the symmetric uncertainty, UW, of the generated segmentation and a portion of a linear relationship such as, for example, m·log(k)+b, is minimized.
-  An optimal k value can be computed iteratively through abisection method 102, as illustrated inFIGS. 7A and 7B . The results of an exemplary implementation of the first five iterations of embodiments of method 102 (FIG. 6 ) are provided inFIGS. 7A and 7B . Inblock 104, an image is provided, illustratively the image shown inFIG. 7B multiple times. The image is divided into a plurality of sub-images. The remainder ofFIG. 6 is carried out for each sub-image of the image. As shown inblock 106, at iteration i=0, the sub-image is segmented for kLeft=1 and kRight=10,000. In embodiments, other values of k may be utilized. As illustrated inFIG. 7A , in this example, the corresponding values of UW/Left and UW,Right are computed for each of kLeft=1 and kRight=10,000.FIG. 7B illustrates the segmentation of the exemplary image(all sub-images) at iteration i=0 for kLeft=1 (upper left ofFIG. 7B , over-segmented) and kRight=10,000 (upper right ofFIG. 7B , under-segmented).
-  Inblock 108, the value of i is increased for the first iteration. Inblock 110, the mean log value (k=exp{[log(kLeft)+log(kRight)]/2}) is used to determine a new k value.
-  Inblock 112, the current iteration i is compared to the maximum number of iterations. In some exemplary embodiments, the maximum number of iterations is a predetermined integer, such as 5 or any other integer, which may be selected, for example, to optimize a trade-off between computational burden and image segmentation quality. In other exemplary embodiments, the maximum number of iterations is based on a difference between the k and/or Uw value determined in successive iterations. If the maximum number of iterations has been reached, the k value determined inblock 110 is chosen as the final value of k for segmentation, as shown inblock 114.
-  If the maximum number of iterations has not yet been reached inblock 112, inblock 116, the image is segmented and the corresponding UW is computed for the k value determined inblock 110. As shown inFIG. 7A , the first iteration of k was 100, and the Uw calculated in the first iteration was 0.28. An example of a resulting segmentation of a first iteration of k is shown inFIG. 7B (second row, left image).
-  Inblock 118, the determined ki and Uw values are compared to the optimal line in the (log ki, Uw) plane. For example, as shown inFIG. 7A , for the first iteration, (log ki, Uw) is located above the optimal line. The value of kright is replaced with ki inblock 122, and themethod 102 returns to block 108. In contrast, for the second iteration, the value of k=1000 (second row, right image inFIG. 7B ), Uw=0.17 is below the optimal line in the (log ki, Uw) plane. In the second iteration, the value of kleft is replaced with ki inblock 120, and themethod 102 returns to block 108.
-  Exemplary results of embodiments ofmethod 102 are presented inFIG. 7A for the image ofFIG. 7B . Although the initial k values of 0 and 10,000 resulted in strong over-segmentation and under-segmentation, respectively, after several iterations, the image ofFIG. 7B appears well-segmented, and the corresponding point in the (log(k), UW) space lies close to the optimal segmentation line, as shown inFIG. 7A . At iteration i=5, the k of 133.3521 and Uw of 0.27 lies very close to the optimal segmentation line (FIG. 7A ), and the image appears well-segmented (FIG. 7B , bottom image).
-  Although sub-images of 160×120 pixels were considered inFIGS. 7A and 7B , the parameters of the segmentation quality model change with the image resolution. In addition, the optimal segmentation line shown inFIG. 7A for a 320×240 resolution is lower than the optimal segmentation line shown inFIG. 7B for a 640×480 resolution. In embodiments, it is believed that at a higher resolution, more details may be generally visible in the image, thus indicating a higher segmentation quality (i.e. higher Uw). In embodiments, the application of the segmentation quality model to other image resolutions may indicate therefore to re-classify segmented sub-images of 160×120 pixels for the given resolution. In some embodiments, interpolation or extrapolation of known image or sub-image resolutions are used.
-  Method 102 (FIG. 6 ) was used to estimate the optimal k value for a sub-image of 160×120 pixels as illustrated inFIGS. 7A and 7B . In embodiments, to segment a full image at resolution of 320×240 or 640×480 pixels, a set of adjacent sub-images may be considered. In some embodiments, putting together the independent segmentations of each sub-image may not produce a satisfying segmentation map, since segments across the borders of the sub-images may be divided into multiple segments.
-  Referring next toFIG. 8 , a modifiedmethod 202 is presented.Method 202 makes use of an adaptive scale factor k(x,y), and the threshold function τ(C)=k/|C| in Equation (2) becomes τ(C,x,y)=k(x,y)/|C|. Instep 204, an image is provided. Exemplary images are shown inFIG. 9A andFIG. 10A . For each image, as shown inblock 206, the image is segmented using k(x,y)=1 and k(x,y)=10,000 for all the image pixels.
-  As shown inblock 208, each image is then divided into a plurality of sub-images. Illustratively, each sub-image may be 160×120 pixels. Inblock 210, for each sub-image, and independently from the other sub-images, a value of k for each sub-image was determined. In some embodiments, the value of k is determined for thesub-image using method 102 as described above with respect toFIG. 6 . Inblock 214, the value of k determined in block 212 is assigned to all pixels in the sub-image.
-  Inblock 216, a scale map of k(x,y) for the image is smoothed through a low pass filter to avoid sharp transition of k(x,y) along the image.
-  The results of anexemplary method 202 are illustrated inFIGS. 9 and 10 .FIGS. 9A and 10A are two exemplary 640×480 pixel images taken from the dataset used for estimating the segmentation quality model inFIG. 5B . The k(x,y) scale map for each image following the smoothing through the low pass filter inblock 216 is presented inFIGS. 9B and 10B , and the corresponding segmentation is shown inFIGS. 9C and 10C .
-  FIGS. 9D and 10D illustrate segmentation achieved with the graph-based approach of Felzenszwalb and Huttenlocher for σ=0.5, min size=5. The value of k was set experimentally to guarantee an equivalent number of segments as inFIGS. 9C and 9C . ForFIG. 9D , k was set to 115, and forFIG. 10D , k was set to 187.
-  FIGS. 9B and 9C illustrate that the present method favors large segments (high k value) in the area occupied by persons in the image, and finer segmentation (low k value) in the upper left area of the image, where a large number of small leaves are present, when compared to the method of Felzenszwalb and Huttenlocher inFIG. 9D .
-  FIGS. 10B and 10C illustrate that embodiments of the present method may favor larger segments in the homogeneous area of the sky and skyscrapers, for example, preventing over-segmentation in the sky area, when compared to the method of Felzenszwalb and Huttenlocher as shown inFIG. 10D . In other embodiments, overlapping rectangular regions may be used.
-  In some embodiments, the segmentation of a second image can be estimated based on the segmentation of a first image. Exemplary embodiments include video processing or video encoding, in which adjacent frames of images may be highly similar or highly correlated. Amethod 302 for segmenting a second image is provided inFIG. 11 . Inblock 304, the first image is provided. The first image is segmented by dividing the first image into a plurality of sub-images inblock 306, determining a value of k for each sub-image inblock 308, and segmenting the image based on the determined k value inblock 310. In some embodiments, segmenting the first image in blocks 306-310 is performed using method 102 (FIG. 6 ) or method 202 (FIG. 8 ). Inblock 312, a second image is provided. In some embodiments, the first and second images are subsequent video images. The second image is divided into a plurality of sub-images inblock 314. In some embodiments, one or more of the plurality of sub-images of the second image inblock 314 correspond in size and/or location to one or more of the plurality of sub-images of the first image inblock 306. Inblock 316, the k value for each sub-image of the first image determined inblock 308 is provided as an initial estimate for the k value of each corresponding sub-image of the second image.
-  In other embodiments, as shown inFIG. 11 , inblock 318 the k values for the second image are optimized, using the estimated k values from the first image as an initial iteration, followed by segmenting the second image inblock 340. In other embodiments, the second image is segmented based on the estimated k value inblock 340 without first being optimized inblock 318. In some embodiments, segmenting the second image in blocks 316-340 is performed using method 102 (FIG. 6 ) or method 202 (FIG. 8 ).
-  In some embodiments, such as in applications like video-encoding, it can also be noticed that the computational cost of segmenting the video images can be significantly reduced. When applied to a unique frame, the proposed method performs a research of the optimal k value for each sub-image considering the entire range for k. For a video-encoding application, since adjacent frames are highly correlated in videos, the range for k can be significantly reduced by considering the estimates obtained at previous frames for the same sub-image and/or corresponding sub-image. In embodiments, k values may be updated only at certain frame intervals and/or scene changes.
-  In some embodiments, the above methods of automatically optimizing a segmentation algorithm are performed based on edge thresholding and working in the YUV color space, achieving similar results. In embodiments in which multiple input parameters are used by the segmentation algorithm, a similar segmentation quality model is used, but the optimal segmentation line as show inFIGS. 5A, 5B, and 7A is replaced with transformed into a plane or hyper-plane.
-  While this invention has been described as relative to exemplary designs, the present invention may be further modified within the spirit and scope of this disclosure. Further, this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains.
Claims (20)
 1. A method of segmenting an image, comprising:
    providing an image;
 determining a first value of a segment parameter, wherein the segment parameter relates to a threshold function for establishing a boundary condition between a first segment and a second segment;
 determining a first value of a similarity function configured to indicate a similarity between the image and its segmentation based on the first value of the segment parameter;
 comparing the first value of the segment parameter and the first value of the similarity function to a predetermined function;
 determining a second value of the segment parameter based on a result of the comparing; and
 segmenting the image based on the second value of the segment parameter.
  2. The method of claim 1 , wherein the similarity function comprises a symmetric uncertainty function.
     3. The method of claim 2 , wherein the predetermined function is a linear function representing a linear relationship between the log of the segment parameter and the symmetric uncertainty, wherein a value above the linear function indicates over-segmentation and a value below the linear function indicates under-segmentation.
     4. The method of claim 3 , further comprising determining an optimal value of the segment parameter, the optimal value of the segment parameter comprising a value of the segment parameter that generates a segmentation of the image for which a difference between a corresponding value of the symmetric uncertainty and a portion of the linear relationship is minimized.
     5. The method of claim 1 , wherein segmenting the image based on the second value of the segment parameter comprises:
    dividing the image into a plurality of sub-images, wherein the sub-images are overlapping or non-overlapping;
 generating a scale map for the image by determining a plurality of values of the segment parameter, wherein each of the plurality of values corresponds to one of the plurality of sub-images; and
 smoothing the scale map for the image using a filter.
  6. The method of claim 5 , wherein the filter comprises a low-pass filter.
     7. The method of claim 5 , further comprising:
    providing an additional image, wherein the additional image is disposed subsequent to the image in a video;
 dividing the additional image into an additional plurality of sub-images, wherein the additional plurality of sub-images corresponds to the plurality of sub-images in at least one of size and location;
 providing the plurality of values of the segment parameter as a plurality of initial estimates for the segment parameter corresponding to the additional plurality of sub-images;
 determining a plurality of optimized values of the segment parameter, wherein each of the plurality of optimized values corresponds to one of the additional plurality of sub-images; and
 segmenting the additional image based on the plurality of optimized values of the segment parameter.
  8. The method of claim 2 , wherein determining the linear function includes:
    providing a plurality of training images;
 generating a segmentation map for each of the plurality of training images at a plurality of values of the segment parameter;
 determining a value of a symmetric uncertainty for each segmentation map; and
 classifying each segmentation map as being over-segmented, well segmented, or under-segmented, based on a visual perception by at least one observer.
  9. A method of segmenting an image, comprising:
    providing an image;
 dividing the image into a plurality of sub-images, each sub-image comprising a plurality of pixels; and for each sub-image, the method comprising:
 determining a first value of a parameter, wherein the parameter relates to a threshold function for establishing a boundary condition between a first segment and a second segment;
determining a first value of a symmetric uncertainty of the sub-image based on the first value of the first parameter;
comparing the first value of the parameter and the first value of the symmetric uncertainty to a predetermined function; and
determining a second value of the parameter based on a result of the comparing.
 10. The method of claim 9 , further comprising:
    assigning the determined value of the parameter to each pixel in the sub-image;
 applying a filter to the assigned values to obtain a filtered value of the parameter for each pixel in the sub-image; and
 segmenting the image based on the filtered values of the parameter.
  11. The method of claim 10 , wherein the filter comprises a low-pass filter.
     12. The method of claim 9 , further comprising:
    providing an additional image;
 dividing the additional image into an additional plurality of sub-images;
 segmenting the additional image based in part on the second value of the parameter determined for each of the first plurality of sub-images of the image.
  13. The method of claim 9 , wherein the predetermined function is a linear function representing a linear relationship between the log of the first parameter and the symmetric uncertainty, wherein a value above the linear function indicates over-segmentation and a value below the linear function indicates under-segmentation.
     14. The method of claim 13 , wherein determining the linear function includes:
    providing a plurality of training images;
 generating a segmentation map for each of the plurality of training images at a plurality of values of the segment parameter;
 determining a value of a symmetric uncertainty for each segmentation map; and
 classifying each segmentation map as being over-segmented, well segmented, or under-segmented, based on a visual perception by at least one observer.
  15. A system, comprising:
    an image segmentation device, the image segmentation device comprising a processor and a memory, the memory comprising computer-readable media having computer-executable instructions embodied thereon that, when executed by the processor, cause the processor to instantiate one or more components, the one or more components comprising:
 a segment module configured to (1) determine a functional relationship between a first parameter and a second parameter based on input electronically received about a plurality of training images; and (2) for a sub-image of an image to be segmented:
determine an initial value of the first parameter for the sub-image; and
determine an initial value of the second parameter; and
a comparison module configured to perform a comparison between the initial value of the first parameter and the initial value of the second parameter to the functional relationship;
wherein the segment module is further configured to determine an updated value of the first parameter based on the comparison.
 16. The system of claim 15 , wherein the segment module is further configured to segment the image, based in part on the updated value of the first parameter, to create segmented image data.
     17. The system of claim 16 , further comprising an encoder configured to encode the segmented image data.
     18. The system of claim 17 , further comprising a communication module configured to facilitate communication of at least one of the image to be segmented and the segmented image data.
     19. The system of claim 17 , wherein the segment module is further configured to:
    divide the additional image into a plurality of sub-images; and
 segment the additional image, based in part on the updated value of the first parameter.
  20. The system of claim 15 , wherein the second parameter comprises a symmetric uncertainty of the sub-image.
    Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| US15/357,906 US20170069101A1 (en) | 2014-10-01 | 2016-11-21 | Method and system for unsupervised image segmentation using a trained quality metric | 
| US15/480,361 US20170337711A1 (en) | 2011-03-29 | 2017-04-05 | Video processing and encoding | 
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| US201462058647P | 2014-10-01 | 2014-10-01 | |
| US201562132167P | 2015-03-12 | 2015-03-12 | |
| US14/696,255 US9501837B2 (en) | 2014-10-01 | 2015-04-24 | Method and system for unsupervised image segmentation using a trained quality metric | 
| US15/357,906 US20170069101A1 (en) | 2014-10-01 | 2016-11-21 | Method and system for unsupervised image segmentation using a trained quality metric | 
Related Parent Applications (2)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| US14/696,255 Continuation US9501837B2 (en) | 2011-03-29 | 2015-04-24 | Method and system for unsupervised image segmentation using a trained quality metric | 
| US14/696,255 Continuation-In-Part US9501837B2 (en) | 2011-03-29 | 2015-04-24 | Method and system for unsupervised image segmentation using a trained quality metric | 
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| US14/737,401 Continuation-In-Part US20160065959A1 (en) | 2011-03-29 | 2015-06-11 | Learning-based partitioning for video encoding | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| US20170069101A1 true US20170069101A1 (en) | 2017-03-09 | 
Family
ID=54330885
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| US14/696,255 Expired - Fee Related US9501837B2 (en) | 2011-03-29 | 2015-04-24 | Method and system for unsupervised image segmentation using a trained quality metric | 
| US15/357,906 Abandoned US20170069101A1 (en) | 2011-03-29 | 2016-11-21 | Method and system for unsupervised image segmentation using a trained quality metric | 
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| US14/696,255 Expired - Fee Related US9501837B2 (en) | 2011-03-29 | 2015-04-24 | Method and system for unsupervised image segmentation using a trained quality metric | 
Country Status (7)
| Country | Link | 
|---|---|
| US (2) | US9501837B2 (en) | 
| EP (1) | EP3201873A1 (en) | 
| JP (1) | JP2017531867A (en) | 
| KR (1) | KR20170057362A (en) | 
| AU (2) | AU2015324988A1 (en) | 
| CA (1) | CA2963132A1 (en) | 
| WO (1) | WO2016054285A1 (en) | 
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US10949948B1 (en) * | 2020-06-21 | 2021-03-16 | Alexandru Kuzmin | Closed form method and system for large image matting | 
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US9501837B2 (en) * | 2014-10-01 | 2016-11-22 | Lyrical Labs Video Compression Technology, LLC | Method and system for unsupervised image segmentation using a trained quality metric | 
| CN104346801B (en) * | 2013-08-02 | 2018-07-20 | 佳能株式会社 | Image composition apparatus for evaluating, information processing unit and its method | 
| KR102267871B1 (en) * | 2014-09-03 | 2021-06-23 | 삼성전자주식회사 | Display apparatus, mobile and method for controlling the same | 
| CN106683041B (en) * | 2016-12-12 | 2023-03-07 | 长春理工大学 | A Quantum Image Miscutting Method Based on NEQR Expression | 
| US10769422B2 (en) | 2018-09-19 | 2020-09-08 | Indus.Ai Inc | Neural network-based recognition of trade workers present on industrial sites | 
| US10853934B2 (en) | 2018-09-19 | 2020-12-01 | Indus.Ai Inc | Patch-based scene segmentation using neural networks | 
| CN113330485B (en) | 2019-01-08 | 2025-09-26 | 诺沃库勒有限责任公司 | Assess the quality of image segmentation into different tissue types for treatment planning using tumor treating fields (TTField) | 
| CN110322445B (en) * | 2019-06-12 | 2021-06-22 | 浙江大学 | A Semantic Segmentation Method Based on Maximizing Prediction and Inter-Label Correlation Loss Function | 
| CN112419344B (en) * | 2020-11-27 | 2022-04-08 | 清华大学 | An Unsupervised Image Segmentation Method Based on Chan-Vese Model | 
| CN113362345B (en) * | 2021-06-30 | 2023-05-30 | 武汉中科医疗科技工业技术研究院有限公司 | Image segmentation method, device, computer equipment and storage medium | 
| CN115439938B (en) * | 2022-09-09 | 2023-09-19 | 湖南智警公共安全技术研究院有限公司 | Anti-splitting face archive data merging processing method and system | 
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US6266443B1 (en) * | 1998-12-22 | 2001-07-24 | Mitsubishi Electric Research Laboratories, Inc. | Object boundary detection using a constrained viterbi search | 
| US6625308B1 (en) * | 1999-09-10 | 2003-09-23 | Intel Corporation | Fuzzy distinction based thresholding technique for image segmentation | 
| US7676081B2 (en) * | 2005-06-17 | 2010-03-09 | Microsoft Corporation | Image segmentation of foreground from background layers | 
| US20080123959A1 (en) * | 2006-06-26 | 2008-05-29 | Ratner Edward R | Computer-implemented method for automated object recognition and classification in scenes using segment-based object extraction | 
| US8233712B2 (en) * | 2006-07-28 | 2012-07-31 | University Of New Brunswick | Methods of segmenting a digital image | 
| CA2718343A1 (en) * | 2007-03-15 | 2008-09-18 | Jean Meunier | Image segmentation | 
| US8135216B2 (en) * | 2007-12-11 | 2012-03-13 | Flashfoto, Inc. | Systems and methods for unsupervised local boundary or region refinement of figure masks using over and under segmentation of regions | 
| JP5200993B2 (en) * | 2009-02-24 | 2013-06-05 | 富士ゼロックス株式会社 | Image processing apparatus and image processing program | 
| US8391603B2 (en) * | 2009-06-18 | 2013-03-05 | Omisa Inc. | System and method for image segmentation | 
| GB2489272B (en) * | 2011-03-23 | 2013-03-13 | Toshiba Res Europ Ltd | An image processing system and method | 
| US9501837B2 (en) * | 2014-10-01 | 2016-11-22 | Lyrical Labs Video Compression Technology, LLC | Method and system for unsupervised image segmentation using a trained quality metric | 
| US20160065959A1 (en) * | 2014-08-26 | 2016-03-03 | Lyrical Labs Video Compression Technology, LLC | Learning-based partitioning for video encoding | 
| US8428363B2 (en) * | 2011-04-29 | 2013-04-23 | Mitsubishi Electric Research Laboratories, Inc. | Method for segmenting images using superpixels and entropy rate clustering | 
| WO2013040673A1 (en) * | 2011-09-19 | 2013-03-28 | The University Of British Columbia | Method and systems for interactive 3d image segmentation | 
| EP2842325A4 (en) * | 2012-04-24 | 2015-10-14 | Lyrical Labs Video Compression Technology Llc | Macroblock partitioning and motion estimation using object analysis for video compression | 
| US20150071541A1 (en) * | 2013-08-14 | 2015-03-12 | Rice University | Automated method for measuring, classifying, and matching the dynamics and information passing of single objects within one or more images | 
| US9280831B1 (en) * | 2014-10-23 | 2016-03-08 | International Business Machines Corporation | Image segmentation | 
- 
        2015
        - 2015-04-24 US US14/696,255 patent/US9501837B2/en not_active Expired - Fee Related
- 2015-09-30 KR KR1020177010319A patent/KR20170057362A/en not_active Abandoned
- 2015-09-30 AU AU2015324988A patent/AU2015324988A1/en not_active Abandoned
- 2015-09-30 CA CA2963132A patent/CA2963132A1/en not_active Abandoned
- 2015-09-30 WO PCT/US2015/053355 patent/WO2016054285A1/en active Application Filing
- 2015-09-30 EP EP15781823.8A patent/EP3201873A1/en not_active Withdrawn
- 2015-09-30 JP JP2017517682A patent/JP2017531867A/en not_active Ceased
 
- 
        2016
        - 2016-11-21 US US15/357,906 patent/US20170069101A1/en not_active Abandoned
 
- 
        2018
        - 2018-11-22 AU AU2018267620A patent/AU2018267620A1/en not_active Abandoned
 
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US10949948B1 (en) * | 2020-06-21 | 2021-03-16 | Alexandru Kuzmin | Closed form method and system for large image matting | 
Also Published As
| Publication number | Publication date | 
|---|---|
| AU2018267620A1 (en) | 2018-12-13 | 
| JP2017531867A (en) | 2017-10-26 | 
| KR20170057362A (en) | 2017-05-24 | 
| AU2015324988A1 (en) | 2017-04-27 | 
| WO2016054285A1 (en) | 2016-04-07 | 
| US9501837B2 (en) | 2016-11-22 | 
| CA2963132A1 (en) | 2016-04-07 | 
| US20160098842A1 (en) | 2016-04-07 | 
| EP3201873A1 (en) | 2017-08-09 | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| US9501837B2 (en) | Method and system for unsupervised image segmentation using a trained quality metric | |
| US11074734B2 (en) | Image processing apparatus, image processing method and storage medium | |
| US11120556B2 (en) | Iterative method for salient foreground detection and multi-object segmentation | |
| Arévalo et al. | Shadow detection in colour high‐resolution satellite images | |
| US20170337711A1 (en) | Video processing and encoding | |
| US8260048B2 (en) | Segmentation-based image processing system | |
| US20180295375A1 (en) | Video processing and encoding | |
| CN113781402A (en) | Chip surface scratch defect detection method, device and computer equipment | |
| US20150227810A1 (en) | Visual saliency estimation for images and video | |
| US20130330004A1 (en) | Finding text in natural scenes | |
| EP1120742A2 (en) | Method for automatically creating cropped and zoomed versions of digital photographic images | |
| US20090278859A1 (en) | Closed form method and system for matting a foreground object in an image having a background | |
| JP4979033B2 (en) | Saliency estimation of object-based visual attention model | |
| Rosenfeld | Image pattern recognition | |
| US9477885B2 (en) | Image processing apparatus, image processing method and image processing program | |
| EP3073443B1 (en) | 3d saliency map | |
| US20170178341A1 (en) | Single Parameter Segmentation of Images | |
| US9916662B2 (en) | Foreground detection using fractal dimensional measures | |
| CN114913463A (en) | Image identification method and device, electronic equipment and storage medium | |
| Mukherjee et al. | A hybrid algorithm for disparity calculation from sparse disparity estimates based on stereo vision | |
| Geetha et al. | An improved method for segmentation of point cloud using minimum spanning tree | |
| Zhou et al. | Stereo matching based on guided filter and segmentation | |
| Charpiat et al. | Machine learning methods for automatic image colorization | |
| Li et al. | Local stereo matching algorithm using rotation-skeleton-based region | |
| Datar et al. | Color image segmentation based on Initial seed selection, seeded region growing and region merging | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| AS | Assignment | Owner name: LYRICAL LABS VIDEO COMPRESSION TECHNOLOGY, LLC, NE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FROSIO, IURI;RATNER, EDWARD;SIGNING DATES FROM 20170413 TO 20170705;REEL/FRAME:042990/0662 | |
| STCB | Information on status: application discontinuation | Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION | 
 
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
         
        
        