WO2014075224A1 - Video object segmentation with llc modeling - Google Patents
Video object segmentation with llc modeling Download PDFInfo
- Publication number
- WO2014075224A1 WO2014075224A1 PCT/CN2012/084536 CN2012084536W WO2014075224A1 WO 2014075224 A1 WO2014075224 A1 WO 2014075224A1 CN 2012084536 W CN2012084536 W CN 2012084536W WO 2014075224 A1 WO2014075224 A1 WO 2014075224A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- video object
- llc
- model
- frame
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/162—Segmentation; Edge detection involving graph-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/143—Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- G06T2207/10021—Stereoscopic video; Stereoscopic image sequence
Definitions
- Video object segmentation can be regarded as a labeling problem, where each pixel in all frames is assigned a unique label - foreground or background. Intuitively, this can be done by image segmentation if video is decoded into a sequence of image frames.
- image segmentation There are methods in image segmentation such as, for example, Magic Wand (see generally, Li, Y., Sun, J., and Shum, H.-Y. : 'Video object cut and paste'. Proc. ACM SIGGRAPH 2005 Papers, Los Angeles, California 2005), Graph cuts (see generally, Y. Y. Boykov and M. P. Jolly, "Interactive graph cuts for optimal boundary region segmentation of objects in N-D images," in Computer Vision, 2001 and ICCV 2001. Proceedings. Eighth IEEE International Conference on, 2001, pp. 105-112 vol.1) and so on.
- Video object segmentation is accomplished utilizinglocality- constrained linear coding (LLC) modeling and adaptive model learning.
- the video sequence is processed frame by frame.
- a three-dimensional (3D) graph based on two successive frames is constructed and then graph cuts are used to determinea video object.
- LLC is utilized to model the foreground and background model and online model learning is used to adapt to the variation of an object in a video combined with LLC.
- the techniques permit better constructions, local smooth sparsity and analytical solutions.
- FIG. 1 is a flow diagram of a method of video segmentation with LLC
- FIG. 2 is an example of results of a sample video frame.
- FIG. 3 is a flow diagram of a method of model learning.
- FIG. 4 is an example system employing an embodiment.
- Video object segmentation utilizing LLC to model the foreground and background model provides better construction, local smooth sparsity and an analytical solution.
- online model learning adapts to the variations of an object in a video that is combined with LLC.
- Video object segmentation by minimal cuts of the graph can be viewed as the identical problem of energy minimization.
- likelihood energy is one of the most used functions. It evaluates the conformity of each node to the foreground or background model.
- Boykov Jolly ⁇ supra a histogram of intensity distribution is used to model the foreground and background.
- Likelihood energy can then be calculated by the negative log-likelihoods of the probability density function. While this is efficient for gray images, it is not tractable for color images because of the mass histogram bins (256x256x256).
- Li et al. see generally, Y. Li, J. Sun, C.-K.
- GMM can be introduced to replace the histogram of intensity distribution, and with GMM, iterative optimization could also be used to refine the segmentation via user interaction ⁇ see generally, C. Rother, V. Kolmogorov, and A. Blake, ""GrabCut”: interactive foreground extraction using iterated graph cuts," presented at the ACM SIGGRAPH 2004 Papers, Los Angeles, California, 2004).
- LLC has achieved excellent performance in image categorization (see generally, Kai Yu, Tong Zhang, and Yihong Gong. Nonlinear learning using local coordinate coding. NIPS'09). Compared with sparse coding, LLC presents some attractive properties such as better construction, local smooth sparsity and an analytical solution.
- FIG. 1 is an example method 100 of one instance of a technique to provide video object segmentation.
- a first frame is grabbed to do the initialization 104.
- the initialization includes two parts: getting an object mask and initializing the foreground and background model.
- the object mask can be calculated by any image segmentation method.
- the foreground or background model is represented by color centers which can be clustered by K-means.
- the video sequence is processed frame by frame.
- the likelihood energy is calculated with LLC 106 and energy E2 and E3 (smoothness) are calculated as well 108.
- a 3D graph G (J 7 , ⁇ ) is constructed 110.
- ⁇ is the set of all edges which are composed of two types: intra frame edges ⁇ ⁇ (connecting adjacent pixels in the same frame) and inter frame edges ⁇ / (connecting adjacent pixels in adjacent frames). It has been proven that the minimal cuts problem of such graph is identical to the following energy minimization:
- the likelihood energy is defined as follows:
- ⁇ is the max value of E 2 and i3 ⁇ 4-
- the maxflow of the graph is solved to get the object 112.
- the model is then updated with LLC based learning (discussed infra) (114) and then the iteration continues to the next frame 116 or ends 118 if completed.
- FIG. 2 a comparison result 200 is shown with a sample video frame 202.
- processing with LLC 208 yields a superior isolated object from the sample frame 202 compared to GMM 204 and K-means 206.
- a foreground or background model is generated by clustering method such as K-means 206, Gaussian Mixture Models 204 and so on. And in most cases, these methods are enough to model the foreground and background model for image segmentation. In order to tackle foreground or background clutter, iterative optimization based on interactive user input is used to update the model. While this is acceptable for image segmentation, it is insufficient for video object segmentation, because it is tedious for users to do interactive labeling with every frame of the video.
- initialized model ⁇ ⁇ 3 ⁇ 4 can be trained by K-means or other clustering methods. As illustrated in FIG. 3, a method 300 begins to learn a model 302 by initializing parameters (304). Then an outer loop 324can be implemented for each frame of the video to update the model.
- the segmentation is first done based on a previously updated model, and the summation of the coefficients c ⁇ m is also initialized in order to check the validity of each word of the model 306.
- the labeled pixels are looped through 322to update the corresponding words of the model which is solved in a gradient descent manner 308, 310, 312, 314.
- the related words are updated - not all the model in each inner loop, which is actually the nearest ones.
- a larger model can be used in an instance of the present methods for accuracy of modeling which is an advantage over GMM or K-means based methods for limited efficiency.
- the c ⁇ m is checked to make sure that unused words of the model are replaced by randomly sampled words for the purpose of adaptation of the variation of the foreground or background 316.
- the above methods and processes can be employed in whole or in part in an example system 400 shown in FIG. 4.
- the video object isolator 402 segments objects found in a video 404 such that it can output an isolated video object 406.
- the video object isolator 402 can reside on a processor that has been configured to perform the steps and methods described herein.
- the processor can utilize hardware, firmware and/or software as required to properly perform these functions.
- the video object isolator 402 segments objects from the video 404 by utilizing an LLC object segmentor 408 that employs the techniques described above to segment an object from the video 404.
- An LLC model learner 410 can be optionally employed to facilitate in adapting the model to better segment an object from the video 404.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Video object segmentation is accomplished utilizing Locality-constrained Linear Coding (LLC) modeling and adaptive model learning. A video object isolator utilizes an LLC segmentor to segment a video object from a video based on the LLC modeling. Additionally, the video object isolator can use an LLC model learner to further adapt the model.
Description
VIDEO OBJECT SEGMENTATION WITH LLC MODELING
BACKGROUND
[0001] With the development of capture and storage devices, video data has increased tremendously in the last few years. It is commonly believed that it will continue to increase in the future. However, only a few objects in the video are useful for content understanding and analyzing. Therefore, video object segmentation is necessary for video processing.
[0002] Video object segmentation can be regarded as a labeling problem, where each pixel in all frames is assigned a unique label - foreground or background. Intuitively, this can be done by image segmentation if video is decoded into a sequence of image frames. There are methods in image segmentation such as, for example, Magic Wand (see generally, Li, Y., Sun, J., and Shum, H.-Y. : 'Video object cut and paste'. Proc. ACM SIGGRAPH 2005 Papers, Los Angeles, California 2005), Graph cuts (see generally, Y. Y. Boykov and M. P. Jolly, "Interactive graph cuts for optimal boundary region segmentation of objects in N-D images," in Computer Vision, 2001 and ICCV 2001. Proceedings. Eighth IEEE International Conference on, 2001, pp. 105-112 vol.1) and so on.
[0003] However, user interaction is needed for most of the segmentation methods, and it is tedious for a user to do it for all the frames. In order to solve this problem, Li et al. (Y. Li, J. Sun, and H.-Y. Shum, "Video object cut and paste," presented at the ACM SIGGRAPH 2005 Papers, Los Angeles, California, 2005) proposed to construct a 3D graph on video frames which can be viewed as a spatial- temporal volume. They used watershed to pre-segment the image and optimize the energy function with graph cuts. However, all of these approaches are either based on superpixels or are not very robust. As a result, temporal coherency has been difficult to maintain and post-processing is needed for noisy segmentation results, such as feature tracking, constrained 2D graph cut, etc.
SUMMARY
[0004] Video object segmentation is accomplished utilizinglocality- constrained linear coding (LLC) modeling and adaptive model learning. The video
sequence is processed frame by frame. In each iteration, a three-dimensional (3D) graph based on two successive frames is constructed and then graph cuts are used to determinea video object. In one instance, LLC is utilized to model the foreground and background model and online model learning is used to adapt to the variation of an object in a video combined with LLC. The techniques permit better constructions, local smooth sparsity and analytical solutions.
[0005] The above presents a simplified summary of the subject matter in order to provide a basic understanding of some aspects of subject matter embodiments. This summary is not an extensive overview of the subject matter. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the subject matter. Its sole purpose is to present some concepts of the subject matter in a simplified form as a prelude to the more detailed description that is presented later.
[0006] To the accomplishment of the foregoing and related ends, certain illustrative aspects of embodiments are described herein in connection with the following description and the annexed drawings. These aspects are indicative, however, of but a few of the various ways in which the principles of the subject mattercan be employed, and the subject matter is intended to include all such aspects and their equivalents. Other advantages and novel features of the subject mattercan become apparent from the following detailed description when considered in conjunction with the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] FIG. 1 is a flow diagram of a method of video segmentation with LLC
[0008] FIG. 2 is an example of results of a sample video frame.
[0009] FIG. 3 is a flow diagram of a method of model learning.
[0010] FIG. 4 is an example system employing an embodiment.
DETAILED DESCRIPTION
[0011] The subject matter is now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the subject matter. It can be evident, however, that subject matter embodimentscan be practiced without these
specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing the embodiments.
[0012] Video object segmentation utilizing LLC to model the foreground and background model provides better construction, local smooth sparsity and an analytical solution. In addition, online model learning adapts to the variations of an object in a video that is combined with LLC. These techniques solve many of the difficulties faced with other methods.
[0013] Video object segmentation by minimal cuts of the graph can be viewed as the identical problem of energy minimization. Although there are many energy functions proposed in recent works, likelihood energy is one of the most used functions. It evaluates the conformity of each node to the foreground or background model. In Boykov Jolly {supra), a histogram of intensity distribution is used to model the foreground and background. Likelihood energy can then be calculated by the negative log-likelihoods of the probability density function. While this is efficient for gray images, it is not tractable for color images because of the mass histogram bins (256x256x256). In order to solve this problem, Li et al. {see generally, Y. Li, J. Sun, C.-K. Tang, and H.-Y. Shum, "Lazy snapping," presented at the ACM SIGGRAPH 2004 Papers, Los Angeles, California, 2004) introduce a K-means to cluster the colors and then uses cluster centers to calculate the likelihood energy based on the distance between each pixel and its nearest center. In addition, Gaussian Mixture Models
(GMM) can be introduced to replace the histogram of intensity distribution, and with GMM, iterative optimization could also be used to refine the segmentation via user interaction {see generally, C. Rother, V. Kolmogorov, and A. Blake, ""GrabCut": interactive foreground extraction using iterated graph cuts," presented at the ACM SIGGRAPH 2004 Papers, Los Angeles, California, 2004).
[0014] However, while these methods could be practicable for image segmentation, they are insufficient for video object segmentation. Subject to the complexity of computation, only a few cluster centers or GMM components are used to calculate the likelihood energy, which is not enough for clutter foreground or background. And, more importantly, an object in a video is varying in almost all the frames such as moving, rotating, scaling and so forth. Therefore, it is necessary to design a scheme to adapt to the variation of an object in the video. In image categorization, each image is commonly modeled by a histogram of its local features.
If a model in an image categorization can be viewed as histogram or GMM, components in a video object segmentation coding scheme can be used to calculate the likelihood energy. Recently, LLC has achieved excellent performance in image categorization (see generally, Kai Yu, Tong Zhang, and Yihong Gong. Nonlinear learning using local coordinate coding. NIPS'09). Compared with sparse coding, LLC presents some attractive properties such as better construction, local smooth sparsity and an analytical solution.
[0015] FIG. 1 is an example method 100 of one instance of a technique to provide video object segmentation. Given a video sequence 102, a first frame is grabbed to do the initialization 104. The initialization includes two parts: getting an object mask and initializing the foreground and background model. The object mask can be calculated by any image segmentation method. The foreground or background model is represented by color centers which can be clustered by K-means. Then the video sequence is processed frame by frame. The likelihood energy is calculated with LLC 106 and energy E2 and E3 (smoothness) are calculated as well 108. In each iteration, a 3D graph G = (J7, ε) is constructed 110. While Vis the set of all nodes which are divided into two parts: terminals {s, t) (denoting foreground and background) and non-terminals (denoting pixels in both frames), ε is the set of all edges which are composed of two types: intra frame edges ετ (connecting adjacent pixels in the same frame) and inter frame edges ε/ (connecting adjacent pixels in adjacent frames). It has been proven that the minimal cuts problem of such graph is identical to the following energy minimization:
E(X) = jEl (xi) + a ∑E2 (Xi ,Xj)
(Eq. 1) + β ∑-¾(*„*,),
wherex, is the label of each node p and X= {x . v /'}. The first term
evaluates the conformity of each node to the foreground or background model, so it is also referred to as likelihood energy. The last two terms E2 and i¾ measure the differences of adjacent nodes: E2 for the ones in the same frame, i¼ for the ones between two
adjacent frames. Therefore, they are commonly viewed as the representation for smoothness and can be defined as follows:
(Eq. 2) dist , j)
where ^ " Pi Pj " and E is the expectation of color contrast.
[0016] With LLC coding, for each pixelpie R3 (RGB), the following criteria should be satisfied:
mi*, Dc 2 + A||d,. ® c.
(Eq. 3) s.t. V c, = 1, V;
where ® denotes the element-wise multiplication, d,e RMis the locality adaptor, De R3xMrepresents the model and c,e RM is the coefficient. Although (Eq. 3) has an analytical solution and it is fast to calculate, approximated LLC is used to speed up the optimization. With appropriated LLC, K nearest neighbors of p, are first calculated which can be taken as the local bases D, , and then a much smaller linear system is solved to get the optimized coefficient c' :
c, = argmin^p, - D,c,
(Eq. 4) s.t. lrc,. = 1, V;.
Residual is then computed according to:
(Eq. 5)
As two models are kept - one for foreground and the other for background, then the likelihood energy
is defined as follows:
where U is uncertain region, σ is the max value of E2 and i¾- Thus, the maxflow of the graph is solved to get the object 112. The model is then updated with LLC based learning (discussed infra) (114) and then the iteration continues to the next frame 116 or ends 118 if completed. In FIG. 2, a comparison result 200 is shown with a sample video frame 202. Here, it can be seen that processing with LLC 208 yields a superior isolated object from the sample frame 202 compared to GMM 204 and K-means 206.
[0017] Generally, a foreground or background model is generated by clustering method such as K-means 206, Gaussian Mixture Models 204 and so on. And in most cases, these methods are enough to model the foreground and background model for image segmentation. In order to tackle foreground or background clutter, iterative optimization based on interactive user input is used to update the model. While this is acceptable for image segmentation, it is insufficient for video object segmentation, because it is tedious for users to do interactive labeling with every frame of the video.
[0018] As video object segmentation is done frame by frame, once segmentation is given, current segmentation result can be used to learn the model for next frame. In addition, while we use a motion estimation method to propagate the labels of specified pixels, model learning can be reinforced with these labeled pixels. Segmentation and model learning can be optimized iteratively which can be solved
with the Coordinate Descent method. Given a video and afirst manually labeled frame, initialized model ϋίη¾ can be trained by K-means or other clustering methods. As illustrated in FIG. 3, a method 300 begins to learn a model 302 by initializing parameters (304). Then an outer loop 324can be implemented for each frame of the video to update the model.
[0019] The segmentation is first done based on a previously updated model, and the summation of the coefficients c∞mis also initialized in order to check the validity of each word of the model 306. Next, the labeled pixels are looped through 322to update the corresponding words of the model which is solved in a gradient descent manner 308, 310, 312, 314. The related words are updated - not all the model in each inner loop, which is actually the nearest ones. As a result, a larger model can be used in an instance of the present methods for accuracy of modeling which is an advantage over GMM or K-means based methods for limited efficiency. At the end of the iteration, the c∞mis checked to make sure that unused words of the model are replaced by randomly sampled words for the purpose of adaptation of the variation of the foreground or background 316. When the frames 318 are completed, the learning ends 320.
[0020] The above methods and processes can be employed in whole or in part in an example system 400 shown in FIG. 4. The video object isolator 402 segments objects found in a video 404 such that it can output an isolated video object 406. The video object isolator 402 can reside on a processor that has been configured to perform the steps and methods described herein. The processor can utilize hardware, firmware and/or software as required to properly perform these functions. The video object isolator 402 segments objects from the video 404 by utilizing an LLC object segmentor 408 that employs the techniques described above to segment an object from the video 404. An LLC model learner 410 can be optionally employed to facilitate in adapting the model to better segment an object from the video 404.
[0021] What has been described above includes examples of the embodiments.
It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the embodiments, but one of ordinary skill in the art can recognize that many further combinations and permutations of the embodiments are possible. Accordingly, the subject matter is intended to embrace all such alterations, modifications and variations that fall within scope of the appended
claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim.
Claims
1. A system that provides video object segmentation, comprising:
a video object isolator that performs video object segmentation based on a Locality-constrained Linear Coding (LLC) model.
2. The system of claim 1, wherein the video object isolator uses adaptive model learning.
3. The system of claim 1, wherein the video object isolator uses an iterative frame by frame process to segment an object.
4. The system of claim 1, wherein the video object isolator uses the LLC model on a foreground and background of a video.
5. The system of claim 4, wherein the video object isolator determines the energy likelihood based on the LLC model.
6. The system of claim 1, wherein the video object isolator constructs a three-dimensional graph based on two successive frames and uses graph cuts to segment a video object.
7. A method for video object segmentation, comprising:
segmenting a video object using a Locality-constrained Linear Coding (LLC) model.
8. The method of claim 7 further comprising:
using adaptive model learning to adapt the LLC model to improve video object segmentation.
9. The method of claim 7 further comprising:
processing a video frame by frame to build a three-dimensional graph based on two successive frames; and
using graph cuts to segment the video object.
10. The method of claim 7 further comprising:
using LLC to model a foreground and a background of a video to segment a video object.
11. The method of claim 7further comprising:
using online model learning to adapt to a variation of an object in a video combined with LLC.
12. A system that isolates an object in a video, comprising:
a means for isolating a video object using a Locality-constrained Linear Coding (LLC) model; and
a means for using adaptive model learning to adapt to a variation of an object in a video combined with LLC.
13. The system of claim 12 further comprising:
a means for processing a video frame by frame to build a three-dimensional graph based on two successive frames; and
a means for using graph cuts to isolate the video object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2012/084536 WO2014075224A1 (en) | 2012-11-13 | 2012-11-13 | Video object segmentation with llc modeling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2012/084536 WO2014075224A1 (en) | 2012-11-13 | 2012-11-13 | Video object segmentation with llc modeling |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014075224A1 true WO2014075224A1 (en) | 2014-05-22 |
Family
ID=50730458
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2012/084536 WO2014075224A1 (en) | 2012-11-13 | 2012-11-13 | Video object segmentation with llc modeling |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2014075224A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2523330A (en) * | 2014-02-20 | 2015-08-26 | Nokia Technologies Oy | Method, apparatus and computer program product for segmentation of objects in media content |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060233436A1 (en) * | 2005-04-19 | 2006-10-19 | Honeywell International Inc. | 3D dense range calculations using data fusion techniques |
CN101789124A (en) * | 2010-02-02 | 2010-07-28 | 浙江大学 | Segmentation method for space-time consistency of video sequence of parameter and depth information of known video camera |
-
2012
- 2012-11-13 WO PCT/CN2012/084536 patent/WO2014075224A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060233436A1 (en) * | 2005-04-19 | 2006-10-19 | Honeywell International Inc. | 3D dense range calculations using data fusion techniques |
CN101789124A (en) * | 2010-02-02 | 2010-07-28 | 浙江大学 | Segmentation method for space-time consistency of video sequence of parameter and depth information of known video camera |
Non-Patent Citations (2)
Title |
---|
WANG, LEI ET AL.: "Adaptive FKCN Method for Image Segmentation'.", ACTA ELECTRONICA SINICA, vol. 28, no. 2, 29 February 2000 (2000-02-29), pages 4 - 6 * |
ZHENG, LING: "Design and Implementation of Cloud-based Rapid Object Recognition'.", CHINA MASTERS' THESES FULL-TEXT DATABASE., 29 March 2012 (2012-03-29), pages 1 - 5 , 16-20 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2523330A (en) * | 2014-02-20 | 2015-08-26 | Nokia Technologies Oy | Method, apparatus and computer program product for segmentation of objects in media content |
US9633446B2 (en) | 2014-02-20 | 2017-04-25 | Nokia Technologies Oy | Method, apparatus and computer program product for segmentation of objects in media content |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10192117B2 (en) | Graph-based framework for video object segmentation and extraction in feature space | |
Faktor et al. | Video segmentation by non-local consensus voting. | |
Chen et al. | Image segmentation by MAP-ML estimations | |
CN107273905B (en) | Target active contour tracking method combined with motion information | |
Peng et al. | High-order energies for stereo segmentation | |
CN103262119A (en) | Method and system for segmenting an image | |
Hu et al. | Markov random fields for sketch based video retrieval | |
WO2004079657A2 (en) | Level sets image segmentation | |
Güçlü et al. | End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks | |
CN114723760A (en) | Portrait segmentation model training method and device and portrait segmentation method and device | |
Zhou et al. | An efficient two-stage region merging method for interactive image segmentation | |
Lim et al. | Joint segmentation and pose tracking of human in natural videos | |
Hedayati et al. | A qualitative and quantitative comparison of real-time background subtraction algorithms for video surveillance applications | |
Zhong et al. | Robust image segmentation against complex color distribution | |
Li et al. | Geodesic propagation for semantic labeling | |
CN105279761B (en) | A kind of background modeling method based on sample local density outlier detection | |
WO2014075224A1 (en) | Video object segmentation with llc modeling | |
Tang et al. | Automatic foreground extraction for images and videos | |
Ellis et al. | Online learning for fast segmentation of moving objects | |
Nguyen et al. | Interactive object segmentation from multi-view images | |
Dupont et al. | Extraction of layers of similar motion through combinatorial techniques | |
Wolf et al. | Integrating a discrete motion model into GMM based background subtraction | |
Gu et al. | Online video object segmentation via LRS representation | |
Álvarez et al. | Exploiting large image sets for road scene parsing | |
Dong et al. | E-GrabCut: an economic method of iterative video object extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12888236 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12888236 Country of ref document: EP Kind code of ref document: A1 |