CN107590818B - A kind of interactive video dividing method - Google Patents
A kind of interactive video dividing method Download PDFInfo
- Publication number
- CN107590818B CN107590818B CN201710794283.5A CN201710794283A CN107590818B CN 107590818 B CN107590818 B CN 107590818B CN 201710794283 A CN201710794283 A CN 201710794283A CN 107590818 B CN107590818 B CN 107590818B
- Authority
- CN
- China
- Prior art keywords
- pixel
- indicate
- previous frame
- space
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a kind of interactive video dividing methods, first, target profile curve is carried out to estimate, it obtains target object and estimates initial profile in present frame, and on the basis of present frame target estimates contour line, the shortest distance for showing that each pixel estimates profile to target, the position attribution as the pixel are mapped by distance.On the basis of each pixel of present frame is in addition to three-dimensional color attribute, the position attribution of reflection space-time restriction is added, i.e., each pixel distance estimates the distance value of objective contour, expands to higher dimensional space.Each attribute of higher dimensional space is first divided into multiple histogram Bin in advance when establishing graph structure, then the smooth item of interframe is added in the data item being calculated by global probabilistic model, data item as energy function model, finally energy function solution to model is obtained using max-flow min-cut algorithm, the motion information for successfully having incorporated target, increases the space-time expending of Video segmentation.
Description
Technical field
The invention belongs to the video dividing technique fields in image procossing and machine vision, more particularly, to a kind of friendship
Mutual formula methods of video segmentation.
Background technique
Video segmentation is a kind of Closing Binary Marker problem, it is intended to video or image sequence as a whole, pass through one
Fixed method is partitioned into the object with practical significance.Video segmentation has important role in many fields, such as:
In target identification, Video segmentation can provide prior information for target identification;In image coding, view is can be improved in Video segmentation
The efficiency of frequency compressed encoding.Generally, according to whether man-machine interactively is added, Video segmentation can be divided into non-interactive type Video segmentation
Divide two kinds with interactive video.Non-interactive type Video segmentation mainly uses the motion feature of the video object, such as based on light stream or
Non-interactive type methods of video segmentation based on gradient descent method.Such method is to the case where there are moving objects in video applicability
Preferably, if target to be split is fixed or moved slowly or alternately moves, such method is due to that can not pass through movement
Feature predicts the Probability Area of target object, to be unable to reach the purpose of segmentation.And interactive video dividing method passes through
Man-machine interactively, which is added, can preferably solve the problems, such as the irregular movement of above-mentioned target.
Video shows as the space-time expending of Video segmentation result as a space-time entirety, space-time restriction, includes the time
Continuity and spatial continuity, time continuity show as movement of the target in two frame of front and back, are that segmentation result effectively transmits
Important leverage;Spatial continuity is used in image segmentation earliest, and the form of expression is adjacent pixel or adjacent area
Similarity is commonly known as smooth item (N-link) in energy function, is to guarantee target object globality in segmentation result
Necessary condition.Extension of the Video segmentation as image segmentation on time dimension, therefore, space-time expending is for Video segmentation result
Transmitting it is most important.
Space-time expending is the important indicator for judging a Video segmentation result transitivity, is various based on motion analysis
Methods of video segmentation uses most attributes.Space-time expending includes time continuity and spatial continuity, and time continuity is logical
Often reflect the motion feature of target, the spatial continuity key reaction shape information of target.It is regarded using time and space continuity
There are many frequency division segmentation method, some methods first carry out super-pixel pre-segmentation, then pass through the super-pixel similarity of adjacent two frame of calculating
Distribute interframe smooth item, when calculating two super-pixel centre distance using only the space length of two super-pixel, by the time
Continuity, that is, motion information is described with a display model.Based on the locus of points tracking methods of video segmentation using dense optical flow with
The locus of points of track Long time scale, and cluster to obtain the space-time expending of target by the locus of points.Also some methods of video segmentation
When considering space-time expending, usually time dimension and Spatial Dimension are treated on an equal basis, i.e., an adjacent unit of time dimension and
An adjacent location equivalence of space dimension, but time and space are discrepant, adjacent two frames same positions in actual conditions
Pixel conventional video segmentation in be it is temporally adjacent, the distance for calculating time dimension is exactly a unit, then calculate work as
The Euclidean distance for time and space is just briefly described in the time-space matrix of the neighborhood territory pixel of previous frame in previous frame pixel.Currently, base
Methods of video segmentation in bilateral space regards video to be split as an entirety, and comprising time and space, there are also pixel colors in total
Six attributes, by being mapped to sextuple bilateral space to each dimension linear interpolation, then to bilateral space using traditional
Figure segmentation method solves, and obtains the label of bilateral space interior joint, it is every to obtain video to be split finally by linear interpolation inverse mapping
The preceding background probability of one frame pixel.Current most of methods of video segmentation are all based on graph theory, have many methods directly will figure
It cuts model and is generalized to Video segmentation from image segmentation, through light stream or other tracking modes on the basis of original spatial continuity
Time continuity is added.Traditional figure cuts model and usually only considers neighbouring some pixels when similitude between considering pixel
Point, such methods do not consider the connection between pixel similar in a wide range of color well.
Summary of the invention
Aiming at the above defects or improvement requirements of the prior art, the present invention provides a kind of interactive video dividing method,
Thus it is not strong to solve segmentation result accuracy present in existing interactive video cutting techniques, space-time expending it is inconsistent and
The technical problems such as interactive quantity is excessive.
To achieve the above object, the present invention provides a kind of interactive video dividing methods, comprising:
(1) according to the segmentation result of previous frame image, contour line of the target in previous frame image is obtained;
(2) contour line of the target in previous frame image is mapped to current frame image, and to each picture on contour line
Vegetarian refreshments is all matched to position of the pixel in current frame image, obtains target and estimates initial profile in current frame image
Line;
(3) initial profile line is estimated in current frame image based on target, each pixel is obtained by distance mapping
Position attribution to the shortest distance for estimating initial profile line, as the pixel;
(4) pixel each in current frame image is transformed into YUV color space by RGB color, and in current frame image
In each pixel YUV color attribute on the basis of, the position attribution of each pixel is added, by the feature of each pixel attribute
Dimension expands to higher dimensional space;
(5) the smooth item of current frame pixel point to previous frame neighborhood territory pixel point is converted according to the label of previous frame pixel
For data item, the data item after conversion is added in the data item being calculated by global probabilistic model, after superposition
Data item of the data item as energy function model, obtain energy function model;
(6) it solves energy function model and obtains energy function solution to model, and using current frame image as previous frame image,
Step (1)~step (5) is continued to execute until Video segmentation terminates.
Preferably, step (4) specifically includes:
(4.1) byBy pixel each in current frame image by
RGB color transforms to YUV color space, wherein [cy cu cv]TIndicate the pixel value in YUV color space, [cr cg
cb]TIndicate the value in RGB color;
(4.2) by b (x)=[cy,cu,cv,l]TThe intrinsic dimensionality of each pixel attribute is expanded into higher dimensional space, wherein
L indicates that the position attribution of pixel x, b (x) indicate the corresponding higher dimensional space attribute of pixel x.
Preferably, in step (5), the smooth item by current frame pixel point to previous frame neighborhood territory pixel point is according to upper
The label conversion of one frame pixel is data item, comprising:
ByBy the smooth item of current frame pixel point x to previous frame neighborhood territory pixel point according to upper
The label conversion of one frame pixel is data itemWherein, the light stream value of each pixel of y ' expression present frame is to present frame
The corresponding previous frame pixel of pixel x,For neighborhood territory pixel set in the frame of y ', ωxy′Indicate pixel x and previous frame field
The similarity of pixel y ', | sy′| indicate the label value of pixel y '.
Preferably, ωxy′Calculation are as follows:Wherein, | | x-y'| | indicate pixel x
With the time-space matrix of previous frame field pixel y ', Δ I indicates that the color distance of pixel x and previous frame field pixel y ', σ indicate figure
The gradient mean value of picture, and(xx,xy) indicate pixel x transverse and longitudinal coordinate, (yx',yy')
Indicate the transverse and longitudinal coordinate of pixel y '.
Preferably, the energy function model obtained in step (5) is expressed as:Wherein, S indicates foreground pixel set,Indicate background pixel
Set, and By the data item converted to the smooth item N-link of interframe, D (x) is indicated by global general
The data item that rate model is calculated, θsExpression prospect statistics with histogram,Indicate the statistics with histogram of background, τ is fore/background
The weight of distributional difference, η indicate the weight of smooth item in frame,Indicate the similarity of adjacent pixel in present frame,It indicates in color histogram, preceding context similarity difference.
Preferably,Calculation are as follows:Wherein, sxAnd syRespectively indicate pixel
X, the label of y, N indicate the set of adjacent pixel pair in present frame, ωxyIndicate the similarity of pixel x, y, and‖ x-y | | indicate that the time-space matrix of pixel x, y, Δ I indicate the color distance of pixel x, y, σ
For the gradient mean value of image.
Preferably, the differentiation mode of foreground pixel and background pixel are as follows:
Obtain target in current frame image estimate initial profile line after, byBy pixel according to whether estimating in initial profile line, it is divided into prospect kind
Son point and background seed point, wherein M indicates that the mask code matrix after mapping, d (M (x)) indicate the distance generated by mask code matrix
Figure, dis are distance threshold, and M (x) indicates the mask value of pixel x, and there are three types of values by Seeds (x), when Seeds (x) value is 1
When, foreground seeds point is set by x, when Seeds (x) value is 0, background seed point is set by x, when Seeds (x) value
When being -1, the region unknown is set by x.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, can obtain down and show
Beneficial effect:
1, present invention pixel each for present frame is additionally added reflection space-time expending other than having R, G, B color attribute
Position attribution, i.e., each pixel distance estimates the distance value of objective contour, successfully incorporated the motion information of target, increases
The space-time expending of Video segmentation is added.
2, the present invention considers the difference of time dimension and Spatial Dimension in space-time expending, when replacing traditional with light flow valuve
Between dimension amount, multilayer graph structure is equivalent to by single layer graph structure by the conversion smooth item of interframe.
3, the present invention improves the space-time of interframe segmentation result by the smooth item conversion of light stream interframe and bilateral space-time restriction
Continuity, while improving the accuracy of segmentation result.
Detailed description of the invention
Fig. 1 is a kind of flow diagram for interactive video dividing method that present example provides;
Fig. 2 is a kind of narrowband light stream display figure that present example provides;
Fig. 3 is a kind of RGB color that present example provides;
Fig. 4 is a kind of YUV color space that present example provides;
Fig. 5 is a kind of distance map that present example provides;
Fig. 6 is a kind of final segmentation obtained based on interactive video dividing method of the invention that present example provides
As a result.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below
Not constituting a conflict with each other can be combined with each other.
A kind of interactive video dividing method proposed by the present invention, firstly, the segmentation result of incoming previous frame, obtains target
In the contour line of previous frame, then carries out target profile curve and estimate, obtain target object in the initial profile of estimating of present frame, and
On the basis of present frame target estimates contour line, the most short distance for showing that each pixel estimates profile to target is mapped by distance
From position attribution as the pixel.Then, each pixel of present frame is transformed into YUV color space by RGB color,
And on the basis of each pixel of present frame is in addition to Y, U, V three-dimensional color attribute, the position attribution of reflection space-time restriction is added, i.e.,
Each pixel distance estimates the distance value of objective contour, expands to higher dimensional space.When establishing graph structure, first by higher dimensional space
Each attribute is divided into multiple histogram Bin in advance, is then superimposed the data item converted to the smooth item N-link of interframe
Into the data item being calculated by global probabilistic model, as the data item of energy function model, energy letter is finally obtained
Exponential model.Energy function model is finally solved, has successfully incorporated the motion information of target through the invention, increases video point
The space-time expending cut can obtain satisfied segmentation result under less man-machine interactively.
It is as shown in Figure 1 a kind of flow diagram of interactive video dividing method provided in an embodiment of the present invention, in Fig. 1
Shown in method specifically includes the following steps:
(1) according to the segmentation result of previous frame image, contour line of the target in previous frame image is obtained;
(2) contour line of the target in previous frame image is mapped to current frame image, and to each picture on contour line
Vegetarian refreshments is all matched to position of the pixel in current frame image, obtains target and estimates initial profile in current frame image
Line;
In an optional embodiment, sparse optical flow matching algorithm can be used the target wheel in previous frame image
Exterior feature is transmitted to present frame, obtains target and estimates initial profile line in current frame image, wherein is specifically obtained using which kind of mode
Initial profile line is estimated in current frame image to target, uniqueness restriction will not be done in embodiments of the present invention.
(3) initial profile line is estimated in current frame image based on target, each pixel is obtained by distance mapping
Position attribution to the shortest distance for estimating initial profile line, as the pixel;
(4) pixel each in current frame image is transformed into YUV color space by RGB color, and in current frame image
In each pixel YUV color attribute on the basis of, the position attribution of each pixel is added, by the feature of each pixel attribute
Dimension expands to higher dimensional space;
In an optional embodiment, the present invention can be using the bilateral grid Γ table being made of rule sampling point ν
Show, pixel is promoted into higher dimensional space first, is then distributed on Grid Sampling point, then Grid Sampling point is constructed
Graph structure.
In an optional embodiment, in the data item (T-link) for calculating higher dimensional space graph structure node and smoothly
When item (N-link), T-link and N-link can be distributed according to the Onecut parted pattern of standard.But the embodiment of the present invention is not
Uniqueness restriction is done to the mode of distribution T-link and N-link.
Wherein, RGB color and YUV color space are all the color models to the color description of image, and RGB is (red green
It is blue) it is the space defined according to eye recognition color, it can indicate most of color.But in machine vision and image procossing
Field does not use RGB color usually to the processing of image, because RGB color contains only three kinds of colors of RGB
Channel puts the image details such as tone, brightness, saturation degree consideration together, so being difficult to these detail sections of quantitative processing.
And in yuv space, each pixel has a luminance signal Y and two carrier chrominance signals U and V.Luminance signal is to intensity
Measurement, separates consideration for luminance signal and carrier chrominance signal, can change brightness value in the case where not influencing color.YUV color
Space can be converted to by RGB color, first colored image into grayscale image and extracted three main colors
Channel becomes additional two carrier chrominance signals to describe color.The YUV color space converted by RGB color can also
To be inversely transformed into RGB color.Specifically include following sub-step:
(4.1) by(1) by pixel each in current frame image
YUV color space is transformed to by RGB color, wherein [cy cu cv]TIndicate the pixel value in YUV color space, [cr cg
cb]TIndicate the value in RGB color;
(4.2) by b (x)=[cy,cu,cv,l]T(2) intrinsic dimensionality of each pixel attribute is expanded into higher dimensional space,
In, l indicates that the position attribution of pixel x, b (x) indicate the corresponding higher dimensional space attribute of pixel x.
Wherein, three color attributes and a position attribution are combined to be promoted to four dimensional feature spaces each pixel, it is false
If the color attribute of a pixelThen the higher dimensional space after promotion isL makees
Position attribution for each pixel is obtained by calculating the distance that each pixel distance estimates contour line recently.Contour line
Estimation contain the motion information of target.
Wherein, the pixel of color space can be mapped to higher dimensional space, common interpolation by way of a variety of interpolation
Mode has, arest neighbors interpolation, linear interpolation and exponential interpolation etc..
In an optional embodiment, for Reduction Computation amount, needs to carry out the higher dimensional space after mapping drop and adopt
Sample.Such as arest neighbors interpolation method can be used, after above-mentioned High Dimensional Mapping, by the value of each dimension of higher-dimension node by most
Neighbour's mode is rounded, if j-th of dimension values of higher dimensional space node iThen arest neighbors interpolation method such as following formula:
Wherein,It indicatesBe not more thanMaximum integer difference.
(5) the smooth item of current frame pixel point to previous frame neighborhood territory pixel point is converted according to the label of previous frame pixel
For data item, the data item after conversion is added in the data item being calculated by global probabilistic model, after superposition
Data item of the data item as energy function model, obtain energy function model;
In an optional embodiment, it is based on single layer graph structure, i.e., to current that traditional figure, which cuts model mostly,
Whole or local narrowband region in frame image model, and this figure mode of building is effectively and necessity in image segmentation
, but Video segmentation is extended to, since object of which movement has temporal continuity, and the motion bit of adjacent two frames object
It moves less, if only only considering to establish graph structure in the current frame, then cannot make full use of the segmentation result of previous frame, institute
To establish the graph structure of adjacent two frame or multiframe, can effectively ensure that the space-time expending and final segmentation knot of segmentation result
The accuracy of fruit.
But multilayer graph structure increases with the number of plies, the node and number of edges amount of figure can increase sharply, and it is huge to calculate time cost
Greatly.Therefore, the present invention equivalent multilayer graph structure by the way of previous frame interframe N-link conversion.Specifically, due to previous frame
The all pixels point of previous frame is divided into two classes by segmentation result, and adjacent with present frame pixel to be split is located at upper one
It the pixel of frame or is marked as prospect or is marked as background.Current frame pixel point to previous frame neighborhood territory pixel point
N-link according to the label of previous frame pixel conversion be T-link, on the one hand can reduce the calculation amount finally solved, separately
On the one hand light stream is used to constrain as space-time expending.
The calculation amount that graph structure solves depends on the number of its interior joint and connects the number of edges of each node.For unexcellent
The multilayer graph structure of change, with the increase of the number of plies, node and side can also increase sharply.In view of the room and time of calculating is multiple
Miscellaneous degree, the present invention is by the way of the smooth item conversion of interframe, with the equivalent multilayer graph structure of single layer graph structure.Traditional interframe conversion
Mode is as follows:
Wherein,Indicate pixel x in previous frame with the neighborhood territory pixel set of position pixel y, | sy| indicate the label of pixel y
Value (0 or 1), wxyIndicate the similarity of pixel x and previous frame field pixel y,||x-y||
Indicate that the time-space matrix of pixel x and y, Δ I indicate that the color distance of pixel x and y, σ indicate the gradient mean value of whole image.And
Calculate the time-space matrix of pixel x and pixel y | | x-y | | when, the time interval of consecutive frame is considered with 1, i.e. one in frame
Pixel.The time-space matrix of the pixel x and pixel y that calculate in this way | | x-y | | such as following formula:
Wherein, xxAnd xyThe transverse and longitudinal coordinate for respectively indicating pixel x, similarly to pixel y, when calculating time-space matrix, formula (5)
Roughly time dimension and Spatial Dimension are equally treated, have ignored time dimension and Spatial Dimension when calculating space-time expending
Difference.
It, cannot be when calculating neighbouring relations in view of calculating the difference of space-time expending time dimension and Spatial Dimension
One unit of time dimension and a unit of Spatial Dimension are equally treated.Therefore, in an optional embodiment, this
The smooth item translation method of interframe of the invention based on optical flow constraint solves light stream value, that is, light stream vectors index of each pixel of present frame
To the corresponding previous frame pixel y ' of current frame pixel point x.Light stream mapping equation such as following formula:
In formula (6), f indicates light stream mapping, i.e., obtains optical flow field by light stream pyramid, finds picture by the mapping of the position f
Plain x corresponds to the pixel y ' of previous frame.A kind of narrowband light stream display figure of present example offer is provided.
In an optional embodiment, in step (5), by current frame pixel point to previous frame neighborhood territory pixel point
Smooth item is data item according to the conversion of the label of previous frame pixel, is specifically included:
ByBy current frame pixel point x to the smooth item of previous frame neighborhood territory pixel point
It is data item according to the conversion of the label of previous frame pixelWherein, the light stream value of each pixel of y ' expression present frame arrives
The corresponding previous frame pixel of current frame pixel point x,For neighborhood territory pixel set in the frame of y ', ωxy′Indicate pixel x and upper one
The similarity of frame field pixel y ', | sy′| indicate the label value of pixel y '.Then according to pixel y ' be marked as prospect or
Data item after background converts formula (7) is added separately in the source node and sink nodes of figure.
Wherein, ωxy′Calculation are as follows:Wherein, | | x-y'| | indicate picture
The time-space matrix of plain x and previous frame field pixel y ', Δ I indicate the color distance of pixel x and previous frame field pixel y ', σ table
The gradient mean value of diagram picture, and(xx,xy) indicate pixel x transverse and longitudinal coordinate, (yx',
yy') indicate pixel y ' transverse and longitudinal coordinate.
In an optional embodiment, the energy function model obtained in step (5) is expressed as:Wherein, S indicates foreground pixel set,Indicate back
Scape pixel set, and By the data item converted to the smooth item N-link of interframe, D (x) indicate by
The data item that global probabilistic model is calculated, and global probabilistic model is made of fore/background mixed Gauss model, in key frame
Segmentation result on initialize fore/background mixed Gauss model, and the ginseng of global probabilistic model is completed according to the distance map of generation
Number updates, θsExpression prospect statistics with histogram,Indicate that the statistics with histogram of background, τ are the weight of fore/background distributional difference, η
Indicate the weight of smooth item in frame,Indicate the similarity of adjacent pixel in present frame,Indicate color histogram
In, the L1 distance of preceding context similarity difference namely preceding background.
Wherein,Calculation are as follows:Wherein, sxAnd syRespectively indicate pixel x, y
Label, N indicate present frame in adjacent pixel pair set, ωxyIndicate the similarity of pixel x, y, and| | x-y | | indicate that the time-space matrix of pixel x, y, Δ I indicate the color distance of pixel x, y, σ
For the gradient mean value of image.
Wherein, the differentiation mode of foreground pixel and background pixel are as follows:
Obtain target in current frame image estimate initial profile line after, byBy pixel according to whether estimating in initial profile line, it is divided into prospect kind
Son point and background seed point, wherein M indicate mapping after mask code matrix, then estimate initial profile line be mapped as in M prospect and
It is mapped as the cut-off rule of background area, d (M (x)) indicates the distance map generated by mask code matrix, and dis is distance threshold, M (x) table
The mask value for showing pixel x, by profile transformation, if pixel x is mapped as prospect, M (x) value is 1, if pixel x quilt
It is mapped as background, then M (x) value is 0, and therefore, x is arranged when Seeds (x) value is 1 for value that there are three types of Seeds (x)
Background seed point is set by x when Seeds (x) value is 0 for foreground seeds point, when Seeds (x) value is -1, by x
It is set as the region unknown.
(6) it solves energy function model and obtains energy function solution to model, and using current frame image as previous frame image,
Step (1)~step (5) is continued to execute until Video segmentation terminates.
Now in conjunction with the drawings and specific embodiments, the present invention is further described.
Method flow of the invention is as shown in Figure 1, existing illustrate by taking test video bear as an example:
(1) previous frame Accurate Segmentation profile is obtained
Assuming that previous frame segmentation result is reliably, previous frame is likely to be key frame, and key frame is needed to be added and is handed over
It is mutually accurately segmented as a result, and initializing global probabilistic model, i.e. fore/background gauss hybrid models, clustering algorithm use
Kmeans++, fore/background mixed Gaussian number are disposed as 5, by the segmentation result of previous frame, obtain target object upper one
The precise boundary of frame.
(2) it preferably matches to obtain the initial position of target in the current frame using sparse optical flow
It is matched by sparse optical flow and the objective contour of previous frame is mapped to present frame, i.e., to each pixel on profile
It is all matched to its position in the current frame, and then obtains target and initially estimates profile in present frame.
(3) profile is initially estimated according to target generates distance map
Profile initially is estimated in present frame according to target, is got over using the distance map that distance mapping obtains target range profile
It is bigger further away from profile distance close to profile apart from smaller, it is as shown in Figure 5 to generate distance map.
(4) RGB color is converted to YUV color space
The original RGB color colour space is converted by above formula (1), obtains YUV color space, RGB color such as Fig. 3 institute
Show, YUV color space is as shown in Figure 4.
(5) higher dimensional space maps
YUV color attribute intrinsic dimensionality is carried out plus position attribution according to above formula (2) to extend to obtain higher dimensional space.
(6) interframe N-link is converted
Interframe N-link is converted using above formula (7), reduced value is added separately in T-link.
(7) building segmentation energy function model
Foreground seeds point and background seed point are chosen according to distance map, in selected seed point, distance profile line is more than one
The pixel for determining threshold value is set as seed point;In addition to seed point, remaining pixel is set as the region unknown;And back before updating
Scape probabilistic model;The data item of energy function model mainly includes the part of the smooth item N-link conversion of interframe, and calculation method is shown in
Above formula (7), data item are the side that ordinary node is connect with source point and meeting point in graph structure.
(8) Accurate Segmentation result is obtained by max-flow/minimal cut algorithm
Final energy function model is shown in above formula (9), solves the model equivalency in solving minimal cut problem, and maximum flow problem
It is dual problem with minimal cut problem, so being finally equivalent to the max-flow of solution figure, can preferably uses maxflow algorithm,
Present frame Accurate Segmentation is obtained as a result, and present frame is switched to previous frame, continuation above-mentioned (1)~(7), until Video segmentation knot
Beam, obtained final segmentation result are as shown in Figure 6.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to
The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include
Within protection scope of the present invention.
Claims (4)
1. a kind of interactive video dividing method characterized by comprising
(1) according to the segmentation result of previous frame image, contour line of the target in previous frame image is obtained;
(2) contour line of the target in previous frame image is mapped to current frame image, and to each pixel on contour line
It is all matched to position of the pixel in current frame image, target is obtained and estimates initial profile line in current frame image;
(3) initial profile line is estimated in current frame image based on target, obtains each pixel in advance by distance mapping
The shortest distance for estimating initial profile line, the position attribution as the pixel;
(4) pixel each in current frame image is transformed into YUV color space by RGB color, and each in current frame image
On the basis of the YUV color attribute of pixel, the position attribution of each pixel is added, by the intrinsic dimensionality of each pixel attribute
Expand to higher dimensional space;
It (5) is number according to the conversion of the label of previous frame pixel by the smooth item of current frame pixel point to previous frame neighborhood territory pixel point
According to item, the data item after conversion is added in the data item being calculated by global probabilistic model, by the number after superposition
Data item according to item as energy function model obtains energy function model;
(6) it solves energy function model and obtains energy function solution to model, and using current frame image as previous frame image, continue
Step (1)~step (5) are executed until Video segmentation terminates;
Wherein, step (4) specifically includes:
(4.1) byBy pixel each in current frame image by RGB face
Color space transformation is to YUV color space, wherein [cy cu cv]TIndicate the pixel value in YUV color space, [cr cg cb]TTable
Show the value in RGB color;
(4.2) by b (x)=[cy,cu,cv,l]TThe intrinsic dimensionality of each pixel attribute is expanded into higher dimensional space, wherein l is indicated
The position attribution of pixel x, b (x) indicate the corresponding higher dimensional space attribute of pixel x;
In step (5), the smooth item by current frame pixel point to previous frame neighborhood territory pixel point is according to previous frame pixel
Label conversion be data item, comprising:
ByBy the smooth item of current frame pixel point x to previous frame neighborhood territory pixel point according to
The label conversion of previous frame pixel is data itemWherein, the light stream value of each pixel of y ' expression present frame is to currently
The corresponding previous frame pixel of frame pixel x,For neighborhood territory pixel set in the frame of y ', ωxy′Indicate that pixel x and previous frame are led
The similarity of domain pixel y ', | sy′| indicate the label value of pixel y ';
The energy function model obtained in step (5) is expressed as:Wherein, S indicates foreground pixel set,
Indicate background pixel set, and Pass through the data item converted to the smooth item N-link of interframe, D (x)
Indicate the data item being calculated by global probabilistic model, θsExpression prospect statistics with histogram,Indicate the histogram system of background
Meter, τ are the weight of fore/background distributional difference, and η indicates the weight of smooth item in frame,Indicate adjacent pixel in present frame
Similarity,It indicates in color histogram, preceding context similarity difference.
2. the method according to claim 1, wherein ωxy′Calculation are as follows:Wherein, | | x-y'| | indicate the time-space matrix of pixel x and previous frame field pixel y ', Δ I
Indicate that the color distance of pixel x and previous frame field pixel y ', σ indicate the gradient mean value of image, and(xx,xy) indicate pixel x transverse and longitudinal coordinate, (yx',yy') indicate pixel y ' cross
Ordinate.
3. method according to claim 1 or 2, which is characterized in thatCalculation are as follows:Wherein, sxAnd syThe label of pixel x, y is respectively indicated, N indicates adjacent picture in present frame
The set of element pair, ωxyIndicate the similarity of pixel x, y, and| | x-y | | indicate pixel x, y
Time-space matrix, Δ I indicate pixel x, y color distance, σ be image gradient mean value.
4. method according to claim 1 or 2, which is characterized in that the differentiation mode of foreground pixel and background pixel are as follows:
Obtain target in current frame image estimate initial profile line after, byBy pixel according to whether estimating in initial profile line, it is divided into prospect kind
Son point and background seed point, wherein M indicates that the mask code matrix after mapping, d (M (x)) indicate the distance generated by mask code matrix
Figure, dis are distance threshold, and M (x) indicates the mask value of pixel x, and there are three types of values by Seeds (x), when Seeds (x) value is 1
When, foreground seeds point is set by x, when Seeds (x) value is 0, background seed point is set by x, when Seeds (x) value
When being -1, the region unknown is set by x.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710794283.5A CN107590818B (en) | 2017-09-06 | 2017-09-06 | A kind of interactive video dividing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710794283.5A CN107590818B (en) | 2017-09-06 | 2017-09-06 | A kind of interactive video dividing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107590818A CN107590818A (en) | 2018-01-16 |
CN107590818B true CN107590818B (en) | 2019-10-25 |
Family
ID=61051076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710794283.5A Expired - Fee Related CN107590818B (en) | 2017-09-06 | 2017-09-06 | A kind of interactive video dividing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107590818B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108389217A (en) * | 2018-01-31 | 2018-08-10 | 华东理工大学 | A kind of image synthesizing method based on gradient field mixing |
CN108961261B (en) * | 2018-03-14 | 2022-02-15 | 中南大学 | Optic disk region OCT image hierarchy segmentation method based on space continuity constraint |
CN109978891A (en) * | 2019-03-13 | 2019-07-05 | 浙江商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN110163873B (en) * | 2019-05-20 | 2023-02-24 | 长沙理工大学 | Bilateral video target segmentation method and system |
CN111985266A (en) * | 2019-05-21 | 2020-11-24 | 顺丰科技有限公司 | Scale map determination method, device, equipment and storage medium |
CN110610453B (en) * | 2019-09-02 | 2021-07-06 | 腾讯科技(深圳)有限公司 | Image processing method and device and computer readable storage medium |
CN112784630A (en) * | 2019-11-06 | 2021-05-11 | 广东毓秀科技有限公司 | Method for re-identifying pedestrians based on local features of physical segmentation |
CN111539993B (en) * | 2020-04-13 | 2021-10-19 | 中国人民解放军军事科学院国防科技创新研究院 | Space target visual tracking method based on segmentation |
CN113191266B (en) * | 2021-04-30 | 2021-10-22 | 江苏航运职业技术学院 | Remote monitoring management method and system for ship power device |
CN116912246B (en) * | 2023-09-13 | 2023-12-29 | 潍坊医学院 | Tumor CT data processing method based on big data |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101527043A (en) * | 2009-03-16 | 2009-09-09 | 江苏银河电子股份有限公司 | Video picture segmentation method based on moving target outline information |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7657060B2 (en) * | 2004-03-31 | 2010-02-02 | Microsoft Corporation | Stylization of video |
-
2017
- 2017-09-06 CN CN201710794283.5A patent/CN107590818B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101527043A (en) * | 2009-03-16 | 2009-09-09 | 江苏银河电子股份有限公司 | Video picture segmentation method based on moving target outline information |
Non-Patent Citations (4)
Title |
---|
三维物体的高效分割与重建;张佳伟;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160715;全文 * |
交互式分割视频运动对象的研究与实现;韩军等;《中国图象图形学报》;20030228;全文 * |
基于内容的视频运动对象分割技术研究;包红强;《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》;20051115;全文 * |
视频场景的重建与增强处理;章国锋;《中国博士学位论文全文数据库 信息科技辑》;20100715;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN107590818A (en) | 2018-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107590818B (en) | A kind of interactive video dividing method | |
CN107644429B (en) | Video segmentation method based on strong target constraint video saliency | |
CN103559719B (en) | A kind of interactive image segmentation method | |
CN110276264B (en) | Crowd density estimation method based on foreground segmentation graph | |
CN105118049A (en) | Image segmentation method based on super pixel clustering | |
CN103258203B (en) | The center line of road extraction method of remote sensing image | |
CN103162682B (en) | Based on the indoor path navigation method of mixed reality | |
CN108133028A (en) | It is listed method based on the aircraft that video analysis is combined with location information | |
CN108648233A (en) | A kind of target identification based on deep learning and crawl localization method | |
CN104463843B (en) | Interactive image segmentation method of Android system | |
CN103578119A (en) | Target detection method in Codebook dynamic scene based on superpixels | |
CN109829449A (en) | A kind of RGB-D indoor scene mask method based on super-pixel space-time context | |
CN109636784A (en) | Saliency object detection method based on maximum neighborhood and super-pixel segmentation | |
CN105678218B (en) | A kind of method of mobile object classification | |
CN103136537B (en) | Vehicle type identification method based on support vector machine | |
CN105809716B (en) | Foreground extraction method integrating superpixel and three-dimensional self-organizing background subtraction method | |
CN107194949B (en) | A kind of interactive video dividing method and system matched based on block and enhance Onecut | |
CN102903128A (en) | Video image content editing and spreading method based on local feature structure keeping | |
CN110163873B (en) | Bilateral video target segmentation method and system | |
CN108537239A (en) | A kind of method of saliency target detection | |
CN109919053A (en) | A kind of deep learning vehicle parking detection method based on monitor video | |
CN111369539B (en) | Building facade window detecting system based on multi-feature image fusion | |
CN109448015A (en) | Image based on notable figure fusion cooperates with dividing method | |
CN107909079A (en) | One kind collaboration conspicuousness detection method | |
CN115757604B (en) | GDP space-time evolution analysis method based on noctilucent image data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191025 Termination date: 20200906 |
|
CF01 | Termination of patent right due to non-payment of annual fee |