CN107590818A - A kind of interactive video dividing method - Google Patents

A kind of interactive video dividing method Download PDF

Info

Publication number
CN107590818A
CN107590818A CN201710794283.5A CN201710794283A CN107590818A CN 107590818 A CN107590818 A CN 107590818A CN 201710794283 A CN201710794283 A CN 201710794283A CN 107590818 A CN107590818 A CN 107590818A
Authority
CN
China
Prior art keywords
pixel
previous frame
target
current frame
represent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710794283.5A
Other languages
Chinese (zh)
Other versions
CN107590818B (en
Inventor
韩守东
杨迎春
刘昱均
陈阳
胡卓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Shenzhen Huazhong University of Science and Technology Research Institute
Original Assignee
Huazhong University of Science and Technology
Shenzhen Huazhong University of Science and Technology Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology, Shenzhen Huazhong University of Science and Technology Research Institute filed Critical Huazhong University of Science and Technology
Priority to CN201710794283.5A priority Critical patent/CN107590818B/en
Publication of CN107590818A publication Critical patent/CN107590818A/en
Application granted granted Critical
Publication of CN107590818B publication Critical patent/CN107590818B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of interactive video dividing method, first, target profile curve is carried out to estimate, obtain target object and estimate initial profile in present frame, and on the basis of present frame target estimates contour line, the beeline for showing that each pixel estimates profile to target, the position attribution as the pixel are mapped by distance.On the basis of each pixel of present frame is except three-dimensional color attribute, the position attribution of reflection space-time restriction is added, i.e., each pixel distance estimates the distance value of objective contour, expands to higher dimensional space.Each attribute of higher dimensional space is first divided into multiple histogram Bin in advance when establishing graph structure, then the smooth item of interframe is added in the data item being calculated by global probabilistic model, data item as energy function model, finally energy function solution to model is obtained using max-flow min-cut algorithm, the movable information of target has successfully been incorporated, has added the space-time expending of Video segmentation.

Description

A kind of interactive video dividing method
Technical field
The invention belongs to the video dividing technique field in image procossing and machine vision, is handed over more particularly, to one kind Mutual formula methods of video segmentation.
Background technology
Video segmentation is a kind of Closing Binary Marker problem, it is intended to using the video or image sequence as an entirety, passes through one Fixed method is partitioned into the object with practical significance.Video segmentation play the role of in many fields it is important, such as: In target identification, Video segmentation can provide prior information for target identification;In Image Coding, Video segmentation can be improved and regarded The efficiency of frequency compressed encoding.Usually, according to whether adding man-machine interactively, Video segmentation can be divided into non-interactive type Video segmentation Split two kinds with interactive video.Non-interactive type Video segmentation mainly uses the motion feature of object video, such as based on light stream or Non-interactive type methods of video segmentation based on gradient descent method.To moving object applicability in video be present in such method Preferably, if target to be split is fixed either motion slowly or alternately moved, such method is due to that can not pass through motion Feature predicts the Probability Area of target object, so as to being unable to reach the purpose of segmentation.And interactive video dividing method passes through The irregular motion of above-mentioned target can preferably be solved the problems, such as by adding man-machine interactively.
Video shows as the space-time expending of Video segmentation result, includes the time as a space-time entirety, space-time restriction Continuity and spatial continuity, time continuity show as motion of the target in front and rear two frame, are that segmentation result effectively transmits Important leverage;Spatial continuity is used in image segmentation earliest, and its form of expression is adjacent pixel or adjacent area Similarity, smooth item (N-link) is commonly known as in energy function, is to ensure target object globality in segmentation result Necessary condition.Video segmentation is segmented in the extension on time dimension as image, and therefore, space-time expending is for Video segmentation result Transmission it is most important.
Space-time expending is the important indicator for judging a Video segmentation result transitivity, is various based on motion analysis Methods of video segmentation uses most attributes.Space-time expending includes time continuity and spatial continuity, and time continuity leads to Often reflect the motion feature of target, the spatial continuity key reaction shape information of target.Regarded using time and space continuity Frequency division segmentation method is a lot, and some methods first carry out super-pixel pre-segmentation, then by calculating the super-pixel similarity of adjacent two frame To distribute interframe smooth item, when calculating two super-pixel centre distance using only the space length of two super-pixel, by the time Continuity is that movable information is described with a display model.Based on the locus of points tracking methods of video segmentation using dense optical flow with The locus of points of track Long time scale, and cluster by the locus of points to obtain the space-time expending of target.Also some methods of video segmentation When considering space-time expending, generally time dimension and Spatial Dimension are treated on an equal basis, i.e., an adjacent unit of time dimension and An adjacent location equivalence of space dimension, but time and space are discrepant, adjacent two frames same positions in actual conditions Pixel conventional video segmentation in be temporally adjacent, the distance for calculating time dimension is exactly a unit, then calculate work as The Euclidean distance for time and space is just briefly described in the time-space matrix of the neighborhood territory pixel of previous frame in previous frame pixel.At present, base Methods of video segmentation in bilateral space regards video to be split as an entirety, also has pixel color comprising time and space altogether Six attributes, by being mapped to sextuple bilateral space to each dimension linear interpolation, then to bilateral space using traditional Figure segmentation method solves, and obtains the label of bilateral space interior joint, it is every to obtain video to be split finally by linear interpolation inverse mapping The preceding background probability of one frame pixel.Current most of methods of video segmentation are all based on graph theory, have many methods directly will figure Cut model and be generalized to Video segmentation from image segmentation, by light stream or other tracking modes on the basis of original spatial continuity Add time continuity.Traditional figure cuts model and generally only considers some neighbouring pixels when similitude between considering pixel Point, this kind of method do not consider the contact between pixel similar in a wide range of color well.
The content of the invention
For the disadvantages described above or Improvement requirement of prior art, the invention provides a kind of interactive video dividing method, Thus it is not strong to solve segmentation result accuracy present in existing interactive video cutting techniques, space-time expending it is inconsistent and The technical problems such as interactive quantity is excessive.
To achieve the above object, the invention provides a kind of interactive video dividing method, including:
(1) according to the segmentation result of previous frame image, contour line of the target in previous frame image is obtained;
(2) contour line of the target in previous frame image is mapped to current frame image, and to each picture on contour line Vegetarian refreshments all matches position of the pixel in current frame image, obtains target and estimates initial profile in current frame image Line;
(3) initial profile line is estimated in current frame image based on target, each pixel is drawn by distance mapping To the beeline for estimating initial profile line, the position attribution as the pixel;
(4) each pixel in current frame image is transformed into YUV color spaces by RGB color, and in current frame image In each pixel YUV color attributes on the basis of, the position attribution of each pixel is added, by the feature of each pixel attribute Dimension expands to higher dimensional space;
(5) the smooth item of current frame pixel point to previous frame neighborhood territory pixel point is converted according to the mark of previous frame pixel For data item, the data item after conversion is added in the data item being calculated by global probabilistic model, after superposition Data item of the data item as energy function model, obtain energy function model;
(6) solve energy function model and obtain energy function solution to model, and using current frame image as previous frame image, Step (1)~step (5) is continued executing with until Video segmentation terminates.
Preferably, step (4) specifically includes:
(4.1) byBy each pixel in current frame image by RGB color transforms to YUV color spaces, wherein, [cy cu cv]TRepresent the pixel value in YUV color spaces, [cr cg cb]TRepresent the value in RGB color;
(4.2) by b (x)=[cy,cu,cv,l]TThe intrinsic dimensionality of each pixel attribute is expanded into higher dimensional space, wherein, L represents pixel x position attribution, and b (x) represents higher dimensional space attribute corresponding to pixel x.
Preferably, in step (5), the smooth item by current frame pixel point to previous frame neighborhood territory pixel point is according to upper The mark conversion of one frame pixel is data item, including:
ByBy current frame pixel point x to previous frame neighborhood territory pixel point smooth item according to upper The mark conversion of one frame pixel is data itemWherein, the light stream value of each pixel of y ' expressions present frame is to present frame Previous frame pixel corresponding to pixel x,For neighborhood territory pixel set in y ' frame, ωxy′Represent pixel x and previous frame field Pixel y ' similarity, | sy′| represent pixel y ' mark value.
Preferably, ωxy′Calculation be:Wherein, | | x-y'| | represent pixel x With previous frame field pixel y ' time-space matrix, Δ I represents pixel x and previous frame field pixel y ' color distance, and σ represents figure The gradient mean value of picture, and(xx,xy) represent pixel x transverse and longitudinal coordinate, (yx',yy') Represent pixel y ' transverse and longitudinal coordinate.
Preferably, the energy function model obtained in step (5) is expressed as:Wherein, S represents foreground pixel set,Represent background pixel Set, and By the data item for converting to obtain to the smooth item N-link of interframe, D (x) is represented by global general The data item that rate model is calculated, θsExpression prospect statistics with histogram,The statistics with histogram of background is represented, τ is fore/background The weight of distributional difference, η represent the weight of smooth item in frame,The similarity of adjacent pixel in present frame is represented,Represent in color histogram, preceding context similarity difference.
Preferably,Calculation be:Wherein, sxAnd syRepresent respectively pixel x, Y mark, N represent the set of adjacent pixel pair in present frame, ωxyThe similarity of pixel x, y is represented, and‖ x-y | | the time-space matrix of pixel x, y is represented, Δ I represents the color distance of pixel x, y, σ For the gradient mean value of image.
Preferably, the differentiation mode of foreground pixel and background pixel is:
Obtain target in current frame image estimate initial profile line after, by By pixel according to whether estimating in initial profile line, it is divided into foreground seeds point and background seed point, wherein, M represents mapping Mask code matrix afterwards, d (M (x)) represent the distance map generated by mask code matrix, and dis is distance threshold, and M (x) represents pixel x's Mask value, Seeds (x) have three kinds of values, when Seeds (x) values are 1, x are arranged into foreground seeds point, as Seeds (x) When value is 0, x is arranged to background seed point, when Seeds (x) values are -1, x is arranged to unknown regions.
In general, by the contemplated above technical scheme of the present invention compared with prior art, it can obtain down and show Beneficial effect:
1st, the present invention is additionally added reflection space-time expending for each pixel of present frame in addition to having R, G, B color attribute Position attribution, i.e., each pixel distance estimates the distance value of objective contour, successfully incorporated the movable information of target, increases The space-time expending of Video segmentation is added.
2nd, the present invention considers the difference of time dimension and Spatial Dimension in space-time expending, when replacing traditional with light flow valuve Between dimension amount, multilayer graph structure is equivalent to individual layer graph structure by converting interframe smooth item.
3rd, the present invention improves the space-time of interframe segmentation result by the smooth item conversion of light stream interframe and bilateral space-time restriction Continuity, while improve the accuracy of segmentation result.
Brief description of the drawings
Fig. 1 is a kind of schematic flow sheet for interactive video dividing method that present example provides;
Fig. 2 is a kind of arrowband light stream display figure that present example provides;
Fig. 3 is a kind of RGB color that present example provides;
Fig. 4 is a kind of YUV color spaces that present example provides;
Fig. 5 is a kind of distance map that present example provides;
Fig. 6 is the final segmentation that a kind of interactive video dividing method based on the present invention that present example provides obtains As a result.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in each embodiment of invention described below Conflict can is not formed each other to be mutually combined.
A kind of interactive video dividing method proposed by the present invention, first, the segmentation result of previous frame is passed to, obtains target In the contour line of previous frame, then carry out target profile curve and estimate, obtain estimate initial profile of the target object in present frame, and On the basis of present frame target estimates contour line, the most short distance for showing that each pixel estimates profile to target is mapped by distance From the position attribution as the pixel.Then, each pixel of present frame is transformed into YUV color spaces by RGB color, And on the basis of each pixel of present frame is except Y, U, V three-dimensional color attribute, the position attribution of reflection space-time restriction is added, i.e., Each pixel distance estimates the distance value of objective contour, expands to higher dimensional space.When establishing graph structure, first by higher dimensional space Each attribute is divided into multiple histogram Bin in advance, is then superimposed the data item for converting to obtain to the smooth item N-link of interframe Into the data item being calculated by global probabilistic model, as the data item of energy function model, energy letter is finally given Exponential model.Energy function model is finally solved, by the movable information of the invention for successfully having incorporated target, adds video point The space-time expending cut, satisfied segmentation result can be obtained under less man-machine interactively.
It is as shown in Figure 1 a kind of schematic flow sheet of interactive video dividing method provided in an embodiment of the present invention, in Fig. 1 Following steps are specifically included in shown method:
(1) according to the segmentation result of previous frame image, contour line of the target in previous frame image is obtained;
(2) contour line of the target in previous frame image is mapped to current frame image, and to each picture on contour line Vegetarian refreshments all matches position of the pixel in current frame image, obtains target and estimates initial profile in current frame image Line;
In an optional embodiment, sparse optical flow matching algorithm can be used by the target wheel in previous frame image Exterior feature is delivered to present frame, obtains target and estimates initial profile line in current frame image, wherein, specifically obtained using which kind of mode Initial profile line is estimated in current frame image to target, uniqueness restriction will not be done in embodiments of the present invention.
(3) initial profile line is estimated in current frame image based on target, each pixel is drawn by distance mapping To the beeline for estimating initial profile line, the position attribution as the pixel;
(4) each pixel in current frame image is transformed into YUV color spaces by RGB color, and in current frame image In each pixel YUV color attributes on the basis of, the position attribution of each pixel is added, by the feature of each pixel attribute Dimension expands to higher dimensional space;
In an optional embodiment, the present invention can use the bilateral grid Γ tables being made up of rule sampling point ν Show, be first then distributed to pixel lifting on Grid Sampling point into higher dimensional space, then Grid Sampling point is built Graph structure.
In an optional embodiment, calculate higher dimensional space graph structure node data item (T-link) and smoothly During item (N-link), T-link and N-link can be distributed according to the Onecut parted patterns of standard.But the embodiment of the present invention is not Uniqueness restriction is done to the mode for distributing T-link and N-link.
Wherein, RGB color and YUV color spaces are all the color models to the color description of image, and RGB is (red green It is blue) it is the space defined according to eye recognition color, most of color can be represented.But in machine vision and image procossing Field, the processing to image do not use RGB color generally, because RGB color contains only three kinds of colors of RGB Passage, the image details such as tone, brightness, saturation degree are put together consideration, so being difficult to quantitative processing these detail sections. And in yuv space, each pixel has a luminance signal Y, and two carrier chrominance signals U and V.Luminance signal is to intensity Measurement, separates consideration by luminance signal and carrier chrominance signal, can change brightness value in the case where not influenceing color.YUV colors Space can be converted to by RGB color, first colored image into gray-scale map and extracted three main colors Passage is changed into extra two carrier chrominance signals to describe color.The YUV color spaces for converting to obtain by RGB color also may be used To be inversely transformed into RGB color.Specifically include following sub-step:
(4.1) by(1) by each pixel in current frame image YUV color spaces are transformed to by RGB color, wherein, [cy cu cv]TRepresent the pixel value in YUV color spaces, [cr cg cb]TRepresent the value in RGB color;
(4.2) by b (x)=[cy,cu,cv,l]T(2) intrinsic dimensionality of each pixel attribute is expanded into higher dimensional space, its In, l represents pixel x position attribution, and b (x) represents higher dimensional space attribute corresponding to pixel x.
Wherein, three color attributes and a position attribution lifting are combined to each pixel to four dimensional feature spaces, it is false If the color attribute of a pixelThen the higher dimensional space after lifting isL makees Position attribution for each pixel is that the distance for estimating contour line recently by calculating each pixel distance obtains.Contour line Estimation contain the movable information of target.
Wherein, the pixel of color space can be mapped to higher dimensional space, common interpolation by way of a variety of interpolation Mode has, arest neighbors interpolation, linear interpolation and exponential interpolation etc..
In an optional embodiment, adopted for Reduction Computation amount, it is necessary to carry out drop to the higher dimensional space after mapping Sample.Such as arest neighbors interpolation method can be used, after above-mentioned High Dimensional Mapping, by the value of each dimension of higher-dimension node by most Neighbour's mode rounds, if j-th of dimension values of higher dimensional space node iThen arest neighbors interpolation method such as following formula:
Wherein,RepresentWith no more thanMaximum integer difference.
(5) the smooth item of current frame pixel point to previous frame neighborhood territory pixel point is converted according to the mark of previous frame pixel For data item, the data item after conversion is added in the data item being calculated by global probabilistic model, after superposition Data item of the data item as energy function model, obtain energy function model;
In an optional embodiment, it is based on individual layer graph structure, i.e., to current mostly that traditional figure, which cuts model, Whole or the modeling of local narrowband region in two field picture, this figure mode of building is effectively and necessary in image segmentation , but Video segmentation is extended to, because object of which movement has temporal continuity, and the motion bit of adjacent two frames object Move little, if only only considering to establish graph structure in the current frame, then can not make full use of the segmentation result of previous frame, institute To establish adjacent two frame or the graph structure of multiframe, the space-time expending and final segmentation that can effectively ensure that segmentation result are tied The accuracy of fruit.
But multilayer graph structure increases with the number of plies, the node and side quantity of figure can increase sharply, and it is huge to calculate time cost Greatly.Therefore, the present invention equivalent multilayer graph structure by the way of previous frame interframe N-link conversions.Specifically, due to previous frame The all pixels point of previous frame is divided into two classes by segmentation result, and adjacent with the pixel that present frame is to be split is located at upper one Either the pixel of frame is marked as prospect, or it is marked as background.Current frame pixel point to previous frame neighborhood territory pixel point N-link according to the mark of previous frame pixel conversion be T-link, on the one hand can reduce the amount of calculation finally solved, separately On the one hand constrained by the use of light stream as space-time expending.
The amount of calculation that graph structure solves depends on the number of its interior joint and connects the side number of each node.For unexcellent The multilayer graph structure of change, with the increase of the number of plies, its node and side can also increase sharply.Answered in view of the room and time of calculating Miscellaneous degree, the present invention is by the way of the smooth item conversion of interframe, with the equivalent multilayer graph structure of individual layer graph structure.Traditional interframe conversion Mode is as follows:
Wherein,Neighborhood territory pixel set of the pixel x in previous frame with position pixel y is represented, | sy| represent pixel y mark Value (0 or 1), wxyPixel x and previous frame field pixel y similarity is represented,||x-y|| Pixel x and y time-space matrix are represented, Δ I represents pixel x and y color distance, and σ represents the gradient mean value of whole image.And Calculate pixel x and pixel y time-space matrix | | x-y | | when, the time interval of consecutive frame is considered with 1, i.e., one in frame Pixel.The pixel x and pixel y time-space matrix calculated in this way | | x-y | | such as following formula:
Wherein, xxAnd xyPixel x transverse and longitudinal coordinate is represented respectively, to pixel y similarly, when calculating time-space matrix, formula (5) Roughly time dimension and Spatial Dimension are equally treated, have ignored time dimension and Spatial Dimension when calculating space-time expending Difference.
Difference in view of calculating space-time expending time dimension and Spatial Dimension, when calculating neighbouring relations, it is impossible to One unit of time dimension and a unit of Spatial Dimension are equally treated.Therefore, in an optional embodiment, this The light stream value that the smooth item translation method of interframe of the invention based on optical flow constraint solves each pixel of present frame is light stream vectors index To previous frame pixel y ' corresponding to current frame pixel point x.Light stream mapping equation such as following formula:
In formula (6), f represents light stream mapping, i.e., obtains optical flow field by light stream pyramid, and picture is found by the mapping of f positions Plain x corresponds to the pixel y ' of previous frame.A kind of arrowband light stream display figure of present example offer is provided.
In an optional embodiment, in step (5), by current frame pixel point to previous frame neighborhood territory pixel point Smooth item is data item according to the conversion of the mark of previous frame pixel, is specifically included:
BySmooth item by current frame pixel point x to previous frame neighborhood territory pixel point It is data item according to the conversion of the mark of previous frame pixelWherein, the light stream value of each pixel of y ' expressions present frame arrives Previous frame pixel corresponding to current frame pixel point x,For neighborhood territory pixel set in y ' frame, ωxy′Represent pixel x and upper one Frame field pixel y ' similarity, | sy′| represent pixel y ' mark value.Then according to pixel y ' be marked as prospect or Data item after background converts formula (7) is added separately in the source node and sink nodes of figure.
Wherein, ωxy′Calculation be:Wherein, | | x-y'| | represent picture Plain x and previous frame field pixel y ' time-space matrix, Δ I represent pixel x and previous frame field pixel y ' color distance, σ tables The gradient mean value of diagram picture, and(xx,xy) represent pixel x transverse and longitudinal coordinate, (yx', yy') represent pixel y ' transverse and longitudinal coordinate.
In an optional embodiment, the energy function model obtained in step (5) is expressed as:Wherein, S represents foreground pixel set,Represent background Pixel set, and By the data item for converting to obtain to the smooth item N-link of interframe, D (x) is represented by complete The data item that office's probabilistic model is calculated, and global probabilistic model is made up of fore/background mixed Gauss model, in key frame Fore/background mixed Gauss model is initialized on segmentation result, and the parameter of global probabilistic model is completed according to the distance map of generation Renewal, θsExpression prospect statistics with histogram,Represent background statistics with histogram, τ be fore/background distributional difference weight, η tables Show the weight of smooth item in frame,The similarity of adjacent pixel in present frame is represented,Represent in color histogram, Preceding context similarity difference, namely the L1 distances of preceding background.
Wherein,Calculation be:Wherein, sxAnd syPixel x, y is represented respectively Mark, N represent present frame in adjacent pixel pair set, ωxyThe similarity of pixel x, y is represented, and| | x-y | | the time-space matrix of pixel x, y is represented, Δ I represents the color distance of pixel x, y, σ For the gradient mean value of image.
Wherein, the differentiation mode of foreground pixel and background pixel is:
Obtain target in current frame image estimate initial profile line after, by By pixel according to whether estimating in initial profile line, it is divided into foreground seeds point and background seed point, wherein, M represents mapping Mask code matrix afterwards, then it is that prospect is mapped as in M and is mapped as the cut-off rule of background area, d (M (x)) to estimate initial profile line The distance map generated by mask code matrix is represented, dis is distance threshold, and M (x) represents pixel x mask value, by profile transformation, If pixel x is mapped as prospect, M (x) values are 1, if pixel x is mapped as background, M (x) values are 0, therefore, Seeds (x) has three kinds of values, and when Seeds (x) values are 1, x is arranged into foreground seeds point, when Seeds (x) values are 0 When, x is arranged to background seed point, when Seeds (x) values are -1, x is arranged to unknown regions.
(6) solve energy function model and obtain energy function solution to model, and using current frame image as previous frame image, Step (1)~step (5) is continued executing with until Video segmentation terminates.
In conjunction with the drawings and specific embodiments, the present invention is further described.
The method flow of the present invention is as shown in figure 1, existing illustrate by taking test video bear as an example:
(1) previous frame Accurate Segmentation profile is obtained
Assuming that previous frame segmentation result is reliable, previous frame is likely to be key frame, needs to add for key frame and hands over Result mutually is accurately segmented, and initializes global probabilistic model, be i.e. fore/background gauss hybrid models, clustering algorithm uses Kmeans++, fore/background mixed Gaussian number are disposed as 5, by the segmentation result of previous frame, obtain target object upper one The precise boundary of frame.
(2) preferably match to obtain the initial position of target in the current frame using sparse optical flow
Matched by sparse optical flow and the objective contour of previous frame is mapped to present frame, i.e., to each pixel on profile Its position in the current frame is all matched, and then obtains target and estimates profile in the initial of present frame.
(3) profile generation distance map is initially estimated according to target
Profile is estimated in the initial of present frame according to target, the distance map that target range profile is obtained using distance mapping is got over Close to profile apart from smaller, bigger further away from profile distance, generation distance map is as shown in Figure 5.
(4) RGB color is converted to YUV color spaces
The original RGB color colour space is converted by above formula (1), obtains YUV color spaces, RGB color such as Fig. 3 institutes Show, YUV color spaces are as shown in Figure 4.
(5) higher dimensional space maps
YUV color attributes are carried out into intrinsic dimensionality plus position attribution according to above formula (2) to extend to obtain higher dimensional space.
(6) interframe N-link is converted
Interframe N-link is converted using above formula (7), reduced value is added separately in T-link.
(7) structure segmentation energy function model
Foreground seeds point and background seed point are chosen according to distance map, in selected seed point, distance profile line is more than one The pixel for determining threshold value is set to seed point;Except seed point, remaining pixel is set as unknown regions;And carried on the back before updating Scape probabilistic model;The data item of energy function model mainly includes the part of the smooth item N-link conversions of interframe, and computational methods are shown in Above formula (7), data item are the side that ordinary node is connected with source point and meeting point in graph structure.
(8) Accurate Segmentation result is obtained by max-flow/minimal cut algorithm
Final energy function model is shown in above formula (9), solves the model equivalency in solving minimal cut problem, and maximum flow problem It is dual problem with minimal cut problem, so the final max-flow for being equivalent to solution figure, can preferably use maxflow algorithms, Present frame Accurate Segmentation result is obtained, and present frame is switched into previous frame, continues above-mentioned (1)~(7), until Video segmentation knot Beam, obtained final segmentation result are as shown in Figure 6.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to The limitation present invention, all any modification, equivalent and improvement made within the spirit and principles of the invention etc., all should be included Within protection scope of the present invention.

Claims (7)

  1. A kind of 1. interactive video dividing method, it is characterised in that including:
    (1) according to the segmentation result of previous frame image, contour line of the target in previous frame image is obtained;
    (2) contour line of the target in previous frame image is mapped to current frame image, and to each pixel on contour line Position of the pixel in current frame image is all matched, target is obtained and estimates initial profile line in current frame image;
    (3) initial profile line is estimated in current frame image based on target, draws each pixel in advance by distance mapping Estimate the beeline of initial profile line, the position attribution as the pixel;
    (4) each pixel in current frame image is transformed into YUV color spaces by RGB color, and it is each in current frame image On the basis of the YUV color attributes of pixel, the position attribution of each pixel is added, by the intrinsic dimensionality of each pixel attribute Expand to higher dimensional space;
    (5) it is number according to the conversion of the mark of previous frame pixel by the smooth item of current frame pixel point to previous frame neighborhood territory pixel point According to item, the data item after conversion is added in the data item being calculated by global probabilistic model, by the number after superposition According to data item of the item as energy function model, energy function model is obtained;
    (6) solve energy function model and obtain energy function solution to model, and using current frame image as previous frame image, continue Step (1)~step (5) is performed until Video segmentation terminates.
  2. 2. according to the method for claim 1, it is characterised in that step (4) specifically includes:
    (4.1) byBy each pixel in current frame image by RGB face Color space transformation to YUV color spaces, wherein, [cy cu cv]TRepresent the pixel value in YUV color spaces, [cr cg cb]TTable Show the value in RGB color;
    (4.2) by b (x)=[cy,cu,cv,l]TThe intrinsic dimensionality of each pixel attribute is expanded into higher dimensional space, wherein, l is represented Pixel x position attribution, b (x) represent higher dimensional space attribute corresponding to pixel x.
  3. 3. according to the method for claim 1, it is characterised in that described by current frame pixel point to upper one in step (5) The smooth item of frame neighborhood territory pixel point is data item according to the conversion of the mark of previous frame pixel, including:
    ByBy current frame pixel point x to previous frame neighborhood territory pixel point smooth item according to previous frame The mark conversion of pixel is data itemWherein, the light stream value of each pixel of y ' expressions present frame is to current frame pixel point Previous frame pixel corresponding to x,For neighborhood territory pixel set in y ' frame, ωxy' represent pixel x and previous frame field pixel y ' Similarity, | sy′| represent pixel y ' mark value.
  4. 4. according to the method for claim 3, it is characterised in that ωxy′Calculation be: Wherein, | | x-y'| | pixel x and previous frame field pixel y ' time-space matrix is represented, Δ I represents pixel x and previous frame field picture Plain y ' color distance, σ represent the gradient mean value of image, and(xx,xy) represent pixel X transverse and longitudinal coordinate, (yx',yy') represent pixel y ' transverse and longitudinal coordinate.
  5. 5. according to the method described in Claims 1-4 any one, it is characterised in that the energy function obtained in step (5) Model is expressed as:Wherein, S represents foreground pixel set,Table Show background pixel set, and Pass through the data item for converting to obtain to the smooth item N-link of interframe, D (x) tables Show the data item being calculated by global probabilistic model, θsExpression prospect statistics with histogram,The statistics with histogram of background is represented, τ is the weight of fore/background distributional difference, and η represents the weight of smooth item in frame,Adjacent pixel is similar in expression present frame Degree,Represent in color histogram, preceding context similarity difference.
  6. 6. according to the method for claim 5, it is characterised in thatCalculation be: Wherein, sxAnd syThe mark of pixel x, y is represented respectively, and N represents the set of adjacent pixel pair in present frame, ωxyRepresent pixel x, y Similarity, and| | x-y | | the time-space matrix of pixel x, y is represented, Δ I represents pixel x, y Color distance, σ be image gradient mean value.
  7. 7. according to the method for claim 5, it is characterised in that the differentiation mode of foreground pixel and background pixel is:
    Obtain target in current frame image estimate initial profile line after, by By pixel according to whether estimating in initial profile line, it is divided into foreground seeds point and background seed point, wherein, M represents mapping Mask code matrix afterwards, d (M (x)) represent the distance map generated by mask code matrix, and dis is distance threshold, and M (x) represents pixel x's Mask value, Seeds (x) have three kinds of values, when Seeds (x) values are 1, x are arranged into foreground seeds point, as Seeds (x) When value is 0, x is arranged to background seed point, when Seeds (x) values are -1, x is arranged to unknown regions.
CN201710794283.5A 2017-09-06 2017-09-06 A kind of interactive video dividing method Expired - Fee Related CN107590818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710794283.5A CN107590818B (en) 2017-09-06 2017-09-06 A kind of interactive video dividing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710794283.5A CN107590818B (en) 2017-09-06 2017-09-06 A kind of interactive video dividing method

Publications (2)

Publication Number Publication Date
CN107590818A true CN107590818A (en) 2018-01-16
CN107590818B CN107590818B (en) 2019-10-25

Family

ID=61051076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710794283.5A Expired - Fee Related CN107590818B (en) 2017-09-06 2017-09-06 A kind of interactive video dividing method

Country Status (1)

Country Link
CN (1) CN107590818B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108389217A (en) * 2018-01-31 2018-08-10 华东理工大学 A kind of image synthesizing method based on gradient field mixing
CN108961261A (en) * 2018-03-14 2018-12-07 中南大学 A kind of optic disk region OCT image Hierarchical Segmentation method based on spatial continuity constraint
CN109978891A (en) * 2019-03-13 2019-07-05 浙江商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110163873A (en) * 2019-05-20 2019-08-23 长沙理工大学 A kind of bilateral video object dividing method and system
CN110610453A (en) * 2019-09-02 2019-12-24 腾讯科技(深圳)有限公司 Image processing method and device and computer readable storage medium
CN111539993A (en) * 2020-04-13 2020-08-14 中国人民解放军军事科学院国防科技创新研究院 Space target visual tracking method based on segmentation
CN111985266A (en) * 2019-05-21 2020-11-24 顺丰科技有限公司 Scale map determination method, device, equipment and storage medium
CN112784630A (en) * 2019-11-06 2021-05-11 广东毓秀科技有限公司 Method for re-identifying pedestrians based on local features of physical segmentation
CN113191266A (en) * 2021-04-30 2021-07-30 江苏航运职业技术学院 Remote monitoring management method and system for ship power device
CN116912246A (en) * 2023-09-13 2023-10-20 潍坊医学院 Tumor CT data processing method based on big data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050226502A1 (en) * 2004-03-31 2005-10-13 Microsoft Corporation Stylization of video
CN101527043A (en) * 2009-03-16 2009-09-09 江苏银河电子股份有限公司 Video picture segmentation method based on moving target outline information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050226502A1 (en) * 2004-03-31 2005-10-13 Microsoft Corporation Stylization of video
CN101527043A (en) * 2009-03-16 2009-09-09 江苏银河电子股份有限公司 Video picture segmentation method based on moving target outline information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
包红强: "基于内容的视频运动对象分割技术研究", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 *
张佳伟: "三维物体的高效分割与重建", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
章国锋: "视频场景的重建与增强处理", 《中国博士学位论文全文数据库 信息科技辑》 *
韩军等: "交互式分割视频运动对象的研究与实现", 《中国图象图形学报》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108389217A (en) * 2018-01-31 2018-08-10 华东理工大学 A kind of image synthesizing method based on gradient field mixing
CN108961261A (en) * 2018-03-14 2018-12-07 中南大学 A kind of optic disk region OCT image Hierarchical Segmentation method based on spatial continuity constraint
CN108961261B (en) * 2018-03-14 2022-02-15 中南大学 Optic disk region OCT image hierarchy segmentation method based on space continuity constraint
CN109978891A (en) * 2019-03-13 2019-07-05 浙江商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110163873A (en) * 2019-05-20 2019-08-23 长沙理工大学 A kind of bilateral video object dividing method and system
CN110163873B (en) * 2019-05-20 2023-02-24 长沙理工大学 Bilateral video target segmentation method and system
CN111985266A (en) * 2019-05-21 2020-11-24 顺丰科技有限公司 Scale map determination method, device, equipment and storage medium
CN110610453B (en) * 2019-09-02 2021-07-06 腾讯科技(深圳)有限公司 Image processing method and device and computer readable storage medium
CN110610453A (en) * 2019-09-02 2019-12-24 腾讯科技(深圳)有限公司 Image processing method and device and computer readable storage medium
CN112784630A (en) * 2019-11-06 2021-05-11 广东毓秀科技有限公司 Method for re-identifying pedestrians based on local features of physical segmentation
CN111539993A (en) * 2020-04-13 2020-08-14 中国人民解放军军事科学院国防科技创新研究院 Space target visual tracking method based on segmentation
CN113191266A (en) * 2021-04-30 2021-07-30 江苏航运职业技术学院 Remote monitoring management method and system for ship power device
CN113191266B (en) * 2021-04-30 2021-10-22 江苏航运职业技术学院 Remote monitoring management method and system for ship power device
CN116912246A (en) * 2023-09-13 2023-10-20 潍坊医学院 Tumor CT data processing method based on big data
CN116912246B (en) * 2023-09-13 2023-12-29 潍坊医学院 Tumor CT data processing method based on big data

Also Published As

Publication number Publication date
CN107590818B (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN107590818B (en) A kind of interactive video dividing method
CN108537239B (en) Method for detecting image saliency target
CN108682017B (en) Node2Vec algorithm-based super-pixel image edge detection method
CN106056155B (en) Superpixel segmentation method based on boundary information fusion
CN110276264B (en) Crowd density estimation method based on foreground segmentation graph
CN111428765B (en) Target detection method based on global convolution and local depth convolution fusion
CN102903128A (en) Video image content editing and spreading method based on local feature structure keeping
CN103258203B (en) The center line of road extraction method of remote sensing image
CN105787948B (en) A kind of Fast image segmentation method based on shape changeable resolution ratio
CN105118049A (en) Image segmentation method based on super pixel clustering
CN103136537B (en) Vehicle type identification method based on support vector machine
CN109829449A (en) A kind of RGB-D indoor scene mask method based on super-pixel space-time context
CN104463843B (en) Interactive image segmentation method of Android system
CN104408733B (en) Object random walk-based visual saliency detection method and system for remote sensing image
CN109903331A (en) A kind of convolutional neural networks object detection method based on RGB-D camera
CN105809716B (en) Foreground extraction method integrating superpixel and three-dimensional self-organizing background subtraction method
CN103561258A (en) Kinect depth video spatio-temporal union restoration method
CN110443173A (en) A kind of instance of video dividing method and system based on inter-frame relation
CN106937120A (en) Object-based monitor video method for concentration
CN112766291A (en) Matching method of specific target object in scene image
US7602966B2 (en) Image processing method, image processing apparatus, program and recording medium
CN113052859A (en) Super-pixel segmentation method based on self-adaptive seed point density clustering
CN115757604B (en) GDP space-time evolution analysis method based on noctilucent image data
CN110111351A (en) Merge the pedestrian contour tracking of RGBD multi-modal information
CN104866853A (en) Method for extracting behavior characteristics of multiple athletes in football match video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191025

Termination date: 20200906

CF01 Termination of patent right due to non-payment of annual fee