CN1179223A - Method and apparatus for multi-frame based segmentation of data streams - Google Patents

Method and apparatus for multi-frame based segmentation of data streams Download PDF

Info

Publication number
CN1179223A
CN1179223A CN 96192717 CN96192717A CN1179223A CN 1179223 A CN1179223 A CN 1179223A CN 96192717 CN96192717 CN 96192717 CN 96192717 A CN96192717 A CN 96192717A CN 1179223 A CN1179223 A CN 1179223A
Authority
CN
China
Prior art keywords
frame
matrix
reference picture
vector
cut apart
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 96192717
Other languages
Chinese (zh)
Inventor
哈拉尔德·奥高·马滕斯
让·奥托·雷伯格
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
IDT International Digital Technologies Deutschland GmbH
Original Assignee
IDT International Digital Technologies Deutschland GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IDT International Digital Technologies Deutschland GmbH filed Critical IDT International Digital Technologies Deutschland GmbH
Priority to CN 96192717 priority Critical patent/CN1179223A/en
Publication of CN1179223A publication Critical patent/CN1179223A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention concerns methods for attaining grouping or segmentation in large signal streams by development and analysis of a manifold or subspace of parameters. A method for segmenting samples in frames of an input signal comprises the steps of: (1) forming a reference image, consisting of samples from a plurality of said frames, (2) estimating motion from said reference image to each of said frames, (3) reformatting the estimated motion into row vectors, (4) collecting said row vectors into a motion matrix, (5) performing a Principal Component Analysis on the motion matrix, thereby obtaining a score matrix consisting of a plurality of column vectors called score vectors and a loading matrix consisting of a plurality of row vectors called loading vectors, such that each score vector corresponds to one element for each frame, such that each element of each loading vector corresponds to one element of the reference image, such that one column of said score matrix and one loading vector together constitute a factor, and such that the number of factors is lower than or equal to the number of said frames, (6) reformatting each loading back into same format as used for the motion, (7) performing segmenting based on the plurality of reformatted loadings.

Description

The method and apparatus of cutting apart based on the data stream of multiframe
Related application
The present invention relates to following application:
The method and apparatus (the lawyer's document number IDT 011 WO) that-judgement (motion determination) of moving on a plurality of frames is coordinated,
-being used for the depth model (depth modelling) of moving object and the method and apparatus (lawyer's document number IDT 015 WO) of the degree of depth (depth) information is provided, these applications have the identical applicant and the applying date with the present invention.
Invention field
The present invention relates to derivation and the analysis method that realization is divided into groups or cut apart in large-signal stream by the subspace of system of parameters or parameter.Relate in particular to the method that the space-time of vision signal is cut apart.
Background
The mathematical parameterization of large-signal stream (as vision signal or voice signal) exists statistical estimate problem and calculated capacity problem.Cutting apart of signal flow can reduce this two problems.
At first, by signal flow being divided into two or more special son group signal or splitting signals of being mutually related, can make the formulation needs independent parameter still less of the data of generation.This has just simplified statistical model.
Secondly, because compacter, each parted pattern can also be easier to control and explain, when for example being used to edit.
The 3rd, after signal flow was cut apart, the processing that the computing of respectively cutting apart can be compared entire stream was simpler on calculating, and for example, has reduced the required high-speed memory of effective calculating.
For to obtaining these advantages in statistics, calculatings cutting apart of data stream, cutting procedure itself must be to be effective in statistics and calculating.The present invention relates to how to obtain relevant reliably cutting apart.
Yet the present invention can also be applied to the signal such as the voice signal of other types, and the multiframe digital video signal is a kind of main application, thereby will describe as an example.
The use of cutting apart in the video coding
In the video coding based on model, image Segmentation is very important: should be with showing the group of pixels modeling (model) of the consistent correlation space figure that changes with each framing, because best compression, editability and interpretation is provided like this.
Cutting apart for one can be corresponding to a physical object, but also can be only corresponding to a part of physical object, or corresponding to one group of several such physical object.Also can be corresponding to non-tangible object or phenomenon, as shade.
Based in the video of statistical model coding, the best definition of cutting apart (' holon ') is different from the purpose of coding: for pure compression purpose, cut apart ideally corresponding to through the group of pixels of effective compression, if but purpose is for later vision operation is encoded, as editor or video-game, cut apart just relevant with physical object more ideally so.
Cutting procedure must be very sane (robust),, must provide acceptable, useful the cutting apart of statistics significantly that is, can be applicable to many relevant picture frames.And it must be suitable for cpu time and memory requirement on calculating.
Some existing dividing method can referring to:
Boyer, K.L., Mirza, M.J. and Ganguly, G. (1994), The Robust Sequential Estimator:A General Approach and its Application to Surface Organization in Range Data.IEEETransactions on Pattern Analysis and Machine Intelligence on October 10th, 16,1994, the 987-1001 page or leaf;
Guensel, B. and Panayirci, E. (1994), Segmentation of Range and Intensity ImageUsing Multiscale Markov Random Field Representation.Proceedings, IEEE Intl.Conf.on Image Proc., Austin Texas, 13-16 day in November, 1994, II volume, the 187-191 page or leaf, IEEEComputer Soc.Press Los Alamitos, CA, USA;
Dellepiane, S., Fontanta, F. and Vernazza, G. (1994), A Robust Non-IterativeMethod for Image Labelling Using Context.Proceedings, IEEE Intl.Conf.on ImageProc., Austin Texas, 13-16 day in November, 1994, II volume, IEEE 207-211 page or leaf, ComputerSoc.Press Los Alamitos, CA, USA;
And
Russ, J.C. (1995) The Image Processing Handbook, the 2nd edition, IEEEPress/CRCRPress, London, the 347-401 page or leaf, above-mentioned paper is for reference in this citation.
The dividing method of video coding mainly contains two kinds of main types: still image is cut apart with based drive and is cut apart.
Still image is cut apart and is based on restriceted envelope intensity mode in the independent image.Such shortcoming of cutting apart is the difficult inside of distinguishing object edge and along the profile in the space of object edge.
Relate to image intensity how to change between image based drive cutting apart.In automatic video frequency coding, cut apart normally based on the latter, and be that estimation analysis by sports ground obtains.A kind of dividing method of having set up is to estimate that the sports ground between two frames (for example (is called ' address difference ' DA here from reference frame R and another frame n Rn)), and search DA with similar movement RnIn group of pixels.In addition, importantly pixel is close mutually in image at least therein physically.DA RnOne, two or more motion dimensions can be arranged.
Generally changed into based on cutting apart of changing based drive cutting apart, and here, variation can also comprise ' intensity difference ', that is, each to frame between Strength Changes D1 Rn, for example, motion-compensated with on different color channel.
To cut apart under the situation that is used for many frames, because the phenomenon of statistics over-fitting (overfitting), and because the frame of selecting can be when fully not represented in the problem all the other frames, will cut apart based on only a frame or a pair of frame are unsafty.Yet is when well cutting apart to actual frame that uses or frame in cutting apart to meeting, and cutting apart of being obtained can be expressed as very bad grouping to other frames.
In order on statistics, finding many frames all effectively to be cut apart, must to be searched various such frames, for example, 5-50, find pixel cluster relevant on statistics.Be imperfect like this.For these frames are cut apart individually, need coordinate different frame segmentation results subsequently.It is responsive that each frame is cut apart the noise in these frame input data.Simultaneously,, can need many storeies, and, on calculating, be expensive in order to carry out the analysis of cutting apart simultaneously to all these sports grounds in order to store the sports ground that is used for many independent frames.
Goal of the invention
The objective of the invention is to be convenient to find the grouping of signal in the signal flow, thereby divide into groups or cut apart that several signal frames are had high statistics robustness and high-efficiency.
Further purpose of the present invention is to carry out multiframe under effective account form of descending of desired data amount in cutting apart and cut apart making.
Further purpose of the present invention is to guarantee to cut apart to carry out to upgrading recurrence according to forward direction or back.
A further object of the present invention is to adopt the continuity and the interruption information in dissimilar phenomenons-time motion and Strength Changes information and space in cutting apart.
Further purpose of the present invention is to estimate carve information, so that to subsequent motion information with Strength Changes is estimated and these information are carried out bilinearity (bilinear) modeling (modelling).
Another purpose of the present invention is to make to cut apart and can overlap.
A further object of the present invention is to make that cut apart can partially transparent (transparent).
Another purpose of the present invention is that definition is cut apart, thus on the one hand (for statistics stability) in built-in system similarity and severity and on the other hand (ideal for the input data is described) between internal nonuniformity and dirigibility, have appropriate balance.
Summary of the invention
Among the present invention, cut apart that the change information that is according to several associated frames carries out, change not only between two frames.Thereby being segmented on the statistics of obtaining is more reliable and have higher validity.
The mode of the many variations of expression mainly is the bilinear model (modelling) that adopts according to the common reference position preferably by public system of parameters or subspace model in cutting apart calculating.Since can ignore some based on noise with other unessential change type, this has further improved statistical accuracy and the validity of cutting apart.Need the dimension of the delta data analyzed in cutting apart by minimizing, this has also reduced the computational complexity of cutting apart work.
In the subspace statement itself is recurrence more under the news, and the subspace is cut apart and can recurrence be upgraded, and this provides the advantage of calculating.
The change information that uses in cutting apart can be varied, can be movable information, also can be variable density information.
The present invention generally can be applied to signal flow.The space-time that especially can be applied to digital signal is cut apart with the time of digital audio data and is cut apart.
The accompanying drawing summary
Fig. 1 describes is how along direction of motion (DV here, RnBe that each pixel is along the longitudinal movement) move (coiling) from visual R and to approaching visual n the sports ground of one frame sign (having nv * nh pixel) is lined up (to have nv as a n dimensional vector n *Nh element).
What Fig. 2 described is to form at the same time under the situation of two direction of motion, the sports ground DA of two frame signs of vertical and horizontal (each frame has nv * nh pixel) Rn=[DV RnAnd DH Rn] how to line up.More multidimensional (for example change in depth) can be included in DA similarly RnIn.
What Fig. 3 described is how to pass through two low order order matrix T *P TBilinearity long-pending add a residual matrix constitute a matrix D (for example be used for many frame n=1,2 ... sports ground DA Rn) model.
Fig. 4 describes is the parameter of the relevant frame that obtains from Fig. 3.
What Fig. 5 described is the 3rd preferred embodiment, and estimation wherein and cutting apart is separately to carry out.
What Fig. 6 described is the 4th preferred embodiment, and estimation is wherein carried out with cutting apart simultaneously.
                              Describe
Mark and definition Hereinafter, symbol ' * ' is used as when needing and takes advantage of. Symbol ' x ' be used for representing matrix dimension (for example the size=n is capable * the n row). Boldface type subscript letter is used for representing data matrix, and the boldface type subscript letter is used for representing number According to vector.
Extract the bilinearity accumulative total of many sports grounds
Some background of the present invention provides in patent application WO 95/08240 and WO 95/34172. Relevant many The additional information that frame is cut apart, coordinate between estimation and the bilinear model is at above-mentioned patent application " Method And Apparatus for Coordination of Motion Determination over Multiple Frames " in give Go out. Several frames are cut apart the information of relevant estimation of Depth between the obstruction at above-mentioned patent application " Method and Apparatus for Depth Modelling and Providing Depth Information of Moving Objects " In provide.
The pixel that sports ground has been described in the image (for example being reference frame R) is how to move in order to approach Another image (for example being n). Such sports ground himself can be regarded as ' image ', to whenever One motion dimension has certain value, for example, and for a visual DH who moves horizontallyRn(null value=nothing moves horizontally, negative value=move to left, on the occasion of=move to right), for a visual DV of vertical movementRn(null value=nothing is vertically moved Moving, negative value=on move, on the occasion of=move down).
As shown in Figure 1, each sports ground image (DV for exampleRn) can line up, as the n dimensional vector n d with n pixel elementn, each element is used for having provided each pixel with reference to image of mobile message.
As shown in Figure 2, different motion dimensions can be lined up in an identical vector one by one, And this vector has the element of a multiple n pixel.
When having estimated one group of such sports ground vector, for several frame dn, n=1,2 ..., the n frame, They can be used as matrix D and analyze together.
The bilinear model of setting up well (BLM) can be used as the method (Fig. 3) of approaching the dependent vector group. Two Linear factor model (bilinear factor model) can be write as a pair of linear matrix product and be added a residual matrix (referring to H.Martens ﹠ Naes, T. (1989) Multivariate Calibration.J.Wiley ﹠ Sons Ltd Chichester UK, for reference in this citation):
D=T *P T+ E (1) here,
D is the data that will simulate, and-its every delegation is used for each frame that will simulate, and each row be used for will with Each pixel variable of time simulation (for example each pixel horizontal movement element and the unit that moves both vertically Plain. )
T is the time mark of the so-called bilinearity factor ,-it just has delegation to each frame of having simulated, and to each The bilinearity factor of having simulated, (f ,=1,2 ... nf) row are then arranged.
P TBe the free token (score) of the so-called bilinearity factor, it is every that-its each row are used for simulating simultaneously One pixel variable, and every delegation is used for each bilinearity factor model f=1,2 ..., nf. SuperscriptTExpression ' transposition '.
E represents error or residue-the have same matrix dimension D of simulation not.
For frame R with to the sports ground between the framing n, bilinear model (Fig. 4) is write as:
d n=t n *P T+e n (2)
As sports ground DA from a framing or subframe Rn, n=1,2 ..., or the modification of these sports grounds is crossed over the maximum effectively charge carrier space P of row space D when being defined as data D TWhat have been represented the public mobile message of several frames in the sequence.Separately with each frame estimation or unite mark vector (score vector) t of the frame of estimation (stating as follows) with many frames n, n=1,2 ..., be used for this common motion information P TIt is right to send back each independent frame.
Several different frames can be used for extracting bilinear model T from D from the current upper and lower relation (context) that multiframe is cut apart *P T, for example, have or do not have self-adaptive forward and back to the weighting svd of upgrading based on the QR rule.Hereinafter, they will be called as bilinearity simulation (BLM) or principal component analysis (PCA).
The details of relevant bilinear model method can referring to:
Martens, H.and Naes, T. (1989) Multivariate Calibration.J.Wiley ﹠amp; Sons Ltd, Chichester UK, by Martens, M.and Martens, H.1986:Partial Least Squaresregression.In:Statistical procedures in Food Research (J.R.piggott, ed.) ElsevierApplied Sciences London p.293-360, by Jackson, J.E. (1991) A User ' s guide toprincipal components.J.Wiley ﹠amp; Sons, Inc.New York, by Jolliffe, I.T. (1986) Principal Component Analysis.Springer Series in Statistics, Springer-Verlag NewYork, by Mardia, K.V., Kent, J.T.and Bibby, J.M. (1979) Multivariate Analysis.Academic Press, Inc., New York, by Sharat M.A., IIIman, D.L., and Kowalski, B.R.Chemometrics, J.Wiley ﹠amp; Sons, New Youk 1986 and by Kung, S.Y., Diamantaras, K.I.and Tauer, J.S. (1991) Neural Networks for extracting pure/constrainted/orientedprincipal components.In:R.Vaccaro (ed): SVD and signal processing II.ElsevierScience Publishers 1991, these documents of pp57-81. are for reference in this citation.
Be noted that importantly that for this purpose bilinear model can not be complete convergent or can not be best with respect to separation of orthogonality, eigenvalue etc.; Important thing is to locate the suitable subspace base that is used to approach data D.
As title above be described in " Method and Apparatus for Coordination of MotionDetermination " and, bilinear model can increase progressively renewal.
Bilinear model can be carried out later in the pre-service that is made of the mean value that deducts each row.The intermediate data that also can get each row is as mean value.When being reconstructed, these average datas must be added back according to bilinear model.Also can adopt more advanced preprocess method, such as, by J.Wiley ﹠amp; The Martens of Sons Ltd (Britain Chichester), H and Naes, the property taken advantage of scatter correction (MSC) that T (1989) describes in Multivariate Calibration and popularization (multiplicative scatter correction and its extensions) thereof, for reference in this citation.Also can adopt the bilinear model method for parameter estimation that comprises level and smooth mark and load (smoothing of score and loadings) or each data element in the data matrix D is revised.
If information has suitable reliability or validity to each pixel in each frame, this information can be used to weigh the relative importance of different input data so: can then carry out the bilinearity extraction (bilinear extraction of factors) of the factor to weighted data (weighted data): suppose G=sports ground DARn, n=1,2, ..., (may after definite row center (column centering)) or these sports grounds are revised, from a framing to or packet frames right, and the hypothesis
D=V Frame *G *V Pixel(3)
Here, V FrameThe weighting matrix of=frame, for example, diag (1/s n, n=1,2 ...), and
s nThe estimator of the uncertain standard deviation of=frame n
V PixelThe weighting matrix of=pixel, for example, diag (1/s Pixel, pixel=1,2 ...), and
s PixelThe estimator of the uncertain standard deviation of=pixel pel.
Like this, have high probabilistic pixel (row among the G) and reduce weighting (weighted down), but still simulate with other pixels of relatively determining.That in addition, determines pixel and uncertain pixel thisly separately can realize by two rather than a BLM.Uncertain pixel can be removed from the bilinearity model of definite pixel.Can pass through Martens ﹠amp as described above; The principal component of Naes 1989 returns (principalcomponent regression, PCR) and the part least square return that (partial least squares regression, PLSR) described load to uncertain pixel can be estimated with the recurrence of the mark (score) of determining the pixel table.This is applicable to also pixel cut apart from one and is re-assigned to another and cuts apart that, the load of the pixel of redistributing must be estimated with respect to distributing newly cutting apart of they here.
A fundamental purpose of bilinear model is that the deflation that realizes a large amount of input data is represented.In order to realize this purpose, model T *P TIn employed ' effectively ' factor number must lack, that is, model must have row rank (row rank).This efficiency factor number can estimate in various manners, for example, and as above-mentioned Marten; Naes 1989 is described to change factor number later on by intersecting effectively (cross validation) or estimating from remaining (residual) and leverage (leverage).
Front definition or the load of estimating (' pseudo-load ') can be used as the part of the model of data matrix D.In this case, estimate the mark (score) of these priori factors (a priori factor), and after this projection (weighted regression), remaining data (residuals) is carried out the bilinearity simulation by D being projected to these pseudo-load.
Adopt least squares minimization certain weighting or steadily and surely weighting again (robustly reweighted), according to linear regression, with d RnProject to P TOn, estimate the mark of other frame one by one.In addition, also can be by as SIMPLEX optimization (J.A.Nelder and R.Mead, ' A simplex method for functionminimization ', Computer Journal, the 7th volume, the 308-313 page or leaf) carry out according to the nonlinear iteration curve fitting.In this case, criterion also can be based on the decoding intensity mistake (decoding intensity error) that is produced when adopting these marks.
As described herein-in such, change information d RnBe expressed as the sports ground DA in the reference position RnThereby, it and the bilinearity load p compatibility of also in the reference position, representing.In addition, change information can be illustrated on the position of pixel among the frame n, for example, and reverse movement field DA NR, and project on the compatible mode of load p, that is, DA temporarily moves to P on the same position with sports ground.
Expression is from the spatial information of many frames on a common image position.
The validity of the bilinear model of motion depends on how sports ground is represented.When moving (translation, rotation, rescaling (scaling)) in 3 dimension spaces of a certain rigid body before a video camera, corresponding sports ground can be described with the bilinear model of low dimension.(for example, the face 0 that begins to smile also can approach with bilinear model the system motion of non-rigid body approx.
Yet, when sports ground (or other variation field) is stored among the D in the given representative system, the bilinear model of low dimension is effectively basically, thereby all information of relevant a certain object all are stored on the same pixel location concerning all frames.This can be by making the motion of each frame in one group of associated frame relevant with given ' reference picture ' R and be stored in the coordinate system of this reference picture and realize.This reference picture can be as sequence n=1,2 ..., among the N first, middle or last image or have some composite image model from the part of several frames.
An example is IDLE coding and decoding type (according to WO 95/08240 and WO95/34172), here, the motion of several (continuously) frame, Strength Changes and other model change informations are to represent with respect to public ' the extended reference image model ' of a class pixel given in the given associated frame group (space ' holon ') (' extended reference image model ') directly or indirectly.Before beginning to cut apart, whole initial reference picture (for example first frame in the sequence) is regarded as a holon.The fundamental purpose of space segmentation is then this initial space holon to be divided into various data structures, and they each self all is the mathematical model of simple, low dimension.
Sports ground is estimated can be directly from reference picture I RTo frame I nCarry out, and directly in D, analyze.In addition, sports ground can be estimated according to motion (coiling) form of this reference picture: I m=mobile (I R, by DA Rm) to I n,-local motion field DA MnEstimated and then moved back to the reference position, for example by producing I mSports ground contrary, produce estimation DA Rn=DA Rm+ mobile (DA MnTake advantage of (by) DA MR).
Therefore, an advantage of the invention is that synthetic that compact, low-rank (low-rank) accumulative total (summary) of the sports ground that utilizes several frames and other variation field strengthen cutting apart in encoding with stable video.
Similarly, cut apart and in time domain, to carry out, in the hope of the grouping of the frame of certain spatial model of finding.Time is cut apart subspace information (the H.Martens ﹠amp that the bilinearity simulation of the time displacement form that then adopts from series correlation time (for example, the time displacement form of the static frames mark that obtains of the bilinearity simulation of the variation field of describing by equation (1)) obtains; M.Martens (1992) NIR Spectroscopy-appliedphilosophy.Proceedings, 5th Internatl Conf.NIR Spectroscopy (K.I.Hildrum.ed) NorthHolland; Pp1-10).
The application of cutting apart based on the multiframe motion
The bilinearity of multiframe sports ground accumulative total (bilinear summary) can be applied in several modes to be cut apart.
The order of analog frame is forward direction (forward) and order (sequential) in most preferred embodiment.Yet order also can be selected according to other criterion, for example, is that need most and the most potential (potential) according to being which frame demonstrates model refinement at given time.
Based on using by pyramid mode (pyramidally) cutting apart of bilinear model.Such example is in the mode of the resolution (resolution) that reduces frame to be cut apart, so that the main holons in the recognition sequence, and then use these results, as preliminary interim (tentative) input of same process under higher frame resolution.
In preferred embodiment, estimation, bilinearity simulation and cut apart can be at each holon that has discerned (' input holons ') or at complete undivided visual I nCarry out.Under any situation, need do a plurality of holon pre-service or aftertreatment, two-wire modeling and cut apart the sports ground that is obtained, so that solve overlapping between the holon of input.
Such pre-service or aftertreatment are based on each holon that storage has the adjacent image point ' haloing (halo) ' of uncertain holon qualification (membership), therefore that is, only can temporarily give a certain holon that (and also temporarily being stored among other the holon or) as independently not knowing the storage of pixel table.In estimation, this interim haloing pixel is special the processing, for example, be by all relevant holon institute matches, and they are to upgrade according to the success of estimation to the qualification of different holon.Such haloing pixel has low power (weight) or is that (the fitted passively) of passive match is (referring to Principal Component Regression in the bilinearity simulation, Martens, H. and Naes, T. (1989), MultivariateCalibration.J.Wiley ﹠amp; Sons Ltd, Chichester UK, for reference in this citation).
Supplementary variable
Additional column among the raw data matrix G (equation (3)) can form from other ' foreign labeling (externalscores ' of data block.This ' foreign labeling ' source is:
From the mark of the bilinearity of some other data field simulation,
(for example, the intensity of the motion compensation of identical holon residue),
From the mark of other holon, preferably be non-linear expression and (see A.Gifi:Nonlinear Multi-variateAnalysis.J.Wiley; Sons Ltd Chichester 1990), make each quantize mark vector quantization and analysis, for reference as oriental matrix (67 pages) or at order levels (187 pages) in this citation, or
From the mark of the identical holon of different spatial resolutions,
Mark from external data such as sound
(for example after the simulation of the bilinearity of the acoustical vibration energy spectrum of these same number of frames)
Must revise the power of this supplementary variable, thus their uncertainty become to the final data matrix D that will simulate (equation (1) and (2)) in uncertain similar through the pixel of weighting.
Another kind of mildly make up uncertain similar or external label and do not force that information is joined the mode of going in the bilinear model is to replace (one-block) bilinearity simulation with secondary simulation (two-modelling), for example PLS returns and (sees Martens, H. and Naes, T. (1989): Multivariate Calibration.J.Wiley ﹠amp; SonsLtd, Chichester UK), or adopt polylith or N mode (N-way) simulation, as Parafac (Sharaf, M.A., IIIman, D.L. and Kowalski, B.R.Chemometrics, J.Wiley ﹠amp; Sons, New York 1986) or Consensus PCA/PLS (referring to Martens, M. and Martens, H.1986 the Partial Least Squares Regression in Statistical Procedurein Food Research (J.R.Piggott publication), ElsevierApplied Sciences London 293-360 page or leaf, and Geladi, P., Martens, H, Marten, M., Kalvenes, S. and Esbensen, K. (1988) Multivariate Compearison of Laboratory ResultsProceding, Symp.Applied Statistics, Copenhagen, 25-27 day in January, 1988 (Per Thorboell publication), Uni.C, Copenhagen 16-30 page or leaf.These are for reference in this citation).
Like this, if uncertain pixel and foreign labeling match are well then simulation is had positive contribution, if but be not suitable for also can strong negative effect not being arranged to simulation.In any case, these uncertain pixels and foreign labeling just are added in the bilinear model that is obtained.
From in the current resolution and the mark of the current holon model of the current field can be then as under other holon or other resolution or ' foreign labeling ' in other territories.
Preferred embodiment
Adopt the stability of cutting apart of multiframe accumulative total (summary) to implement in a different manner.
In first preferred embodiment, the bilinearity cutting procedure adopts top-down method, remove from input the cutting apart of holon: the pixel area that is not suitable in the motion subspace of general holon model is detected as outside sgency (outlier), and cuts apart away from remaining input holon.
In second preferred embodiment, cut apart and adopt bottom-up method, attempt to make stable seed points in input holon, to grow into even, coherent cutting apart.
In the 3rd preferred embodiment, cut apart with estimation and motion compensation (Fig. 5) and separate, sports ground and other bilinearity of being estimated delta data simulations of frame take place therebetween.
In the 4th preferred embodiment, estimation and actually cut apart combination (Fig. 6) is followed and is simulated by bilinearity.
In the 5th preferred embodiment, bilinearity simulation and cutting procedure are finished a whole sequence (sequence).
In the 6th preferred embodiment, estimation, bilinearity simulation and cut apart and model upgrades gradually to each frame.
In the 7th preferred embodiment, the bilinearity analogy method that will be used to cut apart is extended to and comprises additional criteria, and the covariance that does not just illustrate ,-in this case, room and time is smoothly as additional criteria.Also be included in the weighting again that adds the row and column of input data in the bilinearity simulation.
In the 8th preferred embodiment, the bilinearity simulation is combined with best scale change.Thereby be not only weighting and be that input data itself also change in the model estimation procedure: as long as the input value that does not provide than element from the prediction of the data element (element) of initial low-rank (low-rank) bilinear model has obviously worse decode results, its input value is just replaced by its bilinearity forecasting institute.
Preferred embodiment
First preferred embodiment: based on cutting apart that (outlier) not in the know of relevant spatial model analyzes
The primary structure piece of cutting apart that is based on bilinear model shown in Fig. 5: an exercise estimator unit EstMov 520, an a pair of linear analogue unit EstBLM 540 and a cutting unit EstSegm 560, and the data stream between them.The more detail of data stream will provide in the 3rd preferred embodiment.
Two first embodiment have represented the top-down or bottom-up two kinds of structures of cutting unit EstSeg 560-.
In first preferred embodiment, former state keeps the holon input of EstSeg unit 560 as far as possible, if but this holon comprises than all the other holon and has the part more obvious and characteristic that consistance (consistently) is different, so, these parts will be become new cutting apart separately by division (split).In addition, pixel independently, for example along the edge of holon, its preliminary classification has problem, so these independently pixel will be removed, perhaps be identified as insecure outside sgency.
Have below and describe the method that realizes that this top-down holon is cut apart in the pseudo-code of circuit number:
Single frame is cut apart
A single frame and the method that is used for the detection of rigid motion object are at first described.
Adopt linear least square (squares) spatial simulation of weighting again, make the spatial model that each so potential (potential) is cut apart:
Suppose that a given frame n is taken as by regressor (regressand) Y with respect to the vertical and horizontal estimation amount of reference frame R and handle in spatial simulation:
Y=[DV RnDH Rn] (701)。
Suppose regressor X=[1vh] (702).
Here v is the row of the vertical address of pixel, and h is their horizontal address.
Motion model by affined transformation is so:
Y=XB+F, (703)
Least square by weighting again returns estimation 3 * 2 regression coefficient matrix B:
Estimate the uncertain standard deviation s=[s of each pixel (OK) Pel, pel=1,2 ..., the n pixel] (704)
Define the matrix W of initial pixel weighting, for example, all pixels be expressed as:
W=diag(1,1,1,1,......1 npels) (705)
When weighting procedure again is (706) when not restraining
B=(X TWX) -1X TWY (regression coefficient estimation) (710)
F=Y-XB (residue) (720)
R=f (F, S) (with respect to the residue of noise level matrix s) (730)
Here, the residue f of each pixel of each row among the Y (pel, j) the ambiguity standard deviation s (pel) by pixel is divided into:
r(pel,j)=f(pel,j)/s(pel) (735)
W=f (R) (the renewal weighting of pixel) (740)
For example, at all Y variable j=1,2 ... go up the function of the relative surplus of accumulative total:
w(pel,pel)=c/(c+r(pel,1) 2+r(pel,2) 2+...) (745)
Here, sensitivity coefficient c=for example 1.0.
Check convergence: for example, whether B stable? (750)
When weighting procedure is restrained again, finish (750)
Also can adopt except that relative surplus and r (pel, 1) 2+ r (pel, 2) 2+ ... other estimators in addition, for example other sane (robust) distance measure (measure) of intermediate value or some.
In this process, the spatial model of supporting with most pixel can not be good fit pixel significantly big relative surplus R will be arranged and therefore be lowered weighting, so that reduce their influences in next iteration to the estimator of coefficient B, their residue will be bigger in next iteration, thus they to when convergence final spatial model B the estimator influence very little.
(for example, (pel, pel)<0.1) pixel is defined as not belonging to the outside sgency who imports holon w, and collects in new cutting apart to have low final weighting.Can be provided to the same simulation of weighted regression again this new not in the know cutting apart, to check whether should further be divided into littler cutting apart.Result 565 is exported in the then representative of cutting apart of gained.
The adding temporary of pixel in redefining (740), can also introduce neighboring pixels, to strengthen the space continuity of holon.Also can revise the weighting (705) of priori, for example, adopt the lower initial weighting of pixel, and these pixels are because inaccessible and known potential invalid or especially uncertain because unsatisfied bilinearity is simulated.
(pel, (pel j) may estimate uncertainty measure s j) each element among the Y, and can be used for replacing total uncertainty of each pixel in (745).This independent uncertainty measure can be asymmetric, thereby can differently assess positive and negative residue.This is corresponding to the estimation amount (asymmetric slack (assymmetric slack)) near the pixel in the smooth intensity area at intensity edge.Pixel can move away from this edge, does not lack the synthetic intensity that is fit to and do not influence, but can not move to the place beyond this edge.
(710) the full order that adopts in returns (full-rank regression) and can replace with other estimator, for example, as Martens, H. and Naes, T. (1989) is at Multivariate Calibration (J.Wiley ﹠amp; Sons Ltd, Chichester UK) PLS returns middle being similar to of describing or the homing method (reduced-rank regression method) of the contraction of its some popularization.
Multiframe is cut apart
This top-down substantially dividing method can be cut apart as multiframe: rather than only will cut apart sports ground based on the holon of a single frames, and adopt by regressor (regressand):
Y=[DV RnDH Rn]
It can be based on the sports ground of several frames:
Y=[DV R1DH R1,DV R2DH R2,...,DV RnDH Rk,...] (760)
In this first preferred embodiment, it is based on proportional load space, these motor patterns of holon in the several analog frame of this space spans:
Y=[PVPH]=[pV R1pV R2,...,pV RJ,pH R1pH R2,...,pH RK] (770)
Here, the bilinearity factor number (being J and K here) of selecting vertical and horizontal to move, thus only adopt the effective and reliable factor (for example by the effective affirmation of intersection of frame is judged).For example, should get ratio to factor loading (row among PV and the PH), thereby their uncertain variance is identical.
Return operator Y and can also be defined by comprising strength information, for example motion-compensated intensity difference image.
Y=[D1 R1D1 R2D1 R3......D1 Rn] (775)
Here, D1 Rn=for each color metric (for example RGB) define or be defined as be similar to brightness certain accumulative total frame n and common reference frame R between motion-compensated intensity difference.In addition, can be according to motion-compensated intensity difference as load (loading) from the bilinearity intensity factor.
Making the preferred mode of so motion-compensated intensity difference between frame n and reference frame R is at first to set up the sports ground of frame, DA=[DV in exercise estimator EstMov 520 RnDH Rn] and corresponding estimation of Depth amount etc., use this DA subsequently RnMove (reeling (warp)) reference picture, to produce I nHat is (promptly based on I R'sFrame I nApproximate), then calculate I nHat and I nBetween intensity difference, and finally use the contrary DA of mobile operator NR1: D1 Rn=Move ((I nHat-I n) take advantage of DA NR), this difference is moved back to the reference position.
Such strength information can use with movable information, or separates use with movable information.No matter be under any situation, should get ratio to row Y, to reflect them to cutting apart relative desired influence, for example, the average estimation uncertain variance relative with them is inversely proportional to.
Another kind of space-filling model
The space-filling model that calculates residual F in relevant (703) can be the another kind of type that adopts in (702).For example, X can also comprise the quadratic sum cross product item (referring to Lancaster, P. and Salkauskas, K. (1986), Curve and Survace fitting, Academic Press, the 133rd page, for reference in this citation) of address v and h.Also can adopt batten or piecewise polynomial (Lancaster ﹠amp; 1986, the 245 pages of Salkauskas, for reference in this citation).More high level model like this can help to distinguish pixel not in the know and the level and smooth formation that plays a major role be not the pixel of the motion of affined transformation (affine transformation).
X can also comprise a space autoregression part, has the spatial translation form that is included in Y among the X in this part, and has adopted order to reduce homing method, returns (referring to H.Martens ﹠amp as PLS; M.Martens (1992) NIR Spectroscopy-Applied Philosophy, Proceedings, 5th Internatl Conf.NIRSpectroscopy (K.I.Hildrum, ed) North Holland; Pp1-10).This space autoregressive model partly makes can distinguish the pixel not in the know (outlier pixel) that should reduce weighting on the one hand, can distinguish the pixel of the smooth motion that plays a major role on the other hand, they are neither the affined transformation structure, space polynomial construction in neither holon good the description.
Another kind partitioning boundary testing agency
Can introduce additional information, so that make the accurate location of partitioning boundary best.A kind of such information source is the reference picture I that detects as with Sobel wave filter (filter) RIn intensity edge (J.C.Russ:The ImageProcessing Handbook, 2nd ed., IEEE Press 1995, for reference) in this citation.If the later a certain partitioning boundary of relative weighting W 740 expressions of the spatial simulation of Y approaches such intensity edge, this partitioning boundary just moves on to this intensity edge so.
Also can adopt more advanced statistical method to judge segmenting edge.An example of this method can referring to (Kok, F.Lai, ' Deformable Contours:Modelling, Extrction, Detection andClassification ', PhD Thesis, University of Wisconsin-Madison 1995, for reference in this citation); For the application, input information can be an intensity I R, intensity residue DI Rn(or these BLM accumulative total), space residual F 720, R730 or model weighting W 740, with and/or Y data itself.
Second preferred embodiment: based on cutting apart of cluster analysis
The bottom-up method of cutting apart of second preferred embodiment representative input holon.It is made up of the cluster analysis of multiframe exercise data or their bilinearity accumulative total.
Several different clustering techniques can be used for seeking group of pixels.The cluster standard has then defined the statistical property of cluster with the selection of clustering algorithm.For example, can select separately to carry out cluster analysis or on all directions, unite and analyze to each direction of motion (vertically, laterally, the degree of depth).The latter is a kind of preferable implementation method (but selected depth direction not is at coding at first at least).
Can adopt two groups of main clustering techniques: image plane is not adopted the cluster analysis of space hypothesis of related parameter slickness or neighbouring relations and the cluster analysis of adopting this hypothesis.
Conventional cluster analysis
Present purpose be seek to show at least with P in some factor dimension have the pixel cluster of similar motor pattern ,-promptly, on some valid dimension, show the pixel of similar motor pattern at least.
According to bilinearity motion subspace, can adopt several different cluster principles.According to the Pythagorean distance metric and normalization (Mahalanobis) distance of public or weighting, can calculate the space-time distance.A kind of method is the non-layered cluster analysis technology (Mardia of standard, K.V., Kent, J.T.and Bibby, J.M. (1979) Multivariate Analysis, Academic Press, Inc., New York., Gudersen, Bob (1983) AnAdaptive FCV Cluster Algorithm.International J.of Man-Machine Studies 19, the 97-104 pages or leaves, Benadek et al.Detection and Characteristics of Cluster Sub-Structures.SIAM J.of Applied Math 40, in April, (2) 1981, Bezdek, J.C.and Pal, S.K. (1992) Fuzzy Models for Pattern Recognition.IEEE New York).An example of such cluster analysis is by Mardia, K.V., Kent, J.T.and Bibby, J.M. (1979) Multivariate Analysis, Academic Press, the partitioning technology of describing in the Inc.New York 361-368 page or leaf, the document is for reference in this citation.
More detailed cluster analysis can be referring to Mardia, K.V., Kent, J.T. and Bibby, J.M. (1979) Multivariate Analysis.Academic Press, Inc., New York, the 360-390 page or leaf, Benzdek etal.Detection and Characteristics of Cluster Sub-Structures.SIAM J.of Applied Math.40, in April, (2) 1981, and Bezdek, J.C. and S.K. (1992) Fuzzy Models for PatternRecognition.IEEE New York).Especially Fuzzy clustering techniques (seeing Gudersen, Bob (1983) AnAdaptive FCV Cluster Algorithm.International J.of Man-Machine Studies 19 97-104 pages or leaves) is more useful; In this technology, the bilinearity simulation is used to make the inner structure parametrization of cluster, and cluster can be overlapped.Hierarchical cluster analysis is seen Shraf, M.A., Illman, D.L. and Kowalski, B.R.Chemometrics, J.Wiley ﹠amp; Sons, New York,, the 219th page in 1986.These lists of references are for reference in this citation.
The cluster analysis that in image plane, has the space continuity hypothesis
In the present embodiment, the cluster analysis search has the pixel of the space correlation of similar motor pattern.The method that people such as Boyer disclosed a kind of image Segmentation in 1994, making can be widely-but be not unique usage space continuity (Boyer, K.L., Mirza, M.J. and Ganguly, G. (1994) The Robust SequentialEstimator:A General Approach and Its Application to Surface Organization in RangeData.IEEE Transactions on Pattern Analysis and Machine Intelligence 16, on October 10th, 1994, the 987-1001 page or leaf, for reference in this citation).One embodiment of the present of invention are that they are generalized to adopt the measurement from several frames and multidimensional (vertically move, laterally move and other possible features, state as follows) from the method for measuring a frame (single radar image Z) with one dimension (distance).
People's such as above-mentioned Boyer cutting techniques is can brief summary as follows:
*The analysis space data are (under people such as above-mentioned Boyer situation in 1994: range data (range
Data) Z), to find the abundant big level and smooth space region that to cut apart starting point as potential (potential)
The territory.
*Adopt weighted least squares spatial simulation again, form the spatial model of each this starting point.
Around each starting point, make Y variable (Y=Z) match spatial model and estimate residue.In this preferable reality
Execute in the example, employing be linear model Y=XB+F, it is with being used for affined transformation as mentioned above
(702) motion model X minimizes match by weighted least squares again.But in Z, also can
Promote (extension) to adopt polynomial expression and/or autoregression.
*Make by comprising that looking is the adjacent image point that is fit to preliminary parted pattern, upgrade parted pattern gradually
Potential being segmented in partly increases like this.This propagation process proceeds to does not always have new pixel good
Be fit to well till the space segmentation model of holon.
*Attempt each spatial model is expanded to the remainder of image, may belong to that this cuts apart so that search
More far-end pixel.
*Merge with the potential of spatial model compatibility and cut apart.
*Deletion (prune) is also inserted part not in the know, and solves (resolve) litura along the segmenting edge part.
The accurate edge of cutting apart can make it best with the method for describing among first embodiment.
Among the present invention, above-mentioned technology except that by humans such as Boyer in the radar range finding data, also be used for other spatial parameter data.No longer Y is defined as the degree of depth Z of image, but Y is defined as the movable information from several frames according to (701,760 or 770).Also can be described as the 3rd preferred embodiment, also comprise strength information (775).
The model of other parametric representations
After data conversion was become frequency field, another embodiment was applied to be applicable to that with this cutting techniques 1D, 2D reach the data of more high-dimensional (dimensional), and this embodiment is not described in detail in this.Through the data of conversion can be expressed as direct FFT the result, represent or represent with real part and imaginary part with phase place and amplitude.Also can adopt more complicated expression.An example is to adopt the relevant expression of phase place.
Should be pointed out that and to combine being applied to the top-down and Bottom-up approach that multiframe cuts apart.For example, at first, holon carries out the top-down analysis of cutting apart to input, so that be not suitable for the subregion not in the know of great majority or leading structure among the identification holon.Secondly, with analyzing the homogeneous area of searching in the subregion not in the know bottom-up cutting apart.
Two following preferred embodiments are distinguished estimation and are cut apart the dual mode that combines.
The 3rd preferred embodiment: the estimation of separating and cutting apart
In the 3rd preferred embodiment (Fig. 5), be used for each frame I n, n=1,2 ... intensity data and reference picture I R505 intensity data is imported in the exercise estimator 520.The estimation DA of gained Rn525 are sent to bilinearity analogue unit EstBLM 540.The bilinear model parameter 545 that forms is sent to cutting unit EstSegm 560, and it produces segmentation result 565.
EstMov arithmetical unit 520 can comprise detection inside and tentatively cut apart designator (indicator) as I ROr I nIn the device of the new breath in edge and the degree of depth and space, can adopt these devices, so that strengthen estimation DA RnNot 525 (for example, not making that sports ground is in that significantly preliminary partitioning boundary place is fuzzy), and be sent to other unit with estimation.
Bilinear model parameter 545 is mainly by parameter simulation exercise data DA RnAnd probabilistic parameter formation, but also can comprise relevant motion-compensated Strength Changes DI again RnDeng parameter.
At above-mentioned title is to have provided coordinated movement of various economic factors estimation some relevant methods with the bilinearity simulation in the patented claim of " Method and Apparatus for Coordination of MotionDetermination Over Multiple Frames ".
Several feedback control loop levels in this first embodiment, have been adopted
At first, exercise estimator EstMov 520 has adopted the preliminary carve information of previous foundation, so that make processing the best of edge, obturation and the degree of depth: sports ground is not smoothed at the reliable preliminary partitioning boundary place of leap.
EstMov 520 has also adopted the bilinearity analog result 522 of aforementioned foundation, so that make estimation to uncertain fuzzy stable with noise sensitivity.These preliminary information 521 and 522 obtain respectively at the unit EstNLB 540 and the EstSeg 560 of the iteration that is used for the frame of front, the discriminating of other pyramid (pyramidal) frame or front.
In bilinearity analogue unit EstBLM 540, bilinear model is tentatively cut apart (holon) for each separately according to preliminary carve information 521 and is developed (develop).In bilinearity simulated block EstBLM 540, can handle from other relevant holon with from the information of the pixel of its holon ambiguity Chu, thereby can (for example not produce adverse influence to bilinear model, extra (extra) X variable that in the bilinearity simulation that is similar to single variable piece, has low weighting, perhaps as be similar to the PLS2 of X piece and Y piece or the Y variable that the PCR bilinearity is simulated).
Therefore identical, preliminary bilinear model parameter estimation 522 can with new sports ground DA R, n(n=1,2 ..., 525) form (modelled) together, the renewal bilinearity series model 545 that is used to move and selects to be used for motion-compensated Strength Changes etc. with generation.
The 4th preferred embodiment: Union Movement is estimated and is cut apart
Estimation, the degree of depth are assessed and cut apart is separate process, and they should be handled best in the mode of whole (integrated) and treat.In the 3rd preferred embodiment, estimation is to coordinate by preliminary bilinearity result's 521,522 feedback with cutting apart operator (operator).In the 4th preferred embodiment, these operators are together fully-integrated.In this case, bilinearity estimates and can finish with less computer operation amount (power) that this is separately to carrying out computing relatively complete independently cutting apart because of it.
In this embodiment.According to Fig. 6, input data 605 and tentatively cut apart with bilinearity analog result 623 and be imported into EstMov/EstSeg 620, the sports ground DA that this EstMov/EstSeg 620 transmits about the holon that finds RnAnd the uncertainty of estimating, obturation etc. 625 and carve information 665.In EstBLM 640, bilinear model is several to be upgraded separately at each holon.In addition, as described in the 3rd preferred embodiment, (tentatively) with the relation between holon with have the pixel of not knowing holon classification and reduce weighting temporarily, or is defined as the Y variable.
Should be noted that the feedback arrangement of describing among Fig. 5 and Fig. 6, estimation also can be regarded the whole component part that bilinear model is estimated as with cutting apart.In EstBLM540,640, the same with traditional svd or eigen value decomposition, before convergence or complete stability, needn't carry out estimation procedure.The subspace that each holon545 obtained has been improved the estimation of next round and has been cut apart enough.Therefore, by improved estimation with cut apart that the input data modification that is input to EstBLM is advanced can be by as being that the bilinearity estimation procedure divides a part.
Below two embodiment relate between the frame of sequence and coordinating.
The 5th preferred embodiment: simulation entire frame sequence in a step.
In the 5th preferred embodiment, whole sequence will be passed through estimation; Then these estimation of whole sequence are submitted to and carried out the bilinearity simulation.At last, the bilinear model of holon or a plurality of model are used for cutting apart in the sequence.Employing Fig. 5 is described, from the bilinear model result 522 of last iteration (or pyramid (pyramidal) resolution level) with cut apart 521 results and be fed back in estimation 520 and the bilinearity simulation 540 so that make these estimation procedures stable and convenient.
Bilinearity motion and the Strength Changes model and through the cutting apart of renewal of employing through upgrading can then repeat the simulation of whole sequence.
The 6th preferred embodiment: the renewal gradually of series model
In the 6th preferred embodiment, bilinear model 545 is to each frame n=1,2 ... carry out upgrading after estimation finishes.This can cut apart each holon that has separated and carries out, but also can carry out holon all in the frame.Cutting apart 565 equally can be at each frame to upgrade afterwards.In preferred embodiment, except the delete procedure that carries out along the holon edge, main cutting apart again has only when exercise data clearly shows to be only in the time of need doing like this and is allowed to.
The further details that relevant bilinear model upgrades sees that above-mentioned patented claim " carries out the method and apparatus that motion determination is coordinated to a plurality of frames ".
Equally, it can be unfixed each frame being introduced the order of simulating and cutting apart.In case after all frames have been carried out simulation and cut apart, can be once more to cutting apart the whole process of beginning, but now to bilinear model with cut apart and had better initial value.
Estimation uncertainty in the partitioning boundary of estimating is stored with partitioning boundary information, and as in the follow-up coding step.Have the pixel of not knowing segmentation and classification, for example be taken as and have high uncertainty and treat at the pixel in the zone of selected partitioning boundary.In subsequent motion estimation and bilinearity simulation, uncertain pixel is to provide low weighting as previously mentioned or return (Martens ﹠amp by principal component; Naes 1989) next passive match (passively fitted).In follow-up cutting apart, probabilistic pixel is included in new cutting apart in the estimation, but gives low priori input weighting (705).
Two following embodiment relate to the special technique that makes the bilinear model parameter estimation be suitable for cutting apart application.
The 7th preferred embodiment: adopt additional level and smooth criterion (smoothness criteria) to revise the bilinearity simulation
The bilinearity parameter estimation that is used for obtaining above-mentioned bilinear model may be modified as the additional statistical restriction that needs or support to be satisfied, such as needs or depart from space load vector among time mark vector among the T that wants smoothed or the P, at least in the place of not finding preliminary partitioning boundary.
This be finish in the inside of the NIPALS algorithm iteration that is used for each factor a (see Marten, H. and Naes, T. (1989) Multivariate Calaibration.J.Wiley ﹠amp; Sons Ltd, Chichester UK.), as the pseudo-code of the wired label of apparatus is shown:
For each factor a, each new iteration is defined as follows:
(810) by will after the last factor, the remaining data among the D being projected to the level and smooth ratio mark vector t that obtains from last iteration aEstimation space load vector p comes up A, rawOriginal (raw) estimate.
(820) submit these luv space load vectors p to A, rawCarry out space smoothing: p a=f (p A, raw).Smoothing method can be simple boxcar (boxcar) wave filter, or seeks a kind of method, when leap should allow its not smoothed limbus, avoids level and smooth.Level and smooth load p aWith respect to the previous estimation factor [p 1, p 2..., p A-1] load orthogonalization.
(830) by remaining data being projected to smoothed load p aOn estimate original tally t A, raw
(840) submit this original tally vector to, carry out time smoothing, for example boxcar smoothly or more advanced level and smooth: t a=f (t A, raw).
(850) to through level and smooth mark vector t aThe ratio of getting becomes length 1, and iteration repeats this process, till abundant convergence.
The further reinforcement of bilinearity simulation is that iterative reweighted least squares match with bilinear model is applied to this data in the present embodiment, so that reduce the influence of frame not in the know or pixel not in the know: the weighting V of the row in the equation (3) FramesWith row V PelsWeighting can be according to the correct residual from the low-rank bilinear model in the previous iteration, contraryly comes iteration to upgrade according to what the renewal of the average uncertain standard deviation of row and column was estimated.
Its more detailed description is seen above-mentioned patented claim " Method and Apparatus for Coordination ofMotion Determination ".
The 8th preferred embodiment: as the rule-based best scale of a bilinearity simulation part
In bilinearity simulation 540,640, not only can change the bilinearity analog parameter obtaining better simulation, and in the bilinear model parameter estimation procedure, can change the value in the input data, for example DA RnThe exercise data da that can iterative modifications be used for frame and pixel N, pelIn each element, thereby meet the model that obtains from other frames or pixel more:
Da N, pel=f (da N, pel (input (input)), da N, pelHat, Rules (rule)),
Here, for bilinearity simulation da N, pelHat=t n *P Pel
An example of rule is:
If (if da N, pelThe motion match di that Hat provides N, pelWith da N, pel (input)What provide is the same or better),
And (da N, pelHat is positioned at da N, pel (input)The statistical uncertainty scope in),
then(da n,pel=da n,pelHat)
else(da n,pel=da n,pel(input))。
Except data element da N, pelThis discrete definition beyond, also can adopt da N, pelHat and da N, pel (input)More continuous weighted mean function.
More detailed description is seen above-mentioned patented claim " Method and Apparatus for Coordination ofMotion Determination ".
Combinations thereof is again a kind of embodiment.
Other application:
Bilinearity segmentation of structures in the time domain
Above-mentioned cutting apart/clustering technique can be used for judging the grouping of the frame (sequence) that is suitable for analyzing together-and detect scene (scene) translation.This carries out the simple, two wire simulation (possible cluster sampling (subsample)) of frame intensity a kind of embodiment, and in label space T, carry out non-layered cluster analysis (non-hierarchical clusteranalysis), have the frame cluster of more common image materials so that seek.Preferably carry out sane single cluster analysis (robust single cluster analysis) in the present embodiment, so that can follow the motion in the interior scene of a single cluster.
The application of other types data
Another embodiment is the data that mentioned above principle are applied to time series, and for example, voice data is so that definition time is cut apart.In this case, the spatial movement field data is corresponding with time coiling (time warp) estimation, and the spatial-intensity transformation period is corresponding with time Strength Changes data.
The explanation of WO 95/34172 is seen in the use that is used for the output of the present invention of decoding (reconstruct of frame).
In the scope of claim that the professional can be below the present invention is done various modifications.Especially, term " a plurality of " can be interpreted into and be " one or more ' meaning.

Claims (36)

1. method that image sequence is cut apart, described sequence is made of frame, and each frame is made up of the sampling of input signal, it is characterized in that, and described method comprises following step:
(1) form a reference picture, described reference picture is made up of the sampling that obtains from a plurality of described frames,
(2) motion from described reference picture to each described reference frame is estimated,
(3) described motion through estimating is reformated into row vector,
(4) described row vector is gathered in the kinematic matrix,
(5) described kinematic matrix is carried out principal component analysis (Principal Component Analysis), thereby obtain a mark matrix of forming by a plurality of load vectors that are referred to as row vector, with the loading matrix of forming by a plurality of load vectors that are referred to as row vector, thereby each mark matrix is corresponding to an element that is used for each frame, thereby each element of each load vector is corresponding to an element of reference picture, thereby row and a load vector of described mark matrix constitute a factor together, and thereby described factor number is less than or equal to the quantity of described frame
(6) each load reformatting is got back to as the same form that moves,
(7) cut apart according in the load of described reformatting first.
2. the method for claim 1 is characterized in that,
Comprise following step cutting apart in the step (7):
(7a) in first load, select a position,
(7b) form a local motion models according to the element adjacent with described chosen position,
(7c) candidate's element of the described reference picture of selection in also not having divided element,
(7d) determine the described candidate's element and the adaptive good degree of described local motion models of described first load,
(7e) those candidate's elements that will satisfy a certain fidelity are included in the described local motion models,
Wherein, represented one to cut apart with the set of the adaptive element of described local motion models.
3. method as claimed in claim 2 is characterized in that,
Described local motion models has comprised model sport, as affined transformation.
4. method as claimed in claim 2 is characterized in that,
Described local motion models has comprised model sport, as a plurality of polynomial transformations of cutting apart.
5. as the described method of any one claim in the claim 2 to 4, it is characterized in that described step (7d) is replaced by following step:
(7d1) the adaptive good degree of element subclass that has comprised in a plurality of further candidate's element in described first load of judgement and the described partial model, described subclass is selected for each candidate's element according to described candidate's positions of elements.
6. as the described method of any one claim in the claim 2 to 5, it is characterized in that,
The step of relevant described fidelity criterion (7e) comprises:
Be each candidate's element, calculate the motion of extrapolation by the local motion models of candidate's positions of elements and by load quantity by multiply by the difference between the motion that the mark vector calculates.
7. as the described method of any one claim in the claim 2 to 6, it is characterized in that,
Described fidelity criterion is also considered the uncertainty corresponding with described sports ground of each element in the described reference picture.
8. method as claimed in claim 7 is characterized in that, described fidelity criterion is to be described difference that each element in the described reference picture calculates and conduct is corresponding with described sports ground and the ratio between the described uncertainty.
9. method as claimed in claim 8, it is characterized in that, described uncertainty just has a value corresponding with it for each direction in a plurality of direction in spaces, and one in the described value is selected to calculate the described fidelity criterion relevant with the direction of described difference.
10. as the described method of any one claim in the claim 2 to 9, it is characterized in that,
Described method further comprises following step:
(7f) upgrade described local motion models.
11. method as claimed in claim 10 is characterized in that, described renewal comprises the step of adjusting the weighting corresponding with element according to described fidelity criterion.
12. as claim 10 or 11 described methods, it is characterized in that,
Step (7c) repeats repeatedly to (7f).
13., it is characterized in that it also further comprises following step as any one described method in the claim 2 to 12:
(7g) positions of elements that satisfies described fidelity criterion is cut apart,
(7h) in being marked as those positions of cutting apart, select a new position,
(7I) repeating step (7c) is to (7h), up to satisfied given convergence criterion.
14. as the described method of claim 1 to 13, it is characterized in that,
Except adopting described factor I, also adopt a plurality of factors.
15. as the described method of any one claim in the claim 1 to 14, it is characterized in that,
Except the step (5) of carrying out relevant principal component analysis, subsequent step is to operate at the described sports ground corresponding with each described frame except directly described load vector being carried out the computing.
16. the method that raising is cut apart a certain image sequence, described sequence is made up of frame, and each frame is made up of the sampling of a certain input signal, and cut apart representative described cutting apart by described reference picture a plurality of, it is characterized in that described method comprises following step:
(1) each given input is cut apart, enforcement of rights requires the step (1) in 1 to arrive (6), employing is cut apart corresponding intensity with described given input and is added that the adjacent element layer is as the reference image, the set of cutting apart reference picture constitutes a total reference picture together, here, each element in total reference picture can be represented in more than one cutting apart in the reference picture
(2) for each described each element of cutting apart in the reference picture, calculate a fidelity criterion,
(3), which is sought cut apart and provide best fidelity, and from other are cut apart, remove this element for more than one each element of cutting apart in total reference picture of representing in the reference picture.
17., it is characterized in that described method also comprises following step as the described method of any one claim in the claim 1 to 15:
(8) according to the fidelity indication, form the potential energy image,
(9) cut apart for each, seek with adjacent and cut apart adjacent border,
(10) for each border, make its position the best iteratively, thereby make energy and along its track minimum,
Here, on behalf of each, the reference picture element in each border cut apart.
18. method as claimed in claim 17 is characterized in that,
Described potential-energy function is yet based on the reference picture intensity gradient.
19., it is characterized in that it also comprises as the described method of any one claim in the claim 17 to 18:
(11) for each frame in the sequence, use by mark value and load vector being taken advantage of the sports ground that calculates or the sports ground of its reconstruct, this frame is moved back into the reference position,
(12) will be reformated into row vector through the frame of motion,
(13) described row vector is gathered in the intensity matrix,
(14) intensity matrix is carried out principal component analysis, produces row vector that is called the intensity load vector and the column vector that is called the intensity signature vector,
Wherein, described potential energy image also is independent of intensity load vector and intensity signature vector.
20., it is characterized in that the described optimization on described border comprises that also the space simplification that makes the border is best as the described method of any one claim in the claim 17 to 19.
21. the method that a certain image sequence is cut apart, described image sequence is made up of frame, and each frame is made up of the sampling of input signal, it is characterized in that, described method comprises following step:
(1) form a reference picture,
(2) motion of a frame of estimation is that each element in the described reference picture produces the position in described frame,
(3) form a regression matrix, described regression matrix is made of column vector, thereby a column vector corresponding with each Spatial Dimension of reference picture is arranged, each described row comprises an element of the locus of described element for each row in the described reference picture, an and column vector that contains
(4) form one by the regressor matrix, describedly be made up of column vector by the regressor matrix, thereby a column vector corresponding with each Spatial Dimension of reference picture is arranged, each is listed as an element that each element is comprised the estimated position in the described frame,
(4) estimate a regression coefficient matrix, thereby describedly be similar to by the product of regressor matrix by described regression coefficient matrix and described regressor matrix,
(5) according to regression residual, each element in the described reference picture is calculated a qualification measure, described regression residual is calculated as and is describedly deducted the product of regression coefficient matrix and regressor matrix by the regressor matrix,
(6) measure formation one according to the qualification of calculating in the step (5) and cut apart,
Wherein, cutting apart of described image sequence represented in cutting apart of forming in the step (6).
22. method as claimed in claim 21 is characterized in that, the described estimation in the step (4) is that the least square with sane weighting returns and carries out.
23. the method described in claim 21 or 22 is characterized in that described estimation is carried out a plurality of frames, and a described column vector that is contained each Spatial Dimension that is useful on each resultant motion field by the regressor matrix.
24. as claim 21 or 22 described methods, it is characterized in that,
Estimation is that each frame in the described series is carried out, and principal component analysis is that the result from described estimation is carried out, and is comprised a column vector that is used for each load vector by the regressor matrix.
25. as the described method of any one claim in the claim 21 to 24, it is characterized in that,
Describedly by the regressor matrix each frame is also comprised one and contain the remaining row of motion-compensated intensity, the intensity residue of described motion compensation is according to the motion of estimating and deducts reference picture and form by each frame being moved to the reference position.
26. the described method of any one claim as in the claim 21 to 24 is characterized in that,
Main element analysis is that motion-compensated intensity residue is carried out, and the described resultant load vector that is also comprised the conduct row by the regressor matrix.
27. as the described method of any one claim in the claim 21 to 26, it is characterized in that,
Described regressor matrix also comprises the polynomial expression as the described locus of row.
28. as the described method of any one claim in the claim 22 to 27, it is characterized in that,
Described being also included within by the regressor form in the row in the regressor matrix through spatial translation.
29. the method that image sequence is cut apart, described image sequence is made up of frame, and each frame is made of the sampling of input signal, it is characterized in that,
Described method is made of the form step:
(1) cut apart according to the described method of any one claim in the claim 21 to 28,
(2) repeating step (1), those parts of the described reference picture of locating to have found beyond will cutting apart are discarded,
Wherein, described cutting apart represented in cutting apart of finding in the repetition of each step (1).
30., it is characterized in that described recurrence is to carry out with the method for region growing as the described method of any one claim in the claim 21 to 28.
31. the method that image sequence is cut apart, described image sequence is made of frame, and each frame is made of the sampling of input signal, it is characterized in that,
Described method comprises following step:
(1) cut apart according to the method described in the claim 30,
(2) repeating step (1), those parts of the described reference picture that will find outside this is cut apart are discarded,
Wherein, described cutting apart represented in cutting apart that each of step (1) finds in repeating together.
32. the method described in any one claim in the claim 21 to 31 is characterized in that
Described uncertainty by regressor calculates, and described uncertainty is when being used to estimate described regression coefficient matrix.
33. as the described method of any one claim in the claim 21 to 32, it is characterized in that,
Uncertainty by regressor calculates, and described uncertainty is used to calculate described qualification when measuring.
34. as the described method of any one claim in claim 21 or 23 to 33, it is characterized in that,
Described step (4) and (5) comprise:
(4) in the space that is listed as by regressor, cluster analysis is carried out in sampling,
(5) calculating a qualification according to described cluster analysis measures.
35. the device that a certain image sequence is cut apart, described sequence is made of frame, and each frame is made up of the sampling of input signal, it is characterized in that, described device comprises:
(1) device of formation reference picture, described reference picture is made up of the sampling that obtains from a plurality of described frames,
(2) motion of estimation from described reference picture to each described frame,
(3) will be reformated into the device of row vector through the motion of estimation,
(4) described row vector is gathered device in the kinematic matrix,
(5) kinematic matrix is carried out the device of principal component analysis; Thereby obtain the mark matrix that is consisted of by a plurality of column vectors that are called the mark vector and the loading matrix that is consisted of by a plurality of row vectors that are called load vector; Thereby each mark vector is corresponding with an element that is used for each frame; Thereby each element of each load vector is corresponding with an element of reference picture; Thereby row and a load vector of described mark matrix consist of a factor together; And thereby the quantity of the described factor is less than or equal to the quantity of described frame
(6) each load is reformated into and the device that is used as the same form that moves,
(7) device of cutting apart according to the load of a plurality of reformattings.
36. device as claimed in claim 35 is characterized in that, described device is suitable for the described method of any one claim in the claim 2 to 34.
CN 96192717 1995-03-22 1996-03-22 Method and apparatus for multi-frame based segmentation of data streams Pending CN1179223A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 96192717 CN1179223A (en) 1995-03-22 1996-03-22 Method and apparatus for multi-frame based segmentation of data streams

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP95104229.0 1995-03-22
CN 96192717 CN1179223A (en) 1995-03-22 1996-03-22 Method and apparatus for multi-frame based segmentation of data streams

Publications (1)

Publication Number Publication Date
CN1179223A true CN1179223A (en) 1998-04-15

Family

ID=5128423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 96192717 Pending CN1179223A (en) 1995-03-22 1996-03-22 Method and apparatus for multi-frame based segmentation of data streams

Country Status (1)

Country Link
CN (1) CN1179223A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103814384A (en) * 2011-06-09 2014-05-21 香港科技大学 Image based tracking

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103814384A (en) * 2011-06-09 2014-05-21 香港科技大学 Image based tracking
CN103814384B (en) * 2011-06-09 2017-08-18 香港科技大学 Tracking based on image

Similar Documents

Publication Publication Date Title
CN1193620C (en) Motion estimation method and system for video coder
CN1145901C (en) Intelligent decision supporting configuration method based on information excavation
CN1320490C (en) Face detection and tracking
CN1747559A (en) Three-dimensional geometric mode building system and method
CN1208970C (en) Image processing apparatus
CN1132123C (en) Methods for creating image for three-dimensional display, for calculating depth information, and for image processing using depth information
CN1267863C (en) Method for aligning a lattice of points in response to features in a digital image
CN1194047A (en) Method and apparatus for coordination of motion determination over multiple frames
CN1151465C (en) Model identification equipment using condidate table making classifying and method thereof
CN1275201C (en) Parameter estimation apparatus and data collating apparatus
CN1419679A (en) Estimating text color and segmentation of images
CN1871622A (en) Image collation system and image collation method
CN1306650A (en) System, method, and computer program product for representing proximity data in multi-dimensional space
CN1607551A (en) Method and apparatus for image-based photorealistic 3D face modeling
CN1310825A (en) Methods and apparatus for classifying text and for building a text classifier
CN1288914C (en) Image coding and decoding method, corresponding devices and application
CN1554074A (en) Method and system for modifying a digital image taking into account its noise
CN1097396C (en) Vector quantization apparatus
CN1278349A (en) Image processing method and apparatus
CN1627315A (en) Object detection
CN101039422A (en) Image encoding apparatus, image decoding apparatus and control method therefor
CN1940912A (en) Document production system, document production method, program, and storage medium
CN1130969A (en) Method and apparatus for data analysis
CN1765124A (en) Image processing device, image processing method, and program
CN1678021A (en) Image processing apparatus and method, recording medium and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C01 Deemed withdrawal of patent application (patent law 1993)
WD01 Invention patent application deemed withdrawn after publication