AU767741B2 - A method for automatic segmentation of image data from multiple data sources - Google Patents
A method for automatic segmentation of image data from multiple data sources Download PDFInfo
- Publication number
- AU767741B2 AU767741B2 AU57783/01A AU5778301A AU767741B2 AU 767741 B2 AU767741 B2 AU 767741B2 AU 57783/01 A AU57783/01 A AU 57783/01A AU 5778301 A AU5778301 A AU 5778301A AU 767741 B2 AU767741 B2 AU 767741B2
- Authority
- AU
- Australia
- Prior art keywords
- merging
- image
- regions
- segmentation
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Landscapes
- Image Analysis (AREA)
Description
S&FRef: 565702
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT
ORIGINAL
Name and Address of Applicant Actual Inventor(s): Address for Service: Invention Title: Canon Kabushiki Kaisha 30-2, Shimomaruko 3-chome, Ohta-ku Tokyo 146 Japan Julian Frank Andrew Magarey, Brian Parker Spruson Ferguson St Martins Tower,Level 31 Market Street Sydney NSW 2000 (CCN 3710000177) A Method for Automatic Segmentation of Image Data from Multiple Data Sources ASSOCIATED PROVISIONAL APPLICATION DETAILS [33] Country [31] Applic. No(s) AU PQ9218 [32] Application Date 04 Aug 2000 The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5815c -1- A METHOD FOR AUTOMATIC SEGMENTATION OF IMAGE DATA FROM MULTIPLE DATA SOURCES Field of Invention The present invention relates to automatic scene analysis and, in particular, to a s statistical-model-based segmentation of multichannel image data.
Background Modern imaging devices are capable of generating vast amounts of image data in the form of two-dimensional arrays of samples (known as pixels) of some measurable quantity. Examples of directly measurable image data include luminance and chrominance of reflected light (from optical cameras), range or distance from some reference point to the imaged points (from active range sensors), or density (from tomographic scanners). Moreover, many quantities can be derived from the raw image data. Such quantities may be referred to as metadata, this being data that is used to describe other data. Examples of such "metadata" quantities include range (from passive, 15 optical range sensors) and motion (from multiple images of dynamic scenes).
oooo The sheer volume of image data necessitates some kind of automatic analysis of content in most applications. An important step in analysing the content of an imaged scene is to partition the image into disjoint segments corresponding to semantically meaningful objects. Because human expectation is that real world objects are in some sense compact and coherent, each segment of the partitioned image consists of a region of adjacent pixels over which some property of the data (image data, metadata, or both) is uniform. Many approaches to this task of segmentation have been tried. One that has met with some success is region merging. In this paradigm, each pixel is initially labelled as its own unique region. Adjacent regions are then compared using some similarity criterion and merged if they are sufficiently similar. In this way small regions take shape and are gradually built into larger ones. It may be shown that region merging is a 565702.doc -2practical approximate solution to a variational formulation of the image segmentation problem. In this formulation, the "best" segmentation is expressed as the global minimum of some cost functional defined over the space of all possible segmentations of an image. An advantage of region merging methods (as compared with, for example, edge-based methods) is the adaptability of region merging to handle multichannel image data, ie. data which is vector-valued at each pixel. For example, in colour images the vector components might be the red, green, and blue intensities. This facility makes region merging techniques suitable for fusing multiple sources of data and metadata to produce a single segmentation. In this way range and motion information may be 1o integrated with colour to provide an analysis that colour data alone cannot. This is of particular interest when the images are of complex, dynamic scenes. An example of such is disclosed in the paper "Region-based Representation of Image and Video.
Segmentation Tools for Multimedia Services", P. Salembier, F. Marques; IEEE Transactions on Circuits and Systems for Video Technology Vol. 9, No. 8, December .oo• 1999, pages 1147-1169.
Traditional region merging has dealt with the definition of segmentation functionals and/or similarity criteria. Most successful cost functionals have two 4 components: a model fitting cost and a model complexity cost. The model fitting cost encourages a proliferation of regions, while the complexity cost encourages few regions.
The functional must therefore balance the two components to achieve a reasonable result.
a The most soundly based model fitting costs use statistically valid definitions such as residuals. This provides optimal handling of data or metadata which is subject to spatially varying uncertainty. This situation often arises from metadata such as range obtained by passive optical means, when the certainty of the range estimate depends strongly on the underlying image texture.
565702.doc Traditional statistical region merging has assumed all channels have independent, identically distributed uncertainties. Instances where the uncertainties of each channel are unequal and/or correlated between channels have not been addressed.
However this will be the case when fusing pixel data and derived metadata such as range.
A similar situation also occurs when segmenting on estimated motion vector images, in which the uncertainties not only vary over the image, but are correlated between horizontal and vertical components.
Another difficulty with automatic segmentation by region merging is deciding when to halt the merging process. Some implementations have required a predetermined 1o "schedule" of thresholds to govern the merging process and converge to the segmentation which minimises the cost functional. Others have removed the need for a schedule, but still require an arbitrary threshold. This threshold is related to the weighting of fitting error and model complexity in the final cost functional. The use of a predetermined S-"arbitrary threshold means the segmentation algorithm is unable to adapt to different types Is of image without substantial operator effort.
:Summary of the Invention It is an object of the present invention to substantially overcome or at least "ameliorate one or more problems associated with existing arrangements.
SThe problems mentioned above may be addressed by explicitly formulating the cost functional to incorporate different and correlated uncertainties between channels, and providing a flexible, meaningful automatic halting criterion for the merging process.
In accordance with one aspect of the present invention there is disclosed a method for segmenting an image formed by a plurality of pixels using a region-merging process characterised by using covariance data and a plurality of vector components of each said pixel to evaluate a merging criterion for regions of said image.
565702.doc -4- In accordance with another aspect of the present invention there is disclosed a method for segmenting an image formed by a plurality of pixels, each said pixel being described by a vector having components each relating to a different measured image characteristic, said method comprising the steps of: receiving, for each said pixel, a plurality of said vector components and a corresponding error covariance representation of said pixel; for each said pixel, fitting each said component and the corresponding covariance representation to a predetermined linear model to obtain a set of model parameters and corresponding confidence representations; statistically analysing the sets of model parameters and corresponding confidence representations to derive a segmentation of said image that minimises a predetermined cost function.
In accordance with another aspect of the present invention there is disclosed a method for unsupervised selection of a stopping point for a region-merging segmentation process, said method comprising the steps of: analysing a graph of merging cost values to identify departures from substantial monotonicity of said graph; and selecting said stopping point to be a merging cost value corresponding to a return to monotonicity of said graph, said selected stopping point being associated with one of a limited plurality of final said departures in said region merging process.
Other aspects of the present invention are also disclosed.
Brief Description of the Drawings One or more preferred embodiment of the present invention will now be described with reference to the drawings in which: Fig. 1 is a block diagram showing the structure of the preferred implementation; 565702.doc Fig. 2A is a plot of the value of the test statistic as the algorithm proceeds in a typical case; Fig. 2B is a plot similar to Fig. 2A but simplified and shown over an entire segmentation; Fig. 3 is a flow chart representing processing steps used in the preferred implementation to determine when to cease the segmentation; and Fig. 4 is a schematic block diagram representation of a computer system in which the preferred implementation may be implemented.
Detailed Description Some portions of the description which follows are explicitly or implicitly presented in terms of algorithms and symbolic representations of operations on data within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, Is conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that the above and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise, and as apparent from the following, it will be appreciated that throughout the present specification, discussions utilizing terms such as "scanning", "calculating", "determining", "replacing", 565702.doc -6- "generating", "initializing", "outputting", or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical (electronic) quantities within the registers and memories of the computer system into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
The present specification also discloses apparatus for performing the operations of the methods. Such apparatus may be specially constructed for the required purposes, or may comprise a general-purpose computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus.
Various general-purpose machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of more specialized apparatus to perform S- the required method steps may be appropriate. The structure of a conventional general- Is purpose computer will appear from the description below.
In addition, the present specification also discloses a computer readable medium comprising a computer program for performing the operations of the methods. The computer readable medium is taken herein to include any transmission medium for communicating the computer program between a source and a designation. The transmission medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a general purpose computer. The transmission medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that 565702.doc a variety of programming languages and coding thereof may be used to implement the teachings of the arrangements described herein.
With reference to Fig. 1, the preferred implementation comprises a processing algorithm 100 which may be implemented in a programmable device such as a digital computer. One set of inputs 102 to the algorithm are components f(x) of the vector of measurements at each pixel x. These components may each be derived from the same source (for example, the colour channels of an RGB image sensor), or from different sources (for example a range map produced by a passive optical range sensor, along with the corresponding intensity image). The number of components is referred to throughout as m, and the column-vector of measurements at each pixel is written asf(x).
The pixel lattice is a regular two-dimensional (2D) grid, with each interior pixel having four neighbours, those being pixels directly above, below, and to the left and right of the pixel in question. Pixels at the periphery of the lattice each have two or three S. neighbours. Separating each pair of neighbours or adjacent pixels is a boundary element or edgel.
An assumption underlying the segmentation problem is that each measurement f(x) is associated with a particular state. The form of the state (or the model) must be decided in advance, but (unknown) parameters of the state are contained in a state vector of length n, containing model parameters. Each state is assumed to be valid over a connected region of pixels. A connected region is one in which each pixel in the region :can be reached from every other pixel via a neighbourhood relation. Such a requirement forbids disjoint regions. The aim of segmentation is to identify these regions and the prevailing state (ie. the model parameters) for each region. Together, these quantities specify a model image representing a desired output 104 of the processing algorithm 100.
565702.doc -8- The neighbourhood or adjacency rule for pixels, known per se in the art, extends to the regions. That is, a region is said to be adjacent to another region if some pixel in the first region is a neighbour of any pixel in the second region.
Let us index the regions by the integerj, and denote each (connected) region of pixels as Qj, over which the state is specified by the n-vector aj of model parameters. The size of Qj shall be denoted by nj. For the purposes of the preferred implementation, the model image over each region is assumed to be a linear projection of the model parameters for that region: g(x) A(x)aj, x E Qj (1) where A(x) is a known m by n matrix which encapsulates the nature of the model.
Each actual measurement is subject to a random error e(x) such that fix) (2) The error may be assumed to be drawn from a zero-mean normal (Gaussian) distribution with covariance A(x): e(x) N(O, (3) The m by m covariance matrix A(x) at each pixel is an additional input to the algorithm (see Fig. In traditional arrangements, it has been assumed that each component ofe(x) is independently and identically distributed, i.e.: A(x) &y2(x)Im. (4) However, the preferred implementation generalises this to encompass disparate and possibly mutually dependent measurement error components.
Variational Formulation of the Segmentation Problem.
Variational segmentation requires that a cost function E be assigned to each possible segmentation. A model-based segmentation of an image is completely described by the model image which is defined by the list of regions and the model parameters r r r r r rr 565702.doc -9prevailing over each region. A partition into regions may be compactly described by a binary function K(d) on the edgels, in which the value one is assigned to each edgel d bordering a region. This function is referred to as an edge map. It should be noted that because of the requirement of region connectedness, not every edge map defines a valid segmentation.
The preferred implementation defines a cost functional in a traditional fashion in which the model fitting error is balanced with the overall complexity of the model. The sum of the statistical residuals of each region is used as the model fitting error.
Combining Equations and the residual over region j as a function of the model parameters aj is given by Ej A(x)aj ]T A- A(x)a The model complexity is simply the number of region-bounding edgels. Hence the S: overall cost functional may defined as E(g, K, E Z K(d) (6) j d where the (non-negative) parameter X controls the relative importance of model fitting error and model complexity. The aim of variational segmentation is to find the minimising arguments g and K of E, for a given X value.
Note that if the region boundaries are given as a valid edge map K, the minimising model parameters aj over each regionj may be found by minimising Ej. This 20 may be evaluated using a simple weighted linear least squares calculation. Given this fact, any valid edge map K will fully and uniquely describe a segmentation. Therefore, E may be regarded as a function over the space of valid edge maps (K-space), whose minimisation yields the optimal region partition The corresponding model 565702.doc parameters may then be assumed to be those which minimise the residuals Ej over each region. The corresponding minimum residuals will hereafter be written as Ej.
The parameter X is clearly critical to the appearance of the result. At one extreme, the global minimiser where the model complexity is completely discounted, is the most trivial segmentation, in which every pixel constitutes its own region, and which gives zero model fitting error. On the other hand, the global minimiser K(oo), where the model fitting error is completely discounted, is the null or empty segmentation in which the entire image is represented by a single region. Somewhere between these two extremes lies the segmentation which will appear ideal in that the regions correspond to a semantically meaningful partition. However, there is no prima facie method of choosing the corresponding X, and even if X were given, the minimisation task remains extremely difficult. In this regard, a brute force method of evaluating E for S. every possible valid segmentation map K is computationally infeasible for images of size greater than a few pixels.
15 Approximate Solution by Region Merging To find an approximate solution to the variational segmentation problem, the region-merging strategy has been employed. This strategy employs the concept of a 2-normal segmentation. A 2-normal segmentation is defined as one in which the cost functional increases after any 2 neighbouring regions are merged. Based on the idea that nearby valid edge maps in K-space share almost all of their region boundaries, a 2-normal segmentation is clearly a local minimum for the cost functional. Therefore an algorithm which finds a 2-normal segmentation for a given X is a good approximate solution to the variational formulation for that value of X.
565702.doc -11- A second observation is that any 2-normal segmentation K for a given X0 is a superset of a 2-normal segmentation K for any i.e. Ko contains all the boundaries that Ki does (as well as some others). In other words, a 2-normal segmentation for any k may be derived from any 2-normal segmentation for any smaller k just by merging adjacent regions. Knowing that the trivial segmentation is the global minimiser for the smallest possible k value of 0, from these two observations, an approximate solution to the variational formulation for any given k, may be determined according to the following steps: 1. Set k 0 and set Kk K (the trivial segmentation).
2. Increment k and set k k.
3. Form a trial segmentation Kk(ij) by merging any adjacent pair of regions i andj within the segmentation Kk.
4. Compare the cost functional E(Kk(ij), k) with E(Kk, If it is less, allow the merge by setting Kk Kk(ij), and E(Kk, kk) E(Kk(ij), X).
5. Repeat steps 3 and 4 until no further merging is allowed. (A 2-normal segmentation has thus been achieved).
6. If Ak k, halt; otherwise go to step 2.
The above algorithm requires a monotonically increasing X-schedule 0 k kma, k. The minimising (2-normal) segmentation at each point in the schedule is 20 used as the "starting guess" for the minimising segmentation at the succeeding point.
Regions are grown by pairwise merging from the original, pixel level according to whether a given merge decreases the cost functional. If the schedule is gradual enough, the final segmentation Kk should be close to the global minimiser K(k) of E for the given, final X value.
565702.doc 12- Efficient Region Merging Step 4 of the algorithm above requires the comparison of two cost functionals. It is desirable for this step to be carried out as efficiently as possible. The two segmentations differ in only two regions, so the test computation may be confined to those regions. By examining Equations and a test statistic for the adjacent region pair j) may be written as SE, -(El E)) (7) where 1(6j) is the length of the common boundary between regions i and j. If tij (the merging cost) is less than Xk, the merge is allowed. The key to efficient region merging is o0 to compute the numerator oftij as fast as possible. First, let us rewrite Equation as: Ej(aj) (Fj Hj aj)
T
K(F Hj aj) (8) where: Hj is an (njm) by n matrix composed of the individual A(x) matrices stacked on top of one another as x varies over regionj; 15 Fj is a column vector of length (njm) composed of the individual f(x) vectors stacked on top of one another; K is an (njm) by (nim) block diagonal matrix, where each m by m diagonal block is the inverse of the A(x) matrix at the pixel denoted by the corresponding rows in
F
F.
20 By weighted least squares theory, the minimising model parameter vector aj is given by aj =K HT KjF (9) where Kj is the confidence in the parameter estimate, defined as the inverse of its covariance: 565702.doc 13- K= HKjHj The corresponding residual is given by Ej =F HKi-'H K 1 (11) When merging two regions i andj, the "merged" matrix Hij is obtained by concatenating Hi with Hj; likewise for Fij and Kij. These facts may be used to show that the best fitting model parameter vector for the merged region is given by: aij=ai -Kl (12) where the merged confidence is Ki Ki (13) and the merged residual is given by Eij=Ei+ Ej+(ai-aj) KiK'Kj(ai-aj) (14) Combining Equations (13) and the test statistic ty in Equation may be computed as: jK,(a -aj) ti'- K o I is from the model parameters and confidences of the regions to be merged. The matrix to be inverted is always of size n by n, does not increase with region size). If the merge is allowed, Equations (12) and (13) give the model parameters and confidences of the merged region.
Note that under this strategy, only Equations and (15) need to be applied throughout the merging process. Only the model parameters and their confidences for each region are therefore required as segmentation proceeds. Further, neither the original measurements f(x) nor the model structure itself the matrices are required.
565702.doc 14- Statistical linear-model-based segmentation may thus be separated into two stages as seen in Fig. 1, those stages being an initial model fitting stage 106 where parameters a(x) and confidences K(x) are found for the data at each pixel, followed by a region merging stage 108. In the case of a zero-order model the initial modelfitting stage is trivial: f(x) (16) K(x) (17) In the case of higher-order models, model parameters and confidences at each pixel may be obtained in any manner desired. In the preferred implementation, they are estimated over a small window of pixels surrounding the pixel in question. The window size w must be sufficiently large, ie.
wxm>n (18) 2 to prevent under-determination of the model-fitting Equations and In the case where not all the window pixels actually belong to the same state, an estimation technique robust to "outliers" should be used. Robust estimation is a statistical technique known to those skilled in the art.
Removing the Need for a X-Schedule.
Recall that the variational algorithm stated above in steps to requires a monotonically increasing ?-schedule Xi At each point in the schedule, the algorithm searches at random over adjacent pairs and merges all those pairs whose test a statistic is less than the schedule value. Only after no merges are possible may the algorithm advance to the next point in the schedule.
The need for a schedule may be removed by slightly reformulating the algorithm.
Suppose at the initialisation stage all adjacent region pairs are determined and their corresponding test statistics evaluated. It is then possible to sort all pairs into a list in 565702.doc ascending order of the test statistic. Region merging then involves popping a pair off the top of the list the pair with the lowest merging cost), merging this pair, deleting all the pairs containing either of the merged regions, evaluating a new test statistic for each pair containing the newly merged region, and re-inserting these into the list at the appropriate point(s).
This modified region merging algorithm effectively provides a value at each merge operation the test statistic ti of the pair being merged. It is thus possible to build up a sequence of tij values as the algorithm proceeds, using only the measurement data.
The algorithm halts if this value exceeds a predetermined threshold at any time.
This new version of the algorithm may be shown to have complexity O(N log N) where N is the number of pixels in the image (assuming that the number of neighbouring regions remains small relative to provided the sorting and insertion can be done in "log time". This can be guaranteed if the list structure is maintained in computer memory in a structure called a heap or priority queue.
One problem, however, remains: the value of kop must be decided in advance.
Otherwise the algorithm will continue to merge until the null segmentation is reached.
i A suitable value of kstop may be obtained by empirical means. In such an approach, an image or set of images deemed to be typical of the kind likely to be encountered by the algorithm are chosen as the training set. Different values of st,,p are trialed on the whole set until finally a value is obtained which produces segmentations in all or most of the training examples which correspond to those a human segmenter would achieve. There are two main disadvantages of this method. The first is the expense of training, which must be repeated every time a new data set is encountered. The second is the lack of flexibility since the results of applying the predetermined to an image which does not resemble the training set are unpredictable.
565702.doc 16- Automatic Determination of X,_tp.
In the preferred implementation, the value of -stp is determined automatically from each image it is applied to, without the need for training, according to the processing method 300 shown in Fig. 3, which implements a preferred form of the regionmerging 108 of Fig 1. This means the value of varies from image to image in a manner determined by the data itself. This approach is more flexible than the use of a single, fixed Xhp for all images. It is most useful for the class of images in which a small number of distinct but not necessarily homogeneous foreground objects are ranged against a cluttered (ie. non-homogeneous) background.
To see how is determined, note first that as merging proceeds, the merging cost of the regions being merged generally increases. This increase however is not purely monotonic. In fact, the overall rise in t o is punctuated by departures from monotonicity, r which herein are termed local minima. A local minimum represents the collapse of what
S.
*."might be termed a self-supporting group of adjacent regions. Such occurs if one boundary within the group is removed, and the merging costs for adjacent boundaries then suddenly reduce. In effect, the hypothesis that the regions of the group are part of the same object is confirmed as more regions merge and so t decreases. The result is that all the boundaries in the group are removed in quick succession. These self-supporting groups tend to represent the internal structure of objects and background clutter. A
C
Io 20 measure of merit such as the number of boundaries removed or their total length or the maximum (absolute or relative) decrease in tL/ may be assigned to each local minimum.
The point immediately after a local minimum, being a return to substantial monotonicity, is termed herein a stable configuration. Visually, a stable configuration represents a point in the segmentation process at which an area of internal object structure or background clutter has just been removed, and is thus a good potential halting point.
565702.doc -17- Each stable configuration has an associated value of ti. Fig. 2A shows a plot of tj during part of a segmentation of a real image, showing local minima and stable configurations.
If a complete pass is made through the segmentation, in which all regions are merged until only one (the whole image) remains, all local minima and stable configurations for the image may be found automatically by analysing the values of ty.
Significant local minima, being those whose measure of merit exceeds a certain threshold, are flagged. The final segmentation stopping value is chosen to be the last such stable configuration. An example of this is seen in Fig. 2B, where an artificial plot of tyj over time is shown for an entire region merging process. As can be seen, during the early stages of region merging, local minima are common, giving the plot an erratic behaviour.
As the regions become more established and substantial, the local minima frequency reduces until the null segmentation is reached the image forms a single region).
Those segmentations approaching the null will however be useful since the number of e* oC C *Ses*regions will be manageable computationally and most likely will be visually perceptible o1 5s (eg. a person distinguished from background, or the major body parts (head, torso, arms, S CCC* legs) of a person distinguished from background). As indicated above, a stable configuration is a desirable location to cease region merging, and Fig. 2B illustrates the identification of a limited number of candidate stopping locations stop1, ,stop2, kstop_3 ,at stable configurations near the null segmentation. The last stable configuration is
C..
typically chosen as the X.h 1 p, although any of the limited number of candidate stopping locations may be selected depending on the particular image and/or application being processed. Further, where the image has a large number of local minima hundreds, thousands or more), the limited number of candidate stopping positions may be significant (eg. in the "tens").
565702.doc -18 At this point, given the underlying assumptions about the image, unwanted internal object structure and background clutter can thus be removed. To achieve this stable configuration (whose t 0 j value is deemed to be the processing method 300 need only reverse its last few merging operations by restoring the algorithm state appropriately. Alternatively, the merging process may be run again from the start, halting when the value of tj reaches The complete method (using the latter, more expensive alternative) is set out as a flow chart in Fig. 3. The method 300 starts at step 302, and step 304 which follows receives the vector-data sets a(x) and confidences K(x) for each pixel in the image. Step 306 then computes the test statistic t o according to Equation 15 for the pixels. Step 308 inserts the test statistics into a heap T in priority order. Steps 310 to 324 are iterated in a loop to group the pixels into regions. Step 310 finds the first entry T(1) in the heap T and merges the corresponding region pair Step 312 records the test statistic value t o in the list L. Step 314 identifies all adjoining regions and step 316 acts to delete the test 1• 5 statistic value corresponding to all the adjoining regions from the heap T. Step 318 0 follows and creates a new test statistic for each adjoined region. Step 320 then inserts the new t o into the heap T. Step 324 follows and seeks to detect the null segmentation. If
OS@S
such is not present, control returns to step 310 and steps 310-320 are performed again on
S.
the regions.
When all regions have been combined into the null segmentation, step 324 passes control to step 326 which then can identify the t 1 value of the last stable configuration, and this is assigned k,stop Control returns to step 304 and the pixels again are merged to form regions. With kst,, selected, step 320 passes control to step 322 to determine if the merging has reached the stopping point. If so, the method 300 finishes.
If not, control is returned to step 310 and further regions are merged.
565702.doc -19- In order to apply the merging algorithm described above to a wide variety of data sources and models, it is necessary to choose a threshold Nsig, to thereby determine what constitutes a significant local minimum. A stable configuration must follow a significant local minimum according to this definition. In the preferred implementation, a measure of merit is used corresponding to the number of boundaries removed during a local minimum, and Nsig defaults to a small integer such as 3. Larger values of Nsig remove only larger self-supporting structures and thus leave more internal structure and background clutter intact, and so Nsig can be passed as an algorithm parameter to control the depth of segmentation.
Thus in the preferred implementation there is a semantically meaningful halting criterion which, while based on the properties of the image itself, is insensitive to the *o..o actual measurement values.
Extension to Three-dimensional (Volume) Data An additional benefit of the described processes for segmentation of image data ooooo is their ready extension to data defined on higher dimensional spaces. For example, medical images are often recorded as volume data, ie. a three dimensional array of some scalar (or vector) quantity. All that is required to extend the processing algorithm 100 to S•.o handle volume data is a definition of the neighbourhood relation between adjacent voxels (the volume equivalent of pixels). The logical extension of the pixel neighbourhood relation to three dimensions allows six neighbours for each voxel (one in each direction of each dimension). Regions may then be deemed to be bounded by surfaces whose areas are readily computable in the same fashion as two-dimensional perimeter lengths, by adding up the boundary edgels as in Equation The same procedures as described above may be applied without alteration to automatically segment volume data into regions of linearly modellable data.
565702.doc The region-merging and processing methods described above are preferably practiced using a conventional general-purpose computer system 400, such as that shown in Fig. 4 wherein the processes of Figs. 1 to 3 may be implemented as software, such as an application program executing within the computer system 400. In particular, the steps of the region merging method are effected by instructions in the software that are carried out by the computer. The software may be divided into two separate parts; one part for carrying out the merging methods and another part to manage the user interface between the latter and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into 10 the computer from the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a ocomputer program product. The use of the computer program product in the computer preferably effects an advantageous apparatus for region merging.
The computer system 400 comprises a computer module 401, input devices such oooo• as a keyboard 402 and mouse 403, output devices including a printer 415 and a display device 414. A Modulator-Demodulator (Modem) transceiver device 416 is used by the II: computer module 401 for communicating to and from a communications network 420, for S. example connectable via a telephone line421 or other functional medium. The modem 416 can be used to obtain access to the Internet, and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN).
The computer module 401 typically includes at least one processor unit 405, a memory unit 406, for example formed from semiconductor random access memory (RAM) and read only memory (ROM), input/output interfaces including a video interface 407, and an 1/0 interface413 for the keyboard402 and mouse403 and optionally a joystick (not illustrated), and an interface 408 for the modem 416. A storage 565702.doc -21 device 409 is provided and typically includes a hard disk drive 410 and a floppy disk drive 411. A magnetic tape drive (not illustrated) may also be used. A CD-ROM drive 412 is typically provided as a non-volatile source of data. The components 405 to 413 of the computer module 401, typically communicate via an interconnected bus 404 and in a manner which results in a conventional mode of operation of the computer system 400 known to those in the relevant art. Examples of computers on which the implementations can be practised include IBM-PC's and compatibles, Sun Sparcstations or alike computer systems evolved therefrom.
Typically, the application program of the preferred implementation is resident on the hard disk drive 410 and read and controlled in its execution by the processor 405.
Intermediate storage of the program and any data fetched from the network 420 may be :•'""accomplished using the semiconductor memory 406, possibly in concert with the hard *disk drive 410. In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk and read via the corresponding drive 412 or 411, :oooi or alternatively may be read by the user from the network 420 via the modem device 416.
Still further, the software can also be loaded into the computer system 400 from other computer readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red transmission channel between the computer module 401 and another device, a computer readable card such as a PCMCIA card, and the Internet and Intranets including e-mail transmissions and information recorded on websites and the like. The foregoing is merely exemplary of relevant computer readable mediums. Other computer readable media may be practiced without departing from the scope and spirit of the invention.
The methods described may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions and for 565702.doc -22example incorporated in a digital video camera 420. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories. As seen, the camera 450 includes a display screen 452 which can be used to display the segmented image of information regarding then same. In this fashion, a user of the camera may record an image, and using the processing methods described above, create metadata that may be associate with the image to conveniently describe the image thereby permitting the image to used or otherwise manipulated with a specific need for a user to view the image. A connection 448 to the computer module 401 may be utilised to transfer data to and/or from the computer module 401 for performing the segmentation process.
o* Industrial Applicability It is apparent from the above that the embodiment(s) of the invention are :.:""applicable to the image processing industries where images may require cataloguing according to their content.
ooi S •15 The foregoing describes only one embodiment/some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiment(s) being illustrative and not restrictive.
0 In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including" and not "consisting only of'. Variations of the word comprising, such as "comprise" and "comprises" have corresponding meanings.
565702.doc
Claims (6)
1. A method for segmenting an image formed by a plurality of pixels using a region-merging process characterised by using covariance data and a plurality of vector components of each said pixel to evaluate a merging criterion for regions of said image.
2. A method according to claim 1 wherein said plurality of vector components comprise at least two ofcolour, range and motion.
3. A method according to claim 2 wherein said colour vector component comprises at least one colour channel of a colour space in which said image can be reproduced. S. A method for segmenting an image formed by a plurality of pixels, each said pixel being described by a vector having components each relating to a different .ea..i measured image characteristic, said method comprising the steps of: receiving, for each said pixel, a plurality of said vector components and a corresponding error covariance representation of said pixel; for each said pixel, fitting each said component and the corresponding covariance representation to a predetermined linear model to obtain a set of model parameters and corresponding confidence representations; statistically analysing the sets of model parameters and corresponding confidence representations to derive a segmentation of said image that minimises a predetermined cost function.
5. A method according to claim 4 wherein step comprises the sub-steps of: 565702AU.doc
24- (ca) defining said pixels to each be initial regions of said image;. (cb) merging said regions in a statistical order using said sets of model parameters and confidence representations to obtain a null segmentation of said image; (cc) analysing a curve formed using said model parameters and corresponding confidence representations to determine an optimal halting criterion at which to cease the merging of said regions; and (cd) processing said merging of said initial regions to halt when said optimal merging criterion is reached. 6. A method according to claim 5 wherein sub-step (cd) comprises re-executing the S* entire merge of said initial regions using said model parameters and confidence °representations to provide said merged segmentation. 7. A method according to claim 5 wherein sub-step (cc) comprises identifying 15 returns to monotonicity from local minima in said curve and selecting a predetermined said return approaching the null segmentation as said optimal halting criterion. S 8. A method according to claim 7 wherein step (cd) comprises re-executing the 'merge of said regions using said model parameters up until said predetermined return is reached to provide said merged segmentation. 9. A method according to claim 5 wherein said statistical order is determined using an order of minimum covariance-normalised vector distance between adjacent regions of said segmentation.
565702.doc A method according to claim 5 wherein said statistical order is determined using a length of a common boundary between adjacent regions. 11. A method according to claim 5 wherein said statistical order is determined by s dividing a minimum covariance-normalised vector distance between adjacent regions of said segmentation by a length of a common boundary between adjacent regions, and ordering the resulting quotients. 12. A method according to claim 11 wherein each said quotient forms a test statistic, 10 a record of which is retained at each merging step. S13. A method according to claim 4, wherein said plurality of vector components o. "comprise at least two of colour, range and motion. ei 14. A method according to claim 5, wherein said colour vector component comprises at least one colour channel ofa colour space in which said image can be reproduced. 15. A method for unsupervised selection of a stopping point for a region-merging segmentation process, said method comprising the steps of: analysing a graph of merging cost values to identify departures from substantial monotonicity of said graph; and selecting said stopping point to be a merging cost value corresponding to a return to monotonicity of said graph, said selected stopping point being associated with one of a limited plurality of final said departures in said region merging process. 565702.doc 26 16. A method according to claim 15 wherein said selected stopping point comprises a return from said final departure. 17. A method according to claim 15 wherein said departures are larger than a s predetermined threshold. 18. A method according to claim 15 wherein said merging cost function comprises an ordered series of test statistics, each said test statistic being formed, for each adjacent pair of regions in the segmented image, by dividing a covariance-normalised vector 10 distance between the pair by a length of a common boundary between the pair. 19. Apparatus for segmenting an image formed by a plurality of pixels using a region-merging process characterised by using covariance data and a plurality of vector components of each said pixel to evaluate a merging criterion for regions of said image. Apparatus according to claim 19 wherein said plurality of vector components comprise at least two of colour, range and motion. 21. Apparatus according to claim 20 wherein said colour vector component comprises at least one colour channel of a colour space in which said image can be reproduced. 22. Apparatus for segmenting an image formed by a plurality of pixels, each said pixel being described by a vector having components each relating to a different measured image characteristic, said apparatus comprising: 565702.doc -27- means for receiving, for each said pixel, a plurality of said vector components and a corresponding error covariance representation of said pixel; means for fitting, for each said pixel, each said component and the corresponding covariance representation to a predetermined linear model to obtain a set of model parameters and corresponding confidence representations; and analysing means for statistically analysing the sets of model parameters and corresponding confidence representations to derive a segmentation of said image that minimises a predetermined cost function. o10 23. Apparatus according to claim 22 wherein said analysing means comprises: eo defining means for defining said pixels to each be initial regions of said image; merging means for merging said regions in a statistical order using said sets of model parameters and confidence representations to obtain a null segmentation of said image; curve analysing means for analysing a curve formed using said model parameters and corresponding confidence representations to determine an optimal halting criterion at which to cease the merging of said regions; and S• processing means for processing said merging of said initial regions to halt when ~said optimal merging criterion is reached. 24. Apparatus according to claim 23 wherein said processing means comprises means for re-executing the entire merge of said initial regions using said model parameters and confidence representations to provide said merged segmentation. 565702.doc 28 Apparatus according to claim 23 wherein said curve analysing means comprises means for identifying returns to monotonicity from local minima in said curve and means for selecting a predetermined said return approaching the null segmentation as said optimal halting criterion. 26. Apparatus according to claim 25 wherein said processing means comprises means for re-executing the merge of said regions using said model parameters up until said predetermined return is reached to provide said merged segmentation. to 27. Apparatus according to claim 23 wherein said statistical order is determined using an order of minimum covariance-normalised vector distance between adjacent eo regions of said segmentation. 28. Apparatus according to claim 23 wherein said statistical order is determined 15 using a length of a common boundary between adjacent regions. 29. Apparatus according to claim 23 wherein said statistical order is determined by dividing a minimum covariance-normalised vector distance between adjacent regions of said segmentation by a length of a common boundary between adjacent regions, and ordering the resulting quotients. Apparatus according to claim 29 wherein each said quotient forms a test statistic, a record of which is retained at each merging. 565702.doc -29- 31. Apparatus according to claim 22, wherein said plurality of vector components comprise at least two of colour, range and motion. 32. Apparatus according to claim 23, wherein said colour vector component comprises at least one colour channel of a colour space in which said image can be reproduced. 33. Apparatus for unsupervised selection of a stopping point for a region-merging segmentation process, said apparatus comprising: means for analysing a graph of merging cost values to identify departures from .substantial monotonicity of said graph; and *0 means for selecting said stopping point to be a merging cost value corresponding to a return to monotonicity of said graph, said selected stopping point being associated with one of a limited plurality of final said departures in said region merging process. 34. Apparatus according to claim 33 wherein said selected stopping point comprises a return from said final departure. Apparatus according to claim 33 wherein said departures are larger than a predetermined threshold. 36. Apparatus according to claim 33 wherein said merging cost function comprises an ordered series of test statistics, each said test statistic being formed, for each adjacent pair of regions in the segmented image, by dividing a covariance-normalised vector distance between the pair by a length of a common boundary between the pair. 565702.doc 37. A program for making a computer execute a procedure to segment an image formed by a plurality of pixels using a region-merging process characterised by using covariance data and a plurality of vector components of each said pixel to evaluate a merging criterion for regions of said image. 38. A program according to claim 37 wherein said plurality of vector components comprise at least two of colour, range and motion. 39. A program according to claim 38 wherein said colour vector component comprises at least one colour channel of a colour space in which said image can be reproduced. A program for making a computer execute a procedure to segment an image formed by a plurality of pixels, each said pixel being described by a vector having components each relating to a different measured image characteristic, said program comprising: code for receiving, for each said pixel, a-plurality of said vector components and a corresponding error covariance representation of said pixel; code for, for each said pixel, fitting each said component and the corresponding covariance representation to a predetermined linear model to obtain a set of model parameters and corresponding confidence representations; and analysing code for statistically analysing the sets of model parameters and corresponding confidence representations to derive a segmentation of said, image that minimises a predetermined cost function. 565702.doc -31 41. A program according to claim 40 wherein said analysing code comprises: code for defining said pixels to each be initial regions of said image; code for merging said regions in a statistical order using said sets of model parameters and confidence representations to obtain a null segmentation of said image; code for analysing a curve formed using said model parameters and corresponding confidence representations to determine an optimal halting criterion at which to cease the merging of said regions; and code for processing said merging of said initial regions to halt when said optimal merging criterion is reached. S.. S. "42. A program for making a computer execute a procedure for unsupervised selection of a stopping point for a region-merging segmentation process, said program comprising: code for analysing a graph of merging cost values to identify departures from ~substantial monotonicity of said graph; and code for selecting said stopping point to be a merging cost value corresponding .to a return to monotonicity of said graph, said selected stopping point being associated *see with one of a limited plurality of final said departures in said region merging process. 43. A program according to claim 42 wherein said selected stopping point comprises a return from said final departure. 44. A program according to claim 43 wherein said departures are larger than a predetermined threshold. 565702.doc -32- A program according to claim 42 wherein said merging cost function comprises an ordered series of test statistics, each said test statistic being formed, for each adjacent pair of regions in the segmented image, by dividing a covariance-normalised vector distance between the pair by a length of a common boundary between the pair. 46. A method of segmenting an image substantially as described herein with reference to Figs. 1 to 3 of the drawings. 47. Apparatus for segmenting an image substantially as described herein with reference to Figs. 1 to 3 of the drawings. 48. A method for unsupervised selection of a stopping point for a region-merging segmentation process, said method being substantially as described herein with reference Is•5 to Figs. 2 and 3 of the drawings. *s reference to Figs. 2 and 3 of the drawings. Dated this FOURTH day of AUGUST 2001 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant Spruson&Ferguson 49. Apparatus for unsupervised selection of a stopping point for a region-merging segmentation process, said apparatus being substantially as described herein with reference to Figs. 2 and 3 of the drawings. Dated this FOURTH day of AUGUST 2001 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant Spruson&Ferguson 565702.doc
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU57783/01A AU767741B2 (en) | 2000-08-04 | 2001-08-03 | A method for automatic segmentation of image data from multiple data sources |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AUPQ9218 | 2000-08-04 | ||
AUPQ9218A AUPQ921800A0 (en) | 2000-08-04 | 2000-08-04 | A method for automatic segmentation of image data from multiple data sources |
AU57783/01A AU767741B2 (en) | 2000-08-04 | 2001-08-03 | A method for automatic segmentation of image data from multiple data sources |
Publications (2)
Publication Number | Publication Date |
---|---|
AU5778301A AU5778301A (en) | 2002-02-07 |
AU767741B2 true AU767741B2 (en) | 2003-11-20 |
Family
ID=25631765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU57783/01A Ceased AU767741B2 (en) | 2000-08-04 | 2001-08-03 | A method for automatic segmentation of image data from multiple data sources |
Country Status (1)
Country | Link |
---|---|
AU (1) | AU767741B2 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0579319A2 (en) * | 1992-07-16 | 1994-01-19 | Philips Electronics Uk Limited | Tracking moving objects |
US5631970A (en) * | 1993-05-21 | 1997-05-20 | Hsu; Shin-Yi | Process for identifying simple and complex objects from fused images and map data |
AU5261899A (en) * | 1998-10-02 | 2000-04-13 | Canon Kabushiki Kaisha | Segmenting moving objects and determining their motion |
-
2001
- 2001-08-03 AU AU57783/01A patent/AU767741B2/en not_active Ceased
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0579319A2 (en) * | 1992-07-16 | 1994-01-19 | Philips Electronics Uk Limited | Tracking moving objects |
US5631970A (en) * | 1993-05-21 | 1997-05-20 | Hsu; Shin-Yi | Process for identifying simple and complex objects from fused images and map data |
AU5261899A (en) * | 1998-10-02 | 2000-04-13 | Canon Kabushiki Kaisha | Segmenting moving objects and determining their motion |
Also Published As
Publication number | Publication date |
---|---|
AU5778301A (en) | 2002-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6843086B2 (en) | Image processing systems, methods for performing multi-label semantic edge detection in images, and non-temporary computer-readable storage media | |
Fernandes et al. | Deep Learning image segmentation for extraction of fish body measurements and prediction of body weight and carcass traits in Nile tilapia | |
US20220067943A1 (en) | Automated semantic segmentation of non-euclidean 3d data sets using deep learning | |
US8655069B2 (en) | Updating image segmentation following user input | |
US20180295375A1 (en) | Video processing and encoding | |
US20170337711A1 (en) | Video processing and encoding | |
CN110678903B (en) | System and method for analysis of ectopic ossification in 3D images | |
US6947590B2 (en) | Method for automatic segmentation of image data from multiple data sources | |
US8213726B2 (en) | Image labeling using multi-scale processing | |
US20170032222A1 (en) | Cross-trained convolutional neural networks using multimodal images | |
CN108510499B (en) | Image threshold segmentation method and device based on fuzzy set and Otsu | |
EP2801054A1 (en) | Method and system for comparing images | |
CN109657083B (en) | Method and device for establishing textile picture feature library | |
Wang | Image matting with transductive inference | |
Berjón et al. | Fast feature matching for detailed point cloud generation | |
CN116258725B (en) | Medical image processing method and device based on feature images and storage medium | |
AU767741B2 (en) | A method for automatic segmentation of image data from multiple data sources | |
CN104766068A (en) | Random walk tongue image extraction method based on multi-rule fusion | |
Celestine et al. | Investigations on adaptive connectivity and shape prior based fuzzy graph‐cut colour image segmentation | |
Tzotsos et al. | MSEG: A generic region-based multi-scale image segmentation algorithm for remote sensing imagery | |
Freedman | An improved image graph for semi-automatic segmentation | |
Lopez et al. | Line-based image segmentation method: a new approach to segment VHSR remote sensing images automatically | |
CN115272527A (en) | Image coloring method based on color disc countermeasure network | |
Henry et al. | Perceptual image analysis | |
Kohli | Minimizing dynamic and higher order energy functions using graph cuts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
DA3 | Amendments made section 104 |
Free format text: THE NATURE OF THE AMENDMENT IS: SUBSTITUTE PATENT REQUEST REGARDING ASSOCIATED DETAILS |
|
FGA | Letters patent sealed or granted (standard patent) |