WO2004099942A2

WO2004099942A2 - Gait recognition system

Info

Publication number: WO2004099942A2
Application number: PCT/US2004/006743
Authority: WO
Inventors: Prem Kuchi; Sethuraman Panchanathan
Original assignee: The Arizona Board Of Regents
Priority date: 2003-03-05
Filing date: 2004-03-05
Publication date: 2004-11-18
Also published as: WO2004099942A3

Abstract

Gait recognition systems are disclosed using a number of different algorithms for manipulating nonlinear, non-stationary trajectory signals generated by one or more subjects and detected by a motion analysis software or particle filters. The algorithms include a dynamic time warping function for normalizing the tracked signals for comparative analyses, an empirical mode decomposition function for pairing the normalized signals into signals unique to a subject; and a comparative function for comparing the unique signals from the tracked subject against other known unique signals stored in a data bank for ascertaining an identity or some desired output.

Description

GAIT RECOGNITION SYSTEM

Gait recognition systems for recognizing human beings from their walking patterns are generally discussed herein with particular discussions extended to gait recognition systems using nonlinear, nonstationary signal analyses for analyzing gait trajectories.

BACKGROUND

Gait recognition of human beings by human beings was first reported in the late sixties in several psychophysical studies. However, it was not until the late nineties that gait recognition by machines even became a possibility. There were several methods proposed for machine recognition, of which, some representative methods are described here. These methods can be divided into two basic categories: (1) structural methods and (2) structure- free methods.

In the structural methods, the trajectories of specific points on the body or the angles between them are used to derive features for recognition. One early study proposed a gait recognition technique that recovers static body and stride parameters (different distances between various parts of the body). Tests on about 20 subjects showed that the technique held some promise. However, there remains to be seen how these static parameters scale up for larger databases. Another study uses trajectories of specific locations on a subject and use Fourier transform to derive the feature vector (FV). A Multi-Layer Perception (MLP) is used for classification and a Radial Basis Function (RBF) network was used to predict the data that is lost due to self-occlusion from using Fourier transform. In another study, Fourier magnitude spectra of joint angles weighted with phase information was used to generate the FV. A correct classification rate (CCR) of 90% was obtained using k-nearest neighbor rule on a database often subjects.

In the structure-free methods, the motion pattern of the body is characterized without regard to its underlying structure. In one study, moments of the moving points weighted by the dense optical flow for characterizing motion was used. Phase of these scalar sequences was then used as feature vectors for recognizing gait. In another study, the principal components of temporal information from optical-flow changes between two consecutive spatial templates for extracting features for gait recognition. In yet another study, the researchers use symmetry patterns of human motion for recognition. The generalized symmetry operator was then used for gait analysis. Using this approach, the researchers achieved a CCR of 95%. However, there are studies that show that gait might not be symmetric. ι While these studies have shown a positive potential for machine recognition of a person's gait, they have several shortcomings. Among other things, their studies are based on an assumption that gait recognition comprises an analysis of linear and stationary signals. However, recent research efforts in the field of biomechanics have led to the conclusion that gait is both nonlinear and nonstationary. Accordingly, there is a need for a system that is more accurate, reliable, and one based on proper signal interpretation.

SUMMARY

10 The present invention may be implemented by providing a gait recognition system comprising a cycle extraction module, a normalization module, and an empirical mode decomposition module for determining a gait cycle based in part on nonlinear, nonstationary signals and on at least one virtual marker.

In another aspect of the present invention, there is provided a gait recognition system

15 comprising a recording module for recording nonlinear, nonstationary signals; a data extraction module for compiling at least one trajectory based on the nonlinear, nonstationary signals; a data manipulation module for normalizing the at least one trajectory and for decomposing the at least one trajectory into a first set of intrinsic mode functions; and a data

20 comparison module for comparing the first set of intrinsic mode functions with a second set of intrinsic mode functions stored in a database.

In yet another aspect of the present invention, there is provided a gait recognition system comprising a recording module for recording nonlinear, nonstationary signals, a

25 motion analysis software comprising a virtual marker for tracing at least one trajectory from the nonlinear, nonstationary signals; a data manipulation module for decomposing the trajectory into a first set of intrinsic mode functions, and a data comparison module that uses a neural network to compare the first set of intrinsic mode functions with a second set of intrinsic mode functions. 30

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features and advantages of the present invention will become appreciated as the same become better understood with reference to the specification, claims ^J ° and appended drawings wherein:

FIG. 1 is schematic flow diagram of an exemplary gait recognition system provided in accordance with aspects of the present invention; FIG. 2 is a semi-schematic diagram of a subject captured on video with exemplary virtual marker locations;

FIG. 3 is an exemplary displacement trajectory of a single marker location produced from a motion analysis software;

FIG. 4 is an exemplary output of two non-normalized gait signals;

FIG. 5 is a graph of FIG. 4 with the Signal 2 data normalized relative to the Signal 1 data using simple linear resampling method;

FIG. 6 is a graph of FIG. 4 with the Signal 2 data normalized relative to the Signal 1 data using dynamic time warping;

FIG. 7 is an exemplary nonlinear, non-stationary signal;

FIG. 8 shows a set of feature vectors produced from empirical mode decomposition of a kinematic signal, such as the signal from FIG. 7;

FIG. 9 is an exemplary topology of multiplayer perceptions for performing classification problem using supervised training provided in accordance with aspects of the present invention;

FIG. 10 is a schematic diagram of a single marker gait recognition system provided in accordance with aspect of the present invention;

FIG. 11 is a schematic diagram of a multiple-marker gait recognition system provided in accordance with aspects of the present invention; and

FIG. 12 is a plot of a mean and standard deviations of all last intrinsic mode functions of a subject group; and FIG. 13 shows a correlation between different first intrinsic mode functions of a subject group.

DETAILED DESCRIPTION

The detailed description set forth below in connection with the appended drawings is intended as a description of the presently preferred embodiments of a gait recognition system provided in accordance with practice of the present invention and is not intended to represent the only forms in which the present invention may be constructed or utilized. The description sets forth the features and the steps for constructing and using the gait recognition system of the present invention m connection with the illustrated embodiments. It is to be understood, however, that the same or equivalent functions and structures may be accomplished by different embodiments that are also intended to be encompassed within the spirit and scope of the invention. Also, as denoted elsewhere herein, like element numbers are intended to indicate like or similar elements or features.

Referring now to FIG. 1, an exemplary block flow diagram of an embodiment of the present is shown, which is a gait recognition system generally designated as 10. Broadly speaking, the gait recognition system 10 comprises an input image module 12, a data extraction module 14, a data manipulation module 16, a data comparison module 20, and a data bank module 22. The gait recognition system 10 is configured to record image data of a subject's gait or trajectory, manipulating that data into unique data, and providing a desired output. When the captured signal is to be compared to a known signal to determine whether the captured signal matches a previously known signal, a comparison is conducted between the captured signal and a plurality of signals stored in a data bank. If a match is detected within some acceptable deviation, a positive or a confirming output signal is provided to a technician or security guard. For purposes of the following discussion, a person's gait will be discussed herein throughout although it is recognized that an animal's gait may also be analyzed, such as for breeding, horse racing, etc.

As used herein, the term gait means a manner of walking or stepping. The term gait cycle means a series of events that occur between a point on a subject at to (time zero) and the same point at tj, such as a right heel strike to a right heel strike, or between a left heel strike and a left heel strike.

An exemplary use of the gait recognition system 10 is for security monitoring at a bank, an airport, a sporting event, or a merchant location, just to name a few. The system 10 may comprise one or more cameras, such as analog cameras with analog-to-digital converters and digital cameras, mounted at a location for recording gaits of various individuals. Multiple remote locations may also be monitored with all the cameras connected to one or more servers via transmission means, such as cable, microwave radio, the internet, or other conventional transmission means. Video capturing, video transmission, video storage, and video manipulation are well known in the art, for example as disclosed in U.S. Pat. Nos. 6,226,031 (Barraclough et al.), 6,049,353 (Gray), 5,751,346 (Dozier et al.), 5,508,736 (Cooper), and 5,382,943 (Tanaka), which descriptions are expressly incorporated herein by reference, and further discussion is deemed unnecessary. Thus, captured image signals as discussed herein include signals captured from any prior are surveillance or recording systems and then subsequently digitized for storage and/or further manipulation. Referring to ι FIG. 1, these steps are represented by the input image module 12 and the data extraction module 14.

The captured signals are then interfaced to an automatic motion analysis software, which is the data manipulation module 16 shown in FIG. 1. One specific example of data collection, data extraction, and data manipulation is the Mikromak system offered by Mikromak Service GmbH of Berlin, Germany. More information about Mikromak Service is available at its web site: www, mikromak. com . The Mikromak system offers, among others, high speed cameras and measurement software, such as the WINanalyze software, for both

10 tracking and motion analysis of 2D and 3D images using two synchronized cameras.

After video images are captured and digitized into a computer, for example, of people walking into a bank and conducting banking transactions, the WINanalyze software is used to automatically detect the subject or subjects selected by a user/technician and then tracks the subject or subjects in the following frames. The user/technician may select a subject(s) to

15 track by placing one or more virtual markers on the subject, which then, through algorithm programmed in the WINanalyze software, produce trajectories. Thus, this software not only tracks the subject(s), but also specific points on the subject's body. The trajectories produced from the placement of the specific points are then used for the gait analyses. The

20 WINanalyze software can also compute the velocities, accelerations, and angle projections of the trajectories generated by the markers placed on the subject for data analysis or data comparison. The computed data and the trajectories generated by the WINanalyze software are then saved as ASCII text files, which can further be processed by a number of other

25 programs, including empirical mode decomposition (EMD) for discriminating the tracked trajectories against different known trajectories, as further discussed below.

In an exemplary embodiment, if 2D coordinate trajectories are analyzed (e.g., captured image taken from a single video camera), the optical axis of each camera is preferably positioned along a plane normal to the walking plane of the subject. This mounting configuration allows the image to be tracked as it traverses along a linear path to facilitate trajectory extraction. In FIG. 2, fifteen (15) proposed virtual marker points for tracking trajectories on a subject 24 captured on a video are shown. Each marker point represents a trajectory (for a single gait cycle) or a set of trajectories (for several gait cycles) or

^DJ when analyzed by a motion analysis software. While a single virtual marker may be used to analyze a gait, using trajectories of multiple markers and using y-z coordinates instead of just a single y coordinate (such as for tracking a single point on a heel) would better discriminate between the tracked trajectories and known trajectories stored in a database for gait comparison, as further discussed below. If multiple virtual markers are used, recommended placement of the multiple markers for tracking include any combination of markers 3-15. In an alternative embodiment, particle filters may be used to track particular points on the subject's body. These particle filters would be used as an alternative to the WINanalyze software. Thus, particle filter tracking is another example of how trajectories may be obtained to analyze a gait at a later stage. Particle filter tracking is well known in the art.

From the trajectories and from knowledge that between two lowest points of a foot trajectory lies a gait cycle, a single gait cycle (or several cycles for repeatability and/or decrease of false positives) for analysis is extracted using a single virtual marker placed on a heel, such as marker 14 of FIG. 2. Again, if more than one marker is used, the process is simply repeated for each additional marker. Referring now to FIG. 3, a sample trajectory 26 of a y-axis displacement of the 14th marker is shown, which is a nonlinear, nonstationary signal plotted on a graph. The Y-coordinate represents a Y-displacement of the 14th marker while the X-coordinate represents time. In order to compare the trajectory 26 of the 14th marker to known trajectories from several subjects in a database, the trajectory is manipulated into signals unique to a subject's gait. These unique components are then used as feature vectors for comparing to feature vectors belonging to known subjects already stored in a database.

In one exemplary embodiment, tracked trajectories are first normalized using a dynamic time warping (DTW) method. This step is preferred as different people have different stride lengths, which is the distance covered during one gait cycle, and hence their gait cycles will be of different time durations. In addition, the duration of the gait cycle might vary from time to time for the same participant, due to fatigue, sickness, etc. The walking pace might also naturally vary within a single gait cycle without external influence. Thus, to facilitate comparing different gait signals, tracked trajectories and known trajectories must be normalized to some common base or length. The normalization process should compensate not only for the walking pace, but also for variations within each gait cycle.

The DTW method has been widely used in connection with speech recognition analysis. To use DTW for normalizing an input or tracked signal for gait recognition, the input signal is first compared to a template signal, which is a sample gait signal or trajectory having a desired duration. Instead of comparing the value of the input signal at time t to that of the template signal at time t, the DTW algorithm searches a space of mappings from the ι time sequence of the input signal to that of the template signal so that the distance, according to some criterion, such as mean squared error for some specified percent deviation, is minimized. For example, let D be a matrix whose rows correspond to the samples of the template signal, and whose columns correspond to the samples of the input signal. The following cumulative distortion measure is defined for each element in this matrix D as follows:

D(iJ) = d(i,j) + m{D\p(i,j)) + T[(i ) (i,j)]} ^p(ⁱ-^j) π i )

-, where d is a local distance measure between value i of the input sample and value y^' of the template sample, p(i, j) is the set of possible predecessors to i, j and T() is the cost associated with any particular transition. Qualitatively speaking, the DTW procedure can be summarized as follows:

1. The local distance (i.e., cumulative distortion value) for each element in

15 column 1 of distortion matrix is computed;

2. From the second sample value of the input (i.e., the second column of the matrix), starting with the bottom row (i.e., the last sample of the value of the template), equation 1.1 is evaluated. This equation is evaluated for each element in the column (i.e., for

20 each sample of the template signal);

3. The best (smallest) distortion number in the last column for the template is found; and

4. The path from upper-left corner to the lower right corner with the best distortion is used for normalization.

25

FIGs. 4-6 show the effect of DTW on two comparative signals. Referring initially to FIG. 4, two non-normalized gait signals are shown, one of which is a base or reference signal. Alternatively, one of the signals is a normalized reference signal and the other signal is a non- normalized signal. In this case, the Signal 1 graph is a non-normalized reference signal. In

30 FIG. 5, the Signal 2 graph has been normalized relative to the Signal 1 graph using a simple linear resampling method. From the result, it is clear that the durations of the normalized Signal 2 have been uniformly resized as there are fewer peaks and valleys. However, as a subject will generally not slow down or speed up in a uniform manner for the entire gait 35 cycle, simple linear resampling method will lead to unsatisfactory results. Thus, the DTW method is preferred for normalizing the signal, which result is shown in FIG. 6. As can be ι seen, Signal 2 shows a significant improvement in normalization over the same signal when simple linear resampling method is used.

Following the normalization step, the captured signal is further manipulated into a class of functions called intrinsic mode functions using a method called empirical mode decomposition, otherwise known in the art as sifting. Empirical mode decomposition is a process of decomposing a signal into its constituent "intrinsic mode functions". The process is empirical because each signal has different basis functions, and the decomposition varies from signal to signal. Sifting is the step-by-step method for achieving this process.

10 A function is an intrinsic mode function (IMF) if: (1) the number of extrema and the number of zero crossings are either equal or differ at most by one, and (2) at any point, the mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero. The two criteria satisfy the physically necessary conditions to define a meaningful instantaneous frequency. The method to decompose any signal into a set of LMFs

15 is called sifting.

The signals found in nature generally do not fit the definition of intrinsic mode function. For signals to be successfully decomposed into IMFs using the sifting method, it has to satisfy one assumption, that is, it has to have at least two extrema - one maximum and

20 one minimum. The method essentially involves two steps. Assuming X(t) to be the given signal, the two steps are:

1. A smooth spline is constructed, connecting all the maxima of X(t) to obtain its upper envelope Xmax(t). Then another smooth spline is constructed connecting all of the

25 minima, to obtain its lower envelope Xmin(t). The extrema are found by detecting the change of sign of the derivative of the signal. All of the original data points that define the signal should be enclosed between the upper and lower envelopes.

2. The mean of the two envelopes is then subtracted from the data to get a different signal Xι(t). Expressed mathematically, it appears as follows:

Xi(t) = X(t) - "^) + ^»»»( )

(1.2)

The process is then repeated for Xι(t) until the resulting Cι(t) satisfies the criteria of an intrinsic mode function. Typically, after several iterations, the equation produces a pure

35 frequency modulated signal of constant amplitude. Thus, a stopping criterion should be implemented to stop the iteration process. In one exemplary embodiment, sifting can stop by limiting the amount of standard deviation (SD) computed from two consecutive results. The SD is calculated as follows:

where k is the iteration number for steps 1 and 2 above. With each iteration, SD becomes smaller. Usually, the threshold value for SD is set between 0.2 to 0.3, or some other desired ranges.

At this point, IMF Cι(t) is obtained. Cι(t) is then subtracted from the original signal to get the residue R^t) and the two steps described above are repeated to obtain a residue R_n(t) that is smaller than a predetermined value or otherwise becomes a monotonic function. A function/ is called a monotonic function if: (1) x > y implies f(x) > f(y) or (2) x < y implies f(x) <f(y). The original signal can be reconstructed using the following equation. n

Thus, this method decomposes the original signal into n intrinsic mode functions, each with a distinct time scale. The first component has a finest time scale, and this scale increases with successive components. The decomposition is based on the local time scale of the data, and thus yields adaptive basis functions. Hence, it can be used for nonlinear and non-stationary signal analysis.

As an example of a sifting process, refer initially to FIG. 7 for a typical nonlinear, non-stationary signal. The above-described procedure is applied on the FIG. 7 signal to decompose the signal into a set of intrinsic mode functions. The resulting IMFs are shown in FIG. 8. The first subplot 28 in this figure is the original signal from FIG. 7, and the rest of the subplots, except for the last subplot 30, are the IMFs. The last subplot 30 is special in the sense that it is not an IMF, but the final residue, which indicates the overall signal trend during the sample period.

An analysis of the IMF plots reveal two important features when compared to LMF plots for trajectories of different subject or subjects:

1. The last LMF subplot 30 for each subject is similar to one another and is generally large in magnitude. In fact, this component has the largest amplitude and the deviation between the LMFs of different subjects are mathematically insignificant. ι 2. The first LMF subplot 28 for each subject is different and different for different trials of the same subject.

Thus, since the largest component of kinematic signal is common for all subjects, distinguishing between them is difficult. Hence, that component has to be discarded when differences between a subject's gait is to be determined. Also, the first component appears random and does not add any information as far as distinguishing between different people is concerned. Hence, that component is also discarded. The rest of the components can be used to generate feature vectors..

10 Referring again to FIG. 1, the gait recognition system provided in accordance with aspects of the present invention would then involve an undertaking to populate a data bank 22 with feature vectors from subjects to be tracked. This may be performed by the above- described steps with the ultimate outcome of producing a plurality of sets of feature vectors described in terms of LMF plots for different subjects, such as different LMF plots shown in

15 FIG. 8. When a target subject is to be analyzed, such as a masked bank robber whose footage is captured by a security camera, a set of LMF plots are then generated for that captured footage. This generated LMF plots are then compared against the LMF plots in the data bank 22 for determining whether the target subject has a prior history, which would correspond to

20 a set of LMF plots already in the data base 22.

The unique signatures of each person from these LMFs may be extracted as follows: From experiments, it is known that the last IMF is similar in all participants and is large in magnitude. In fact, this component has the largest magnitude. FIG. 12 illustrates this with

25 the mean of all the last LMFs and amount of deviation that might occur from the mean. The deviation is not significant between different subjects and different trials of each subject. Also, the first LMF is different for different participants and different for different trials of the same participant. FIG. 13 shows the correlation between different first IMFs. As is evident by FIG. 13, the correlation between different trials of the same participant and between different participants is nearly the same. That is, by plotting the first LMFs for various subjects, the results confirm that they are random and therefore difficult to differentiate or distinguish one first LMF plot of one subject from another first LMF plot of another subject. Accordingly, since the largest component of the kinematic signal is common for all o participants, distinguishing between them is difficult. Therefore, that component has to be discarded when differences between participants' gait are to be determined. Also, the first component appears random and does not add any information as far as distinguishing between different participant is concerned. Therefore, that component should also be discarded. The rest of the components are added together and a Fourier transform is taken to serve as the unique signature.

Multilayer Perceptions (MLPs) are feed forward neural networks that are used to perform classification problems using supervised training. In the present embodiment, MLPs are preferably used on the database (for example, database 22 in FIG. 1) to classify the extracted feature vectors (or unique signatures of each person) from the various trajectories to compare the stored feature vectors with the feature vector from the subject to be tracked. A typical topology of an MLP is shown in FIG. 9. It consists of one input layer, one or more hidden layers (typically only one hidden layer is used and only one is shown in FIG. 9) and one output layer. The neurons are represented in FIG. 9 as circles and, in all these layers, are connected to each other. Each of these connections is associated with a particular weight value. These weights for these connections are mainly responsible for learning complex tasks. Each neuron in this network has a nonlinear activation function. A logistic function, shown as equation 1.5 below, is commonly used to represent this activation function.

1

where v_j is the induced local field or input, of neuron j aidy_j is the output of that neuron.

The back-propagation algorithm is commonly used for training an MLP. This algorithm is implemented as two passes through the different layers of the network by way of the forward pass and the backward pass. The back propagation algorithm is a well known training technique for neural networks. Its two passes (forward and backward) are standard in the field and are available in numerous textbooks.

An input vector (i.e. the training vector or unique signature) is presented to the input layer. The effect of this vector is propagated (layer by layer) and an output vector is produced as the response of the network. During the forward pass, the connection weights are all fixed.

During the backward pass, the connection weights are altered to minimize the error. The error signal is produced by subtracting the actual response of the network obtained in the forward pass from the desired response, which is available during the training stage. The error signal is then propagated backwards, and the connection weights are adjusted such that the difference between the actual response and the desired response is reduced in some statistical sense. The result is that the network learns from the examples presented at its input. Bayes' Risk Criterion (BRC) is a very effective method of sensor fusion. Sensor fusion is a process where information from two or more modalities is combined. In this case, information from two or more markers are combined. BRC should be used only in the case of a multiple marker system. This method is used to minimize the expected cost function called the Bayes' Risk. The solution of this minimization problem is of the form:

where, m_\ denotes the presence of the claimed identity, m₂ its absence, and d_\, d₂ the respective decisions taken. p(z\^mύ ^d p(z\nι?) ^are the conditional probability density functions, and Cy represents the cost of deciding d when rø_j is present. The conditional probabilities are obtained from the single marker processor, at the output layer of the neural network. Also, cost is assigned to the decisions made. For example, a wrong decision by the system would have 'more' cost than a correct decision. As another example, in this case, the probability that the classification output from one sensor is correct is

and the probability that the output is wrong is given by p(z\m\).

= 1 -p(z\m\). When multiple sensors must be fused or combined, equation 1.6 becomes:

π {zi _>d2 (C21 ^~ Cιι)P(mι) ._, p(*i

^<dl (C12 - C7₂₂) (m₂)

(1-7) where n is the number of sensors being fused. If it is assumed that the events m_\ and rø are equally probable, i.e., P(m ) = P(mι) = 1/2, and using the standard "0-1" cost function (where the cost is 0 when a correct decision is made and the cost is 1 when a wrong decision is made), then equation 1.7 is reduced to:

where λ = 1. λ then can be tuned to meet specific False Acceptance Rates and False Rejection Rates.

Various numbers of neurons in the hidden layer and various decay rates of learning the MLP may be used to find the most optimal parameters for classification. The numbers of neurons can include: [100, 150, 200, 250, 300, 350], and the decay rates can be [0.01, 0.03, ι 0.05, 0.07, 0.08, 1.0]. The optimal number of neurons in the hidden layer, and the optimal decay rate can then be used to classify the aforementioned feature vectors (i.e., of length 100). A leave one out strategy with cross validation may be used for testing the efficacy of the feature vector. Leave one out strategy is where one sequence among several sequences produced from each subject is used for testing and the rest for training. For example, if there are five sequences from each of the five subjects, then four sequences of each subject is used for training and the one remaining sequence for testing. Cross validation involves leaving out a different sequence from the training set each time the training/testing procedure is

10 performed. The average recognition rate over all the sessions is then performed.

Turning now to FIG. 10, an exemplary schematic diagram of a single marker gait recognition system 34 provided in accordance with aspect of the present invention is shown. In the exemplary system 34, the trajectory (i.e., the vertical displacement) from only one marker 36 is used for gait recognition. As discussed above, the subject 38 to be analyzed is

15 captured on video. The video of the subject 38 is then processed by a motion analysis software 40, such as the Mikromak system from Mikromak Service of Berlin, Germany, or from particle filters. The motion analysis software 40 tracks the trajectories 42 produced by the maker 36 placed on the subject 38. Velocities, accelerations, and angle projections of the

20 trajectories 42 are tabulated by the software 40, which then outputs the results into ASCII text files or other machine readable files.

The trajectories are then normalized using a dynamic time warping algorithm 44. This algorithm normalizes the trajectories so that different trajectories may be compared

25 using a common base scale. Using another algorithm 46, empirical mode decomposition is performed on the normalized trajectories and feature vectors extracted 48 in the form of intrinsic mode functions (LMFs). The group of modules 51 within the dotted lines may be grouped as a single-marker processor 56. In other words, the single-marker processor comprises the various modules for analyzing a gait in accordance with aspects of the present invention. However, the different modules may be processed by more than one microprocessor or computer.

The intrinsic mode functions of the tracked subject 38 can then be compared with feature vectors stored in a database 50 using a feature comparison engine 52. The feature

^D comparison engine 52 uses multilayer perceptions in feed forward neural networks discussed above to perform classification problems using supervised training. An output 54 is produced with an identity of the tracked subject (or a zero match if the results fall outside some pre- defined limits) using the neural network, in the case of a single-marker system, or the neural network in conjunction with Bayes' risk criterion, in the case of a multiple-marker system, as described above.

Referring now to FIG. 11, a multiple-marker gait recogmtion system 58 provided in accordance with aspects of the present invention is shown. In the present system 58, trajectories from multiple markers 60a - 60e are used for recognition. The single-marker system 34 described above, and particularly the single marker processor 56, may form the core processing unit of the present system. A trajectory 62 from each marker is passed through the single-marker processor 56, which outputs the recognition result and confidence measure directly to the Bayes' Risk Criterion 64 to provide the final result 66. The confidence measure module 68 represents the probability measures that are output from the neural network The confidence measure module represents the probability that the output of the neural network is correct. Probabilities are used to make effective decisions using Bayes' Risk Criterion.

In an alternative embodiment, one of the gait recognition systems as described above is used with one or more biometrics for enhanced classification, such as with a face recognition system. It has been demonstrated by many field tests that face recognition is not accurate to a satisfactory level. The answer to this problem could be the use of multimodal biometrics, where more than one recognition techniques may be used to increase true recognition rate and also decrease false positives. One simple method for integrating the outputs from the face and gait recognition systems is to use another Bayes' Risk Criterion classifier, which now takes the probabilities from the face recognition system and the gait recognition system and makes an effective decision. Also, confidence in individual modes are adjusted dynamically, i.e., more confidence would be given to the face recognition subsystem, if the subject-of-interest is near the camera.

Another exemplary application of gait recognition is its use in the design of intelligent human-computer interfaces. For example, a person approaching a computer could be recognized by the gait recognition system, a signal is then relayed to a computer where the computer could then customize the platform for that person.

Although limited embodiments of the gait recognition systems and their algorithms have been specifically described and illustrated herein, many modifications and variations will be apparent to those skilled in the art. Accordingly, it is to be understood that the gait recognitions systems and their components constructed according to principles of this invention may be embodied other than as specifically described herein. The invention is defined in the following claims.

Claims

! WHAT IS CLAIMED IS :

1. A gait recognition system comprising a cycle extraction module, a normalization module, and an empirical mode decomposition module for determining a gait cycle based in part on nonlinear, nonstationary signals and on at least one virtual marker. 5

2. The gait recognition system as recited in claim 1, wherein the cycle extraction module comprises the at least one virtual marker.

10 3. The gait recogmtion system as recited in claim 2, wherein the cycle extraction module comprises a WINanalyze software.

4. The gait recognition system as recited in claim 1, further comprising a recording module for recording a walking motion generated by a subject.

15

5. The gait recognition system as recited in claim 4, wherein the nonlinear, nonstationary signals are normalized using a dynamic time warping method.

0 6. The gait recognition system as recited in claim 5, wherein the normalized signals are decomposed into a first set of intrinsic mode functions using an empirical mode decomposition method.

5 7. The gait recognition system as recited in claim 6, wherein the decomposition

Xι(t) = X(t) - ^■X""^» (^t) ⁺ -^X"^»"(*) includes using the mathematical equation:

0 8. The gait recognition system as recited in claim 6, further comprising a data base comprising a second set of intrinsic mode functions and a third set of intrinsic mode functions.

9. The gait recognition system as recited in claim 8, further comprising a

35 comparison module. ι 10. The gait recognition system as recited in claim 9, wherein the comparison module comprises a neural network for comparing the first set of intrinsic mode functions with the second and the third set of intrinsic mode functions.

11. A gait recognition system comprising a recording module for recording nonlinear, nonstationary signals; a data extraction module for compiling at least one trajectory based on the nonlinear, nonstationary signals; a data manipulation module for normalizing the at least one trajectory and for decomposing the at least one trajectory into a 10 first set of intrinsic mode functions; and a data comparison module for comparing the first set of intrinsic mode functions with a second set of intrinsic mode functions stored in a database.

12. The gait recognition system as recited in claim 11, wherein the recording module comprises one or more video cameras.

15

13. The gait recognition system as recited in claim 12, wherein the data extraction module comprises a motion analysis software or particle filters for extracting at least one trajectory from the nonlinear, nonstationary signals. 0

14. The gait recognition system as recited in claim 13, wherein the motion analysis software comprises two or more virtual markers for extracting two or more trajectories from the nonlinear, nonstationary signals. 5

15. The gait recognition system as recited in claim 11, wherein the data manipulation module comprises a data normalization algorithm and a data decomposition algorithm.

0

16. The gait recognition system as recited in claim 15, wherein the data decomposition algorithm produces a set of intrinsic mode functions from the nonlinear, nonstationary signals.

5 17. A gait recognition system comprising a recording module for recording nonlinear, nonstationary signals, a motion analysis software comprising a virtual marker for tracing at least one trajectory from the nonlinear, nonstationary signals; a data manipulation ι module for decomposing the trajectory into a first set of intrinsic mode functions, and a data comparison module that uses a neural network to compare the first set of intrinsic mode functions with a second set of intrinsic mode functions.

18. The gait recognition system as recited in claim 17, wherein the recording module comprises two or more video cameras.

19. The gait recognition system as recited in claim 17, wherein the motion 10 analysis software comprises a second virtual marker for tracing a second trajectory from the nonlinear, nonstationary signals.

20. The gait recognition system as recited in claim 17, further comprising a face recognition system.

15

0

5

0

5