COST-EFFECTIVE SYSTEM AND METHOD FOR DETECTING, CLASSIFYING AND TRACKING THE PEDESTRIAN USING NEAR INFRARED CAMERA
FIELD OF THE INVENTION
The present invention generally relates to a system and method for detecting, classifying and tracking the pedestrian. More particularly, this invention relates to a cost effective system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle by using real time images captured by near infrared (IR) camera disposed on the vehicle.
BACKGROUND OF THE INVENTION
Road accidents involving pedestrians are far more frequent at night than during the day. Worldwide, the number of people killed in road traffic crashes each year is estimated at almost 1.2 million, while the number injured could be as high as 50 million -which is the combined population of five of the world's largest cities. Many of the people killed in such accidents are the pedestrians. The most important factor is the driver's dramatically reduced range of vision. Fewer pedestrians would be killed or seriously injured if vehicles were equipped with improved pedestrian detection systems combined with driver warning strategies.
Some of the inventions which deals with pedestrian detection and tracking known to us are as follows:
US713941 1 to Fujimura et al teaches that a system and method for detecting and tracking as pedestrians, in low visibility conditions or otherwise. A night vision camera periodically captures an infrared image of a road from a single perspective. A pedestrian detection module determines a position of a pedestrian in the frame by processing the captured image. The pedestrian detection module includes a support vector machine to compare information derived from the night vision camera to a training database. A pedestrian tracking module estimates pedestrian movement of the detected pedestrian from in subsequent frames by applying filters. The tracking module uses Kalman filtering to estimate pedestrian movement at periodic times and mean-shifting to adjust the estimation.
US7526102 to Ibrahim Burak Ozer teaches that methods and systems for providing real-time video surveillance of crowded environments. The method consists of several object detection and tracking processes that may be selected automatically to track individual objects or group of objects based on the resolution and occlusion levels in the input videos. Possible objects of interest (OOI) may be human, animals, cars etc. The invention may be used for tracking people in crowded environments or cars in heavy traffic conditions.
US7421091 to Hiroshi Satoh teaches that outputs of pixels present around a given pixel at an image-capturing unit having a plurality of pixels disposed two-dimensionally are added to the output of the given pixel.
US5694487 to Min-Sub Lee teaches a method for determining feature point for each of the blocks based on the gradient magnitude and variance corresponding to each of the pixels therein.
Most of these known devices, systems and methods use complex methods in order to detect and track pedestrian and are costlier. The accuracy of these methods isn't adequate to detect and track the pedestrian.
Thus, in the light of the above mentioned background of the art, it is evident that, there is a need for a system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle for avoiding collision, which is simple, easy to install and provides higher accuracy at a lower cost.
OBJECTIVES OF THE INVENTION
The primary objective of the present invention is to provide a system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle for avoiding collision which is simple, easy to install, provides higher accuracy at a lower cost.
Another objective of the invention is to provide a systematic way of detecting the road region by estimating ground plane using near IR camera.
Further objective of the invention is to provide a systematic way of eliminating non-ground objects based on their distance to ground using near IR camera.
A still another objective of the invention is to provide a systematic way of filtering the non-ROI objects based on the shape of such objects by computing the signal to noise ratio (SNR) for each of such non-ROI objects using near IR camera.
Still another objective of the invention is to provide a systematic way of eliminating the non-vertical objects from the IR images by computing inertial moment relative to x and y axis with respect to the centre of mass of such non-vertical objects.
Yet another objective of the invention is to provide a systematic way of classifying the pedestrians in the analyzed frame of the image based their shape using near IR camera.
Further objective of the invention is to provide a systematic way of tracking the movement of the classified pedestrian using mean shift algorithm.
Yet another objective of the invention is to provide a system and method for detecting and tracking the pedestrians which is simple and cost effective.
SUMMARY OF THE INVENTION
Before the method, system, and hardware enablement of the present invention are described, it is to be understood that this invention in not limited to the particular systems, and methodologies described, as there can be multiple possible embodiments of the present invention which are not expressly illustrated in the present disclosure. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.
The present invention provides a system and method for system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collision which is simple, easy to install, provides higher accuracy at a jower cost.
The present invention embodies a cost effective method for detecting, classifying and tracking the pedestrian present in front of the vehicle using images captured by near infrared (IR) camera disposed on the vehicle, wherein the said method comprises the processor implemented steps of:
detecting the road to focus of attention and for filtering the region of interest (ROI) objects in the said image by estimating the ground plane; eliminating the non-ground objects based on their distance to ground; filtering the non-ROI objects based on the shape of such objects by computing the signal to noise ratio (SNR) for each of such non-ROI objects; eliminating the non-vertical objects by computing inertial moment relative to x and y axis with respect to the centre of mass of such non-vertical objects; classifying the pedestrians in the analyzed frame of the image based their shape; and tracking the movement of the classified pedestrian using mean shift algorithm.
In one aspect of the invention tracked data with respect to the pedestrian is further communicated by the processor to an output means.
In yet another aspect of the invention an alert means warns the driver for presence of one or more pedestrian, wherein the alert means can be audio and audio visual devices, sounding an alarm, a voice based caution, an Indicator and display.
In accordance with another aspect of the invention, near IR camera can be disposed either on the dashboard or in front or inside or top of the vehicle. In one exemplary embodiment of the invention, the near IR camera can be disposed in front of the vehicle.
BRIEF DESCRIPTION OF DRAWINGS
The foregoing summary, as well as the following detailed description of preferred embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, there is shown in the drawings example constructions of the invention; however, the invention is not limited to the specific methods and apparatus disclosed in the drawings:
Figure 1 is flowchart which illustrates a method for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collision according to various embodiments of the invention.
DETAIL DESCRIPTION OF THE INVENTION
Some embodiments of this invention, illustrating its features, will now be discussed in detail. The words "comprising," "having," "containing," and "including," and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural references unless the context clearly dictates otherwise. Although any systems and methods similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present invention, the preferredT systems and methods are now described. The disclosed embodiments are merely exemplary of the invention, which may be embodied in various forms by a person skilled in the art.
A cost effective method for detecting, classifying and tracking the pedestrian present in front of the vehicle by using images captured by near infrared (IR) camera disposed on the vehicle, the said method comprising the processor implemented steps of:
a) detecting the road to focus of attention for filtering the region of interest (ROI) objects in the said image by estimating the ground plane;
b) eliminating the non-ground objects based on their distance to ground;
c) filtering the non-ROI objects based on the shape of such objects by computing the signal to noise ratio (SNR) for each of such non-ROI objects;
d) eliminating the non-vertical objects by computing inertial moment relative to x and y axis with respect to the centre of mass of such non-vertical objects;
e) classifying the pedestrians in the analyzed frame of the image based their shape; and f) tracking the movement of the classified pedestrian using mean shift algorithm.
The present invention provides a system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collision which is simple, easy to install, provides higher accuracy at a lower cost.
According to one exemplary embodiment of the invention, a cost effective system comprises of a near IR camera disposed on the vehicle for capturing an image; and a processor for analyzing the captured image in real-time for detecting, classifying and tracking the pedestrian present in front of the vehicle.
Figure 1 is flowchart which illustrates a method 100 for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collision according to various embodiments of the invention.
In one embodiment of the invention, a cost effective method 100 for detecting, classifying and tracking the pedestrian present in front of the vehicle by using images captured by near infrared (IR) camera disposed on the vehicle. In accordance with another aspect of the invention, near IR camera can be disposed either on the dashboard or in front or inside or top of the vehicle. In one example embodiment of the invention, the near IR camera is disposed in front of the vehicle.
In accordance with one aspect of the invention the resolution of the near IR camera can be selected from 640*480, 720*480, etc. In an exemplary embodiment of the invention, the Tesolution of the near IR camera is 720*480. In accordance with one aspect of the invention the IR range of the near IR camera can be selected from a range of (0.7-1 ) to 5 Microns for detecting and tracking the pedestrians. In an exemplary embodiment of the invention, the IR range of the near IR camera can be selected from a range of 0.7 to 5 Microns. In accordance with one aspect of the invention the temperature range of the near IR camera can be selected from 740 to 3,000-5,200 Kelvin for detecting and tracking the pedestrians.
In accordance with one aspect of the invention the processor can be disposed either in the body of the near IR camera or on the dashboard of the vehicle. In one exemplary embodiment of the invention, the processor is disposed either in the body of the near IR camera. In accordance with another aspect of the invention the processor can be selected from the group of Davinci DM6446 Processor, ADSP-BF533, 750 MHz Blackfin Processor.
The above said cost effective method comprises various processor implemented steps. In the case of object detection with an on-board near IR camera, first ground region needs to be estimated. In the first step of the proposed method, the processor detects the road to focus attention for filtering the region of interest (ROI) objects in the said image by estimating the ground plane. In order to detect ground region, the processor executes the following steps:
Initially, Image differentiation 102 is done using sobel operator, then threshold the differentiated image 104 is determined using Otsu algorithm. According to one aspect of the invention, threshold parameters for differentiated images can be varied based on the applicability. In the thresholded of
the differentiated binary image, at each bottom most pixel along width of the image, traverse upwards until there is 0-1 transition. The region 106 from bottom most pixel to 0-1 -transition indicates smooth regions which indicates road. Mark such region values as ' X '. This procedure is repeated for each bottom most pixel image by the processor.
In order to detect complete ground region 108, divide the image intensity into 16 bins by the processor. Traverse from bottommost pixel upto half of the height of the image. Those pixels which are marked as ' X ' are considered as seed pixel. Choose the seed pixel, check the neighboring pixels and add them to the ground region if they are similar to the seed by computing Eulidean Distance (D) by the processor. Repeat this process for each of the newly added pixels; stop if no more pixels can be added. Newly added pixels are also now marked as ' X ' by the processor. This method is based on the assumption that the roads have relatively constant temperature thus produces no edges in an edge detected image.
In the second step of the proposed method, the processor eliminates the non-ground objects based on their distance to ground. In order to eliminate the non-ground objects, the processor executes the following steps:
Initially, threshold the original gray level image 110 using Otsu algorithm. In the thresholded binary image, eliminates the pixels which are marked as ' X '. Then executes connected component analysis on binary image. Let consider 'Lr', ' Hr', ' Lc' , 'Hc' be the lower most row, higher most row, left most column, right most column which constitutes the boundary box of a component. Then computes mean ( ) and standard deviation (σ) on ' Hr' of all the components. Any components whose ' Hr' is less than (μ + σ) 112 are deleted by the processor.
After the performance of the above mentioned steps there would be many non-ROIs which are bright stripe objects 114 such as a metallic sign boards, light sign board, traffic signs, headlight, an electric pole, and the poles of a guardrail. These objects have regular in shapes.
In the third step of the proposed method, the processor filters the non-ROI objects based on the shape of such objects by computing the signal to noise ratio (SNR) for each of such non-ROI objects. In order to detect these regular shapes, the processor executes the following steps:
For each component, finds the exterior boundary. A boundary pixel is a pixel whose value is and any one of the 8 neighbors whose pixel value is Ό', then the processor computes centroid (x
c , yj for a blob. Then com utes distance 0( ) from centroid of the blob to shape contour.
Then computes N point DFT R(f) on r(ri) . Then searches for the local maxima of amplitudes of frequency contents. If local maximas occur periodically computation of such period is done. It is observed that objects which are regular in , shape will have periodic local maximas and objects which are irregular in shape have no periodic local maximas. Then the processor computes replicated version of R
Where M is the periodicity.
In next step, computes N point IDFT R{f) on r(n) and then computes error signal, e(n) = r(n) - r(n) . Then the processor compute a signal to noise ratio(SNR),
SNR = 10 \og(Sv / Se )
Where s, (n) =∑(r[n])2 and St, (») =∑(e[«])2
SNR will be a very high for a regular shape object whereas for irregular shaped objects SNR will be less. In this way street light, car head light, pillars, lamp post are easily eliminated.
In the fourth step of the proposed method, the processor eliminates the non-vertical objects 116 by computing inertial moment relative to x and y axis with respect to the centre of mass of such non- vertical objects. In the gray level thresholded binary image, still there would be components which are aligned more horizontally rather than vertically. Pedestrian's objects are usually vertical in nature. In order to detect vertical elongated objects 118, the normalized central moments are used by the processor.
1
M ∑∑(x - x
c )
x (y - y
c y i(x,
Area
Where M, , is the inertial moment, relative to X axis, in respect to the center of mass.
Where M y , is the inertial moment, relative to Y axis, in respect to the center of mass.
Where Mxv , is the inertial moment, relative to both X and Y axis, in respect to the center of mass.
Since pedestrians are more vertical in nature M ' > M x .
M
If — > th, then object is vertical elongated otherwise it is horizontally elongated -which will be deleted by the processor. In one exemplary embodiment of the invention, the estimated threshold was around 3.2.
In the fifth step of the proposed method, the processor classifies the pedestrians in the analyzed frame of the image based their shape. Once the objects basic structure and a qualitative hint about pedestrians have been obtained, a more accurate control is necessary. For classification, pedestrian shape is used as a cue.
In order to classify the pedestrians in the analyzed frame of the image, the processor executes the following steps: For each ROI, the processor finds the exterior boundary. A boundary pixel is a pixel whose value is T and any one of the 8 neighbors whose pixel value is Ό'. Initially, computes centroid (xc , yc ) for a blob. Represent the , boundary as a complex co-ordinate function
Z(n) = [x(n) -x ] + i[y(n) - yc ] ^ Thjs shjft makes the snape representation invariant to translation. Objects shape and model shape can have different sizes. Consequently, the number of data points of the object and model representations will also be different. To avoid this problem, the shape boundary of objects and models must be sampled to have the same number of data points. a) Assuming K is the total number of candidate points to be sampled along the shape boundary. The equal angle sampling selects candidate points spaced at equal angle
Θ = 2 \\ Ι Κ
b) Fourier descriptor (FD) is obtained by computing 32 point FFT on complex co-ordinate a(u) =∑ Z(k) exp(-j2 Π uk I K )
function *=»
c) Rotation invariant of the FDs is achieved by ignoring the phase information and by taking only the magnitude values of the FDs.
d) Scale invariance is then obtained by dividing the magnitude values of the first half of FDs by the DC component.
FD\ \ \ FD2 \ FD.
/ =
\ FD0 \ ' \ FDQ \ ' \ FD0 \
e) Now for a model shape indexed by FD feature
/ = [/»>/„> , 1 and a <jata shape indexed by FD feature f = [ >< >f<> > since both features are normalized as to translation, rotation and scale, the Euclidean distance between the two feature vectors can be used as the similarity measurement.
d = (∑\ f: - f nui
Where N< is the truncated number of harmonics needed to index the shape.
d < threshold
Pedestrian = [0 d > threshold
Finally, pedestrians are classified based on the above comparison in the image.
In the final step of the proposed method, the processor tracks 120 the movement of the classified pedestrian using mean shift algorithm. In order to track the movement of the classified pedestrian, the processor executes the following steps:
After locating pedestrians in the current frame, from next frame onwards mean shift tracking is employed to track pedestrians. The mean shift tracking algorithm is an appearance based tracking method and it employs the mean shift iterations to find the target candidate that is the most similar to a given model in terms of intensity distribution, with the similarity of the two distributions being expressed by a metric based on the Bhattacharyya coefficient. The derivation of the Bhattacharyya coefficient from sample data involves the estimation of the target density q and the candidate density p , for which employs the histogram formulation.
First, considering the centroid of pedestrian blob as centre *° , the target model histogram is calculated by considering the feature space.
Compute 32 bin histogram q on edge based thresholded '™age Target model : ? = " = 1'2'3> 32 ,
From next frame onwards, centre of the target is inialized at its previous location ( y ) and target candidate histogram is calculated by considering the same feature space.
Update 32 bin histogram p on edge based thresholded image
Now the distance between target model and target candidate histogram is calculated, d(y) - \ p[p{y), q] ^ where ?[ ] is tne bhattacharya coefficient between
displacement of the target centre is calculated by the weighted mean.
§ W' where ~ PM
Once the new location of target is found, the processor executes the following steps:
a. Computes target candidate histogram at new location with the same feature space involving histogram equalized image range and bottom hat transformed image.
b. Computes Ap(y, ),q]
c. While ^ .?] < ?[/>( „ ),g]
do > < ~(y<> + y» I t- evaluate O,)> 9]
d. If I' stop.
e. Otherwise set y" < ~y> and derive weights, then new location and go to step 1 by the processor.
The above said method further comprises the step of warning the driver for avoiding collision characterized by use of the tracking data of the pedestrian, wherein the alert means is used for warning the driver, wherein the said alert means can be audio and audio visual means including but not limited to an alarm, a voice based caution, or an Indicator.
The preceding description has been presented with reference to various embodiments of the invention. Persons skilled in the art and technology to which this invention pertains will appreciate
that alterations and changes in the described process and methods of operation can be practiced without meaningfully departing from the principle, spirit and scope of this invention.
ADVANTAGES OF THE INVENTION
A system and method as proposed in the present invention has following advantages:
1. The present invention provides a system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collisions which is easy to install and execute.
2. The system of the present invention also provides a method for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collision having a reasonably high accuracy as compared to the existing conventional systems.
3. The present invention also provides a system and method for detecting, classifying and tracking the pedestrian present in front of the vehicle while driving for avoiding collision which is cost effective as compared to the conventional systems.