WO2002097758A1

WO2002097758A1 - Drowning early warning system

Info

Publication number: WO2002097758A1
Application number: PCT/SG2002/000105
Authority: WO
Inventors: Wei Yun Yau; Wen Miao Lu; Siew Wah Alvin Harvey Kam; Yap Peng Tan
Original assignee: Nanyang Technological University, Centre For Signal Processing
Priority date: 2001-05-25
Filing date: 2002-05-24
Publication date: 2002-12-05
Also published as: SG95652A1; WO2002097758A9

Abstract

This invention describes an audio-visual based method and system for early drowning detection system. In this invention, a number of cameras (100, 200, 201) are mounted on top of a swimming pool (101). These cameras (100, 200, 201) are used to monitor swimmers in the pool (101) together with the aid of an array of microphones (103). Similarly, the microphone array (103) is mounted above the water to cover the entire swimming pool (101). Based on the motion and activity of the swimmer detected through the video camera (100, 200, 201), the swimmer's condition is automatically analyzed. Such automated video analysis includes building the visual background model of the pool (101), detecting the presence of the swimmers in the monitored areas (106), estimating the number of swimmers inside the monitored area (106), tracking the swimmers and analyzing the behaviour of each tracked swimmer in terms of body orientation, moving direction and motion patterns. In addition, the microphone array (103) is deployed to pick up audio signals (104) originating from distress calls. Once the system detects the presence of a potential drowning, both visual and audio alarms will be activated to draw the attention of the person in charge for further confirmation and if necessary to provide necessary follow-up rescue actions.

Description

Drowning Early Warning System

Background of the invention

The present invention relates to an audio-visual system capable of monitoring a swimming pool and an automated method of analyzing swimmers' conditions to detect potential drowning incidents based on the received audio and video signals. A number of video cameras and an array of microphones are strategically placed above the water around the pool such that the entire swimming pool can be covered. Through the processing of the video sequence and/or the aud-io signal, abnormal conditions will be detected and considered as potential drowning cases. Such a system serves as an aid to the life-guards on duty or as a distress call to alert the attention of nearby people.

At present, there does not exist a viable automated system that monitors a swimming pool for a person in distress by analyzing swimmers' behavior. However, there are so many swimming pools worldwide, located in public places, private houses, condominiums and hotels. Unlike public pools, most private pools do not have lifeguards on duty. Therefore, a system that can provide monitoring assistance to such pools would be useful. In addition, a system capable of alerting the life-guards on duty of potential distress or drowning incidences would be helpful. Timely rescue of the person in distress or the drowning victims is critical in saving lives and in preventing irreversible injuries. It is thus highly desirable to have an automatic system capable of detecting distress or potential drowning accidents at an early stage. Related art

There are a few prior arts describing the monitoring of swimming pools to detect drowning cases. Most prior arts describe the use of floating devices that will sound an alarm if there is a wave (US Patent No 3953843, 4510487, 4775854), an echo (US Patent no 5274607, 5369623) or when the water is disturbed (US Patent no 3969712 and 5923263) . These prior arts are mainly useful in detecting people entering swimming pools, presumably unauthorized. US Patent No 4747085 and 4932009 describe the use of an array of transducers (typically ultrasonic) to detect the rate of motion of the swimmer and if the rate is suspicious, drowning is assumed. The US Patent No 5043705 describes the use of ultrasonic radar to scan the bottom and top layer of the swimming pool to detect motionless bodies, assumed to be possible drowning victims, while US Patent no 5043705 detects motionless bodies by using sonar scanning upward from the bottom of the pool. In another approach, US Patent no 5097254, 5408222 and 5907281 rely on a device worn by the swimmer. If the swimmer goes below a certain depth for an extended duration of time, an alarm will be generated or a float activated. US Patent no 6111510 uses a microphone system to detect heartbeats and breathing sounds and measures the interval between the absence and presence of these sounds to determine the possibility of drowning.

The only patents that use video cameras for processing are as follows :

1. 5886630 describes the use of video cameras to detect motionless bodies at the bottom of a swimming pool. The presence of such motionless bodies is used as an indication of possible drowning.

2. 6133838 describes a method which involves the installation of multiple underwater cameras at the side of a swimming pool. Drowning is assumed if a body is detected to be moving slowly or motionless underwater beyond a predetermined time.

None of them however provides any description of methods of analyzing swimmers' behaviors. Most approaches assumed that absence of motion or slow motion is an indication of drowning. There is however a significant possibility that the actions of standing, swimming on the spot, diving or playing underwater could trigger false alarms. To reduce such false alarms, the systems above have to be installed sufficiently deep. In such a case however, distress or the early drowning of a child at shallower places cannot be detected; moreover early distress calls by a swimmer and the typical initial struggles on the water surface before sinking are ignored. By the time the body sinks down to such a depth, there is only little time left to rescue the victim. Furthermore, the above systems require the placement of underwater cameras which requires significant and expensive installation procedures which also results in an interruption of the pools' operation. Therefore, a system of this type leaves much to be desired.

Summary of the Invention

It is an object of the present invention to provide a system and method for monitoring a swimming pool to indicate a risk of drowning of swimmers, which system and method only make use of above-water video cameras with optional microphones to cover a swimming pool and which can reliably indicate a risk of drowning of swimmers.

The object is achieved by analyzing swimmers' conditions which detects distress or possible drowning cases by making use of the cues of swimmers' behaviors, such as body orientation, area of the body above water, moving direction and motion symmetries, the image features of the surrounding areas of the swimmers (for example, water ripple patterns), sudden changes in the swimming pattern, irregular activity, and calls for assistance.

The present invention differs from the existing methods at least in the following aspects:

1. The system is installed above the water, making installation and maintenance easy and inexpensive. The system can be installed in existing swimming pools without the troublesome and expensive procedures of complete water drainage or costly renovations for cabling and installation of underwater devices.

2. The system is monitoring distress and early drowning signs such as irregular swimming patterns, signs of struggling, sudden submerging of body etc. and calls for assistance. It does not solely depend on the motionless cue.

The invention provides a method for monitoring a swimming pool to indicate a risk of drowning of swimmers, the method comprising: taking a plurality of subsequent images of a monitoring region at least partly containing a water surface of water contained in the swimming pool by means of a camera outside the water at a plurality of predetermined subsequent moments in time, in each image, detecting the presence of swimmer image portions each of which shows a swimmer present in the image, processing each detected swimmer image portion, so as to assign to the respective swimmer image portion a characteristic two-dimensional geometrical figure, wherein the geometrical figure is characterized by at least one predetermined geometrical attribute, assigning to each geometrical figure a figure position of the geometrical figure, wherein the figure position corresponds to a position of the detected swimmer image portion in the image, for at least one pair of two subsequent images, i.e. of a presently processed present image and a previously processed previous image, comparing values of at least one out of the figure position and the at least one geometrical attribute of the present image to that of the previous image, so as to detect a change in the figure position/ geometrical attribute of the present image as compared to the previous image, based on the detected change in the figure position/ geometrical attribute of the subsequent images, assigning to the corresponding swimmer either a drowning condition indicating that there is a risk of drowning of the swimmer or a safe condition indicating that there is no risk of drowning of the swimmer, and outputting an output signal if a drowning condition is assigned to the swimmer.

The invention further provides a system for monitoring a swimming pool, the system comprising: at least one camera being installed for taking a plurality of images of a monitoring region at least partly containing a water surface of water contained in the swimming pool at a plurality of subsequent moments in time, the camera further being installed outside the water, and a computer being coupled to the camera to receive an image taken by the camera and being installed to process the image, wherein the computer comprises: a means for detecting the presence of swimmer image portions each of which shows a swimmer present in the image, processing each detected swimmer image portion, so as to assign to the respective swimmer image portion a characteristic two-dimensional geometrical figure, wherein the geometrical figure is characterized by at least one predetermined geometrical attribute, a means for assigning to each geometrical figure a figure position of the geometrical figure, wherein the figure position corresponds to a position of the detected swimmer image portion in the image, a means for comparing, for at least one pair of two subsequent images, i.e. of a presently processed present image and a previously processed previous image, values of at least one out of the figure position and the at least one geometrical attribute of the present image to that of the previous image, so as to detect a change in the figure position/ geometrical attribute of the present image as compared to the previous image, a means for assigning to the corresponding swimmer, based on the detected change in the figure position/ geometrical attribute of the subsequent images, either a drowning condition indicating that there is a risk of drowning of the swimmer or a safe condition indicating that there is no risk of drowning of the swimmer, and a means for outputting an output signal if a drowning condition is assigned to the swimmer.

Brief discussion of the drawings

Fig. 1 shows a system setup for an embodiment of the system according to the present invention;

Fig. 2 illustrates three overlapping images of three different sub-regions of a monitoring region, the images taken by three different cameras;

Fig. 3 shows (a) a typical image of a swimming pool and histograms of the (b) Red, (c) Green and (d) Blue color component of the image;

Fig. 4a shows a background scene of a swimming pool;

Fig. 4b shows a segmentation of the image of Fig. 4a with four clusters;

Fig. 4c shows a 3D (4D) scatter plot of the background scene of Fig. 4a and 4b in RGB color space, where each cluster is shown in different color (gray scale) ;

Fig. 5a shows one time series of intensity values of a pixel where ripples occur, for a plurality of subsequent images (frames) ;

Fig. 5b shows one time series of intensity values of a pixel that the swimmer goes through, for a plurality of subsequent imaαes (frames) ; Fig. 6 (a) to (d) show images of swimmers in a swimming pool, with detected swimmer image portions, wherein the contour of each detected swimmer image portion is approximated by best- fit ellipse;

Fig. 7 shows a state flow diagram for the detection of potential drowning;

Fig. 8 shows the orientation of the body of a tracked swimmer for a plurality of subsequent frames taken by a video camera;

Fig. 9 shows the rate of orientation change obtained form Fig. 5; and

Fig. 10 shows an illustration of the overall process flow of an embodiment of the system according to the present invention.

Detailed description of the preferred embodiments

Hardware Setup

A number of video cameras are mounted above the swimming pool and are located around the pool such that these cameras cover the view of the entire swimming pool. Typically, each camera is mounted high up and at an angle viewing downward to the pool area so as to cover a large field of view, reduce occlusions of swimmers and minimize the perspective foreshortening effects. All the cameras are enclosed in a rain-proof compartment suitable for outdoor setting. Figure 1 shows an example of one such camera 100. There is an overlap in the view of each camera 100 and the two cameras 200, 201 at its side. An example is shown in Figure 2 in which 3 video cameras 100, 200, 201 are used to cover the entire view of the pool 101. The view of camera 100 and camera 200 overlaps as well as camera 200 and camera 201.

These video cameras 100, 200, 201 capture the video sequences of the activities inside the pool 101. All the cameras 100, 200, 201 are identical and each is responsible for monitoring a portion of the pool 101. The video sequences obtained will be processed by computers 102 to analyze for potential drowning cases.

In the following section, the method of analyzing one swimmer's behavior is discussed within the context of a single camera 100, the typical arrangement of which is depicted in Figure 1. Similarly, the array of microphones

103 is located such that sound from any part of the swimming pool 101 can be picked-up reliably. The audio signal 104 will be enhanced to increase the signal to noise ratio. This is also depicted in Figure 1.

The video signal 105 and audio signal 104 are processed by a computer 102 or a cluster of computers. The video signal 105 can be sent to a dedicated monitor for viewing by the person in-charge either via a wired line or a wireless link. Similarly, the audio signal 104 can be sent to the speaker via either wired or wireless means. When an abnormal condition is detected, which could possibly be a distress or drowning incident, the person in-charge will be alerted. The person in-charge can view the video signal 105 and hear the audio signal 104 to decide whether it is a genuine drowning incident and if it is, further rescue operation would ensue. If there is no response from the person in-charge after a short duration of time, a loud audible sound can be emitted to alert people nearby the swimming pool 101.

Operating Principle

The video data acquired from the video camera 100 is being sampled and digitized and the digitized data is made available to the computer 102. Typically, the number of samples or frames acquired is between 1 to 8 per second. The video data being processed is broken into short sequences. Typically, the duration of each sequence is between 20 seconds to 60 seconds. The operation for each sequence is similar and will be as follows.

For each sequence, a swimmer detection module is launched to detect and count the number of swimmers. This process is divided into three stages, namely global statistical model generation, segmentation of swimmers and updating the statistical model. These will be described in more detail below.

1. Global statistical model generation

Figure 3 shows the histogram 300 of a background scene of a typical swimming pool 101 in RGB (red, green, and blue) color space. These data display the behavior as expected: the black strip and water pixels form two fairly well defined peaks. This observation inspired us to employ the kernel-based mean shift procedure [1] to perform mean shift clustering. It provides a mixture of Gaussian distribution for the background scene.

The uniqueness of background scene of swimming pool 101 in a data analysis context lies in the fact that the image data are strongly correlated. Clusters in the joint domain correspond to underlying contiguous regions within the image, the recovery of which has been achieved using the mean shift procedure. Figure 4 shows the 3D scatter plot 402 of the scene 400 of the swimming pool 101 in the RGB color space and its segmentation 401. The image data are assigned to clusters using Euclidean distance. Each cluster is assumed to be a multivariate Gaussian characterized by its mean value and covariance matrix.

2. Segmentation of swimmers

Swimmers in the scene are characterized by a relatively large deviation from the background statistics. Changes in the model scene are computed at every frame by comparing current frames with the model. With the set of M clusters {C₁}₁=_I..._M that model the background scene at time t, a similarity measure is computed between the incoming frame and the background using the normalized Mahalanobis distance. This provides an estimate of the similarity with the background at every pixel. The distance measure of a pixel in the current frame is:

D_p (X_t) =arg min (D_p (X_t \ C₁ : i =1... N) ) ,

D_p (X_t \ C, ) = In l∑_{i f t} l+ ( X_t ^~ μ _{l t} t ) ^τ ∑ ^~ , _t ( X_t - μ ι, t ) ,

Where D_p (X_tJ measures the lowest difference between a cluster centroid and the projection of the pixel's observed value on the sub-space spanned by the cluster; μι,_t and ∑_1/t are the mean value and covariance matrix of the i^th cluster at time t. Thus, if a region of the image is sufficiently dissimilar from the modeled background, the system will consider it to be a swimmer. A binary map is then formed for the current frame, in which regions corresponding to the swimmers are highlighted as motion blobs. Basic morphological operations (erosion-dilation) [2] are applied to remove small noise regions and to fill holes in these regions of interest. However, some segmentation errors will still persist due to reflections (or ripples) on the surface of the water, which are not explicitly included into the background model. By including a-priori knowledge of the swimmers, our system is able to further suppress this type of false segmentation.

In order to differentiate reflections on the surface of water and swimmers, we take a time series of intensity values for two types of pixels. One type of pixels has the occurrence of reflections only (with no swimmers, compare first time series 500) while swimmers pass through the other type of pixels, compare second time series 501. Figure 5 shows the typical time series 500, 501 of these two types of pixels. Clearly, there is a distinctive contrast between swimmers and reflections in term of intensity changes. Swimmers generally cause a significant decrease in intensity of the background scene, while reflections and/or ripples make the scene brighter. As the pool 101 mainly contains water, the overall brightness of the scene B_s can be obtained from the mean value of the cluster having the largest sample size. We then impost another constraint on the segmentation of the swimmers: B_s -I_t > T_B, where I_t is the brightness of the pixel in the current frame and T_B is the conveniently chosen threshold. The binary map in which foreground regions contains the swimmers is represented by: bi (t)

(x_t ) >r ]Λ [B_S - I_t >T_B]}.

where ^Λ denotes the logical 'AND' operator. Thus, the binary map for the frame at time t is defined by the locations where the difference from the background model is greater than a given threshold T and the brightness level is lower than the overall brightness of the pool by the threshold T_B.

3__;_ Updating the background global statistical model

Since our background model gives the global description of the background scene, the update of the model mainly caters for the changes in overall lighting condition. For each frame considered, the existing Gaussian clusters for the background model are updated with the color values of pixels that are not classified as swimmers. The background pixels are assigned to the respective nearest clusters according to the normalized Mahalanobis distance.

The parameters μ_1#t and ∑ _ιt of i^1-*¹ cluster which matches the set of N new samples {X_κ} are updated as follows:

μ±,t = (1 - P)μι,t-ι + P^Xi,t ∑i,_t = (1 - p) ∑_{i t}_ι + pCi,_t

where

1 A N

Ci,t = - X(x_k-Xⁱ.^tXx_k- ⁱ. ^r . ^l and p is the learning factor for adapting current distribution of i^th cluster. In our implementation, the learning factor is a constant to provide faster Gaussian tracking at some expense of accuracy.

At the end of this segmentation module, the system represents each of the detected swimmer with a set of attributes, including an identity label, size, colour, centroid position and major orientation of the segmented swimmer represented by the major and minor axes of a best- fit ellipse 600 to 603 around the swimmer. Figure 6 shows detected swimmers in a swimming pool 101 with the superimposed best-fit ellipses 600 to 603.

This entire detection and tracking process is summarized as follows:

1. Convert the original RGB color space into the HSV color space.

2. Determine the area of interest to monitor 106 (water area) during system setup.

3. Compute a global statistical model to represent the background of the swimming pool 101. During system setup, an estimated background scene 400 to 402 for the swimming pool 101 is computed by observing the empty pool without any swimmer for some time, and pre-stored.

4. Detect swimmers by locating regions with large deviation from the background global statistical model.

5. Apply a series of morphological erosion and dilation operations [2] to remove isolated noise pixels obtained above ,

6. Connect the changed pixels into a contiguous foreground region, called blobs, using the connected component analysis algorithm [2] .

7. Update the background global statistical model.

8. Enclose each newly detected blob with a best-fit ellipse 600 to 603;

9. Label and count the blobs inside the monitored area.

After the swimmers are detected, the swimmers will be tracked using a multi-swimmer tracking module. This module uses a Kalman filter based multiple hypotheses tracking algorithm that incorporates color, position and size as the matching features. The system initiates a Kalman model for each detected swimmer. At each frame, an available pool of Kalman models are used to identify the detected swimmers with respect to the previously detected swimmers in the previous frame (this process is called correspondence) . When unambiguous correspondence between a model and a swimmer can be established, the model will be updated using the latest information of that swimmer. Models that cannot be used to explain any detected swimmer within a certain period will be removed. In that case, the system assumes swimmers corresponding to those models have left the monitored area.

The multi-swimmer tracking process is as follows:

1. A region corresponding to the pool area will be designated in each view of the video camera 100 so that the computer 102 can easily know the area of interest (AOI) (monitoring region 106) to monitor. This will be done only once during system setup;

2. The tracking module will monitor for swimmers entering into the AOI (monitoring region 106) ;

3. From the blobs detected using the above method, attributes of the swimmers such as the position and velocity will be extracted to form a tracking attribute vector, a . For example, if the tracking attributes used are position, p, and velocity, v, then the vector a for swimmer i will be in the form: aι= (p_{l f}

4. Compute the matching score matrix, M. Each entry s _J in M is the inverse weighted sum of tne Euclidean distance, D_λJ between the i^tn blob's tracking attribute vector, a_ι; in the current frame and the j ^th swimmer's tracking attribute vector, a₃ , in previous frame and the

Euclidean distance Oι₃ of a_x in the current frame and the predicted tracking attribute vector, a₃ , of the j^th swimmer. Therefore, the matrix, M, measures the likelihood of blobs in the current frame corresponding to swimmers in • previous frame. Assuming that there are ra swimmers in the current frame and n swimmers in the previous frame, then the matrix M will have the form: '11 '21 'ml

'12

M =

^Jl« where

D₁₃₊D.

|{ x|| denotes Euclidean norm of x , i = 1 ... m and j = 1 ... n .

5. Establish the one-to-one correspondence of swimmers between the current and previous frames by obtaining the scalar product of two matrices M₂ and M₂ , both of which are derived from the matching score matrix, M. Mi have entries equal to 1 corresponding to the highest blob-to-swimmer matching scores and other entries equal to 0 for the current frame while similarly, M₂ have entries equal to 1 corresponding to the highest swimmer-to-blob matching scores and other entries equal to 0 for the previous frame. The scalar product of Mi and M₂ creates a matrix M₃ , whose nonzero entries indicate the unambiguous correspondences between swimmers;

6. Analyze the locations of any unmatched blobs. If it is sufficiently close to one of the existing swimmers and appear away from the boundary of the AOI then the unmatched blob will be merged into the nearest swimmer. Otherwise, the unmatched blob will be added to the list of newly appearing swimmers if the system manages to track them over a few frames. If the new blob disappears after a few frames, then the new blob will be considered as noise and will be dropped;

7. Feed the blob' s information into a Kalman filter [3] to obtain the swimmer's predicted attributes for the next frame .

Once the swimmers are successfully tracked, the multiple- swimmer tracking module will extract the attributes of the tracked swimmers. For each swimmer, multiple attributes are extracted, such as the spatial location of the centroid of the swimmer, body orientation and size. From the temporal sequence of these attributes, other attributes such as the rate of orientation change, moving directions, motion symmetry, regularity of motion, sudden change in swimming pattern and water ripple patterns can be obtained. By learning the temporal model of these attributes, the system will compute an overall score for each swimmer to determine whether the swimmer is normal or at risk of drowning (including cases of distress and early drowning) .

The analysis process of the temporal model is based on the optimal filtering of past measurements. The state flow diagram 700 of the detection of potential drowning is given in Figure 7. The system will consider the swimmer to be in an abnormal condition if the system fails to give good prediction of the swimmer's attributes. This is illustrated using the rate of orientation change of the swimmer's body as an example. A sample plot 800 is shown in Figure 8. As can be seen, starting from around Frame 350, the body orientation of the swimmer changes much faster and is irregular. The fast and irregular change in body orientation serves as an indication to the breach of predictability of the swimming pattern. Figure 9 shows the rate of orientation change 900. This plot 900 is obtained from Figure 8 using the following equation:

T

^r( - ^l" _τ—" , where T is the preset length of temporal window.

Once the rate of orientation change 900 is larger than a preset threshold, the system will not be able to accurately predict the body orientation in the new frame using prior measurements. Therefore, a breach of predictability of the swimming pattern is detected in this case, and becomes one possible good indicator of an abnormal condition. The occurrence of several abnormal conditions together would be a good indication of a swimmer at risk of drowning.

The extraction of relevant features and their interpretation is most crucial for automatic detection of swimmers at risk of drowning. These features are an important aspect of this invention and besides the rate of change of orientation 900 described above, other features will now be described in more detail:

a) forward motion of the swimmer;

This attribute considers a swimmer to be in an abnormal condition if the swimmer is not moving forward but there is detected movement of the arms. This attribute is characterized by the spatial location of the centroid of the swimmer not changing beyond a preset boundary, given by: v_c V_r. ≤ D where

v_c= spatial location of centroid at current frame v_p = spatial location of centroid at previous frame vi = {Xi _rYi } , with the x, y coordinates in the image

A_{n n} ⁼ threshold of the change to consider as not moving forward.

The rate of motion will also be considered. An abnormal condition arises if the motion slows to almost a halt.

b) posture of the swimmer, whether upright or just slightly leaning;

This attribute considers a swimmer to be in a potentially abnormal condition if the posture of the body is upright. This is characterized by the major axis of the ellipse 600 to 603 being vertical or close to vertical .

c) size of the swimmer's body above the water;

This attribute considers a swimmer to be in a potentially abnormal condition if the size of the body, inclusive of the head that is not submerged in the water, is reducing or that increase in size is not detected for a period of time. This feature can be characterized by the change in the area of the best-fit ellipse 600 to 603.

d) path of the swimmer's movement;

This attribute considers a swimmer to be in a potentially abnormal condition if there is a significant change in the path taken by the swimmer as predicted from the past frames. If the body is in an upright position, the path to be checked could include up-down movement. The path can be obtained from the plot of the centroid over time. The best fit curve of the plot gives the path taken by the swimmer and the deviation of the path is seen as a change in the best fit curve or the presence of a deflection point.

motion symmetry of the swimmer;

This attribute considers a swimmer to be in a potentially abnormal condition if the motion of the swimmer does not show any symmetry. An example in which this attribute can be obtained is by dividing the image into two along the major axis of the ellipse 600 to 603, then flip one of the images along the axis and compute the correlation [2] between the two. If the value of the correlation is small, then there isn't much symmetry of the motion of the swimmer.

f) periodicity or repeatability of the swimmer' s movement pattern;

This attribute considers a swimmer to be in a potentially abnormal condition if the motion of the swimmer does not show any periodic or repeatable pattern. An example where this attribute can be obtained is by normalizing the image extracted from the best-fit ellipse 600 to 603 and then computing the cross- correlation [2] of these images over different frames. If the value of the correlation is smaller than a predetermined threshold, then no repeatability of the motion is detected.

g) ripple pattern in the surrounding area of the swimmer;

This attribute considers a swimmer to be in a potentially abnormal condition if the ripple surrounding the swimmer is more violent than normal. This attribute is characterized by the overall brightness of the water surrounding the swimmer. If the overall brightness increases beyond a certain threshold over the average of the water, then abnormal ripple is considered present.

In all the above indicators, the presence of several abnormal conditions together will serve as an indication of a swimmer at risk of drowning. For example, one of the considerations is that if the body orientation is vertical and there is no forward motion and that the motion is not symmetrical, then the swimmer is considered at risk of drowning. Once such condition is detected, the person in charge will be alerted and the video signal 105 and audio signal 104 corresponding to the area of the swimmer will be made available to the person in charge. Figure 10 shows the overall process flow 1000 of the proposed system.

Summarizing, this invention describes an audio-visual based method and system for early drowning detection system. In this invention, a number of cameras 100, 200, 201 are mounted on top of a swimming pool 101. These cameras 100,

200, 201 are used to monitor swimmers in the pool 101 together with the aid of an array of microphones 103. Similarly, the microphone array 103 is mounted above the water to cover the entire swimming pool 101. Based on the motion and activity of the swimmer detected through the video camera 100, 200, 201, the swimmer' s condition is automatically analyzed. Such automated video analysis includes building the visual background model of the pool 101, detecting the presence of swimmers in the monitored areas 106, estimating the number of swimmers inside the monitored area 106, tracking the swimmers and analyzing the behavior of each tracked swimmer in terms of body orientation, moving direction and motion patterns. In addition, the microphone array 103 is deployed to pick up audio signals 104 originating from distress calls. Once the system detects the presence of a potential drowning, both visual and audio alarms will be activated to draw the attention of the person in charge for further confirmation and if necessary to provide necessary follow-up rescue actions .

List of reference signs

100 video camera

101 swimming pool 102 computer

103 microphone

104 audio signal

105 video signal

106 monitored area of interest 200 video camera

201 video camera

300 histogram

400 background scene

401 segmentation 402 3D scatter plot

500 first time series

501 second time series

600 best-fit ellipse

601 best-fit ellipse 602 best-fit ellipse

603 best-fit ellipse 700 state flow diagram 800 sample plot

900 plot showing the rate of orientation change 1000 overall process flow

References

1. Yizong Cheng, "Mean Shift, Mode Seeking and Clustering, " IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 8, 1995, pp. 790-799.

2. Anil K. Jain, "Fundamentals of Digital Image Processing", Prentice Hall, 1989.

3. Yaakov Bar-Shalom & Xiao-Rong Li, Estimation and

Tracking: Principles, Techniques and Software" Artech House, 1993.

Claims

1. A method for monitoring a swimming pool (101) to indicate a risk of drowning of swimmers, the method comprising: taking a plurality of subsequent images of a monitoring region (106) at least partly containing a water surface of water contained in the swimming pool (101) by means of a camera (100) outside the water at a plurality of predetermined subsequent moments in time, in each image, detecting the presence of swimmer image portions each of which shows a swimmer present in the image, processing each detected swimmer image portion, so as to assign to the respective swimmer image portion a characteristic two-dimensional geometrical figure, wherein the geometrical figure characterizes at least one geometrical att ribute, assigning to each geometrical figure a figure position of the geometrical figure, wherein the figure position corresponds to a position of the detected swimmer image portion in the image, for at least one pair of two subsequent images, i.e. of a presently processed present image and a previously processed previous image, comparing values of at least one out of the figure position and the at least one geometrical attribute of the present image to that of the previous image, so as to detect a change in the figure position/ geometrical attribute of the present image as compared to the previous image, based on the detected change in the figure position/ geometrical attribute of the subsequent images, assigning to the corresponding swimmer either a drowning condition indicating that there is a risk of drowning of the swimmer or a safe condition indicating that there is no risk of drowning of the swimmer, and outputting an output signal if a drowning condition is assigned to the swimmer.

2. The method according to claim 1, wherein the at least one geometrical attribute comprises at least one attribute out of: a two-dimensional area of the geometrical figure corresponding to an area of the corresponding swimmer' s body visible to the camera (100), a shape of the geometrical figure corresponding to an approximate contour of the corresponding swimmer's body visible to the camera (100), an orientation of an axis of the geometrical figure, corresponding to an orientation of an axis of the corresponding swimmer's body visible to the camera (100), a length of an axis of the geometrical figure, corresponding to a length or width of the corresponding swimmer's body visible to the camera (100), and a symmetry condition of the geometrical figure.

3. The method according to claim 1, wherein the characteristic two-dimensional geometrical figure is a best-fit ellipse (600 to 603) by which a contour of the respective swimmer image portion is approximated, a major axis, a minor axis and a centroid (center) of the best- fit ellipse (600 to 603) are adjusted to best approximate said contour of the swimmer.

4. The method according to claim 3, wherein the at least one geometrical attribute comprises at least one attribute out of: an area of the best-fit ellipse (600 to 603), an. orientation of the major axis of the best-fit ellipse (600 to 603) , an orientation of the minor axis of the best-fit ellipse (600 to 603) , a length of the major axis of the best-fit ellipse (600 to 603) , a length of the minor axis of the best-fit ellipse (600 to 603}, a symmetry of the swimmer^'s body visible to the camera (100) with respect to the major axis of the best-fit ellipse (600 to 603) , and a symmetry of the swimmer's body visible to the camera (100) with respect to the minor axis of the best-fit ellipse (600 to 603) .

5. The method according to claim 1, further comprising: for each image, forming a binary map of the image, wherein each detected swimmer image portion is represented by a highlighted blob in the binary map and wherein the surface image portion is represented by dark regions in the binary map.

6. The method according to claim 1, wherein the images are video frames taken over a predetermined length of time by means of a video camera (100) .

7. The method according to claim 1, wherein a moving velocity of the figure position corresponding to a moving velocity of corresponding swimmer is determined from the detected change in the figure position and a possible drowning condition is assigned to the swimmer if moving velocity of the figure position is lower than a predetermined moving velocity threshold.

8. The method according to claim 2, wherein a possible drowning condition is assigned to the swimmer if the orientation of the axis of the swimmer' s body is vertical for a predetermined certain number out of a predetermined total number of subsequent images or that the rate of change of the orientation (900) is higher than a predetermined threshold for the rate of orientation change (900) .

9. The method according to claim 2, wherein a possible drowning condition is assigned to the swimmer if the detected area of the geometrical figure is lower than a predetermined area threshold for a predetermined certain number out of a predetermined total number of subsequent images .

10. The method according to claim 1, wherein a movement path of the figure position corresponding to a movement path of the corresponding swimmer is determined from the detected change in the figure position of a plurality of subsequent images and a possible drowning condition is assigned to the swimmer if the movement path of the figure position shows at least one deviation from an expected normal feature obtained by predicting the movement path.

11. The method according- to claim 1, wherein a movement periodicity of the geometrical figure corresponding to a movement periodicity of corresponding swimmer is determined from the detected change in the figure position and/or at least one geometrical attribute of a plurality of subsequent images and/or repeatable pattern of the swimmer's body visible to the camera (100) and a drowning condition is assigned to the swimmer if the movement periodicity of the figure position shows at least one predetermined abnormal feature.

12. The method according to claim 1, further comprising: assigning to each geometrical figure a surrounding area of the geometrical figure corresponding to an area of water surrounding the corresponding swimmer, and assigning to the swimmer a drowning condition if the surrounding area shows at least one predetermined abnormal feature.

13. The method according to claim 1, further comprising labeling each assigned geometrical figure, so as to distinguish different underlying swimmers.

14. The method according to claim 1, further comprising counting the number of assigned geometrical figures.

15. The method according to claim 1, wherein a present assigned geometrical figure of a present image is assigned to a predetermined swimmer, and wherein a subsequent assigned geometrical figure of a subsequent image is assigned to the identical predetermined swimmer if the figure position of the subsequent assigned geometrical figure is sufficiently close to the corresponding figure position of the present assigned geometrical figure.

16. The method according to claim 15, wherein the predetermined swimmer is deemed to have left the monitoring region if, for a predetermined number of subsequent images, no subsequent assigned geometrical figure having a figure position which is sufficiently close to the corresponding figure position of the respective present assigned geometrical figure can be found.

17. The method according to claim 1, further comprising providing a gauge global statistical model of the swimming pool (101), the gauge global statistical model comprising at least a water surface model characteristic of the water, and wherein detecting the presence of swimmer image portions comprises : by processing of the image, generating a current global statistical model, the current global statistical model comprising at least a water surface model characterizing a surface image portion comprising the water surface and, if at least one swimmer is present in the image, at least one swimmer model characterizing at least one swimmer image portion comprising at least one swimmer, identifying the at least one swimmer model by a comparison of the current global statistical model to the gauge global statistical model, so as to detect the presence of at least one swimmer image portion.

18. The method according to claim 17, wherein the gauge global statistical model is generated from a previously processed image.

19. The method according to claim 17, wherein the gauge global statistical model is generated from an image taken of the swimming pool (101) not containing any swimmers.

20. The method according to claim 17, further comprising: updating the gauge global statistical model by the current global statistical model, ignoring the swimmer model if any is present.

21. The method according to claim 17, wherein any surface image portion has mainly or in average a surface color and the at least one swimmer image portion has mainly or in average a swimmer color, and wherein the generating of the global statistical model comprises : for a predetermined plurality of moments out of the subsequent moments in time, converting respectively at least one image into a color-distribution in a four-dimensional color space spanned up by three primary colors and an brightness, wherein any surface image portion is converted into a surface cluster representing the surface color, and wherein any swimmer image portion is converted into a swimmer cluster representing the swimmer color, the converting of the image being performed by generating a histogram (300) of the image in each of the three primary colors, wherein each histogram (300) indicates the frequency distribution of the image of the swimming pool (101) in the respective primary color, and the brightness at each point in said color space being equal to a distance computed from the three frequencies of occurrence in the three primary colors at the respective point.

22. The method according to claim 21, wherein the distance measure is computed using the Euclidian distance.

23. The method according to claim 21, wherein the color space is the RGB color space.

24. The method according to claim 21, wherein the color space is the HSV color space.

25. The method according to claim 21, wherein the at least one swimmer image portion is detected in that the swimmer color deviates from the surface color.

26. The method according to claim 21, further comprising: assigning to each geometrical figure a surrounding area of the geometrical figure corresponding to an area of water surrounding the corresponding swimmer, and assigning to the swimmer a drowning condition if the surrounding area shows a color which is sufficiently different from the swimmer color and which is sufficiently different from the surface color.

27. The method according to claim 21, further comprising: after detecting the presence of swimmer image portions in a present image and before detecting the presence of swimmer image portions in a subsequent image, updating the water surface model of the gauge global statistical model by the water surface model of the current global statistical model of the present image.

28. The method according to claim 21, wherein the detecting of the swimmer model is performed by computing a difference measure between the current global statistical model and the gauge global statistical model, wherein the swimmer model is identified if the difference measure is larger than a predetermined difference threshold.

29. The method according to claim 28, wherein the similarity measure is computed using the normalized Mahalanobis distance.

30. The method according to claim 21, wherein the images are video frames taken over a predetermined length of time by means of a video camera (100) and wherein a sequence of several subsequent video frames is used in generating the respective color distribution.

31. A system for monitoring a swimming pool (101), the system comprising: at least one camera (100) being installed for taking a plurality of images of a monitoring region (106) at least partly containing a water surface of water contained in the swimming pool (101) at a plurality of subsequent moments in time, the camera (100) further being installed outside the water, and a computer (102) being coupled to the camera (100) to receive an image taken by the camera (100) and being installed to process the image, wherein the computer (102) comprises : a means for detecting the presence of swimmer image portions each of which shows a swimmer present in the image, processing each detected swimmer image portion, so as to assign to the respective swimmer image portion a characteristic two-dimensional geometrical figure, wherein the geometrical figure characterizes at least one geometrical attribute, a means for assigning to each geometrical figure a figure position of the geometrical figure, wherein the figure position corresponds to a position of the detected swimmer image portion in the image, a means for comparing, for at least one pair of two subsequent images, i.e. of a presently processed present image and a previously processed previous image, values of at least one out of the figure position and the at least one geometrical attribute of the present image to that of the previous image, so as to detect a change in the figure position/ geometrical attribute of the present image as compared to the previous image, a means for assigning to the corresponding swimmer, based on the detected change in the figure position/ geometrical attribute of the subsequent images, either a drowning condition indicating that there is a risk of drowning of the swimmer or a safe condition indicating that there is no risk of drowning of the swimmer, and a means for outputting an output signal if a drowning condition is assigned to the swimmer.

32. The system according to claim 31, further comprising a display means being coupled to the computer (102) for displaying a display signal which is dependent on the output signal.

33. The system according to claim 32, wherein the display means is an alarm means and wherein the display signal is a signal which is suited for driving the alarm means.

34. The system according to claim 32, wherein the display means is a computer monitor and wherein the display signal is a signal which is suited for being displayed on the computer monitor.

35. The system according to claim 31, wherein a plurality of cameras (100, 200, 201) is provided as the at least one camera (100, 200, 201) , wherein each camera (100, 200, 201) is installed for taking a plurality of images of a predetermined sub-region of the monitoring region at subsequent moments in time, wherein images of different sub-regions are taken by different cameras (100, 200, 201) .

36. The system according to claim 35, wherein adjacent sub-regions partly overlap.

37. The system according to claim 31, further comprising at least one microphone (103) for detecting sounds such as distress calls from swimmers.

38. A system for monitoring a swimming pool (101), the system comprising: at least one camera (100) being installed for taking a plurality of images of a monitoring region (106) at least partly containing a water surface of water contained in the swimming pool (101) at a plurality of subsequent moments in time, the camera (100) further being installed outside the water, and a computer (102) being coupled to the camera (100) to receive an image taken by the camera (100) and being installed to process the image, wherein the computer (102) comprises : a means for detecting the presence of swimmer image portions each of which shows a swimmer present in the image, processing each detected swimmer image portion, so as to assign to the respective swimmer image portion a characteristic two-dimensional geometrical figure, wherein the geometrical figure characterizes at least one geometrical attribute, a means for assigning to each geometrical figure a figure position of the geometrical figure, wherein the figure position corresponds to a position of the detected swimmer image portion in the image, a means for comparing, for at least one pair of two subsequent images, i.e. of a presently processed present image and a previously processed previous image, values of at least one out of the figure position and the at least one geometrical attribute of the present image to that of the previous image, so as to detect a change in the figure position/ geometrical attribute of the present image as compared to the previous image, a means for assigning to the corresponding swimmer, based on the detected change in the figure position/ geometrical attribute of the subsequent images, either a drowning condition indicating that there is a risk of drowning of the swimmer or a safe condition indicating that there is no risk of drowning of the swimmer, and a means for outputting an output signal if a drowning condition is assigned to the swimmer,

wherein the means for detecting the presence of swimmer image portions comprises: a means for generating a global statistic model from a taken image, a storage means for storing global statistic models, and a comparison means for comparing a current global statistic model to a gauge global statistic model which is stored in the storing means.

39. The method according to claim 1, further comprising the step of differentiating a swimmer from ripples on the surface of the water based on contrast information.

40. The system according to claim 31 or 38, further comprising means for differentiating a swimmer from ripples on the surface of the water based on contrast information.