WO2009039350A1 - System and method for estimating characteristics of persons or things - Google Patents

System and method for estimating characteristics of persons or things Download PDF

Info

Publication number
WO2009039350A1
WO2009039350A1 PCT/US2008/076977 US2008076977W WO2009039350A1 WO 2009039350 A1 WO2009039350 A1 WO 2009039350A1 US 2008076977 W US2008076977 W US 2008076977W WO 2009039350 A1 WO2009039350 A1 WO 2009039350A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
persons
human
boundary
means adapted
Prior art date
Application number
PCT/US2008/076977
Other languages
French (fr)
Inventor
Thomas Slowe
Peter Hall
Joseph Dachuk
Original Assignee
Micro Target Media Holdings Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Micro Target Media Holdings Inc. filed Critical Micro Target Media Holdings Inc.
Priority to CA002670021A priority Critical patent/CA2670021A1/en
Publication of WO2009039350A1 publication Critical patent/WO2009039350A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis

Definitions

  • the invention relates to systems and methods for estimating a number and/or other characteristics of persons or things, and particularly to systems and methods useful for estimating numbers and other characteristics of persons and other things included in visual representations and/or images of such persons, things and the like.
  • an apparatus, systems, methods and computer programming for estimating a number of persons or things there is provided an apparatus, systems, methods and computer programming for estimating a number of persons or things.
  • the method and system include receiving data representing a visual image of the persons or things; analyzing the data in the frequency domain to observe one or more edge properties of one or more edges of an outline of the person or things in the visual representation; and estimating a number of persons or things represented by the data by comparing the one or more edge properties against a model set of characteristics for the persons or things. A person or thing is counted in the number of persons or things for each set of the one or more edge properties that correlate to the model set of characteristics.
  • the analyzing the data may include separating one or more areas of the visual representation showing the persons or things from one or more background areas, and analyzing the one or more areas showing the persons or things to observe the one or more edge properties of the persons or things.
  • the model set of characteristics can be predetermined.
  • the model set of characteristics can be updated.
  • the model set of characteristics can be updated by self- training.
  • the one or more background areas may be determined by comparison to a background model set of characteristics, and the background model may be updatable.
  • the one or more edge properties may be determined to correlate to the model set of characteristics by meeting a threshold number of characteristics in the model set of characteristics.
  • Figure 1 is flow chart block diagram of an exemplary method of estimating a number of persons or things in accordance with the invention
  • Figure 2 provides transition charts relating to data analysis techniques useful in implementing embodiments of the method of Figure 1;
  • Figure 3 is a flow chart block diagram of an exemplary method of estimating a number of persons or things in accordance with the invention, incorporating of Figure 1;
  • Figure 4 is a graph showing a density and probability curve in an exemplary implementation of the method of Figure 1;
  • FIGS. 5 and 6 are schematic block diagrams of exemplary processes useful in implementing embodiments of the invention.
  • FIGS 7 and 8 are schematic block diagrams of exemplary processes useful in implementing alternate embodiments of the invention.
  • Figure 9 is a block diagram of a further embodiment of the invention.
  • Figure 10 is a block diagram of a still further embodiment of the invention.
  • the apparatus, systems and methods described below are herein are useful in determining numbers and other characteristics of persons and/or other things present within or otherwise appearing in a given area or image, such as for example within a live or stored visual representation, such as still or moving images, or within a field of view.
  • Such apparatus, systems and methods are particularly useful, for example, for implementation in computer-controlled applications for estimating tile numbers and reactions or persons in a crowd being monitored, such as by surveillance camera or cameras at an event.
  • Such embodiments of the invention can be useful, for instance, for estimating the number and other characteristics of spectators at an event, numbers and other characteristics of persons at designated locations (at an event or otherwise), or the numbers or other characteristics of persons that are in the vicinity of certain buildings, landmarks, attractions, or advertising media.
  • the estimation of numbers and other characteristics of other things can also be desired.
  • Figure 1 is a flow chart block diagram of an exemplary method for use in estimating numbers or other characteristics of objects in accordance with the invention.
  • Feature extraction process 100 of Figure 1 comprises providing data corresponding to a visual representation 102 to a computing system or other data processor for processing.
  • visual representation data 102 is compared to data representing a background model, which permits the analysis of data representing "foreground" areas that may represent objects of interest, such as people. Such areas of interest are sometimes referred to herein as "blobs".
  • the background model can be updated as appropriate, at 106, such as to adjust for daylight to nighttime changes and/or to stationary objects placed into the scene and which become part of the background.
  • the extraction of foreground data for further number analysis can be limited to one or more particular areas of the visual representation that are of interest, for example, such as may be desirable if one is trying to determine the number or persons in line at a concession stand or the number of persons within a certain distance from an advertisement.
  • background models useful in processes according to the invention are models of any information likely to be present in a representation of an image that is not of interest.
  • Such models can represent, for example, static items located within a field of view, regardless of their relative position within the field of view, or predicted or expected items, such as items which appear on a recurrent or predictable basis and are not of interest to the analyst.
  • a background model can be defined using a number of characteristics of a background scene. For instance, for a scene at an event in which a number of persons present within a given area is to be estimated, a background model can derived using a statistical model of the scene as it appears prior to entry of people to be counted. For example, one manner of analyzing a background model is to record data representing the background scene on a pixel-by-pixel basis. Referring to Figure 2 one concept of an exemplary method of updating a model of a stationary background, as shown at 106 in Figure 1, is shown. The entry of a new object into the visual field can be determined as a sharp change in the image characteristics over time. For example, changes within pixels representing the entirety or a sampling of an image can compared be observed over time, such that a sharp transition (shown as I: New Object) can
  • 000051UOlSW 1-JXG 4 be interpreted as entry into the scene of a new object, whereas a gradual change in the pixel (image) quality or characteristics can be interpreted to be merely a change in the background, such as due to changing lighting conditions. Should a new object be determined to have entered the scene, and if the new object remains in the scene for long enough, the background model can be updated to reflect that the background scene should include the new object.
  • a short-term or other previously-undetected presence of a new object can be interpreted as entry of a persons or other thing of interest to the scene.
  • the processes of locating of areas of interest and updating of background models can inform one another.
  • the process of updating the background model can also include manual intervention by an operator of the computing system for estimating the number of objects, especially for difficult cases that the system has lower confidence in determining background change or area location.
  • the system can flag particular change scenarios for operator intervention, either real time or as stored scenarios for later analysis.
  • background model 106 can include a set of statistical measures of the behavior of the pixels that collectively represent the appearance of the scene from a given camera in the form of an image, such as a video frame image.
  • the background model is for measuring static areas of the image, such that when a new dynamic object enters the field of view of the camera, a difference can be detected between its visual appearance and the appearance of the scene behind it. For example, if a pixel is thought of as a random variable with modelable statistical traits, pixels depicting portions of a new object on the scene would be detected as having significantly changed statistical traits.
  • the identification of areas of interest within an image can be accomplished through visual comparison of a background model against another visual representation.
  • foreground models can be constructed to detect foregrounds (i.e., areas of interest). This could for example be accomplished using orthogonal models to detect areas that appear to include objects for which a number or other characteristic is to be determined, which models set out generic features of the object.
  • Another foreground detection method that can be used is motion detection, in which frame subject methods are used to determine foregrounds, in the object is a mobile one (such as persons or vehicles).
  • Imdndge ⁇ 011495 000051M015194 1-JXG 5 intensive it can tend to reduce or eliminate the need to create and update a background model.
  • one way of proceeding can include using foreground modeling and/or segmentation processing to find any areas of interest. Regardless of whether areas of interest are identified, the process then can move to edge detection processing 108 of the area(s) of interest, or the entire visual representation 102, as the case may be.
  • edge detection processing 108 refers to "blobs" or "areas of interest”, but it is equally applicable to an implementation in which the entire visual representation 102 is analyzed.
  • the system analyses the areas of interest to observe one or more frequency properties to the edges of the outline(s) of each area of interest. For example, a frequency transform applied an exemplary two dimensional (such as an x, y pixel pair) signal of the visual presentation 102 can be taken to determine edge properties of the area(s) of interest.
  • a frequency decomposition algorithm known in the art such as fourier transform, discrete cosine transform and/or wavelets, can be used to reorganize image information in terms of frequency instead of space, which can be considered a visual image's innate form.
  • Fourier transform, discrete cosine transform and/or wavelets can be used to reorganize image information in terms of frequency instead of space, which can be considered a visual image's innate form.
  • frequency decomposition algorithms can be used to perform a subset of the normal decompositions, focusing only upon a range of frequencies.
  • edge detection algorithms In general, these algorithms are termed “edge detection algorithms".
  • the Sobel Edge Detection algorithm can be employed with standard settings for both horizontally and vertically oriented frequencies to obtain edge property information.
  • Edge detection processing 108 can also be informed by a scene model 110, which like the background model can be updatable to describe a geometric relationship between a visual source (e.g. a camera) and a three dimensional scene being observed by the visual source.
  • Scene model 110 can, but need not, also describe a camera's parameters such as lens focal length, field of view, or other properties.
  • Scene model 110 can be used with edge detection 108 to help inform processing 108 in its detection of edge properties to any identified areas of interest.
  • edge detection 108 the process moves onto breaking each edge and its associated edge properties 112, into oriented feature(s).
  • An oriented feature is for example an edge property that relates to the orientation of an edge on the visual representation, such as vertical, horizontal, or diagonal, Including at various degrees and angles.
  • Generation of edge properties, such as oriented features, can be tabulated or tracked as a feature list 114.
  • Feature list 114 can for example include a plot or a histogram of information for any edge property, or feature, that is broken out at 112. To estimate the number of objects in the visual representation, feature list 114 can be compared against a model set of characteristics
  • the image is convolved with a horizontal and vertical Sobel filter using standard settings, resulting in two corresponding horizontal and vertical images, in which the intensity of the pixel value at any given location implies a strength of an edge.
  • the total strength of the edge at any particular point in the image can therefore be defined as a vector magnitude as calculated from the horizontal and vertical edge images. In this example, if this magnitude is greater than half the maximum magnitude across the entire image being considered, then it is considered a feature.
  • the particular feature can be measured for its orientation by calculating the vector angle. For example, a 360 degree range can be broken up into eight equal parts each representing 45 degrees, the first of which can be defined to start at -22.5 degrees.
  • a histogram of these eight features can then be assembled based upon the number of incidences of each feature with a given region. It will be appreciated that this example given above is a simplification of an approach that can incorporate the use of more than a slice of image frequencies coupled with spatial constraints that can further model the outline of object(s) in an area of interest.
  • the estimation of a number of objects can be handled by the computing system by matching a histogram of feature list 114 against an object model and looking for the number of matches.
  • one or more edge characteristics can be defined for each body part (such as the head and/or arms), which can be matched against feature list 114 generated from visual representation 102.
  • an estimate can be made, within desired or otherwise-specified error margins as dictated in part by the level of detail in the object model, of the number of persons (i.e. objects) in visual representation 102.
  • the system can be trained by providing multiple examples of humans at a distance and crowds varied in density and numbers, which can be hand labeled for location and rough outline.
  • the training can be a fully automated process, such as with artificial intelligence, or partially or wholly be based on manual operator intervention.
  • a feature histogram can be generated for each person, where it is normalized for person size given by a scene model.
  • Each of these "people models” can then be used to train a machine-learning algorithm such as
  • GMHA human appearance
  • a simple initial approach can be to accumulate individual feature histograms to create a collection of features of an entire group, which can then be normalized by a total number of people used for training to result in the GMHA.
  • new images and/or sub-parts thereof can be feature- extracted, normalized and used to produce feature histogram(s).
  • These new feature histogram(s) can then be compared to the GMHA, using a machine learning algorithm such as those described above.
  • the number of incidences of GMHA features within the new feature histograms can denote the number of objects (i.e., persons or things) within a given visual representation, such as an image or a sub-image.
  • the model characteristics, and the threshold or criteria for declaring a match can all be set and adjusted as desired for a particular application, the estimation process can tend to be optimized towards particular applications. For example, for the estimation of numbers of persons in dense crowds, the system would tend to have a more detailed object model of a human head and/or shoulder, so that only a partial view of the head and/or shoulder would be sufficient to generate the edge property that would result in a match.
  • process 100 provides feature list 114 (not shown in Figure 3) to comparator 308 for matching edge properties of the visual representation 102 against features of object model 306. Also shown in Figure 3 is a training process that can optionally be used to update the object feature model 306. Therein, a video archive of crowds can be fed through feature extraction process 100 to generate an archive feature list that the system can learn at 304 as being characteristics of persons in a crowd, which can then be used to update or revise model 306 with edge properties as appropriate.
  • a number (or density)/probability curve 310 can be constructed to track if a match has been made.
  • An example of such a curve is shown in Figure 4.
  • Such a curve shows the number (or density) of persons at different probabilities, and permits a performance threshold to be set by a user of the system.
  • the curve of Figure 4 permits reports to be generated to state that a certain number of persons are shown in the visual representation at a particular percentage probability.
  • additional or alternative characteristics of persons or other objects can be determined in addition to merely the number of objects.
  • more parameters regarding the persons can be specified, such as number of persons of particular age/gender/ethnicity, number of persons with positive facial expressions, number of persons with negative facial expressions, or number of persons wearing cloths of a particular color or style.
  • one audience measurement metric is whether there is a strong reaction to advertising that can be correlated to memory retention by the audience. It will be appreciated that for other objects, different estimation parameters or characteristics may be specified.
  • FIG. 5 there is shown an example of a video analysis architecture 500 for estimating the number and determining other characteristics of a group of persons within a video image.
  • visual representation 502 of the group of persons is analyzed by the feature extraction process 100, customized for persons as described above, in addition to one or more of face view estimator 506, gender/ethnicity estimation 508, expression estimation 510, or other analysis 512.
  • Feature models 516 relating to each of these analysis processes can be then compared to with extracted features from each or any of 100, 506, 508, 510 and 512 at block 514, so as to determine a number matches for each feature to estimate the number of persons fitting parameters defined with the feature extraction in 506, 508, 510 and 512.
  • Models 516 in this example could include model object features in model 306 described above, as well as other features relating to the estimation parameters defined with 506, 508, 510 and 512.
  • a person of skill in the relevant arts will appreciate that these model characteristics and the comparison thereof to generate number (density)/probability curves 518 are similar to that described above with respect to curve 316, and so such details are not described again with respect to
  • FIG. 6 there is shown an architecture 600 (designated as "macro” as opposed to the "micro” designation of architecture 500 shown in Figure 5) that utilizes multiple cameras to provide multiple visual representations of different locations of an event, in which a micro architecture 500 is associated with each camera in order to generate number estimations and number (density)/probability curves 605 for the event.
  • an architecture 600 designated as "macro” as opposed to the "micro” designation of architecture 500 shown in Figure 5
  • a micro architecture 500 is associated with each camera in order to generate number estimations and number (density)/probability curves 605 for the event.
  • any overlaps in views captured by different cameras can be calculated and stored as global scene models 604, which can be used to ensure that the same objects, such as persons, are not counted more than once due to the object appearing within views of two or more cameras or visual sources.
  • the total cumulative number (density)/probability estimates of an event can then be created as curves 606, representing estimates as seen by the entire camera or visual source network.
  • micro architecture 700 and macro architecture 800 similar to architectures 500 and 600 respectively are shown. However, in place of outputting a number (density)/probability curve as in architecture 500, architecture 700 is set to estimate and output demographic-based counts and scene (such as, of visual representation 502) statistics.
  • macro architecture 800 shown in Figure 8 can be utilized to measure large scale event statistics similarly to architecture 600, but output results as event demographic counts and statistics 806.
  • any type of information derivable from data representing images may be used as output, particularly in advertising applications those types of data useful in assessing the effectiveness of displayed images, including for example, advertising images.
  • a camera used in a system described herein it can be calibrated in order to give greater confidence in number estimations.
  • a camera can be calibrated to generated geometric relationship(s) between a three-dimensional scene being observed by the camera.
  • Such calibration can be automatic or manual, and can include use of template patterns, calibration periods and/or computer annotation of video frames.
  • an automatic approach can leverage any prior or common knowledge about a size of readily detectable objects.
  • persons can generally be readily detected through an
  • FIG. 9 there is shown an embodiment of the invention in 10 which an estimation system is configured for use with mobile station 900.
  • Station 900 can include a vehicle, or a mobile platform that can be moved by a person or vehicle from location to location.
  • the embodiment of Figure 9 is useful, for example, for having an estimation system set up at temporary locations with one or more stations 900 at a time and location when an event is taking place and estimations are desired.
  • a plurality of cameras 902 providing one or more visual representations can be connected to station 900 via post 906.
  • it can be desirable to elevate cameras 902 above persons or objects to be counted, so that the dept perception of the visual representation(s) can be improved.
  • a mobile station can have multiple posts and/or other camera mounts to provide additional cameras 902, visual sources, and/or viewing angles.
  • Each station 900, with its array of cameras 902 can monitor an area 904 defined by the viewing angle and range of its associated cameras 902.
  • mobile station 900 is shown to be among persons in area 904, and the numbers and/or characteristics of which, including demographics, can be estimated by the systems and methods of estimation as described above and operating in conjunction with mobile station 900.
  • estimation operation can occur locally at station 900, or alternatively, raw, compressed, and/or processed data can be transferred live to another location for processing. Still alternatively, such data can be stored at station 900 for a time, and then off-loaded or transferred for processing such as when mobile station 900 returns to dock at a processing base or centre.
  • Processing 908 include models for estimating, in the crowd in area 904, different numbers and characteristics such as set out in data sets 910 and 912. These include head counts (or an estimate of the number of persons in area 904), traffic density, face views,
  • System 1000 can be set up with a billboard style advertisement that may have a passive or fixed image, or actively changing image or multimedia presentations.
  • an enclosure 1004 having one or more cameras 1002 that are set up to estimate the number and characteristics of possible observers to the billboard advertisement or objects near the billboard.
  • System 1000 further includes a battery 1012 to operate the system's electronics and computing circuitry, and a solar panel 1010 to charge battery 1012 when there is daylight. Alternatively, wired AC power can be used as well.
  • System 1000 further includes processing 1014 to process the visual representation(s) that are observed from camera(s) 1002, such as described above with reference to Figures 1 to 8.
  • System 1000 is also equipped with a trans/receiver 1006 connected to antenna 1008 for wirelessly transmitting the results of processing 1014 to a remote location for review.
  • the results of processing 1014 (such as number/probability curves, demographic information, face reactions and/or event statistics) can be transferred from system 1000 to a server (not shown) which then posts the results for access over the Internet or a private network.
  • raw, compressed or processed data from camera(s) 1002 can be stored and later transferred, or transferred live, through wired or wireless connections to a remove location for estimation processing as described above with reference to Figures 1 to 8.
  • system 1000 is set up near a road 1016 with sidewalk 1020.
  • Camera 1002 are set up for observing vehicles 1018 on road 1016, and persons 1022 on sidewalk 1020 so as to be able estimate the number of persons and/or vehicles that come in proximity of an advertisement associated with system 1000, and to estimate characteristics such as demographics and/or reactions of viewers to the advertisement, such as face view estimations, gender/ethnicity estimation, face expression estimation, length of face views, persons/vehicle counts and traffic density, emotion reaction to advertisement, and/or demographics.
  • system 1000 can also be trained to detect the direction of travel of vehicles 1018, so as to be able to determine the length of time that a billboard advertisement associated with system 1000 is, for example, in direct frontal view of a vehicle 1018 or the number of vehicles 1018 and the length of time that they are not in a direct frontal, but still visible angle to the billboard advertising.
  • a billboard advertisement associated with system 1000 is, for example, in direct frontal view of a vehicle 1018 or the number of vehicles 1018 and the length of time that they are not in a direct frontal, but still visible angle to the billboard advertising.
  • By utilizing higher resolution cameras 1012 it is also possible to observe and estimate the number and characteristics of persons in vehicles 1018 as well.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

In one embodiment of the subject application, there is a method and system of estimating the number of persons or things. The method and system include receiving data representing a visual image of the persons or things; analyzing the data in the frequency domain to observe one or more edge properties of one or more edges of an outline of the person or things in the visual representation; and estimating a number of persons or things represented by the data by comparing the one or more edge properties against a model set of characteristics for the persons or things. A person or thing is counted in the number of persons or things for each set of the one or more edge properties that correlate to the model set of characteristics.

Description

SYSTEM AND METHOD FOR ESTIMATING CHARACTERISTICS OF PERSONS
OR THINGS
Cross-Reference to Related Applications This application claims priority to U.S. Provisional Application 60/973,678, filed 19
September 2007 and incorporates by this reference the disclosures of co-pending patent applications 11/558,031 entitled METHOD FOR DISPLAY OF ADVERTISING and filed 9 November 2006, 60/870,258 entitled SYSTEM AND METHOD FOR DISPLAY OF ADVERTISING, AND METHODS OF TRACKING VIEWINGS THEREOF and filed 15 December 2006, 60/871,507 entitled SYSTEM AND METHOD FOR DISPLAYING ADVERTISING AND TRACKING VIEWINGS THEREOF and filed 22 December 2006, 60/938,013 entitled SYSTEM AND METHOD FOR OBTAINING AND UTILIZING ADVERTISING INFORMATION and filed 15 May 2007, and 60/970,191 entitled SYSTEM AND METHOD FOR ESTIMATING CHARACTERISTICS OF PERSONS OR THINGS and filed 5 September 2007; including all appendices and other documents attached thereto.
Background of the Invention
The invention relates to systems and methods for estimating a number and/or other characteristics of persons or things, and particularly to systems and methods useful for estimating numbers and other characteristics of persons and other things included in visual representations and/or images of such persons, things and the like.
Summary of the Invention
In one embodiment of the subject application, there is provided an apparatus, systems, methods and computer programming for estimating a number of persons or things.
In one embodiment of the subject application, there is a method and system of estimating the number of persons or things. The method and system include receiving data representing a visual image of the persons or things; analyzing the data in the frequency domain to observe one or more edge properties of one or more edges of an outline of the person or things in the visual representation; and estimating a number of persons or things represented by the data by comparing the one or more edge properties against a model set of characteristics for the persons or things. A person or thing is counted in the number of persons or things for each set of the one or more edge properties that correlate to the model set of characteristics.
000051MOlSIe4 1 JXG 1 The analyzing the data may include separating one or more areas of the visual representation showing the persons or things from one or more background areas, and analyzing the one or more areas showing the persons or things to observe the one or more edge properties of the persons or things. The model set of characteristics can be predetermined. The model set of characteristics can be updated. The model set of characteristics can be updated by self- training.
The one or more background areas may be determined by comparison to a background model set of characteristics, and the background model may be updatable. The one or more edge properties may be determined to correlate to the model set of characteristics by meeting a threshold number of characteristics in the model set of characteristics.
Still other advantages, aspects and features of the subject application will become readily apparent to those skilled in the art from the following description wherein there is shown and described a preferred embodiment of the subject application, simply by way of illustration of one of the best modes best suited to carry out the subject application. As it will be realized, the subject application is capable of other different embodiments and its several details are capable of modifications in various obvious aspects all without departing from the scope of the subject application. Accordingly, the drawings and descriptions will be regarded as illustrative in nature and not as restrictive.
Brief Description of the Drawings
The foregoing and other aspects of the invention will become more apparent from the following description of specific embodiments thereof and the accompanying drawings which illustrate, by way of example only, the principles of the invention. In the drawings, where like elements feature like reference numerals (and wherein individual elements bear unique alphabetical suffixes):
Figure 1 is flow chart block diagram of an exemplary method of estimating a number of persons or things in accordance with the invention; Figure 2 provides transition charts relating to data analysis techniques useful in implementing embodiments of the method of Figure 1;
Imdnage\011495 000051U015194 1-JXG Figure 3 is a flow chart block diagram of an exemplary method of estimating a number of persons or things in accordance with the invention, incorporating of Figure 1;
Figure 4 is a graph showing a density and probability curve in an exemplary implementation of the method of Figure 1;
Figures 5 and 6 are schematic block diagrams of exemplary processes useful in implementing embodiments of the invention;
Figures 7 and 8 are schematic block diagrams of exemplary processes useful in implementing alternate embodiments of the invention; Figure 9 is a block diagram of a further embodiment of the invention; and
Figure 10 is a block diagram of a still further embodiment of the invention.
Detailed Description of the Preferred Embodiments
The description which follows, and the embodiments described therein, are provided by way of illustration of an example, or examples, of particular embodiments of the principles of the present invention. These examples are provided for the purposes of explanation, and not limitation, of those principles and of the invention.
The apparatus, systems and methods described below are herein are useful in determining numbers and other characteristics of persons and/or other things present within or otherwise appearing in a given area or image, such as for example within a live or stored visual representation, such as still or moving images, or within a field of view. Such apparatus, systems and methods are particularly useful, for example, for implementation in computer-controlled applications for estimating tile numbers and reactions or persons in a crowd being monitored, such as by surveillance camera or cameras at an event. Such embodiments of the invention can be useful, for instance, for estimating the number and other characteristics of spectators at an event, numbers and other characteristics of persons at designated locations (at an event or otherwise), or the numbers or other characteristics of persons that are in the vicinity of certain buildings, landmarks, attractions, or advertising media. In addition to estimating numbers and other characteristics of persons in such circumstances, the estimation of numbers and other characteristics of other things can also be desired.
The estimation of the number and other characteristics of objects (be it either persons or things) within a visual representation can tend to be difficult, particularly where such persons or objects are present in high density, due to different factors including occlusion of
Imanage\011495.000051U015194.1-JXG 3 objects by each other; varied motion or the lack thereof; unknown intrinsic camera parameters for obtaining the visual representation; unknown camera position relative to the scene of the visual representation; and/or unpredictable lighting changes.
Figure 1 is a flow chart block diagram of an exemplary method for use in estimating numbers or other characteristics of objects in accordance with the invention. Feature extraction process 100 of Figure 1 comprises providing data corresponding to a visual representation 102 to a computing system or other data processor for processing. At 104, visual representation data 102 is compared to data representing a background model, which permits the analysis of data representing "foreground" areas that may represent objects of interest, such as people. Such areas of interest are sometimes referred to herein as "blobs". As visual representation 102 is processed, the background model can be updated as appropriate, at 106, such as to adjust for daylight to nighttime changes and/or to stationary objects placed into the scene and which become part of the background. In some applications, the extraction of foreground data for further number analysis can be limited to one or more particular areas of the visual representation that are of interest, for example, such as may be desirable if one is trying to determine the number or persons in line at a concession stand or the number of persons within a certain distance from an advertisement.
As will be appreciated by those skilled in the relevant arts, "background" models useful in processes according to the invention are models of any information likely to be present in a representation of an image that is not of interest. Such models can represent, for example, static items located within a field of view, regardless of their relative position within the field of view, or predicted or expected items, such as items which appear on a recurrent or predictable basis and are not of interest to the analyst.
A background model can be defined using a number of characteristics of a background scene. For instance, for a scene at an event in which a number of persons present within a given area is to be estimated, a background model can derived using a statistical model of the scene as it appears prior to entry of people to be counted. For example, one manner of analyzing a background model is to record data representing the background scene on a pixel-by-pixel basis. Referring to Figure 2 one concept of an exemplary method of updating a model of a stationary background, as shown at 106 in Figure 1, is shown. The entry of a new object into the visual field can be determined as a sharp change in the image characteristics over time. For example, changes within pixels representing the entirety or a sampling of an image can compared be observed over time, such that a sharp transition (shown as I: New Object) can
000051UOlSW 1-JXG 4 be interpreted as entry into the scene of a new object, whereas a gradual change in the pixel (image) quality or characteristics can be interpreted to be merely a change in the background, such as due to changing lighting conditions. Should a new object be determined to have entered the scene, and if the new object remains in the scene for long enough, the background model can be updated to reflect that the background scene should include the new object.
Conversely, a short-term or other previously-undetected presence of a new object can be interpreted as entry of a persons or other thing of interest to the scene. Thus, a person skilled in the relevant arts would appreciate that the processes of locating of areas of interest and updating of background models can inform one another. Furthermore, as shown in Figure 1, the process of updating the background model can also include manual intervention by an operator of the computing system for estimating the number of objects, especially for difficult cases that the system has lower confidence in determining background change or area location. For example, the system can flag particular change scenarios for operator intervention, either real time or as stored scenarios for later analysis. Thus, in an exemplary embodiment background model 106 can include a set of statistical measures of the behavior of the pixels that collectively represent the appearance of the scene from a given camera in the form of an image, such as a video frame image. The background model is for measuring static areas of the image, such that when a new dynamic object enters the field of view of the camera, a difference can be detected between its visual appearance and the appearance of the scene behind it. For example, if a pixel is thought of as a random variable with modelable statistical traits, pixels depicting portions of a new object on the scene would be detected as having significantly changed statistical traits.
The identification of areas of interest within an image can be accomplished through visual comparison of a background model against another visual representation. Alternatively or additionally, foreground models can be constructed to detect foregrounds (i.e., areas of interest). This could for example be accomplished using orthogonal models to detect areas that appear to include objects for which a number or other characteristic is to be determined, which models set out generic features of the object. Another foreground detection method that can be used is motion detection, in which frame subject methods are used to determine foregrounds, in the object is a mobile one (such as persons or vehicles).
Referring back to Figure 1, a person of skill will appreciate that, optionally, background separation and the identification of areas of interest 104 can be skipped, and the visual representation can be passed directly to edge detection 108 without first removing or otherwise accounting for the background. While this may tend to be more computationally
Imdndge\011495 000051M015194 1-JXG 5 intensive, it can tend to reduce or eliminate the need to create and update a background model. For example, one way of proceeding can include using foreground modeling and/or segmentation processing to find any areas of interest. Regardless of whether areas of interest are identified, the process then can move to edge detection processing 108 of the area(s) of interest, or the entire visual representation 102, as the case may be. The following description refers to "blobs" or "areas of interest", but it is equally applicable to an implementation in which the entire visual representation 102 is analyzed.
In edge detection processing 108, the system analyses the areas of interest to observe one or more frequency properties to the edges of the outline(s) of each area of interest. For example, a frequency transform applied an exemplary two dimensional (such as an x, y pixel pair) signal of the visual presentation 102 can be taken to determine edge properties of the area(s) of interest. A frequency decomposition algorithm known in the art, such as fourier transform, discrete cosine transform and/or wavelets, can be used to reorganize image information in terms of frequency instead of space, which can be considered a visual image's innate form. Several frequency decomposition algorithms can be used to perform a subset of the normal decompositions, focusing only upon a range of frequencies. In general, these algorithms are termed "edge detection algorithms". In an exemplary implementation, the Sobel Edge Detection algorithm can be employed with standard settings for both horizontally and vertically oriented frequencies to obtain edge property information. Edge detection processing 108 can also be informed by a scene model 110, which like the background model can be updatable to describe a geometric relationship between a visual source (e.g. a camera) and a three dimensional scene being observed by the visual source. Scene model 110 can, but need not, also describe a camera's parameters such as lens focal length, field of view, or other properties. Scene model 110 can be used with edge detection 108 to help inform processing 108 in its detection of edge properties to any identified areas of interest.
Once edge detection 108 is complete, the process moves onto breaking each edge and its associated edge properties 112, into oriented feature(s). An oriented feature is for example an edge property that relates to the orientation of an edge on the visual representation, such as vertical, horizontal, or diagonal, Including at various degrees and angles. Generation of edge properties, such as oriented features, can be tabulated or tracked as a feature list 114.
Feature list 114 can for example include a plot or a histogram of information for any edge property, or feature, that is broken out at 112. To estimate the number of objects in the visual representation, feature list 114 can be compared against a model set of characteristics
Im-ndge\011495 000051U015194 1-JXG 6 for the object whose number is being estimated. For instance, if the number of persons is being estimated, there can be edge characteristics to persons that are set out in the model, which can be compared to feature list 114 to estimate the number of persons in visual representation 102. In one implementation, it has been found that a human model with eight defined edge characteristics can provide a fairly reliable indication of person(s) in a visual representation. In the exemplary implementation, the eight edge features are derived from their orientations, and can be computed as follows. The image is convolved with a horizontal and vertical Sobel filter using standard settings, resulting in two corresponding horizontal and vertical images, in which the intensity of the pixel value at any given location implies a strength of an edge. The total strength of the edge at any particular point in the image can therefore be defined as a vector magnitude as calculated from the horizontal and vertical edge images. In this example, if this magnitude is greater than half the maximum magnitude across the entire image being considered, then it is considered a feature. The particular feature can be measured for its orientation by calculating the vector angle. For example, a 360 degree range can be broken up into eight equal parts each representing 45 degrees, the first of which can be defined to start at -22.5 degrees. A histogram of these eight features can then be assembled based upon the number of incidences of each feature with a given region. It will be appreciated that this example given above is a simplification of an approach that can incorporate the use of more than a slice of image frequencies coupled with spatial constraints that can further model the outline of object(s) in an area of interest.
Thus, in an embodiment the estimation of a number of objects can be handled by the computing system by matching a histogram of feature list 114 against an object model and looking for the number of matches. In the example of a person, one or more edge characteristics can be defined for each body part (such as the head and/or arms), which can be matched against feature list 114 generated from visual representation 102. From the number of resulting matches, an estimate can be made, within desired or otherwise-specified error margins as dictated in part by the level of detail in the object model, of the number of persons (i.e. objects) in visual representation 102. In the embodiment, the system can be trained by providing multiple examples of humans at a distance and crowds varied in density and numbers, which can be hand labeled for location and rough outline. The training can be a fully automated process, such as with artificial intelligence, or partially or wholly be based on manual operator intervention. With this training information, a feature histogram can be generated for each person, where it is normalized for person size given by a scene model. Each of these "people models" can then be used to train a machine-learning algorithm such as
000051U015194 1-JXG 7 an support vector machine, neural network, or other algorithm, resulting in a generalized model of human appearance ("GMHA") in the feature space. Thus, a simple initial approach can be to accumulate individual feature histograms to create a collection of features of an entire group, which can then be normalized by a total number of people used for training to result in the GMHA. During live operation, new images and/or sub-parts thereof, can be feature- extracted, normalized and used to produce feature histogram(s). These new feature histogram(s) can then be compared to the GMHA, using a machine learning algorithm such as those described above. In a basic example, the number of incidences of GMHA features within the new feature histograms can denote the number of objects (i.e., persons or things) within a given visual representation, such as an image or a sub-image.
Thus, it will be appreciated that greater or fewer characteristics can be defined in an object model with respect to the object being estimated, which can provide for greater or lesser confidence in an estimation of the number of objects in a visual representation being analyzed. Since the model characteristics, and the threshold or criteria for declaring a match can all be set and adjusted as desired for a particular application, the estimation process can tend to be optimized towards particular applications. For example, for the estimation of numbers of persons in dense crowds, the system would tend to have a more detailed object model of a human head and/or shoulder, so that only a partial view of the head and/or shoulder would be sufficient to generate the edge property that would result in a match. Referring for example to an implementation for counting persons in a crowd, as shown in Figure 3, process 100 provides feature list 114 (not shown in Figure 3) to comparator 308 for matching edge properties of the visual representation 102 against features of object model 306. Also shown in Figure 3 is a training process that can optionally be used to update the object feature model 306. Therein, a video archive of crowds can be fed through feature extraction process 100 to generate an archive feature list that the system can learn at 304 as being characteristics of persons in a crowd, which can then be used to update or revise model 306 with edge properties as appropriate.
From a comparison of feature list 114 with object model 306 in block 308, a number (or density)/probability curve 310 can be constructed to track if a match has been made. An example of such a curve is shown in Figure 4. Such a curve shows the number (or density) of persons at different probabilities, and permits a performance threshold to be set by a user of the system. For example, the curve of Figure 4 permits reports to be generated to state that a certain number of persons are shown in the visual representation at a particular percentage probability.
Imanage\0π495.000051\i01S194.1-JXG 8 In alternate embodiments, additional or alternative characteristics of persons or other objects can be determined in addition to merely the number of objects. For example, if the system is used to estimate the number of persons, more parameters regarding the persons can be specified, such as number of persons of particular age/gender/ethnicity, number of persons with positive facial expressions, number of persons with negative facial expressions, or number of persons wearing cloths of a particular color or style. In particular, for implementations relating to advertising, it can be desirable to be able to estimate or otherwise determine the number of persons that react "positively" or "strongly" to the advertising by observing the number of persons with "positive" or "strong" facial expressions in the vicinity of the advertising. For example, in advertising media, one audience measurement metric is whether there is a strong reaction to advertising that can be correlated to memory retention by the audience. It will be appreciated that for other objects, different estimation parameters or characteristics may be specified.
Referring to Figure 5, there is shown an example of a video analysis architecture 500 for estimating the number and determining other characteristics of a group of persons within a video image. In architecture 500, visual representation 502 of the group of persons is analyzed by the feature extraction process 100, customized for persons as described above, in addition to one or more of face view estimator 506, gender/ethnicity estimation 508, expression estimation 510, or other analysis 512. Feature models 516 relating to each of these analysis processes can be then compared to with extracted features from each or any of 100, 506, 508, 510 and 512 at block 514, so as to determine a number matches for each feature to estimate the number of persons fitting parameters defined with the feature extraction in 506, 508, 510 and 512. Models 516 in this example could include model object features in model 306 described above, as well as other features relating to the estimation parameters defined with 506, 508, 510 and 512. A person of skill in the relevant arts will appreciate that these model characteristics and the comparison thereof to generate number (density)/probability curves 518 are similar to that described above with respect to curve 316, and so such details are not described again with respect to
506, 508, 510, 512 and 518. While the foregoing has been described with reference to a single source of visual information, the apparatus, systems and methods described herein can be applied to multiple sources of visual information so as to provide scalability over large 25 areas. Alternatively, if two or more visual information sources are provided to the same physical location, the estimates resulting from each source can be correlated to provide
Imanage\011495 000051U015194 1-JXG 9 greater confidence in the estimate of the number of the object in the location covered by the visual information sources. For example, building on the example described above with reference to Figure 5, in Figure 6 there is shown an architecture 600 (designated as "macro" as opposed to the "micro" designation of architecture 500 shown in Figure 5) that utilizes multiple cameras to provide multiple visual representations of different locations of an event, in which a micro architecture 500 is associated with each camera in order to generate number estimations and number (density)/probability curves 605 for the event. In architecture 600, any overlaps in views captured by different cameras can be calculated and stored as global scene models 604, which can be used to ensure that the same objects, such as persons, are not counted more than once due to the object appearing within views of two or more cameras or visual sources. The total cumulative number (density)/probability estimates of an event can then be created as curves 606, representing estimates as seen by the entire camera or visual source network.
The output of the micro/macro architectures need not be number (density)/probability estimates or curves, but the system can be specified to output other types of information as well, including for example statistics and counts. Referring for example to Figures 7 and 8, micro architecture 700 and macro architecture 800 similar to architectures 500 and 600 respectively are shown. However, in place of outputting a number (density)/probability curve as in architecture 500, architecture 700 is set to estimate and output demographic-based counts and scene (such as, of visual representation 502) statistics. Thus, macro architecture 800 shown in Figure 8 can be utilized to measure large scale event statistics similarly to architecture 600, but output results as event demographic counts and statistics 806.
As will be appreciated by those skilled in the relevant arts, any type of information derivable from data representing images may be used as output, particularly in advertising applications those types of data useful in assessing the effectiveness of displayed images, including for example, advertising images.
For a camera used in a system described herein, it can be calibrated in order to give greater confidence in number estimations. For example, a camera can be calibrated to generated geometric relationship(s) between a three-dimensional scene being observed by the camera. Such calibration can be automatic or manual, and can include use of template patterns, calibration periods and/or computer annotation of video frames. For instance, an automatic approach can leverage any prior or common knowledge about a size of readily detectable objects. As an example, persons can generally be readily detected through an
Imdπdge\011495 000051U015194 1-JXG 10 approach involving of background segmentation as discussed above. If an algorithm is tuned to assume that objects of particular pixel masses are persons, the knowledge that people are generally roughly 170 cm tall can be used to calculate a rough relationship between the size of objects in an observed scene and their pixel representation(s). Thus, if the algorithm performs this task upon people standing in at least 3 locations in an image, the an estimate of the relationship between the camera's orientation relative to the physical scene can be calculated.
Referring to Figure 9, there is shown an embodiment of the invention in 10 which an estimation system is configured for use with mobile station 900. Station 900 can include a vehicle, or a mobile platform that can be moved by a person or vehicle from location to location. The embodiment of Figure 9 is useful, for example, for having an estimation system set up at temporary locations with one or more stations 900 at a time and location when an event is taking place and estimations are desired.
As shown in Figure 9, a plurality of cameras 902 providing one or more visual representations can be connected to station 900 via post 906. For some embodiments, it can be desirable to elevate cameras 902 above persons or objects to be counted, so that the dept perception of the visual representation(s) can be improved. It will be appreciated that in other embodiments, a mobile station can have multiple posts and/or other camera mounts to provide additional cameras 902, visual sources, and/or viewing angles. Each station 900, with its array of cameras 902 can monitor an area 904 defined by the viewing angle and range of its associated cameras 902. In this example, mobile station 900 is shown to be among persons in area 904, and the numbers and/or characteristics of which, including demographics, can be estimated by the systems and methods of estimation as described above and operating in conjunction with mobile station 900. Such estimation operation can occur locally at station 900, or alternatively, raw, compressed, and/or processed data can be transferred live to another location for processing. Still alternatively, such data can be stored at station 900 for a time, and then off-loaded or transferred for processing such as when mobile station 900 returns to dock at a processing base or centre.
For the example shown in Figure 9, processing as described above with reference to Figures 1 to 8 can be conducted locally at station 900 with processing 908. Processing 908 include models for estimating, in the crowd in area 904, different numbers and characteristics such as set out in data sets 910 and 912. These include head counts (or an estimate of the number of persons in area 904), traffic density, face views,
Imanage\0H495 000051U015194 1-JXG 11 length of face views, ethnicity of viewers, gender of viewers, an emotional reaction (such as to an advertisement associated with station 900) and/or group demographics.
Systems and methods of estimation can also be used at a stationary position. Referring to Figure 10, there is an exemplary embodiment in which the systems and methods of estimation are implemented at a fixed location, such as with a fixed billboard (shown in side view) advertisement. System 1000 can be set up with a billboard style advertisement that may have a passive or fixed image, or actively changing image or multimedia presentations. In system 1000, there can be provided an enclosure 1004 having one or more cameras 1002 that are set up to estimate the number and characteristics of possible observers to the billboard advertisement or objects near the billboard. System 1000 further includes a battery 1012 to operate the system's electronics and computing circuitry, and a solar panel 1010 to charge battery 1012 when there is daylight. Alternatively, wired AC power can be used as well. System 1000 further includes processing 1014 to process the visual representation(s) that are observed from camera(s) 1002, such as described above with reference to Figures 1 to 8. System 1000 is also equipped with a trans/receiver 1006 connected to antenna 1008 for wirelessly transmitting the results of processing 1014 to a remote location for review. For example, the results of processing 1014 (such as number/probability curves, demographic information, face reactions and/or event statistics) can be transferred from system 1000 to a server (not shown) which then posts the results for access over the Internet or a private network. Alternatively, raw, compressed or processed data from camera(s) 1002 can be stored and later transferred, or transferred live, through wired or wireless connections to a remove location for estimation processing as described above with reference to Figures 1 to 8.
For the embodiment shown, system 1000 is set up near a road 1016 with sidewalk 1020. Camera 1002 are set up for observing vehicles 1018 on road 1016, and persons 1022 on sidewalk 1020 so as to be able estimate the number of persons and/or vehicles that come in proximity of an advertisement associated with system 1000, and to estimate characteristics such as demographics and/or reactions of viewers to the advertisement, such as face view estimations, gender/ethnicity estimation, face expression estimation, length of face views, persons/vehicle counts and traffic density, emotion reaction to advertisement, and/or demographics.
The observation of persons 1022 on sidewalk 1020 is similar to that described above with respect to Figures 1 to 9 and so the details of which are now repeated here again. With respect to vehicles 1018, in addition to training to estimate the numbers and characteristics of
Imanage\011495 000051U015194 1-JXG 12 the vehicles, system 1000 can also be trained to detect the direction of travel of vehicles 1018, so as to be able to determine the length of time that a billboard advertisement associated with system 1000 is, for example, in direct frontal view of a vehicle 1018 or the number of vehicles 1018 and the length of time that they are not in a direct frontal, but still visible angle to the billboard advertising. By utilizing higher resolution cameras 1012, it is also possible to observe and estimate the number and characteristics of persons in vehicles 1018 as well.
While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be appreciated by those skilled in the relevant arts, once they have been made familiar with this disclosure, that various changes in form and detail can be made without departing from the true scope of the invention in the appended claims. The invention is therefore not to be limited to the exact components or details of methodology or construction set forth above. Except to the extent necessary or inherent in the processes themselves, no particular order to steps or stages of methods or processes described in this disclosure, including the Figures, is intended or implied. In many cases the order of process steps may be varied without changing the purpose, effect, or import of the methods described.
Imanage\011495.00Q051\1015194.1-JXG 13

Claims

What is claimed:
1. A system for extracting population information from a digital image comprising: means adapted for receiving digital image data inclusive of geographic data corresponding to an image of a geographical location; means adapted for receiving sector data defining at least one sub-portion of interest relative to the digital image data; means adapted for receiving boundary data; boundary detection means adapted for detecting at least one boundary area within the at least one sub-portion of interest in accordance with received boundary data; means adapted for generating population data in accordance with each detected boundary area; and means adapted for outputting the population data.
2. The system of claim 1 wherein the boundary data includes histogram data corresponding to at least one physical human characteristic such that the at least one boundary area includes a human.
3. The system of claim 2 further comprising means adapted for isolating feature data corresponding to at least one human feature disposed within the at least one boundary area.
4. The system of claim 3 wherein the human feature includes clothing information.
5. The system of claim 3 wherein the human feature includes a human body characteristic.
6. The system of claim 5 wherein the at least one sub-portion of interest is associated with delivery of a commercial message directed thereto.
7. The system of claim 6 further comprising:
Im_n_ge\01149S 000051U015194 1-JXG 14 categorizing means adapted for categorizing each isolated human body characteristics as positive or negative; and means adapted for generating feedback data corresponding to effectiveness of the commercial message in accordance with an output of the categorizing means.
8. A method for extracting population information from a digital image comprising the steps of: receiving digital image data inclusive of geographic data corresponding to an image of a geographical location; receiving sector data defining at least one sub-portion of interest relative to the digital image data; receiving boundary data; detecting at least one boundary area within the at least one sub-portion of interest in accordance with received boundary data; generating population data in accordance with each detected boundary area; and outputting the population data.
9. The method of claim 8 wherein the boundary data includes histogram data corresponding to at least one physical human characteristic such that the at least one boundary area includes a human.
10. The method of claim 9 further comprising the step of isolating feature data corresponding to at least one human feature disposed within the at least one boundary area.
11. The method of claim 10 wherein the human feature includes clothing information.
12. The method of claim 10 wherein the human feature includes a human body characteristic.
13. The method of claim 7 wherein the at least one sub-portion of interest is associated with delivery of a commercial message directed thereto.
Imanage\011495.()00051\1015194.1-JXG 15
14. The method of claim 13 further comprising the steps of: categorizing each isolated human body characteristics as positive or negative; and generating feedback data corresponding to effectiveness of the commercial message in accordance with an output of the categorizing step.
Figure imgf000017_0001
1 JXG 16
PCT/US2008/076977 2007-09-19 2008-09-19 System and method for estimating characteristics of persons or things WO2009039350A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA002670021A CA2670021A1 (en) 2007-09-19 2008-09-19 System and method for estimating characteristics of persons or things

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US97367807P 2007-09-19 2007-09-19
US60/973,678 2007-09-19

Publications (1)

Publication Number Publication Date
WO2009039350A1 true WO2009039350A1 (en) 2009-03-26

Family

ID=40468367

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/076977 WO2009039350A1 (en) 2007-09-19 2008-09-19 System and method for estimating characteristics of persons or things

Country Status (2)

Country Link
CA (1) CA2670021A1 (en)
WO (1) WO2009039350A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8208943B2 (en) 2009-02-02 2012-06-26 Waldeck Technology, Llc Anonymous crowd tracking
US8417780B2 (en) 2007-12-21 2013-04-09 Waldeck Technology, Llc Contiguous location-based user networks
US8560608B2 (en) 2009-11-06 2013-10-15 Waldeck Technology, Llc Crowd formation based on physical boundaries and other rules
US8711737B2 (en) 2009-12-22 2014-04-29 Waldeck Technology, Llc Crowd formation based on wireless context information
US8898288B2 (en) 2010-03-03 2014-11-25 Waldeck Technology, Llc Status update propagation based on crowd or POI similarity
US9177385B2 (en) 2010-11-18 2015-11-03 Axis Ab Object counter and method for counting objects
US9763048B2 (en) 2009-07-21 2017-09-12 Waldeck Technology, Llc Secondary indications of user locations and use thereof by a location-based service
US9886727B2 (en) 2010-11-11 2018-02-06 Ikorongo Technology, LLC Automatic check-ins and status updates
CN112801377A (en) * 2021-01-29 2021-05-14 腾讯大地通途(北京)科技有限公司 Object estimation method, device, equipment and storage medium
US11842369B2 (en) 2019-07-10 2023-12-12 Theatricality Llc Mobile advertising system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033145A1 (en) * 1999-08-31 2003-02-13 Petrushin Valery A. System, method, and article of manufacture for detecting emotion in voice signals by utilizing statistics for voice signal parameters
US20040249650A1 (en) * 2001-07-19 2004-12-09 Ilan Freedman Method apparatus and system for capturing and analyzing interaction based content
US20070186165A1 (en) * 2006-02-07 2007-08-09 Pudding Ltd. Method And Apparatus For Electronically Providing Advertisements

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033145A1 (en) * 1999-08-31 2003-02-13 Petrushin Valery A. System, method, and article of manufacture for detecting emotion in voice signals by utilizing statistics for voice signal parameters
US20040249650A1 (en) * 2001-07-19 2004-12-09 Ilan Freedman Method apparatus and system for capturing and analyzing interaction based content
US20070186165A1 (en) * 2006-02-07 2007-08-09 Pudding Ltd. Method And Apparatus For Electronically Providing Advertisements

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8417780B2 (en) 2007-12-21 2013-04-09 Waldeck Technology, Llc Contiguous location-based user networks
US9098723B2 (en) 2009-02-02 2015-08-04 Waldeck Technology, Llc Forming crowds and providing access to crowd data in a mobile environment
US8321509B2 (en) 2009-02-02 2012-11-27 Waldeck Technology, Llc Handling crowd requests for large geographic areas
US8208943B2 (en) 2009-02-02 2012-06-26 Waldeck Technology, Llc Anonymous crowd tracking
US9515885B2 (en) 2009-02-02 2016-12-06 Waldeck Technology, Llc Handling crowd requests for large geographic areas
US8825074B2 (en) 2009-02-02 2014-09-02 Waldeck Technology, Llc Modifying a user'S contribution to an aggregate profile based on time between location updates and external events
US9763048B2 (en) 2009-07-21 2017-09-12 Waldeck Technology, Llc Secondary indications of user locations and use thereof by a location-based service
US9300704B2 (en) 2009-11-06 2016-03-29 Waldeck Technology, Llc Crowd formation based on physical boundaries and other rules
US8560608B2 (en) 2009-11-06 2013-10-15 Waldeck Technology, Llc Crowd formation based on physical boundaries and other rules
US9046987B2 (en) 2009-12-22 2015-06-02 Waldeck Technology, Llc Crowd formation based on wireless context information
US8711737B2 (en) 2009-12-22 2014-04-29 Waldeck Technology, Llc Crowd formation based on wireless context information
US8898288B2 (en) 2010-03-03 2014-11-25 Waldeck Technology, Llc Status update propagation based on crowd or POI similarity
US9886727B2 (en) 2010-11-11 2018-02-06 Ikorongo Technology, LLC Automatic check-ins and status updates
US9177385B2 (en) 2010-11-18 2015-11-03 Axis Ab Object counter and method for counting objects
US11842369B2 (en) 2019-07-10 2023-12-12 Theatricality Llc Mobile advertising system
CN112801377A (en) * 2021-01-29 2021-05-14 腾讯大地通途(北京)科技有限公司 Object estimation method, device, equipment and storage medium
CN112801377B (en) * 2021-01-29 2023-08-22 腾讯大地通途(北京)科技有限公司 Object estimation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CA2670021A1 (en) 2009-03-26

Similar Documents

Publication Publication Date Title
WO2009039350A1 (en) System and method for estimating characteristics of persons or things
US10372970B2 (en) Automatic scene calibration method for video analytics
US11176382B2 (en) System and method for person re-identification using overhead view images
US20080172781A1 (en) System and method for obtaining and using advertising information
US9646212B2 (en) Methods, devices and systems for detecting objects in a video
US7747075B2 (en) Salient motion detection system, method and program product therefor
WO2017122258A1 (en) Congestion-state-monitoring system
US20090158309A1 (en) Method and system for media audience measurement and spatial extrapolation based on site, display, crowd, and viewership characterization
US20140003710A1 (en) Unsupervised learning of feature anomalies for a video surveillance system
US20090304230A1 (en) Detecting and tracking targets in images based on estimated target geometry
EP2983131A1 (en) Method and device for camera calibration
US10621734B2 (en) Method and system of tracking an object based on multiple histograms
CN102609724B (en) Method for prompting ambient environment information by using two cameras
JP2022542566A (en) Object tracking method and device, storage medium and computer program
Wang et al. Crowd density estimation based on texture feature extraction
US9361705B2 (en) Methods and systems for measuring group behavior
CN110111129B (en) Data analysis method, advertisement playing device and storage medium
KR20170006356A (en) Method for customer analysis based on two-dimension video and apparatus for the same
CN113920585A (en) Behavior recognition method and device, equipment and storage medium
KR101529620B1 (en) Method and apparatus for counting pedestrians by moving directions
Maalouf et al. Offline quality monitoring for legal evidence images in video-surveillance applications
JP2018165966A (en) Object detection device
JP2018185623A (en) Object detection device
Bai et al. Image quality assessment in first-person videos
Kim et al. Directional pedestrian counting with a hybrid map-based model

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2670021

Country of ref document: CA

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08831564

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08831564

Country of ref document: EP

Kind code of ref document: A1