WO2011071838A1 - Système de traitement vidéo fournissant des fonctions de suivi améliorées pour objets mobiles à l'extérieur de fenêtre visible, et ses procédés - Google Patents
Système de traitement vidéo fournissant des fonctions de suivi améliorées pour objets mobiles à l'extérieur de fenêtre visible, et ses procédés Download PDFInfo
- Publication number
- WO2011071838A1 WO2011071838A1 PCT/US2010/059153 US2010059153W WO2011071838A1 WO 2011071838 A1 WO2011071838 A1 WO 2011071838A1 US 2010059153 W US2010059153 W US 2010059153W WO 2011071838 A1 WO2011071838 A1 WO 2011071838A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- moving object
- area
- location data
- geospatial
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 title claims description 25
- 238000003672 processing method Methods 0.000 claims description 4
- 238000013459 approach Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 11
- 238000012937 correction Methods 0.000 description 9
- 230000009466 transformation Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 238000001914 filtration Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000002123 temporal effect Effects 0.000 description 4
- 238000013500 data storage Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000009189 diving Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30212—Military
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
Definitions
- the present invention relates to the field of video processing, and, more particularly, to processing of geospatially referenced video and related methods.
- georeferenced video includes video imagery frames encapsulated in a transport stream along with geospatial metadata that correlates the pixel space of the imagery to geospatial coordinate values (e.g., latitude/longitude coordinates).
- 6,392,661 which discloses an apparatus for arranging and presenting situational awareness information on a computer display screen using maps and/or other situational awareness information, so that greater amounts of relevant information can be presented to a user within the confines of the viewable area on small computer screen displays.
- the map display layout for a screen display utilizes multiple, independent map displays arranged on a computer screen to maximize situational awareness information and display that information efficiently.
- the apparatus provides the ability to independently scale with respect to distance, time and velocity, as well as zoom and pan each map on the screen display.
- the system utilizes the imagery and terrain information contained in a geo-spatial database to align geodetically calibrated reference imagery with an input image, e.g., dynamically generated video images, and thus achieve a high accuracy identification of locations within the scene.
- a sensor such as a video camera
- images a scene contained in the geo-spatial database the system recalls a reference image pertaining to the imaged scene. This reference image is aligned with the sensor's images using a parametric transformation. Thereafter, other information that is associated with the reference image can be overlaid upon or otherwise associated with the sensor imagery.
- Tracking objects within georeferenced video feeds is also a desirable feature that may be problematic in some circumstances.
- the Full-Motion Video Asset Management Engine (FAMETM) from the present Assignee Harris Corporation.
- FAMETM Full-Motion Video Asset Management Engine
- UAV unmanned aerial vehicle
- KLV Key-Length- Value
- a video processing system which may include a display and a video processor coupled to the display. More particularly, the video processor may be configured to display a georeferenced video feed on the display defining a viewable area, determine actual geospatial location data for a selected moving object within the viewable area, and generate estimated geospatial location data along a predicted path for the moving object when the moving object is no longer within the viewable area and based upon the actual geospatial location data.
- the video processor may be further configured to define a successively expanding search area for the moving object when the moving object is no longer within the viewable window and based upon the estimated geospatial location data, and search within the successively expanding search area for the moving object when the successively expanding search area is within the viewable area.
- the video processor may relatively quickly re-acquire a moving object after it exits and re-enters the viewable area, after panning away from and back to the object, etc., to thereby provide enhanced tracking and/or monitoring of objects within georeferenced video feeds.
- the video processor may be further configured to overlay a tracking indicator for the moving object within the viewable area. Moreover, the video processor may also be configured to overlay an annotation indicator for the moving object within the viewable area. The video processor may further be configured to discontinue generating the estimated geospatial location data based upon a threshold.
- the video processing system may further include a geospatial database for storing the actual geospatial location data. Additionally, the viewable area may be defined by a first set of boundary pixels, and the video processor may be further configured to define the successively expanding search area by a second set of boundary pixels.
- the video processor may be configured to generate the estimated geolocation data further based upon a velocity model. More particularly, the velocity model may account for elevation, earth curvature, etc. Also by way of example, the video feed may be from a video camera.
- a related video processing method may include displaying a georeferenced video feed on the display defining a viewable area, determining actual geospatial location data for a selected moving object within the viewable area, and generating estimated geospatial location data along a predicted path for the moving object when the moving object is no longer within the viewable area and based upon the actual geospatial location data.
- the method may further include defining a successively expanding search area for the moving object when the moving object is no longer within the viewable window and based upon the estimated geospatial location data, and searching within the successively expanding search area for the moving object when the successively expanding search area is within the viewable area.
- FIG. 1 is a schematic block diagram of a video processing system in accordance with one aspect of the invention providing overlayed geospatially-tagged metadata onto a viewable display area and relating to a geolocation outside of the viewable area.
- FIG. 2 is a schematic block diagram of an alternative embodiment of the video processing system of FIG. 1.
- FIG. 3 is view of the display of the system of FIG. 1 showing an aerial image with overlayed geospatially-tagged metadata.
- FIGS. 4 and 5 are flow diagrams illustrating video processing method aspects associated with the systems of FIGS. 1 and 2.
- FIG. 6 is a schematic block diagram of a video processing system in accordance with another aspect of the invention providing geospatial correlation of annotations to an object in overlapping geospatial video feeds.
- FIG. 7 is schematic block diagram of an alternative embodiment of the video processing system of FIG. 6.
- FIGS. 8 A and 8B are respective frame views of overlapping geospatial video feeds taken from different vantage points and showing correlation of an annotation to an object from the first feed to the second feed.
- FIGS. 9 and 10 are flow diagrams illustrating video processing method aspects associated with the systems of FIGS. 6 and 7.
- FIG. 11 is a schematic block diagram of a video processing system in accordance with still another aspect of the invention providing a successively expanding search area for moving objects when outside of a viewable area to allow for ready searching for the moving object when it is within the search area.
- FIGS. 12 and 13 are flow diagrams illustrating method aspects associated with the system of FIG. 11.
- FIGS. 14A-14D are a series of display views illustrating the use of a successively expanding search area by the video processing system of FIG. 11.
- FIG. 15 is a schematic block diagram of a video processing system in accordance with yet another aspect of the invention providing correction of geospatial metadata among a plurality of georeferenced video feeds.
- FIG. 16 is a schematic block diagram of an alternative embodiment of the video processing system of FIG. 15.
- FIGS. 17 and 18 are flow diagrams illustrating method aspects associated with the system of FIGS. 15 and 16.
- FIG. 19 is a view of the display of the system of FIG. 15 illustrating geospatial metadata correction operations performed by the systems of FIGS. 15 and 16.
- the system 30 illustratively includes a display 31, one or more geospatial databases 32, and a video processor 33.
- the video processors described herein and the various functions that they perform may be implemented using a combination of computer hardware (e.g., microprocessor(s), memory, etc.) and software modules including computer- executable instructions, as will be appreciated by those skilled in the art.
- the geospatial database(s) 32 may be implemented using a suitable storage device(s) (i.e., memory) and a database server application to be run on a computing platform and including computer-executable instructions for performing the various data storage and retrieval operations described herein, a will also be appreciated by those skilled in the art.
- a suitable storage device(s) i.e., memory
- a database server application to be run on a computing platform and including computer-executable instructions for performing the various data storage and retrieval operations described herein, a will also be appreciated by those skilled in the art.
- While existing satellite positioning technology e.g., GPS units
- FAMETM system video and metadata from multiple sources may be viewed by many different people, and situational awareness is accomplished through the referencing of external applications or area maps, such as GoogleTM Earth, for example.
- annotations may be added to video by users, those annotations typically cannot be referenced from other videos or visualization tools.
- the system 30 advantageously provides a unified approach to manage geospatially tagged metadata, user-defined features, and points of interest, which may be implemented in a video platform such as the FAMETM system, for example, although the present invention may be used with other suitable systems as well. That is, the system 30 may advantageously be used for a video-centric environment to apply reverse geocoding techniques to increase real-time situational awareness in video.
- the video processor 33 cooperates with the display 31 and the database 32 and is configured to display a georeferenced video feed on the display and defining a viewable area, at Blocks 40-41.
- an aerial view of a vehicle 34 being tracked by a video sensor as it travels along a road 35 is shown, which defines a viewable area (i.e., what can be seen on the display 31).
- An object highlight box 36 and accompanying annotation (“Ground: Vehicle”) indicates to the viewer or operator that the object is being tracked, and what the object is, although such indicators are not required in all embodiments.
- the video processor 33 advantageously overlays selected geospatially-tagged metadata onto the viewable area and relating to a geolocation outside the viewable area, at Block 42, thus concluding the illustrated method (Block 43).
- the distance may be measured between a current frame center (as determined from sensor metadata) and the desired feature location obtained from the internal geospatial database 51'.
- the bearing angle may be measured between true north and line connecting the current frame center and the selected feature, for example.
- the selected geospatially-tagged metadata may comprise at least one of geospatially referenced feature annotations, geospatially referenced video source locations, and geospatially referenced points of interest. Turning additionally to FIG.
- annotations/metadata may be provided by an external geospatial database 50', such as a commercial off-the-shelf (COTS) database including geospatial points of interest, etc., as well as through user input (i.e., user- generated geospatial data).
- COTS commercial off-the-shelf
- the external and user-generated geospatial data may be stored in a common internal geospatial database 51' for convenience of access by the video processor 33' in the illustrated example, but the geospatial metadata may be distributed among multiple databases or sources in other embodiments.
- the database 51' stores the respective geospatial features along with their absolute latitude and longitude, as will be appreciated by those skilled in the art.
- the external geospatial database 50' may be conceptually viewed as a fixed or static set of geospatial data, even though such commercially available data sets may be customized or modified in some embodiments, and the user-generated geospatial data may be considered as variable data that may be readily changed by users. That is, the system 30' advantageously provides for reverse geocoding with both static and user-defined geospatial features on a video-centric platform.
- the video processor 33' illustratively includes a request handler 52' configured to accept a query or request from a user.
- the query is communicated to the geospatial database 51' to generate selected geospatially-tagged metadata that satisfies the given query (i.e., filtering) parameters, at Blocks 44'-45'.
- the query may be based upon one or more filtering parameters, which in the present example includes a category filtering parameter (i.e., airports), and a distance filtering parameter (i.e., within 50 km).
- the category filtering parameters may include categories such as buildings, landmarks (e.g., airports or airfields, etc.), natural formations (e.g., rivers, mountains, etc.), vehicles, etc.
- the video processor 33' further illustratively includes a marker handler 53' which is configured to overlay the selected geospatially-tagged metadata obtained from the database 51', if any (Block 46'), onto the viewable area.
- the video processor 33' also illustratively includes an overlay generator 54' for overlaying the appropriate annotation(s) on the video feed displayed on the display 31' as described above. This advantageously allows the video to be viewed by the user as normal, while at the same time providing ready access to information for off-screen or out of view features of interest, including names, locations, and any other relevant information stored in the database 51'. Other information, such as population, size, speed, priority level, etc., may also be included in the database 51' and handled by the marker handler 53'. Location information for mobile or moving off-screen objects may be provided by a secondary tracking system, e.g., a secondary user viewing station with a separate video interfaced to the system 31', as will be appreciated by those skilled in the art.
- Exemplary applications for the systems 30, 30' may include applications such as surveillance, planning, or reconnaissance where it is desirable to remain aware of objects or features which are out of frame. Moreover, the systems 30, 30' may also advantageously be used for location-based services and advertising, as will be appreciated by those skilled in the art.
- FIGS. 6-10 another video processing system 60 and associated method aspects are now described.
- imagery/video detectors or sensors become more ubiquitous, there is a need to translate geospatial metadata between the different sensor feeds.
- applications may include government, emergency services, and broadcast industry applications.
- annotation standards and tools to accommodate those standards are in current development.
- telestration tools are in use, but are typically limited to visual annotation of still frames.
- the system 60 advantageously allows for transferring visual annotations between disparate sensors.
- Extracted metadata may be utilized to spatially correlate sensor perspectives.
- annotations may be projected onto an alternative georeferenced video feed, whether temporal or non- temporal in nature.
- annotations may be transferred onto temporal data which overlaps spatially within a user-defined offset, and annotations may also be transferred onto spatially overlapping non-temporal data.
- the video processing system 60 illustratively includes a first video input 61 configured to receive a first georeferenced video feed from a first video source, and a second video input 62 configured to receive a second georeferenced video feed from a second video source, which overlaps the first video georeferenced video feed, at Block 90-91, as will be discussed further below.
- the system 30 further illustratively includes a video processor 63 coupled to the first and second video inputs 61, 62.
- the video processor 63 further illustratively includes an annotation module 64 configured to generate an annotation for an object in the first georeferenced video feed, at Block 92.
- the video processor 63 also illustratively includes a geospatial correlation module 64 configured to geospatially correlate the annotation to the object in the second georeferenced video feed overlapping the first georeferenced video feed, at Block 93, thus concluding the method illustrated in FIG. 9 (Block 94).
- the video processing system 60 advantageously allows annotations made in one perspective to be translated to other perspectives, and thus provides tracking abilities and correlation of objects between different georeferenced video feeds.
- the first video source includes a first video camera 70' and a first metadata generation module 72'
- the second video source includes a second video camera 71' and a second metadata generation module 73'. That is, the cameras 70', 71' generate video imagery based upon their particular vantage point, while the metadata generation modules 72', 73' generate respective metadata for each of the video feeds, at Block 95'.
- the metadata generation modules 72', 73' may be incorporated within the cameras 70', 71'.
- the cameras 70', 71' may be stationary cameras, etc., without the capability to produce metadata (e.g., traffic cameras), and the metadata generation modules may be inserted downstream to create the necessary metadata and package it together with the video imagery in a media transport format, as will be appreciated by those skilled in the art.
- metadata e.g., traffic cameras
- the video cameras 70', 71' are directed at a common scene, namely a football player 80 which in FIG. 8A is diving toward the first video camera 70' to make a catch.
- the video frame of FIG. 8B shows the same player making the catch from a side view a few moments later in time.
- an annotation 81 is input from a telestrator via the annotation module 64' by a user (e.g., a commentator).
- the annotation reads "keeps his eye on the ball," and it is overlayed on the video feeds as shown.
- suitable input devices for providing annotation input e.g., computer mouse/keyboard, etc.
- Various types of visual textual, etc., annotations may be used, as will be appreciated by those skilled in the art.
- the video processor 63' illustratively includes a geospatial metadata extraction module 74' for extracting geospatial metadata from the first and second georeferenced video feeds (Block 91'), which may also be stored in a metadata database 75' (e.g., a COTS database).
- a metadata database 75' e.g., a COTS database
- An archival storage device or database 77' may also be included and configured to store the first and second georeferenced video feeds.
- the archive storage database 77' may also be implemented with a COTS database or other data storage medium.
- the geospatial correlation module 65' illustratively includes a coordinate transformation module 76' configured to transform geospatial coordinates for the annotation in the first georeferenced video feed to pixel coordinates in the second georeferenced video feed.
- the first and second video sources may have respective first and second source models associated therewith, and the transformation module may perform affine transformations using the first and second source models.
- the affine transformations between image and ground space are performed using sensor models that are unique to each sensor (here the video cameras 70', 71'), according to the following equation: where a is the ground point, c is the location of the camera, and ⁇ is the rotation of the camera (compounded with platform rotation).
- accuracy may be increased by using an elevation surface, rather than a spheriodal/ellipsoidal reference surface, in some embodiments, if desired.
- the geospatial correlation module 65' may further include a velocity model module 78' configured to generate velocity models of the object (i.e., the football player 80 in FIGS. 8 A and 8B) in the first and second georeferenced video feeds for tracking the object therebetween, at Block 93'.
- Velocity models are computed to provide accurate interpolation between sensors with differing collect intervals.
- Pixel-based tracking may be used to reacquire the object of interest and continue tracking between the various video feeds so that the annotation may continue to follow the tracked object as the video progresses. Exemplary velocity models and tracking operations will be discussed further below.
- the system 60' further illustratively includes one or more displays 79' coupled to the video processor 63' for displaying one both of the video feeds with the overlayed annotation(s).
- the systems 60, 60' thus advantageously provide for "chaining" visual sensors to track annotated objects across wide areas for real-time or forensic purposes. Moreover, this may also reduce user workload necessary to mark up multiple sources, as well as improving user situational awareness. This approach may also be used to automatically enhance metadata repositories (since metadata generated for one feed may automatically be translated over to other overlapping feeds), and it has application across multiple media source types including video, motion imagery, and still imagery.
- FIGS. 11-14D another exemplary video processing system 110 and associated method aspects are now described.
- maintaining situational awareness over a large area using georeferenced video may be difficult.
- Current tracking technologies may be of some help in certain applications, but such trackers are usually limited to tracking objects within a viewable frame area and which are not occluded. Since video has become an important tool for decision making in tactical, disaster recovery, and other situations, tools to enhance the effectiveness of video as an informative data source are desirable.
- the system 110 maintains an up-to-date geospatial location for tracked objects and generates a movement model for that object.
- the object's location may be predicted even if primary tracking is lost due to the object no longer being within the viewable frame area, occlusion, etc.
- the system 110 may be particularly advantageous for civil programs (e.g., police, search and rescue, etc.), and various video applications such as telestration tools, collaborative video tools, etc., as will be appreciated by those skilled in the art.
- the video processing system 110 illustratively includes a display 111 and a video processor 112 coupled to the display.
- the video processor 112 illustratively includes a display module 114 configured to display a georeferenced video feed on the display defining a viewable area, at Block 121.
- the video processor 112 further illustratively includes a geospatial tracking module 115 configured to determine actual geospatial location data for a selected moving object 140 within the viewable area, at Block 122, which in the example shown in FIG. 14A is a vehicle traveling along a road 141.
- the modules 114, 115 may be implemented with an existing video processing platform that performs pixel tracking, such as the above-noted FAMETM system, for example.
- a tracking indicator and/or annotation may also be displayed along with the object being tracked, as discussed above with reference to FIG. 3, for example (Block 121').
- the module 115 is further configured to generate estimated geospatial location data along a predicted path for the moving object 140 when the moving object is no longer within the viewable area and based upon the actual geospatial location data, at Blocks 123-124, and as seen in FIG. 14B.
- the predicted path is visually represented by an arrow 144.
- the moving object 140 (which is not shown in FIGS. 14B and 14C to illustrate that it is outside the viewable area) may cease to be within the viewable area for a variety of reasons. As noted above, the object may go behind or underneath a structure (e.g., building, tunnel, etc.), which occludes the moving object 140 from view by the sensor. Another reason is that the object moves outside of the viewable area. Still another reason is that the zoom ratio of the image sensor capturing the video feed may be changed so that the moving object 140 is no longer within the viewable area, which is the case illustrated in FIGS. 14A-14D.
- FIG. 14B the viewable area that can be seen on the display 111 is zoomed in to focus on a house 142.
- the moving object 140 and road 141 would no longer be viewable to the user on the display 111, although the imagery included within the original viewable area 143 from FIG. 14A is still shown in FIG. 14B for reference.
- the video processor 112 defines a successively expanding search area 145 for the moving object 140 based upon the estimated geospatial location data (i.e., the last known geospatial position of the object before it was lost from the viewable area), at Block 125.
- the viewable area is defined by a first set of boundary pixels in pixel space (e.g., the corner points of the viewable area), and the video processor 112 may be further configured to define the successively expanding search area by a second set of boundary pixels (e.g., a second set of corner pixel points), for example, in the case of a rectangular boundary area.
- a second set of boundary pixels e.g., a second set of corner pixel points
- Other boundary area shapes may also be used, if desired (e.g., circular, etc.).
- the search area 145 continues to expand while the moving object 140 is outside of the viewable area. This is because the longer the object 140 is outside the viewable area, the more the confidence level as to the object's estimated position 146 will decrease. That is, the velocity of the object 140 may change from the last known velocity, which would result in the object being nearer or father away from an estimated location 146 based solely on its last known velocity.
- the search area 145 may advantageously expand to accommodate a range of increasing and decreasing velocities as time progresses.
- knowledge of the moving object's last position may be used in refining the velocity model. For example, if the object 140 was at an intersection and had just begun moving when it was lost from the viewable area, knowledge of the speed limit would allow the video processor 112 to refine the velocity model to account for acceleration up to the speed limit, and use the speed limit as the estimated rate of travel from that point forward.
- Another way in which the expandable search area 145 could be adjusted to account for the particular area where the object is would be if the projected path of the object takes it to an intersection, where the object could potentially change its direction of travel. In such case, the rate of expansion of the search area 145 could be increased to account for the potential change in direction, as well as continued travel along the predicted path.
- Other similar refinements to the rate of expansion of the search area 145 may also be used, as will be appreciated by those skilled in the art.
- the moving object 140 is once again within the viewable area in FIG. 14D.
- the video processor 112 now has a relatively well-defined area in which to search (e.g., though pixel tracking operations) to find the moving object 140 when it is once again within the viewable area, at Blocks 126-127, thus concluding the method illustrated in FIG. 12 (Block 128).
- the video processor 112 may relatively quickly re-acquire the moving object 140 after it exits and re-enters the viewable area, after panning away from and back to the object, etc., to thereby provide enhanced tracking and/or monitoring of objects within georeferenced video feeds. Yet, even if the moving object 140 is not recovered once it is again within the viewable area, its last known location and predicted path are potentially important pieces of information.
- the system 112 may optionally include one or more geospatial databases 113, which provides the ability to maintain or store known locations of important objects. This may advantageously allow tracking of targets to be resumed by other UAVs or video sensors, even though the object can no longer be tracked by the current sensor.
- the moving object 140 location in pixel space may be converted to geospatial coordinates, from which the velocity model is generated.
- the velocity model may take a variety of forms.
- One straightforward approach is to calculate the velocity of the object 140 as a ratio of distance traveled to time between measurements as follows: where is the change in position, and t is time. An average may then be used to estimate future velocity as follows:
- n is the number of measurements over which the velocity is averaged.
- More sophisticated alternatives of the velocity model may account for elevation, earth curvature, etc., (Block 124') to further improve accuracy where desired. Accounting for earth curvature or elevation may be particularly helpful when tracking objects over relatively long distances/measurement intervals, for example.
- the video processor 112 may discontinue generating the estimated geospatial location data. For example, if the expandable search area 145 exceeds a threshold, such as a size threshold, or a threshold time for position estimation, at Block 129', then the search area may have expanded to the point that it is no longer beneficial for re-acquiring the object 140 for tracking. That is, the search area may have become so large that there is no practical benefit to continuing expansion of the search area 145, and the processing/memory overhead requirements associated therewith.
- the length or size of such thresholds may vary based upon the particular implementation, and could be changed from one implementation or application to the next.
- Factors that may affect the duration or size of the threshold include the nature of the objects being tracked, their ability to change directions (e.g., complexity of road system), expected velocities of the objects in a given environment, etc., as will be appreciated by those skilled in the art. For example, it may be desirable to track a vehicle traveling along a long, straight dirt road where the top speed may be relatively slow, as opposed to a vehicle in a metropolitan area where there is ready access to high-speed interstates that go in many different directions.
- FIGS. 15 through 19 another exemplary video processing system 150 and associated method aspects are now described.
- geospatial metadata sent from UAVs or other aerial platforms is often not precise enough for position sensitive activities or determinations.
- the system 150 advantageously provides an approach for automatically correcting inaccuracies in geospatial metadata across multiple video feeds due to misaligned frames in the video or through a lack of coordinate precision, for example.
- the video processing system 150 illustratively includes one or more video ingest modules 151 for receiving a plurality of
- georeferenced video feeds each comprising a sequence of video frames and initial geospatial metadata associated therewith. Moreover, each georeferenced video feed has a respective different geospatial accuracy level associated therewith. In the illustrated example, there are two georeferenced video feeds, but other numbers of feeds may be used in some embodiments as well.
- the system 150 further illustratively includes a video processor 152 coupled to the video ingest module 151 that is configured to perform image registration among the plurality of georeferenced video feeds, at Block 171.
- the video processor 152 further generates correct geospatial metadata for at least one of the georeferenced video feeds based upon the initial geospatial metadata, the image registration and the different geospatial accuracy levels, at Block 172, thus concluding the method illustrated in FIG. 17.
- the system 150 may thereby provide automatic real-time metadata correction that may use geospatial metadata to find a general area of reference between two or more sensor feeds (UAVs, stationary camera, etc.), and use a predefined accuracy metric to determine which feed is more accurate. For example, some sensor feeds that produce full motion video (30 fps) are less accurate than high definition surveillance feeds ( ⁇ 15 fps) that are captured at a higher altitude.
- the video processor 152 may perform image registration not only against reference images, which may be stored in a geospatial image database 153', but also may perform image registration between the overlapping portions of different video frames.
- the video processor 152 uses their respective geospatial metadata to find a common region of interest 191 between the feeds, typically corresponding to a landmark.
- the reference geospatial images in the database 153' may be used as well.
- the video image frames (and, optionally, images from the database 153') are used to perform the image registration around the common region of interest 191.
- four different aerial sensors are used to generate a georeferenced video feed for the area surrounding a building 190, and the particular area of interest 197 corresponds to a specific landmark on the building, namely a dome 191.
- the coordinates for the dome 191 resulting from the first video feed result in a point 192 (represented with a star) within a few meters of the dome, which is therefore the most accurate of the video feeds.
- Points 193 and 194 are from the second and third video sensors, respectively, and are farther away from the dome 191.
- the fourth sensor video feed is the least accurate, and provides a geospatial coordinate set for the dome 191 that is approximately two hundred meters away and in the middle of a completely different building 196 due to a floating point imprecision associated with the fourth sensor.
- Accuracy metrics for the various sensor types are typically known or may be measured prior to video capture, as will be appreciated by those skilled in the art.
- the video processor 152 may automatically correct the geospatial metadata for video frames in one or more of the video feeds using a metadata correction algorithm.
- the correction algorithm may be relatively straightforward, or more complex, depending upon the desired speed and accuracy required.
- faster and slightly less accurate algorithms may be used.
- One straightforward approach is to correct the metadata for the less accurate sensor with that of the most accurate sensor (i.e., based upon their respective accuracy metrics).
- the video processor 152 would determine which video feed from the provided video feeds is from the sensor with the greatest accuracy, and it would perform the correction based upon the metadata therefrom.
- a somewhat more sophisticated approach is to use the predefined accuracy ratings to rank each sensor feed.
- This approach uses a weighted average of the metadata from all of the feeds to determine the new or corrected geospatial metadata based on the their respective accuracy rankings, at Block 172'.
- One exemplary algorithm for performing the weighted average is as follows:
- G new corrected geospatial metadata
- N number of sensors
- R sensor ranking
- T total rankings
- O old geospatial metadata
- the video processing system 150' also illustratively includes geospatial metadata database 154' coupled to the video processor 152' for storing the corrected geospatial metadata.
- a geospatial video database or storage device 155' is coupled to the video processor 152' and is for storing the sequence of video images for each video feed. In some embodiments, some or all of the data may be combined into a common database, for example.
- the system 150' further illustratively includes a display 156' coupled to the video processor 152', which is configured to display the sequence of video frames of one or more of the georeferenced video feeds on the display and with the corrected geospatial metadata associated therewith, at Block 177'.
- a display 156' coupled to the video processor 152', which is configured to display the sequence of video frames of one or more of the georeferenced video feeds on the display and with the corrected geospatial metadata associated therewith, at Block 177'.
- the video processor 152' may perform the correction operations on an interval basis, rather than on every frame.
- the video processor 152' may generate the corrected geospatial metadata every N number of video frames, where N is greater than 1.
- the video feed may have missing geospatial metadata due to errors, etc.
- the video processor 152' may be further configured to fill in the missing geospatial metadata using the same approach outlined above, i.e., based upon the initial geospatial metadata, the image registration and the different geospatial accuracy levels, at Blocks 175'-176'.
- the above-described approach may advantageously be implemented on a platform independent basis. As such, with little or no operator intervention, the geospatial information in the video frames may be automatically corrected to produce a more accurate georeferenced video than relying on raw sensor video alone.
- the system 150 also advantageously provides ingest and metadata correction abilities for new video streams where reference imagery is not otherwise available, but other, more accurate aerial sensor video feeds are.
- the corrected metadata and video feed may be respectively stored in the geospatial metadata database 154' and geospatial video database 155' to provide the video analyst with accurate georeferenced video to perform future metadata correction (i.e., from archived video feeds), as opposed to real-time or live video feeds.
- the systems 150, 152' therefor advantageously may save users time and money by automatically correcting frames in a video feed(s) video which would otherwise have inaccurate geospatial information.
- These systems may advantageously be used in a variety of applications for government and civilian sectors where relatively accurate georeferenced video streams are required, such as targeting systems, surveillance systems, and aerial mapping, for example.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP10798872A EP2510503A1 (fr) | 2009-12-10 | 2010-12-07 | Système de traitement vidéo fournissant des fonctions de suivi améliorées pour objets mobiles à l'extérieur de fenêtre visible, et ses procédés |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/634,843 US8363109B2 (en) | 2009-12-10 | 2009-12-10 | Video processing system providing enhanced tracking features for moving objects outside of a viewable window and related methods |
US12/634,843 | 2009-12-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011071838A1 true WO2011071838A1 (fr) | 2011-06-16 |
Family
ID=43501318
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2010/059153 WO2011071838A1 (fr) | 2009-12-10 | 2010-12-07 | Système de traitement vidéo fournissant des fonctions de suivi améliorées pour objets mobiles à l'extérieur de fenêtre visible, et ses procédés |
Country Status (4)
Country | Link |
---|---|
US (1) | US8363109B2 (fr) |
EP (1) | EP2510503A1 (fr) |
TW (1) | TW201139989A (fr) |
WO (1) | WO2011071838A1 (fr) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2009236675A1 (en) * | 2008-04-14 | 2009-10-22 | Gvbb Holdings S.A.R.L. | Technique for automatically tracking an object |
US8477188B2 (en) * | 2009-10-14 | 2013-07-02 | Harris Corporation | Surveillance system for transcoding surveillance image files while retaining geospatial metadata and associated methods |
JP5507962B2 (ja) * | 2009-11-05 | 2014-05-28 | キヤノン株式会社 | 情報処理装置及びその制御方法、プログラム |
GB2495529B (en) * | 2011-10-12 | 2013-08-28 | Hidef Aerial Surveying Ltd | Aerial survey video processing |
US9462301B2 (en) | 2013-03-15 | 2016-10-04 | Google Inc. | Generating videos with multiple viewpoints |
DE102014101525A1 (de) * | 2014-02-07 | 2015-08-13 | Cassidian Optronics Gmbh | Verfahren und Vorrichtung zur Darstellung einer Objektliste, eines Bildes, eines Bildmosaiks, eines virtuellen 2D-Objektraums oder eines virtuellen 3D-Objektraums |
WO2016015251A1 (fr) | 2014-07-30 | 2016-02-04 | SZ DJI Technology Co., Ltd. | Systèmes et procédés de poursuite de cible |
US10042031B2 (en) * | 2015-02-11 | 2018-08-07 | Xerox Corporation | Method and system for detecting that an object of interest has re-entered a field of view of an imaging device |
US10762660B2 (en) * | 2018-09-28 | 2020-09-01 | Verizon Patent And Licensing, Inc. | Methods and systems for detecting and assigning attributes to objects of interest in geospatial imagery |
CN110062272B (zh) * | 2019-04-30 | 2021-09-28 | 腾讯科技(深圳)有限公司 | 一种视频数据处理方法和相关装置 |
US11200919B2 (en) | 2020-03-10 | 2021-12-14 | Sony Group Corporation | Providing a user interface for video annotation tools |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5625715A (en) * | 1990-09-07 | 1997-04-29 | U.S. Philips Corporation | Method and apparatus for encoding pictures including a moving object |
US6392661B1 (en) | 1998-06-17 | 2002-05-21 | Trident Systems, Inc. | Method and apparatus for improving situational awareness using multiple map displays employing peripheral range bands |
WO2005031382A2 (fr) * | 2003-05-23 | 2005-04-07 | Lockheed Martin Corporation | Systeme de poursuite base sur des images infrarouge a etapes multiples en temps reel |
US6957818B2 (en) | 2000-11-22 | 2005-10-25 | Sic Llc | Hand steerable sports scooter |
WO2005120071A2 (fr) * | 2004-06-01 | 2005-12-15 | L-3 Communications Corporation | Procede et systeme permettant d'effectuer un flash video |
WO2007095526A2 (fr) * | 2006-02-13 | 2007-08-23 | Sony Corporation | Système et méthode de combinaison de multiples flux vidéo |
US20080074494A1 (en) * | 2006-09-26 | 2008-03-27 | Harris Corporation | Video Surveillance System Providing Tracking of a Moving Object in a Geospatial Model and Related Methods |
US7559017B2 (en) | 2006-12-22 | 2009-07-07 | Google Inc. | Annotation framework for video |
WO2009085233A2 (fr) * | 2007-12-21 | 2009-07-09 | 21Ct, Inc. | Système et procédé de suivi visuel avec occlusions |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5600775A (en) * | 1994-08-26 | 1997-02-04 | Emotion, Inc. | Method and apparatus for annotating full motion video and other indexed data structures |
US6173317B1 (en) * | 1997-03-14 | 2001-01-09 | Microsoft Corporation | Streaming and displaying a video stream with synchronized annotations over a computer network |
US6597818B2 (en) * | 1997-05-09 | 2003-07-22 | Sarnoff Corporation | Method and apparatus for performing geo-spatial registration of imagery |
US5936552A (en) * | 1997-06-12 | 1999-08-10 | Rockwell Science Center, Inc. | Integrated horizontal and profile terrain display format for situational awareness |
US6609005B1 (en) * | 2000-03-28 | 2003-08-19 | Leap Wireless International, Inc. | System and method for displaying the location of a wireless communications device wiring a universal resource locator |
US6327533B1 (en) * | 2000-06-30 | 2001-12-04 | Geospatial Technologies, Inc. | Method and apparatus for continuously locating an object |
US6363320B1 (en) * | 2000-08-18 | 2002-03-26 | Geospatial Technologies Inc. | Thin-client real-time interpretive object tracking system |
US20060041375A1 (en) * | 2004-08-19 | 2006-02-23 | Geographic Data Technology, Inc. | Automated georeferencing of digitized map images |
US8731585B2 (en) * | 2006-02-10 | 2014-05-20 | Telecommunications Systems, Inc. | Intelligent reverse geocoding |
US8149278B2 (en) * | 2006-11-30 | 2012-04-03 | Mitsubishi Electric Research Laboratories, Inc. | System and method for modeling movement of objects using probabilistic graphs obtained from surveillance data |
US7668651B2 (en) * | 2006-12-12 | 2010-02-23 | Pitney Bowes Software Inc. | Reverse geocoding system using combined street segment and point datasets |
US8456527B2 (en) * | 2007-07-27 | 2013-06-04 | Sportvision, Inc. | Detecting an object in an image using templates indexed to location or camera sensors |
-
2009
- 2009-12-10 US US12/634,843 patent/US8363109B2/en active Active
-
2010
- 2010-12-07 WO PCT/US2010/059153 patent/WO2011071838A1/fr active Application Filing
- 2010-12-07 EP EP10798872A patent/EP2510503A1/fr not_active Withdrawn
- 2010-12-10 TW TW099143349A patent/TW201139989A/zh unknown
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5625715A (en) * | 1990-09-07 | 1997-04-29 | U.S. Philips Corporation | Method and apparatus for encoding pictures including a moving object |
US6392661B1 (en) | 1998-06-17 | 2002-05-21 | Trident Systems, Inc. | Method and apparatus for improving situational awareness using multiple map displays employing peripheral range bands |
US6957818B2 (en) | 2000-11-22 | 2005-10-25 | Sic Llc | Hand steerable sports scooter |
WO2005031382A2 (fr) * | 2003-05-23 | 2005-04-07 | Lockheed Martin Corporation | Systeme de poursuite base sur des images infrarouge a etapes multiples en temps reel |
WO2005120071A2 (fr) * | 2004-06-01 | 2005-12-15 | L-3 Communications Corporation | Procede et systeme permettant d'effectuer un flash video |
WO2007095526A2 (fr) * | 2006-02-13 | 2007-08-23 | Sony Corporation | Système et méthode de combinaison de multiples flux vidéo |
US20080074494A1 (en) * | 2006-09-26 | 2008-03-27 | Harris Corporation | Video Surveillance System Providing Tracking of a Moving Object in a Geospatial Model and Related Methods |
US7559017B2 (en) | 2006-12-22 | 2009-07-07 | Google Inc. | Annotation framework for video |
WO2009085233A2 (fr) * | 2007-12-21 | 2009-07-09 | 21Ct, Inc. | Système et procédé de suivi visuel avec occlusions |
Also Published As
Publication number | Publication date |
---|---|
US20110141287A1 (en) | 2011-06-16 |
TW201139989A (en) | 2011-11-16 |
US8363109B2 (en) | 2013-01-29 |
EP2510503A1 (fr) | 2012-10-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8970694B2 (en) | Video processing system providing overlay of selected geospatially-tagged metadata relating to a geolocation outside viewable area and related methods | |
US8717436B2 (en) | Video processing system providing correlation between objects in different georeferenced video feeds and related methods | |
US8363109B2 (en) | Video processing system providing enhanced tracking features for moving objects outside of a viewable window and related methods | |
US8933961B2 (en) | Video processing system generating corrected geospatial metadata for a plurality of georeferenced video feeds and related methods | |
US11428537B2 (en) | Localization and mapping methods using vast imagery and sensory data collected from land and air vehicles | |
US10540804B2 (en) | Selecting time-distributed panoramic images for display | |
US20220019611A1 (en) | Providing A Thumbnail Image That Follows A Main Image | |
US8036827B2 (en) | Cognitive change detection system | |
US9644968B2 (en) | System and method for creating, storing and utilizing images of a geographical location | |
US9280851B2 (en) | Augmented reality system for supplementing and blending data | |
WO2018204680A1 (fr) | Système de création et de gestion de données vidéo | |
US20080074494A1 (en) | Video Surveillance System Providing Tracking of a Moving Object in a Geospatial Model and Related Methods | |
US11290705B2 (en) | Rendering augmented reality with occlusion | |
Lin et al. | Moving camera analytics: Emerging scenarios, challenges, and applications | |
Laka-Iñurrategi et al. | AUGMENTING LIVE AERIAL VIDEO IMAGES WITH GIS INFORMATION TO ENHANCE DECISION MAKING PROCESS DURING EMERGENCIES |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10798872 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010798872 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112012013446 Country of ref document: BR |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01E Ref document number: 112012013446 Country of ref document: BR |
|
ENPW | Started to enter national phase and was withdrawn or failed for other reasons |
Ref document number: 112012013446 Country of ref document: BR |