WO2011133720A2 - Réseau de détection d'événement auto-adaptative : détails du codage et du décodage vidéo - Google Patents

Réseau de détection d'événement auto-adaptative : détails du codage et du décodage vidéo Download PDF

Info

Publication number
WO2011133720A2
WO2011133720A2 PCT/US2011/033323 US2011033323W WO2011133720A2 WO 2011133720 A2 WO2011133720 A2 WO 2011133720A2 US 2011033323 W US2011033323 W US 2011033323W WO 2011133720 A2 WO2011133720 A2 WO 2011133720A2
Authority
WO
WIPO (PCT)
Prior art keywords
frame
frames
data
change
image data
Prior art date
Application number
PCT/US2011/033323
Other languages
English (en)
Other versions
WO2011133720A3 (fr
Inventor
Robert J. Jannarone
John T. Tatum
Leronzo Lidell Tatum
David J. Cohen
Original Assignee
Brainlike, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Brainlike, Inc. filed Critical Brainlike, Inc.
Publication of WO2011133720A2 publication Critical patent/WO2011133720A2/fr
Publication of WO2011133720A3 publication Critical patent/WO2011133720A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/162User input
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution

Definitions

  • the present invention generally relates to efficient, auto-adaptive detection of events, and specifically relates to efficient reduction of video data to important events.
  • Auto-adaptive systems have many applications. These applications include event recognition based on data measured over a number of successive time periods. Events take many different forms. For example, event recognition may include detection of a target in a particular area, sensing an out-of-specification condition in a physical environment or identifying psychometric measurements with a particular behavior prediction profile. Preliminary anomaly sensing, clutter removal and data reduction or triage are often key elements of event recognition. Event recognition may also comprise evaluation of sensed data to recognize or reject the existence of conditions indicated by the data or to initiate a particular action. In order to be effective, auto-adaptive systems must keep up with continuously arriving data. Otherwise, data must be stored before processing, and events may not be recognized in time to take effective action.
  • CODEC coder / decoder
  • the auto-adaptive detection system may partition sensor data and compare subsets of current sensor data against subsets of template data to determine if the current sensor data matches previous sensor data.
  • the auto- adaptive system may continuously learn changing background or event conditions, so as to match current sensor data with template data robustly. In this fashion the system may identify changed sensor data and a region in which the changed sensor data was found more effectively than if conventional template matching, which does not automatically and continuously adapt, were used. The region data may be used to focus later comparisons on those regions where changes are most likely to have occurred.
  • the auto-adaptive detection system may also identify events of interest without template matching. For example, an initial baseline frame of sensor data that includes the entire sensor data set may be used in combination with a series of change frames that include only changed sensor data with respect to the baseline frame. Periodically a new baseline frame may be established after which change frames are used. Reduced sensor data included baseline frames and change frames may be encoded into data packets for efficient transmission over a network or directly between two devices.
  • the auto-adaptive detection system may also operate efficiently by reducing sensor data to useful information at various network points, which may include detectors where data triage may be performed before display, transmission or storage; network devices, where combined detector information may be further reduced before display, broadcasting or storage; and computer servers, where detailed analysis may be performed prior to display, broadcasting, or storage.
  • FIG. 1 is a network diagram illustrating an example auto-adaptive detection network according to an embodiment of the invention
  • FIG. 2 is a block diagram illustrating example detector according to an embodiment of the invention.
  • FIG. 3 is a block diagram illustrating example network device according to an embodiment of the invention.
  • FIG. 4 is a block diagram illustrating example server according to an embodiment of the invention.
  • FIG. 5 is a block diagram illustrating an example partitioning configuration according to an embodiment of the invention.
  • FIG. 6 is a block diagram illustrating an example region surrounded by a box according to an embodiment of the invention.
  • FIG. 7 is a block diagram illustrating an example region surrounded by a box according to an embodiment of the invention.
  • FIG. 8 is a block diagram illustrating two example pixel reductions in an image frame that is part of a sequence of image frames comprising a video according to an embodiment of the invention
  • FIG. 9 is a block diagram illustrating example encoding of reduced video data over time according to an embodiment of the invention.
  • FIG. 10 is a block diagram illustrating example full pixel encoding of video data over time according to an embodiment of the invention.
  • FIG. 1 1 is a block diagram illustrating example compact pixel encoding of video data over time according to an embodiment of the invention
  • FIG. 12 is a flow diagram illustrating an example process for encoding video data according to an embodiment of the invention.
  • FIG. 13 is a block diagram illustrating an example pixel change locator according to an embodiment of the invention.
  • FIG. 14 is a block diagram illustrating an example wired or wireless processor enabled device that may be used in connection with various embodiments described herein.
  • Embodiments disclosed herein describe systems and methods for auto- adaptive detection of events in sensor data streams, reducing the size of sensor data and processing sensor data in real time.
  • one method disclosed herein allows for a stream of video frames to be sensed, events in the video stream to be identified, background information in the video stream to be eliminated to encode and reduce the transmission size of the video stream, the stream of video frames to be reconstructed on the receiving end without loss of fidelity, and encoding as well as reconstruction to keep up with continuously arriving video frames.
  • a sensor is an item that provides information that may be used to produce a meaningful result.
  • Sensor data is collected continuously over successive time periods, generally from an array of sensors.
  • Sensor data may include, but is not limited to, audio data, image data or a combination of the two.
  • Image data may be collected at rapid enough intervals to be considered video data.
  • a data point characterizing a position of a point in a plane may be characterized by x and y coordinates. Such a point has two spatial dimensions.
  • a data point may also be characterized by x, y, and z coordinates, as in a tomograph. Such a point has three spatial dimensions.
  • Time dimensions may also exist. For example, if the data point describes the condition of a pixel in a television display, the data point may be further characterized by points in a current frame, along with points along the 10 most recent frames. Other dimensions may also exist.
  • the data point may be further characterized by intensity values of luminance and chroma along a feature dimension.
  • the data point may be characterized by frequency and volume intensity values along a feature dimension. These values are characterized as data points along further dimensions.
  • auto-adaptive systems process successive signals in one or a plurality of dimensions to track the background environment's dynamic change.
  • an event occurs within a sensor's area of response (e.g., within a field of view of optical sensors or within reception of an audio sensor)
  • auto-adaptive systems determine if the return is sufficiently different from the background prediction to verify if an event has occurred while minimizing the likelihood and number of false positives and reducing the cost of transmitting unnecessary information.
  • An important aspect of auto-adaptive systems is a dynamic detection threshold that enables these systems to find signals and events that could not otherwise be distinguished from noise in a naturally changing environment. Having a dynamic threshold also allows a system to maintain a tighter range on alarm limits. Broader alarm ranges from alternative systems decrease their capacity to distinguish anomalous conditions from normal conditions.
  • Hierarchical aspects of auto-adaptive systems include arrangements of input intensity values into cells along one, two, or three space dimensions, along with an input feature dimension within each cell. Derived feature values may further be computed within each cell as functions of dependent feature values, which may be input feature values or other derived feature values. One or more feature output feature values may be computed for each cell, as a function of feature values in a surrounding window.
  • the surrounding window may include feature values in cells that are nearest spatial neighbors, measured currently or recently.
  • a further important aspect of auto-adaptive systems is their capacity to identify anomalous or interesting events efficiently and in real time, that is, sufficiently fast to keep up with continuously arriving input measurements without falling behind.
  • High speed operation of auto-adaptive systems results from their use of recursive updating of learned statistical metrics in real time, along with their avoidance of iterative estimation procedures and time consuming estimation based on historical data.
  • Auto-adaptive systems stems from their effective reduction of sensor data to useful information, in real time.
  • Auto-adaptive data reduction results in lower costs of data transmission, broadcasting, storage, display, analysis, and related network operations.
  • Auto-adaptive data reduction may occur while reducing sensor data into selected or encoded form, at or near remote sensor locations, and prior to transmission.
  • Auto-adaptive data reduction may also occur while reducing sensor data to selected or encoded form on network devices and computer servers, after they have received sensor data, and before the data have been saved, transmitted, broadcast, or prepared for further display or analysis. Preparation of data that has been encoded by auto-adaptive encoders may require special purpose, auto-adaptive decoding.
  • Auto-adaptive systems may operate as part of, or as alternatives to, other conventional compression methods such as video encoders within the h.264 family.
  • conventional compression methods for video data use fixed data compression algorithms to reduce uniformly over small blocks of pixels within images.
  • Conventional encoding limitations, such as slow operation, distortion or lossiness and poor performance when video cameras are jittery, may be mitigated by auto-adaptive data processes, which may select only pixels of interest in full resolution, while ignoring other pixels entirely.
  • FIG. 1 is a network diagram illustrating an example auto-adaptive detection network 10 according to an embodiment of the invention.
  • the system 10 may comprise one or more detectors 20 and 30, optional network devices 40, 50 and 60 and one or more optional servers 70.
  • Each of the detectors, devices and servers has a corresponding data storage area 25, 35, 45, 55 and 75, respectively.
  • the detectors, devices, and servers are communicatively coupled over a data communication network 80.
  • a detector may be directly connected to a device such as detector 20 being directly connected to device 40 and a detector can be directly connected to a server such as detector 30 being directly connected to server 70.
  • Detectors 20 and 30 can also be directly connected to each other or indirectly connected to each other via network 80.
  • the detectors such as detector 20 may be any data collection devices that may be connected to the network devices such as device 40 and server devices such as server 70 via a wired or wireless network such as network 80.
  • the direct connection between detector 20 and device 40 or between detector 30 and server 70 may also be a wired or wireless connection.
  • the device 40 or server 70 may comprise a wireless base station that receives sensor data from the various detectors and processes the sensor data as described herein.
  • detector 20 may be a stand alone detection device that senses data, displays sensed data or detected information and provides sensed data or detected information to a receiving device such as device 40 or server 70 or another detector 30.
  • detector 20 may be integrated into another device such as an unmanned vehicle or some other monitoring or sensor device or vehicle.
  • detector 20 may provide information to a base station incorporated into network device 40 or server 70 via a direct connection or the base station function may be part of the network 80.
  • the detector 20 may be integrated into a processor enabled device that is capable of information processing in order to reduce the size of the sensed data prior to transmission.
  • device 40 or server 70 is configured with sufficient processor power to manage the data provided by the plurality of detectors such as detectors 20 and 30.
  • server 70 may include a base station and server 70 may be housed on a ship, a bunker, aircraft or satellite.
  • Detector 20 may also include a transmitter having sufficient bandwidth to provide detected information to server 70 or device 40.
  • Detector 20 may also include processing systems to ensure that video information provided by detector 20 to the network device 40 or server 70 is useful. To the extent that detector 20 transmits non-useful information, sensor 20 and network device 40 or server 70 will unnecessarily and inefficiently expend resources.
  • detector 20 may also display its own sensed data or reduced information and use sensed or reduced information to control its own operation automatically.
  • detector 20 may be include a video camera, the gain of which may be automatically controlled, once the gain has been determined at the detector.
  • Detector 20 may also include a receiver, allowing it to display or further use information from another detector 30.
  • Network device 40 or server 70 that incorporates a base station may also be in communication with other network devices in the field (not shown).
  • network device 40 may be in communication with server 70 via network 80 or via another communication medium, such as a separate wireless or wired network, cellular network or direct connection.
  • device 40 may receive, decompress and display data from detector 20 in real-time.
  • Device 40 may also send configuration packets to detector 20. These configuration packets may include feature specifications, sensitivity metrics and other sensing information that may be communicated to the detector 20 and/or may aid in identifying useful information and/or masking clutter.
  • network 80 may be a cellular network, wireless LAN, wired network the Internet or any combination of networks and communication infrastructures capable to transmitting data.
  • Detector 20, device 40, detector 30 and server 70 may directly communicate via a wired or wireless direct connection and may also include hardware to record data. Any combination of detector 20, device 40 or server 70 may then perform various operations to identify events, remove clutter and reduce the sensor prior to or after further transmission of the sensor data.
  • FIG. 2 is a block diagram illustrating example detector 20 according to an embodiment of the invention.
  • detector 20 comprises a sensor module 100, a display module 1 10, a data reduction module 1 15 and an encoder/decoder module 120.
  • the detector 20 also includes a processor 127 to execute the various modules and handle wired and wireless communications.
  • the detector 20 may be configured to communicate with other devices via a direct wired or wireless connection or via a wired or wireless network.
  • the sensor module 100 may be configured to sense any variety of information as will be understood by those skilled in the art. This capability of sensor module 100 will not be further discussed.
  • the display module 1 10 is configured to output information to a user interface device such as a monitor (for image data) or speaker (for audio data) or any other type of user interface device.
  • the data reduction module 1 15 is configured to reduce the size of the sensed data and processor 127 is configured to execute the various modules and implement communications between detector 20 and other devices.
  • the detector 20 is also configured with data storage area 25 that may comprise volatile and/or non-volatile memory as suitable for the particular purpose of detector 20. Examples of detectors include platforms dedicated to special purpose sensors such as video cameras, microphones and radar, as well as general purpose platforms such as cell phones, personal digital assistants and other wired or wireless communication devices that may be connected to any variety of sensors.
  • FIG. 3 is a block diagram illustrating example network device 40 according to an embodiment of the invention.
  • network device 40 comprises sensor module 140, a display module 150, a data reduction module 155 and an encoder/decoder module 160.
  • the network device 40 also includes a processor 167 to execute the various modules and handle wired and wireless communications.
  • the network device 40 is configured to communicate with other devices via a direct wired or wireless connection or via a wired or wireless network.
  • the network device 40 includes many of the same functional capabilities as the previously described detector 20. This contemplates but does not require an embodiment where the detector 20 does not include the processing capability to analyze and process the information collected by the detector fully, although the detector may have some capability to detect events and reduce data.
  • the network device 40 may be integrated with a base station and the detector 20 may be a wireless camera that streams its sensor data directly to the network device 40.
  • the sensor module 140 may be configured to recognize events in the data that may be received by the device 40.
  • the display module 150 is configured to output information to a user interface device such as a monitor (for image data) or speaker (for audio data) or any other type of user interface device.
  • the data reduction module 155 may be configured to reduce the size of the recognized events and the processor 167 may be configured to execute the various modules and implement communications between the device 40 and other devices.
  • the device 40 may also be configured with data storage area 45 that may comprise volatile and/or non-volatile memory as suitable for the particular purpose of the device 40.
  • network devices include base stations, access points, wireless communication devices, personal computers, mobile cell towers and other processor enabled devices with the capacity to communicate with other processor enabled devices either directly (wired or wirelessly) or over a wired or wireless network and the capacity to process large amounts of data.
  • the encoder/decoder module 160 may be configured to decode reduced information that has been transmitted from one or more detectors 20.
  • the encoder/decoder module 160 may also be configured to decode raw sensor data or reduced information that has been received from other devices, sensors, or networks.
  • the encoder/decoder module 160 may also be configured first to decode information from one form, such as the auto-adaptive packets to be described in FIG. 12 below, and then to encode information in another form, such as one or more of the formats within the h.264 family.
  • FIG. 4 is a block diagram illustrating an example server 70 according to an embodiment of the invention.
  • the server 70 comprises a converter module 170, a stabilizer module 175, a video change module 180, an event window module 185, a data cropper module 190, a compress module 195 and a processor 197.
  • the server 70 may also include the modules of the previously described network device, for example in an embodiment where the server is integral with a base station and directly receives information from one or more sensors.
  • the converter module 170 is configured to convert data.
  • the converter module 170 may transform video camera data to a suitable format for further analysis or transmission.
  • the stabilizer module 175 may be configured to analyze image information and replace pixels in a first frame of image data with pixels from a second frame of image data to correct for intra-frame camera motion.
  • the video change module 180 may be configured to identify inter-frame changes in individual pixels or features that depend on nearest neighbor pixels and also identify inter-frame changes from either temporally adjacent frames or between a baseline frame and subsequent frames in the packet.
  • the video change module 180 may also be configured to include changed pixels in reduced pixel packets and include non-changed pixels surrounding changed pixels in baseline pixel packets.
  • the event window module 185 may be configured to supply windows with configurable dimensions, containing events of interest that have been detected by the video change module 180.
  • the data cropper module 190 may be configured to supply image frame data (e.g., windows within a frame having configurable locations and sizes as requested).
  • the data cropper module 190 may also supply image frame data in full resolution or in reduced resolution.
  • the compressor module 195 may be configured to compress sensed information by creating packets that contain only information that has changed. For example, in an embodiment where the sensor produces image data, the compressor module 195 may create packets that contain only changed pixels with respect to a previous frame such as a baseline frame, which may be the adjacent previous frame in the data stream or another (earlier) previous frame in the data stream from the sensor.
  • the server 70 may also be configured with data storage area 75 that may comprise volatile and/or non-volatile memory as suitable for the particular purpose of the device 40.
  • the server may also be configured with one or more user interface devices for input and output of data and information. Examples of servers include base stations, access points, wireless communication devices, personal computers and other processor enabled devices with the capacity to communicate with other processor enabled devices either directly (wired or wirelessly) or over a wired or wireless network and the capacity to process large amounts of data.
  • the server 70 is thus made up of various analytical components, which may include, but may not be limited to the visual analytical components shown in FIG. 4.
  • the system 10 will utilize servers 70 to perform certain analytic operations that detectors 20 and devices 40 cannot perform due to limited processing power.
  • Some of the specific analytic components shown in FIG. 4 may alternatively reside on devices 40 or detectors 20, provided that they have sufficient processing power.
  • processing power is limited on devices 40 or detectors 20, due to power and space constraints that are more severe than on servers 70.
  • the system 10 will detect, recognize, analyze, and display events of interest, in real time, by utilizing processing power from detectors 20, network devices 40 and servers 70.
  • detectors 20 will operate in remote regions with limited electric power and limited processing assets.
  • detectors 20 may only be capable of performing relatively coarse data triage.
  • Network devices 40 may be capable of further and more refined data reduction and event recognition, by having access to more powerful processing and information from multiple detectors 20.
  • Servers 70 may be capable of still more substantial analytics, by having access to highly powerful processing and by using preliminary processing that may be performed by detectors 20 and network devices 40 as points of departure.
  • FIG. 5 is a block diagram illustrating an example partitioning configuration according to an embodiment of the invention.
  • FIGS. 6 and 7 are block diagrams illustrating example regions surrounded by a box according to embodiments of the invention.
  • the embodiments shown in FIGS. 5-7 provide for real time frame alignment to facilitate auto-adaptive event detection.
  • the embodiment shown in FIGS. 5-7 contemplates a two dimensional video embodiment where each data segment is an image, gray scale pixels within each image are arranged along a two dimensional grid and alignment occurs from one image to another, or alternatively alignment occurs from one time slice to another. Similar considerations apply if data segments are arranged in zero, one or three space dimensions and if at each point in a grid more than one value is recorded. For example red, blue, and green intensity values, instead of only gray scale values, may be measured at every point in a grid and therefore more than one value may be recorded at each point in a grid.
  • the directional change will be reflected in the shift amount along the direction of travel axis as well as the axis perpendicular to the direction of travel. If the UAV additionally banks while turning or the UAV flies over land that is not flat, the shift amount will differ among pixels in any given image. Accordingly, it should be understood that pixel shifts will be present from one image to the next image and also within any given image.
  • the auto-adaptive process first partitions each image into a number of panes, surrounded by a frame, in a configurable way.
  • FIG. 5 shows one such partitioning configuration, for a 20 row pixel by 36 column pixel image.
  • Four panes each of which has six rows and eight columns, are shown with each pane surrounded by a box.
  • the process also specifies a configurable template within each current pane. For example, each of the four templates within each FIG. 5 pane has two rows and four columns.
  • the process also specifies a configurable previous image search region for each pane. Within the search region for any given pane, its template may be matched with windows of the same size from the previous pane.
  • FIG. 5 shows one such partitioning configuration, for a 20 row pixel by 36 column pixel image.
  • Four panes each of which has six rows and eight columns, are shown with each pane surrounded by a box.
  • the process also specifies a configurable template within each current pane. For example, each
  • FIG. 6 shows one such region surrounded by a box from the pixel at row 2, column 6 through the pixel at row 1 1 , column 21 corresponding to the upper left pane.
  • FIG. 7 shows another such region surrounded by a box from row 8, column 14 through row 17, column 29, corresponding to the lower right pane.
  • the auto-adaptive process For each template within each current image, the auto-adaptive process identifies a box of the same size within its corresponding search region in the previous image that most closely matches its template. For example, the search for the best fitting match in the upper left template could begin by getting mean absolute differences between pixels in the template beginning in row 6, column 12 and corresponding pixels in a box with two rows and four columns, beginning in row 2, column 6 in the previous image. The search would proceed by comparing the same template with all windows of the same size in its corresponding search region.
  • the auto-adaptive process also identifies its location in the previous image.
  • the best fitting location box for the upper left pane might be the small box shown in FIG. 6 (see box located entirely within columns 9-16, rows 3-8), surrounding pixels from row 3, column 9 through row 4, column 12. If that were the case, the current pane would have evidently shifted by three row pixels and three column pixels, relative to the previous pane.
  • the process aligns the pixels in the previous pane accordingly. For example, to align pixels in the upper left pane with its previous pane in order to correct the shift that the process has identified in FIG. 6 the process would creates a new, change score image.
  • Each pixel in the change score image contains the corresponding pixel in the current image minus its shifted pixel in the previous image. For example, each pixel from row 4, column 10 through row 9, column 17 in FIG. 6 would each contain its corresponding current pixel value minus the same previous pixel value, shifted three columns to the left of it and three rows above it.
  • the auto-adaptive process also normalizes change scores adaptively, allowing corrections to be made for baseline changes such as changing light levels from one image to another or changing light levels from one sub-image to another.
  • the process also masks pixels that have change values near zero, while showing changing pixels in their raw form.
  • video may be displayed in real time, and aligned just as the unmasked video, except pixels corresponding to no movement will be masked while pixels corresponding to movement will not.
  • the auto-adaptive process may optionally be partitioned to operate using many parallel processors, so that each processor operates on a subset of the panes or on a sub-image within each pane.
  • the process may also optionally be operated by skipping images. Through parallel processing or image skipping, the process may be employed to operate in real time, even on high resolution video, streaming at high rates.
  • the auto-adaptive process may be simply extended to allow images from more than one video camera, viewing the same region, to be aligned in real time. As a result, composite images with higher resolution or with three-dimensional renderings may be viewed in real time. The process may also be extended to allow images from more than one video camera, viewing overlapping regions, to be aligned in real time, such that stitching between images is visually negligible.
  • Shadow effects may occur because changes between images may include differences between where a moving object is located now and where it was not located before, as well as differences between where a moving object is not located now and where it was located before. Such effects may be effectively removed by using plausibility values to mask shadow effects.
  • the above process reduces data storage and transmission by overlaying detected changes on baseline images to align consecutive images in space.
  • the above process may be configured to create a baseline image at the beginning of every 100 frames, and then paste only changed values to the baseline image during the next 100 frames.
  • the above process also reduces data storage and transmission by transmitting only changed pixels while setting other pixel values to zero.
  • received and recovered data may be used to reproduce images with high resolution by overlaying changed values on baseline images. As image locations shift, views of new image regions may be preserved by transmitting or storing only pixels containing new locations, rather than all pixels.
  • the above process also reduces data storage and transmission by operating in conjunction with other space alignnnent processes, based on global positioning systems, gyroscopic stabilization, or rigid mounting.
  • All such other processes have limited capacity to align high resolution pixels.
  • the above process may use alignment from other such processes as a point of departure for fast alignment and/or fine tuning.
  • the above process may also reduce data substantially, while reproducing images in full pixel resolution.
  • alternative data reduction processes such as encoders in the h.264 family, may not reproduce images in full resolution.
  • Estimating shift values may take substantial computer time, especially when the above process and other template matching alternatives use jpeg or mpeg values as input, or when images contain little discriminating pixel variation.
  • the above process may be extended to include provision for adaptive searching to establish faster estimation. For example, small template and shift region sizes may be used initially to localize more accurate follow-up estimation based on larger template and shift region sizes. Previous pane shift values may also be used as points of departure. Pane shift estimates based on a small number of panes may also be used to identify shift values in other panes by interpolating and extrapolating estimates to other positions within images, based on fitting geometric planes to the estimated values.
  • the above process may operate in conjunction with parallel, adaptive searching to establish faster estimation.
  • the above process may be operating in real time initially, by skipping, allowing changes to be detected every 100 frames, due to initially slow speed shift estimation.
  • adaptive searching to establish faster operation may determine that use of previous shifts, smaller template sizes, smaller search regions, interpolation and extrapolation can increase speed by a factor of 100. Once such a faster operation method has been determined, it may be deployed at once by changing the above process configuration.
  • the above process may be configured to operate effectively in some settings, and reconfigured to operate effectively as settings change in other settings.
  • an operator may be viewing video from an unmanned aircraft camera, based on the above process, when it has been tuned to identify relatively few changes against highly cluttered background.
  • the operator may send a signal to the unmanned aircraft, changing the configuration for the above process, accordingly.
  • the above process may be made more effective by blurring prior images through window averaging or by compensating for differences in global image characteristics such as differences in image means, variances, minimum values and maximum values.
  • sensor data from multiple sensors may be reduced into smaller portions called snippets and then snippets from multiple sensors may be combined. Snippets containing events of interest may be combined in order to further distill or resolve snippet information, prior to transmission or final analysis.
  • the sensor data is image data.
  • the sensor data can be audio data, audio-video data or any of a variety of other types of sensor data.
  • the sensor data is image data
  • several cameras i.e., sensor modules 100 or 140
  • the cameras can be stationary (e.g., in a room) or moving (e.g., on a vehicle) and can be positioned so as to sense an object of interest.
  • multiple complete camera image frames may be combined to produce a clearer picture. Doing so is very difficult, because the frames must be spatially aligned, rescaled, and refocused in order to combine information usefully at the pixel level. However, the problem becomes much easier if much smaller windows within a frame of image data, where the window contains the object or event of interest, can be combined.
  • the cameras are very low power, solar powered video cameras that can each communicate with a powered hub at distances of a few hundred feet.
  • the cameras may be the sensors 120 and the hub may be a network device that processes the sensor data.
  • a plurality of sensors may include a plurality of wireless communication devices - in this example they are cell phones.
  • nine video cameras may be monitoring a scene of interest in real time and each video camera may be connected to its own cell phone/wireless communication device.
  • each wireless communication device would have sufficiently powerful processors and hardware resources to process its own video camera image data and identify events of interest and windows of interest from its own camera in real time.
  • each wireless communication device would be configured to transmit its windows of interest (i.e., its pre-processed sensor data) to a central server in real time and thereby drastically reduce necessary bandwidth and increase the number of available sensors that can be deployed.
  • the number of wireless communication devices that could be monitoring the scene of interest might drop from nine to three.
  • the central server module may be configured to receive the windows of interest from each of the nine wireless communication devices and combine them as inputs for further processing to produce composite output in real time. Operating on windows of interest rather than full frames greatly simplifies fusion, including spatial and temporal registration and also greatly reduces the required bandwidth.
  • the detector, the server or another network device may include a display to show the windows of interest from each camera along with the composite output in real time.
  • displaying only the windows of interest greatly simplifies operator interpretation of the sensor data and improves clarity by removing unimportant clutter data.
  • analysis of the composite sensor data after event detection and data reduction increases precision and clarity over analysis of the perceived output from any one of the nine individual cameras.
  • FIG. 8 is a block diagram illustrating two example image frames before and after pixel reduction according to an embodiment of the invention.
  • original image frames 280 and 290 are each part of a separate sequence of image frames comprising a video and reduced image frames 285 and 295 are subsequent image frames in each respective video that are reduced versions of original image frames 280 and 290, respectively.
  • reduced image 285 the only pixels remaining from the original image frame 280 are those pixels that are associated with objects in the video that have moved in the time period between the capture of original image frame 280 and reduced image frame 285, namely the walking pedestrian 287 and the driving vehicle 289.
  • the stationary objects have been removed from the reduced image 285.
  • the camera that captured the video comprising image frames 280 and 285 is a stationary video camera.
  • Scaling may include, but may not be limited to computing average values among pixels in a 10 row by 10 column by 10 consecutive time slice window and transmitting only those average values rather than all pixel values.
  • An encoder decoder module 120 or 160 may encode the pixels in the image 290, such that the average pixel values in image 290 may be encoded as a reference frame and the pixels of interest in 297 in the reduced image 295 an their addresses may be encoded. The resulting encoded packet may then be transmitted or stored for future use.
  • a second encoder decoder module 120 or 160 may then reproduce the image 295 after transmission or storage for display and then reproduce the image 295 for display by overlaying the pixels within the event of interest 297 onto the reference frame 290. As a result, substantial data reduction during transmission or storage may occur.
  • Transmitting or storing pixels within events of interest in this way, along with scaled counterparts to other pixels, in this way, may allow the original events of interest to be reproduced in full resolution, along with background in greatly reduced form, yet with sufficient clarity when reproduced to provide useful context for the event of interest.
  • the system reduces the total amount of data that is analyzed and processed by minimizing sensor data prior to storage or transmission. However, because the original image data is preserved in the baseline frame, the system can recover the full image data in any subsequent frame for display upon demand.
  • sensor data may include conventional video in gray scale or red, green, and blue (RGB) form, infrared (IR) data, radar display data, tomography data, or any variety of similar data. Sensors may be either stationary or moving.
  • a packet of data comprises a baseline reference frame and a configurable number of corresponding reduced frames.
  • the initial reference frame is transmitted or stored in full resolution without any reduction in pixel data.
  • subsequent reduced frames contain only pixels that have changed, relative to the reference frame.
  • subsequent reference frames may be captured and these subsequent reference frames may also be reduced with respect to the original reference frame. In operation, this further reduction of subsequent reference frames is more practical when the video data is capture by a stationary camera.
  • Data packets containing the video data as a series of reference frames and reduced frames may be compressed in any of several ways, including JPEG, MPEG, and run length encoding ("RLE").
  • the data packets may be formatted in a variety of ways, depending on compression type, camera pixel format, and display format as will be understood by those having skill in the art.
  • the system initially converts raw video camera data to a suitable format for further data reduction analysis.
  • a camera stabilizer may transform pixels from a frame to those from a previous frame, to correct for intra-frame camera motion.
  • a video change detector may identify inter-frame changes, in either individual pixels or features that depend on nearest neighbor pixels and identify inter-frame changes from either temporally adjacent frames or between an initial, baseline frame and the subsequent frame and include those changed pixels in the reduced pixel image.
  • the system may also include some non-changed pixels surrounding changed pixels in the reduced pixel image.
  • an event window generator may determine windows within an image frame (e.g., windows with configurable dimensions) where a window contains an event of interest that has been detected by the video change detector.
  • an even window generator may employ any variety of automatic event detection processors, including but not limited to adaptive clutter reduction, template matching, and feature-based detection processes.
  • the system may also provide a window within an image frame by cropping the image frame in accordance with configurable locations and sizes specified by an operator or a client application.
  • the cropper may also provide the image data from within the identified window in full resolution while supplying the remainder of the image data for the frame in a reduced resolution.
  • a compressor may create data packets containing only the pixels that have changed and in an alternative embodiment, the data packets may optionally include reduced resolution pixels provided by the cropper.
  • the system may transmit the data packets containing the reduced image data that collectively convey the video captured by the camera or store the data packets in a data storage area.
  • the system may also include a decompressor configured to restore image frames to their original form so that the original captured video can be played out to a monitor or speaker or other suitable device for outputting the captured sensor data.
  • the compressed or original captured video can also be provided to a programmed module or hardware device (or a module comprising both hardware and software) for further computer implemented analysis of the data.
  • the system takes in each frame of data and provides alerts as to where in each image frame a change has occurred.
  • the output is a pixel map of the image frame specifying where in each image frame the changed pixels occurred.
  • the system may convert the image frame data to an image frame where the original pixel values of changed pixels are included and the pixel values of unchanged pixels are reset to zero.
  • more than one pixel can be included in a reduced image from for each changed pixel. For example, the original pixel values of the two pixels above, below, to the right and to the left of each changed pixel may be included in a reduced image and stored.
  • a baseline reference frame is captured at the beginning of each series of reduced frames and a reference frame may also be captured at a regular interval as specified by configuration metrics.
  • the baseline image frame is used to rebuild a reduced image frame by applying the changed values in the reduced image frame to the baseline image frame.
  • the system may also store a time stamp which can later be used for reconstructing the overall video data and used during replay or analysis.
  • the system extends the storage capacity of a fifty video camera sensor system from less than three weeks to more than a year. With more aggressive configurations, the system can extend the storage capacity by orders of magnitude, while still providing extremely high fidelity replay of the originally captured video data.
  • real time data processing as described herein provides reducing of sensor data from stationary or moving sensors to useful, high quality information during encoding and results in high quality pixel values during decoding.
  • the data processing disclosed herein replaces conventional CODEC systems and methods entirely.
  • the data processing disclosed herein is used in combination to further reduce output data from conventional CODEC systems and methods.
  • the data processing disclosed herein converts and/or reduces sensor data to a format that is supplied as input to established CODEC methods, resulting in output CODEC video with substantially improved compression or quality or both.
  • the data processing can be implemented by a processor executing certain programmed modules or it can be implemented using dedicated hardware or using some combination of programmed modules and hardware. Whether implemented in a software, hardware or combination embodiment, the data processing disclosed herein allows frames to be processed in about 10 milliseconds, using conventional software applications implemented on currently available processors. Advantageously, this speed is over 10 times faster using graphical processing units and over 100 times faster using currently available digital signal processing and field programmable gating arrays and is even faster using application specific chips. The data processing described herein also allows frames to be reproduced at roughly the same speeds, regardless of frame resolution.
  • the data processing system identifies pixels of interest ("POI") automatically, encodes POIs in high resolution, and either encodes other pixels (non POIs) in low resolution or does not encode other pixels at all to further reduce data size.
  • POI pixels of interest
  • Many advantages to the presently described data processing system result, including: (1 ) identifying and encoding all pixels of interest in full resolution, rather than compressing all pixels within a frame or a block; (2) identifying pixels of interest using auto-adaptive functions that continuously learn how to distinguish POIs from other pixels; and (3) identifying POIs at a high rate that keeps pace with many high resolution cameras running at once, using one encoder.
  • FIG. 9 is a block diagram illustrating example encoding of reduced video data over time according to an embodiment of the invention.
  • FIG. 9 illustrates an embodiment with five consecutive frames, labeled from 301 to 305, which were generated by a video camera while a vehicle was moving. In alternative embodiments, the both the camera and the vehicle may be in motion or only the camera may be in motion relative to the vehicle. The motion may be synchronous or asynchronous.
  • FIG. 9 shows the system converting the frames into compressed form.
  • the encoded version of reference frame 301 looks like the complete video camera frame. However, the encoded version of frame 302, labeled 302A, contains only pixels that changed between 302 and 301 and the encoded version of frame 303, labeled 303A, contains only pixels that changed between 303 and 301 , and so on.
  • FIG. 10 is a block diagram illustrating example full pixel encoding of video data over time according to an embodiment of the invention.
  • FIG. 10 further illustrates an embodiment with frame 301 and 302 broken down into pixels, showing the frames containing 18 rows and 20 columns of pixels.
  • images may contain many more pixels.
  • the smaller number of pixels shown in FIG. 10 is to illustrate an embodiment.
  • a more representative density, namely 2,400 pixel rows and 3,200 pixel columns, will be used to describe various embodiments in further detail.
  • the encoder shown in FIG. 10 uses the pixels in frame 302, along with the pixels in reference frame 301 to identify changed pixels relative to the corresponding reference frame 301 pixels.
  • the encoder may then output a reduced reference frame 302A, containing color intensity values for only the changed pixels.
  • the encoder may use frame 303 pixels (not shown), along with reference frame 301 pixels, to identify changed frame 303 pixels, and output color intensity values for only the changed pixels to output a reduced reference frame 303A (not shown).
  • frame 304 and so on.
  • FIG. 1 1 is a block diagram illustrating example compact pixel encoding of video data over time according to an embodiment of the invention.
  • FIG. 1 1 further illustrates one embodiment. Among the 18 by 20 full frame pixels shown in 301 -303, in B301 -B303 only 5 by 5 reduced frame pixels are shown.
  • the reduced frame pixels illustrate one embodiment that reduces the number of pixels in a full resolution frame, such as 2,400 by 3,200 pixels, to a much lower number, say 240 by 320. The smaller number may then be processed in real time to identify change regions in the full resolution frame.
  • FIG. 1 1 also illustrates a further embodiment.
  • frame 302 has been contrasted with frame 301 in order to identify change regions in frame 302A, that same change regions may be used in frame 303 to identify the regions of frame 303 to compare against reference frame 301 when generating reduced reference frame 303A, without having to first identify change regions in frame 303.
  • encoded packets containing reference frame pixels and change pixels may be created.
  • encoded packets may be stored or transmitted.
  • stored or transmitted packets may be further processed by a decoder.
  • the decoder may prepare reference frame pixels, such as those shown in frame 301 within FIG. 1 1 , for display.
  • the decoder may then sequentially prepare subsequent frame pixels, such as those shown in frames 302 through 305 within FIG. 1 1 for display.
  • Each such frame may be prepared for display by overlaying its encoded change pixels from reduced reference frames (e.g., 302A and 303A) onto their corresponding pixels from reference frame 301 .
  • change regions based on contrasting frame 302 with the reference frame 301 may be enlarged so that they will cover further change between frame 302 and 303. Enlarging change regions to cover all anticipated change will not result in reduced display frame quality. Instead, outer change region pixels, which may not contain pixels that actually changed, would have the same intensity values as their corresponding reference frame pixels. As a result, overlaying their values onto the same corresponding reference frame values will have no visible effect.
  • FIG. 12 is a flow diagram illustrating an example process for encoding video data according to an embodiment of the invention.
  • the illustrated process may be implemented by a system or devices such as previously described with respect to FIGS. 1 -4.
  • Alternative embodiments of FIG. 12 may be configured manually through the use of a configuration console which may be operated to create a configuration array.
  • Alternative embodiments may also be configured automatically.
  • Alternative embodiments may include, but not be limited to, determining how often in a frame sequence a new reference frame should be created and how often a frame should be used to detect changes against the reference frame. Elements in an array that are used to determine preferred embodiments are called configuration metrics.
  • step 410 the process starts and proceeds to reading image data in step 415.
  • the image data may be read from a data storage area or may be read in real time from a communication interface, e.g., from a sensor via the communication interface.
  • each frame may be initially read into a computer.
  • the frame may then be optionally preprocessed in order to reduce the frame size to cover only regions of interest, as specified by configuration metrics. Preprocessing may range from simple image cropping based on manual image control to auto-adaptive identification of within-frame windows covering faces, periscope wakes, marine mammals, tumors, or other events of interest.
  • step 420 the system determines if the frame is a reference frame. This determination may be made based on configuration metrics, for example every 100 th frame may be determined to be a reference frame or every 25 th frame, depending on the particular implementation.
  • each image data input that is read in step 415 may be stored in an image buffer for later use.
  • the frame is encoded in step 425 and then compacted in step 430.
  • a frame When a frame is encoded, its encoded values are copied into a packet buffer.
  • compacting may include skipping pixel rows and columns.
  • the number of resulting pixels to be used for change detection may be reduced by a factor of 100 from 2,400 times 3,200 pixels to 240 times 320 pixels.
  • each non-overlapping, continuous set of 10 by 10 pixels in a full resolution frame is replaced by its average, its maximum value, or some other statistic that may be more representative of the block than its center pixel.
  • the system determines if the frame is a "change detection frame.”
  • a change detection frame such as frame 302 in FIG. 1 1
  • the change detection frame may be preprocessed in the same way that the reference frame was processed, as specified by configuration metrics.
  • the system next compacts the frame in step 440 and locates the changed pixels based on the compacted frame in step 445.
  • An overlay including the changed pixels and their addresses is then encoded in step 450 and stored in the packet buffer 490.
  • the reference frame's pixels are subtracted from the change frame's pixels.
  • changed pixel addresses are identified when encoding a frame.
  • Changed pixel addresses may be represented in a variety of ways. For example, frame pixels may be stored sequentially in a one-dimensional array, and changed pixels may be organized into contiguous regions. Continuing with the example, change pixel addresses can be represented by a number of runs containing contiguous change addresses. Each such run may be further represented by only its starting address and its run length. Once a change frame's changed pixel addresses have been identified and stored, they may also be stored in a changed pixel addresses array within the encoder as well as in a changed pixel addresses array within the encoded packet buffer 490.
  • the change pixel frame may be encoded as an overlay frame in step 450.
  • This step includes fetching all changed pixel color intensity values from the image data that may be stored in an input buffer.
  • the pixel addresses in the changed pixel addresses array are encoded into the overlay frame along with their intensity values and stored in the encoded packed buffer.
  • the packet buffer 490 contains the intensity values in a single array for all reference frames, change frames, and other frames so that a decoder may separate them into individual frame intensity values along with the contents of the packet buffer's 490 changed pixel addresses array. This functionality may be based on configuration metrics to add flexibility to the overall system.
  • step 435 if in step 435 it is determined that the frame is neither a reference frame nor a change detection frame, the system proceeds to step 450 and encodes the frame.
  • a change detection frame always follows a reference frame and standard frames always follow a change detection frame until such time as a new reference frame is establish, for example based on configuration metrics.
  • standard frame 303 follows change detection frame 302, which follows reference frame 301 .
  • Standard frames may be pre- processed in the same way that reference frames and change frames are processed as specified by configuration metrics.
  • an overlay frame of the standard frame is encoded in step 450 during which changed pixel addresses are identified.
  • Changed pixel addresses are identified for a standard frame in the same way that they are identified for a change frame. For example, the change regions from the change detection frame are used to identify the region of potential change with respect to the current standard frame and the pixel intensity values in the change regions of the current frame are compared against the pixel intensity values in the reference frame.
  • step 455 after a reference frame has been encoded and compacted and stored in the packet buffer or after a change frame has been encoded and compacted and stored in the packet buffer or after a change frame has been encoded and compacted and stored in the packet buffer, the system determines if the packet should be sent. If the packet is to be sent, then in step 460 the packet buffer is sent. The packet buffer can be sent by transmitting it directly or over a network to another device or the packet buffer can be sent by saving the data in a memory or other data storage area. These alternatives can advantageously be selected by the use of configuration metrics. If the packet is not to be sent or after the packet has been sent, the system next determines in step 465 if there is more image data to be processed. If there is more image data, the system returns to step 415 to read more image data and process any number of additional reference frames, change frames, and standard frames. If there is not more image data to be processed, the process ends as shown in step 470.
  • the process 400 may first store and compact an initial reference frame. When subsequent reference frames are read and identified, those reference frames may be compacted and "reference change frames" created based on the compacted values of the initial reference frame (which contains all of its pixels) and the compacted values of the subsequent reference frame. The compacted change reference frame values may then be used to compute changed pixel addresses, which in turn may be used to encode a "reference change overlay frame".
  • subsequent reference frames may be converted to reference change frames during encoding, and they may be overlaid onto the initial reference frame during decoding, just as change frames can be created and overlaid onto reference frames.
  • subsequent reference frames may be substantially compressed, just as other frames may be substantially compressed.
  • FIG. 13 is a block diagram illustrating an example pixel change locator 545 according to an embodiment of the invention.
  • the pixel change locator 545 can be used in the method described with respect to FIG. 12 to locate change pixels.
  • the pixel change location 545 can be implemented in hardware or software or any combination of the two.
  • the pixel change locator 545 comprises a local deviant pixels module 510, a deviant pixel addresses module 520 and a locate change regions module 530. Additionally, as input the pixel change locator 545 receives a compacted difference frame 500 and as output the pixel change locator 545 provides changes pixel addresses.
  • the locate deviant pixels module 510 is configured to convert each pixel difference value to an auto-adaptive deviance value.
  • Deviance value conversion may include first subtracting a learned mean from a compacted difference value, and then dividing the result by a normalizing factor.
  • the normalizing factor may be the square root of a learned mean squared deviation ("MSD").
  • MSD learned mean squared deviation
  • the learned mean and MSD may advantageously be updated recursively.
  • the learned mean and MSD values may be periodically updated and used for each pixel or the learned mean and MSD values may be established and then used for all pixels.
  • deviance values may be evaluated using robust deviance cutoff values, which do not depend on varying intensity under different environmental conditions such as lighting. Converting pixel difference values to robust deviance values in this way is highly valuable and is particularly valuable when the deviance values are obtained quickly.
  • the deviant pixel addresses module 520 is configured to identify changed pixel addresses based on contiguous pixels in a region, rather than using only individual pixel deviance values.
  • regions are established by configuration metrics as all existing windows within a frame that contain 1 1 contiguous rows and 1 1 contiguous columns within a compacted difference frame.
  • a rule may be established by configuration metrics that identifies the center pixel as deviant if the number of pixels within its window exceeds a value of 50.
  • the locate change regions module 530 may be tasked with implementing this rule to analyze and identify changed regions with the frame. Incorporating such an embodiment reduces the number of sporadic deviant pixels, which in turn reduces the size and complexity of deviant pixel addresses. Obtaining contiguous pixel addresses in this way is highly valuable, especially if they are obtained quickly.
  • the efficiency of the pixel change locator 545 may be increased by utilizing overlap in change region windows to reduce deviant pixel counting, by storing all possible normalizing constant values in arrays to avoid computing square roots and dividing (which saves processor resources), by performing special purpose fixed-point and floating point arithmetic, based on minimal word lengths for increased storage and speed efficiency, by using fast, available, and affordable chips, such as Spartan 6 chips, each of which contains a number of DSPs, a sufficiently large FPGA, and sufficient local memory, by performing more complicated arithmetic on DSPs within a chip and less complicated counting on its FPGA, and by optimizing parallel pipelined operation.
  • Spartan 6 chips each of which contains a number of DSPs, a sufficiently large FPGA, and sufficient local memory
  • the pixel change locator 545 is advantageously able to keep pace with high speed, high resolution, and highly compacted frames from multiple cameras.
  • use of the pixel change locator 545 has been shown to operate at less than 10 milliseconds per 240 by 320 pixel frame, when running as application software on a 2 GHz computer in a Microsoft Windows® operating system environment.
  • Use of the pixel change locator 545 has also been shown to operate about 10 times faster on GPUs than on conventional CPUs.
  • Use of the pixel change locator 545 has also been shown to operate at less than 100 microseconds per 240 by 320 pixel frame, when running as a special purpose module (or modules) on a Spartan 6 chip.
  • an encoder can process 2,400 by 3,200 frames at a rate of 1 ,250 frames per second when running as a software application on a conventional CPU; (2) an encoder can process 2,400 by 3,200 frames at a rate of 12,500 frames per second when running as a software application on a GPU; and (3) an encoder can process 2,400 by 3,200 frames at a rate of 125,000 frames per second, when running as a special purpose process on a Spartan 6 chip.
  • FIG. 14 is a block diagram illustrating an example wired or wireless processor enabled device that may be used in connection with various embodiments described herein.
  • the device 550 may be used in conjunction with a sensor, a network device, a server, a base station, or other device as previously described with respect to FIGS. 1 -4 and 12-13.
  • alternative processor enabled systems and/or architectures may also be used.
  • the processor enabled device 550 preferably includes one or more processors, such as processor 560. Additional processors may be provided, such as an auxiliary processor to manage input/output, an auxiliary processor to perform floating point mathematical operations, a special-purpose microprocessor having an architecture suitable for fast execution of signal processing algorithms (e.g., digital signal processor), a slave processor subordinate to the main processing system (e.g., back-end processor), an additional microprocessor or controller for dual or multiple processor systems, or a coprocessor. Such auxiliary processors may be discrete processors or may be integrated with the processor 560.
  • processors such as processor 560. Additional processors may be provided, such as an auxiliary processor to manage input/output, an auxiliary processor to perform floating point mathematical operations, a special-purpose microprocessor having an architecture suitable for fast execution of signal processing algorithms (e.g., digital signal processor), a slave processor subordinate to the main processing system (e.g., back-end processor), an additional microprocessor or controller for dual or multiple processor systems, or a
  • the processor 560 is preferably connected to a communication bus 555.
  • the communication bus 555 may include a data channel for facilitating information transfer between storage and other peripheral components of the processor enabled device 550.
  • the communication bus 555 further may provide a set of signals used for communication with the processor 560, including a data bus, address bus, and control bus (not shown).
  • the communication bus 555 may comprise any standard or non-standard bus architecture such as, for example, bus architectures compliant with industry standard architecture ("ISA”), extended industry standard architecture (“EISA”), Micro Channel Architecture (“MCA”), peripheral component interconnect (“PCI”) local bus, or standards promulgated by the Institute of Electrical and Electronics Engineers (“IEEE”) including IEEE 488 general-purpose interface bus (“GPIB”), IEEE 696/S-100, and the like.
  • ISA industry standard architecture
  • EISA extended industry standard architecture
  • MCA Micro Channel Architecture
  • PCI peripheral component interconnect
  • IEEE Institute of Electrical and Electronics Engineers
  • IEEE Institute of Electrical and Electronics Engineers
  • IEEE Institute of Electrical and Electronics Engineers
  • IEEE Institute of Electrical and Electronics Engineers
  • GPIB general-purpose interface bus
  • IEEE 696/S-100 IEEE 696/S-100
  • Processor enabled device 550 preferably includes a main memory 565 and may also include a secondary memory 570.
  • the main memory 565 provides storage of instructions and data for programs executing on the processor 560.
  • the main memory 565 is typically semiconductor-based memory such as dynamic random access memory (“DRAM”) and/or static random access memory (“SRAM”).
  • DRAM dynamic random access memory
  • SRAM static random access memory
  • Other semiconductor-based memory types include, for example, synchronous dynamic random access memory (“SDRAM”), Rambus dynamic random access memory (“RDRAM”), ferroelectric random access memory (“FRAM”), and the like, including read only memory (“ROM”).
  • SDRAM synchronous dynamic random access memory
  • RDRAM Rambus dynamic random access memory
  • FRAM ferroelectric random access memory
  • ROM read only memory
  • the secondary memory 570 may optionally include a internal memory 575 and/or a removable medium 580, for example a floppy disk drive, a magnetic tape drive, a compact disc (“CD”) drive, a digital versatile disc (“DVD”) drive, etc.
  • the removable medium 580 is read from and/or written to in a well-known manner.
  • Removable storage medium 580 may be, for example, a floppy disk, magnetic tape, CD, DVD, SD card, etc.
  • the removable storage medium 580 is a non-transitory computer readable medium having stored thereon computer executable code (i.e., software) and/or data.
  • the computer software or data stored on the removable storage medium 580 is read into the processor enabled device 550 for execution by the processor 560.
  • secondary memory 570 may include other similar means for allowing computer programs or other data or instructions to be loaded into the processor enabled device 550.
  • Such means may include, for example, an external storage medium 610 and an interface 590.
  • external storage medium 610 may include an external hard disk drive or an external optical drive, or an external magneto-optical drive.
  • secondary memory 570 may include semiconductor-based memory such as programmable read-only memory (“PROM”), erasable programmable read-only memory (“EPROM”), electrically erasable read-only memory (“EEPROM”), or flash memory (block oriented memory similar to EEPROM). Also included are any other removable storage media 580 and communication interface 590, which allow software and data to be transferred from an external medium 610 to the processor enabled device 550.
  • PROM programmable read-only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable read-only memory
  • flash memory block oriented memory similar to EEPROM
  • Processor enabled device 550 may also include a communication interface 590.
  • the communication interface 590 allows software and data to be transferred between processor enabled device 550 and external devices (e.g. printers), networks, or information sources.
  • external devices e.g. printers
  • computer software or executable code may be transferred to processor enabled device 550 from a network server via communication interface 590.
  • Examples of communication interface 590 include a modem, a network interface card ("NIC"), a wireless data card, a communications port, a PCMCIA slot and card, an infrared interface, and an IEEE 1394 fire-wire, just to name a few.
  • Communication interface 590 preferably implements industry promulgated protocol standards, such as Ethernet IEEE 802 standards, Fiber Channel, digital subscriber line (“DSL”), asynchronous digital subscriber line (“ADSL”), frame relay, asynchronous transfer mode (“ATM”), integrated digital services network (“ISDN”), personal communications services (“PCS”), transmission control protocol/Internet protocol (“TCP/IP”), serial line Internet protocol/point to point protocol (“SLIP/PPP”), and so on, but may also implement customized or non-standard interface protocols as well.
  • industry promulgated protocol standards such as Ethernet IEEE 802 standards, Fiber Channel, digital subscriber line (“DSL”), asynchronous digital subscriber line (“ADSL”), frame relay, asynchronous transfer mode (“ATM”), integrated digital services network (“ISDN”), personal communications services (“PCS”), transmission control protocol/Internet protocol (“TCP/IP”), serial line Internet protocol/point to point protocol (“SLIP/PPP”), and so on, but may also implement customized or non-standard interface protocols as well.
  • Software and data transferred via communication interface 590 are generally in the form of electrical communication signals 605. These signals 605 are preferably provided to communication interface 590 via a communication channel 600.
  • the communication channel 600 may be a wired or wireless network, or any variety of other communication link.
  • Communication channel 600 carries signals 605 and can be implemented using a variety of wired or wireless communication means including wire or cable, fiber optics, conventional phone line, cellular phone link, wireless data communication link, radio frequency (“RF”) link, or infrared link, just to name a few.
  • RF radio frequency
  • Computer executable code i.e., computer programs or software
  • main memory 565 and/or the secondary memory 570 Computer programs can also be received via communication interface 590 and stored in the main memory 565 and/or the secondary memory 570. Such computer programs, when executed, enable the processor enabled device 550 to perform the various functions of the present invention as previously described.
  • computer readable medium is used to refer to any non-transitory computer readable storage media used to provide computer executable code (e.g., software and computer programs) to the processor enabled device 550. Examples of these media include main memory 565, secondary memory 570 (including internal memory 575, removable medium 580, and external storage medium 610), and any peripheral device communicatively coupled with communication interface 590 (including a network information server or other network device). These non-transitory computer readable mediums are means for providing executable code, programming instructions, and software to the processor enabled device 550.
  • the software may be stored on a computer readable medium and loaded into processor enabled device 550 by way of removable medium 580, I/O interface 585, or communication interface 590.
  • the software is loaded into the processor enabled device 550 in the form of electrical communication signals 605.
  • the software when executed by the processor 560, preferably causes the processor 560 to perform the inventive features and functions previously described herein.
  • the system 550 also includes optional wireless communication components that facilitate wireless communication over a voice and over a data network.
  • the wireless communication components comprise an antenna system 625, a radio system 615 and a baseband system 620.
  • RF radio frequency
  • the antenna system 610 may comprise one or more antennae and one or more multiplexors (not shown) that perform a switching function to provide the antenna system 625 with transmit and receive signal paths.
  • received RF signals can be coupled from a multiplexor to a low noise amplifier (not shown) that amplifies the received RF signal and sends the amplified signal to the radio system 615.
  • the radio system 615 may comprise one or more radios that are configured to communicate over various frequencies.
  • the radio system 615 may combine a demodulator (not shown) and modulator (not shown) in one integrated circuit ("IC").
  • the demodulator and modulator can also be separate components. In the incoming path, the demodulator strips away the RF carrier signal leaving a baseband receive audio signal, which is sent from the radio system 615 to the baseband system 625.
  • baseband system 620 decodes the signal and converts it to an analog signal. Then the signal is amplified and sent to a speaker.
  • the baseband system 620 also receives analog audio signals from a microphone. These analog audio signals are converted to digital signals and encoded by the baseband system 620.
  • the baseband system 620 also codes the digital signals for transmission and generates a baseband transmit audio signal that is routed to the modulator portion of the radio system 615.
  • the modulator mixes the baseband transmit audio signal with an RF carrier signal generating an RF transmit signal that is routed to the antenna system and may pass through a power amplifier (not shown).
  • the power amplifier amplifies the RF transmit signal and routes it to the antenna system 625 where the signal is switched to the antenna port for transmission.
  • the baseband system 620 is also communicatively coupled with the processor 560.
  • the central processing unit 560 has access to data storage areas 565 and 570.
  • the central processing unit 560 is preferably configured to execute instructions (i.e., computer programs or software) that can be stored in the memory 565 or the secondary memory 570.
  • Computer programs can also be received from the baseband processor 620 and stored in the data storage area 565 or in secondary memory 570, or executed upon receipt.
  • Such computer programs when executed, enable the communication device 550 to perform the various functions of the present invention as previously described.
  • data storage areas 565 may include various software modules (not shown) that were previously described with respect to FIGS. 1 -5.
  • Various embodiments may also be implemented primarily in hardware using, for example, components such as application specific integrated circuits ("ASICs"), or field programmable gate arrays ("FPGAs"). Implementation of a hardware state machine capable of performing the functions described herein will also be apparent to those skilled in the relevant art. Various embodiments may also be implemented using a combination of both hardware and software.
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate arrays
  • DSP digital signal processor
  • a general-purpose processor can be a microprocessor, but in the alternative, the processor can be any processor, controller, microcontroller, or state machine.
  • a processor can also be implemented as a combination of computing devices, for example, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • a software module can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium including a network storage medium.
  • An exemplary storage medium can be coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium can be integral to the processor.
  • the processor and the storage medium can also reside in an ASIC.

Abstract

L'invention concerne des systèmes et des procédés auto-adaptatifs et un réseau auto-adaptatif permettant une détection d'événement auto-adaptative ainsi qu'un codage et un décodage vidéo. Un ou plusieurs détecteurs génèrent des données de capteur, puis les données de capteur sont analysées afin d'identifier des événements d'intérêt. Les données de capteur sont réduites à une série de trames de base, de trames de changement et de trames standard et sont codées à des fins de transmission. A l'extrémité de réception, les trames de base, les trames de changement et les trames standard sont décodées pour fournir une reproduction haute fidélité des données de capteur d'origine.
PCT/US2011/033323 2010-04-20 2011-04-20 Réseau de détection d'événement auto-adaptative : détails du codage et du décodage vidéo WO2011133720A2 (fr)

Applications Claiming Priority (10)

Application Number Priority Date Filing Date Title
US32590610P 2010-04-20 2010-04-20
US61/325,906 2010-04-20
US34970510P 2010-05-28 2010-05-28
US61/349,705 2010-05-28
US35878710P 2010-06-25 2010-06-25
US61/358,787 2010-06-25
US39238110P 2010-10-12 2010-10-12
US61/392,381 2010-10-12
US201161448607P 2011-03-02 2011-03-02
US61/448,607 2011-03-02

Publications (2)

Publication Number Publication Date
WO2011133720A2 true WO2011133720A2 (fr) 2011-10-27
WO2011133720A3 WO2011133720A3 (fr) 2012-01-12

Family

ID=44834796

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/033323 WO2011133720A2 (fr) 2010-04-20 2011-04-20 Réseau de détection d'événement auto-adaptative : détails du codage et du décodage vidéo

Country Status (1)

Country Link
WO (1) WO2011133720A2 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014133977A1 (fr) * 2013-03-01 2014-09-04 Robotex Inc. Système et procédé de liaison de données à faible latence
EP3477594A1 (fr) * 2017-10-27 2019-05-01 Vestel Elektronik Sanayi ve Ticaret A.S. Détection du mouvement d'un objet mobile et transmission d'une image de l'objet mobile
US10440086B2 (en) 2016-11-28 2019-10-08 Microsoft Technology Licensing, Llc Reading multiplexed device streams
EP3846461A1 (fr) * 2019-12-30 2021-07-07 Axis AB Déviation en temps réel dans un système de surveillance vidéo

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003087772A (ja) * 2001-09-10 2003-03-20 Fujitsu Ltd 画像制御装置
US20040143602A1 (en) * 2002-10-18 2004-07-22 Antonio Ruiz Apparatus, system and method for automated and adaptive digital image/video surveillance for events and configurations using a rich multimedia relational database
US20060152636A1 (en) * 2003-10-20 2006-07-13 Matsushita Electric Industrial Co Multimedia data recording apparatus, monitor system, and multimedia data recording method
US20090259615A1 (en) * 2005-07-08 2009-10-15 Brainlike Surveillance Research, Inc. Efficient Processing in an Auto-Adaptive Network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003087772A (ja) * 2001-09-10 2003-03-20 Fujitsu Ltd 画像制御装置
US20040143602A1 (en) * 2002-10-18 2004-07-22 Antonio Ruiz Apparatus, system and method for automated and adaptive digital image/video surveillance for events and configurations using a rich multimedia relational database
US20060152636A1 (en) * 2003-10-20 2006-07-13 Matsushita Electric Industrial Co Multimedia data recording apparatus, monitor system, and multimedia data recording method
US20090259615A1 (en) * 2005-07-08 2009-10-15 Brainlike Surveillance Research, Inc. Efficient Processing in an Auto-Adaptive Network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014133977A1 (fr) * 2013-03-01 2014-09-04 Robotex Inc. Système et procédé de liaison de données à faible latence
US10440086B2 (en) 2016-11-28 2019-10-08 Microsoft Technology Licensing, Llc Reading multiplexed device streams
EP3477594A1 (fr) * 2017-10-27 2019-05-01 Vestel Elektronik Sanayi ve Ticaret A.S. Détection du mouvement d'un objet mobile et transmission d'une image de l'objet mobile
EP3846461A1 (fr) * 2019-12-30 2021-07-07 Axis AB Déviation en temps réel dans un système de surveillance vidéo
US11232686B2 (en) 2019-12-30 2022-01-25 Axis Ab Real-time deviation in video monitoring

Also Published As

Publication number Publication date
WO2011133720A3 (fr) 2012-01-12

Similar Documents

Publication Publication Date Title
US10986338B2 (en) Thermal-image based video compression systems and methods
US9258564B2 (en) Visual search system architectures based on compressed or compact feature descriptors
GB2507395B (en) Video-based vehicle speed estimation from motion vectors in video streams
US8315481B2 (en) Image transmitting apparatus, image receiving apparatus, image transmitting and receiving system, recording medium recording image transmitting program, and recording medium recording image receiving program
US20140369417A1 (en) Systems and methods for video content analysis
KR100883632B1 (ko) 고해상도 카메라를 이용한 지능형 영상 감시 시스템 및 그 방법
EP1869641A2 (fr) Procedes et appareils d'analyse adaptative d'avant-plan/ arriere-plan
US8724912B2 (en) Method, apparatus, and program for compressing images, and method, apparatus, and program for decompressing images
US8737727B2 (en) Color similarity sorting for video forensics search
EP2717475B1 (fr) Procédé et appareil de compression de données de capteur généralisées
WO2011133720A2 (fr) Réseau de détection d'événement auto-adaptative : détails du codage et du décodage vidéo
US20230127009A1 (en) Joint objects image signal processing in temporal domain
US20140043491A1 (en) Methods and apparatuses for detection of anomalies using compressive measurements
US8587651B2 (en) Surveillance system for transcoding surveillance image files while retaining image acquisition time metadata and associated methods
EP2608151A2 (fr) Système pour communiquer des données de relation associées à des caractéristiques d'image
CN102726042B (zh) 视频处理系统和视频解码系统
US20120039395A1 (en) System and method for time series filtering and data reduction
US20230188679A1 (en) Apparatus and method for transmitting images and apparatus and method for receiving images
EP3629577B1 (fr) Procédé de transmission de données, caméra et dispositif électronique
US20070046781A1 (en) Systems and methods for processing digital video data
KR20060087732A (ko) 스마트 네트워크 카메라
CN109859200B (zh) 一种基于背景分析的低空慢速无人机快速检测方法
CN111859001B (zh) 图像相似度检测方法、装置、存储介质与电子设备
EP4195166A1 (fr) Appareil et procédé de transmission d'images et appareil et procédé de réception d'images
CN116939170B (zh) 一种视频监控方法、视频监控服务器及编码器设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11772673

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11772673

Country of ref document: EP

Kind code of ref document: A2