US20230077815A1 - System and method for enhanced video image recognition using motion sensors - Google Patents
System and method for enhanced video image recognition using motion sensors Download PDFInfo
- Publication number
- US20230077815A1 US20230077815A1 US18/056,402 US202218056402A US2023077815A1 US 20230077815 A1 US20230077815 A1 US 20230077815A1 US 202218056402 A US202218056402 A US 202218056402A US 2023077815 A1 US2023077815 A1 US 2023077815A1
- Authority
- US
- United States
- Prior art keywords
- video
- timestamped
- video frames
- sensor
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 121
- 230000033001 locomotion Effects 0.000 title description 18
- 238000012545 processing Methods 0.000 claims description 47
- 238000010586 diagram Methods 0.000 description 20
- 230000006870 function Effects 0.000 description 12
- 238000004891 communication Methods 0.000 description 9
- 238000013500 data storage Methods 0.000 description 9
- 230000001413 cellular effect Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 230000009471 action Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000001133 acceleration Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 230000001953 sensory effect Effects 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 230000002547 anomalous effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000001594 aberrant effect Effects 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- JTJMJGYZQZDUJJ-UHFFFAOYSA-N phencyclidine Chemical compound C1CCCCN1C1(C=2C=CC=CC=2)CCCCC1 JTJMJGYZQZDUJJ-UHFFFAOYSA-N 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
-
- G06K9/6271—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G11B27/30—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
- G11B27/3027—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording used signal is digitally coded
- G11B27/3036—Time code signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
Definitions
- the embodiments described in the disclosure relate to the field of image processing and specifically, to systems and methods for video and image processing, image recognition, and video annotation using sensor measurements.
- time synchronization must be very accurate to provide the desired visual effect. For example, to slow down only the frames showing a skier's jump, time synchronization must be accurate to tenths of a second to create the appropriate visual effect.
- a camera frame For example, to select a particular part of a frame for enhancement (e.g., of a basketball player performing a dunk), a camera frame must be well calibrated to the real world three-dimensional coordinates. While camera calibration is well known (e.g. Tsai, Roger Y. (1987) “A Versatile Camera Calibration Technique for High Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses,” IEEE Journal of Robotics and Automation, Vol. RA-3, No. 4, August 1987, pp. 323-344), for a mass market adaptation such procedures must be highly automated with a possible use of image recognition of the sample target in the video frame.
- camera calibration is well known (e.g. Tsai, Roger Y. (1987) “A Versatile Camera Calibration Technique for High Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses,” IEEE Journal of Robotics and Automation, Vol. RA-3, No. 4, August
- a separate requirement may be the graphical enhancement of the video by adding graphics to particular images in the frame, such as a person's face, etc.
- Image recognition has become a common part of the video and image processing. It is used to recognize particular images, like faces, cars, animals, or recognize and track particular objects or activities, say athlete jumping or moving.
- Embodiments of the disclosure overcome the aforementioned difficulties by combining sensor and video processing to provide multiple advantages.
- This sensor-camera time and position synchronization creates a virtue cycle where simple feature recognition allows accurate 3D to 2D sensor-camera mapping, which then automatically creates a large number of samples for video recognition to learn more complicated motion via deep learning, neural networks, algorithmically, or any other method, which then allows reverse mapping of image recognized motions into world 3D/time space even for the subjects, machines or equipment that don't have attached sensors.
- the disclosure describes a method for improving image recognition by using information from sensor data.
- the method may comprise receiving one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device; selecting an event based on the sensor records; identifying a time associated with the event; retrieving a plurality of timestamped video frames; synchronizing the sensor records and the video frames, wherein synchronizing the sensor records and the video frames comprises synchronizing the timestamped sensor data with individual frames of the timestamped video frames according to a common timeframe; and selecting a subset of video frames from the plurality of timestamped video frames based on the selected event.
- the disclosure describes a system for improving image recognition by using information from sensor data.
- the system comprises a sensor recording device configured to capture one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device and one or more cameras configured to record a plurality of timestamped video frames.
- the system further comprises an event processing system configured to receive one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device; select an event based on the sensor records; identify a time associated with the event; retrieve a plurality of timestamped video frames; synchronize the sensor records and the video frames, wherein synchronizing the sensor records and the video frames comprises synchronizing the timestamped sensor data with individual frames of the timestamped video frames according to a common timeframe; and select a subset of video frames from the plurality of timestamped video frames based on the selected event.
- an event processing system configured to receive one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device; select an event based on the sensor records; identify a time associated with the event; retrieve a plurality of timestamped video frames; synchronize the sensor records and the video frames, wherein synchronizing the sensor records and the video frames comprises synchronizing the timestamped sensor data with individual frames of the timestamped video frames according to a common time
- FIG. 1 is a flow diagram illustrating a method for improving image recognition by using information from sensor data according to one embodiment of the disclosure.
- FIG. 2 is a flow diagram illustrating a method for automatically creating a large video training set for image recognition and deep learning based on sensor readings according to one embodiment of the disclosure.
- FIG. 3 is a flow diagram illustrating a method for camera calibration based on sensor readings according to one embodiment of the disclosure.
- FIG. 4 is a block diagram illustrating a video and sensor processing device according to one embodiment of the disclosure.
- FIG. 5 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure.
- FIG. 6 is a block diagram illustrating a database system for enhanced video image recognition according to one embodiment of the disclosure.
- FIG. 7 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure.
- a plurality of cameras may be utilized to capture timestamped video of event such as sporting events.
- the participants captured on video may be equipped with a sensor recording device designed to capture movement and other activity data.
- the systems and methods utilize the timestamps of the sensor record data and the video data to time synchronize the two streams of data.
- the systems and methods select a subset of the video frames for further processing.
- the systems and methods select this subset by identifying events of interest (e.g., spins, jumps, flips) within the sensor record data and calculating a period of video footage to analyze.
- the systems and methods then embed the sensor record data within the video footage to provide an enhanced video stream that overlays performance data on top of the segment of video footage.
- Techniques for overlaying performance data are described in, for example, commonly owned U.S. Pat. No. 8,929,709 entitled “Automatic digital curation and tagging of action videos”, the entire disclosure of which is incorporated herein by reference.
- FIG. 1 is a flow diagram illustrating a method for automatically creating a large video training set for image recognition and deep learning based on sensor readings according to one embodiment of the disclosure.
- synchronizing the time between video records and sensor records may comprise synchronizing the time between video records and sensor records using a standard time reference.
- a standard time reference may comprise a GPS time reference or a common network time reference.
- a device capturing video data may additionally tag one or more frames with a timestamp derived from a standard time reference.
- a device capturing sensor records may utilize the same standard time reference (e.g., GPS) and may associate a timestamp with each recorded event.
- the method 100 selects and event of interest based on sensor data.
- the method 100 may analyze a stream of sensor records to detect when the sensor records indicate an event has occurred.
- a stream of sensor record data may include a time-delimited stream of sensor readings corresponding to an activity (e.g., a race or other event).
- various parameters of the sensor records may indicate that one or more subsets of the sensor record data may be associated with an event.
- sensor records may store information regarding the acceleration, velocity, rotation, or other movement of a participant equipped with a performance recording device.
- sensor records may record aberrant readings when compared to the majority of the sensor records.
- rotational sensor readings may be recorded to indicate as such.
- the method 100 may analyze the stream of sensor records to determine such anomalous readings and identify those readings as a potential event of interest.
- events may comprise various events such as jumps, flips, rotations, high speed movement, turns, or any other finite portion of a user's performance that may be of interest.
- the method 100 determines an event time in a sensor data time frame after identifying a potential event of interest. As discussed above, the method 100 may first identify a set of sensor records that correspond to a potential event. In step 106 , the method then identifies a time (T SNS ) associated with the event. In one embodiment, T SNS may comprise a point within a range of sensor readings. For example, sensor readings for a “spin” event may span multiple seconds. In one embodiment, the method 100 identifies the point T SNS as a moment in time occurring between the start and end of the spin, such as the midpoint of the event. Thus, in step 106 , the method 100 converts a stream of sensor records into a limited set of “markers” that indicate when an event has occurred, the markers each being associated with a timestamp recorded using a standard time reference (e.g., GPS).
- a standard time reference e.g., GPS
- step 108 the method 100 transfers the time, or times, determined in step 106 into the video time frame.
- transferring the time or times determined in step 106 may comprise converting the time, T SNS , to a timestamp (T V ) associated with video data.
- determining a timestamp T V may comprise applying a synchronization function to the timestamp T SNS to obtain T V .
- the method 100 may apply a linear transformation to the camera and/or sensor time if the video record frames and corresponding sensor times are known. In alternative embodiments, the method 100 may utilize this transformation as a synchronization function and may apply the synchronization function to the all video record data.
- the method 100 selects video segments that correspond to the selected time segment obtained in step 106 .
- the method 100 may utilize the timestamp T V in order to obtain a time period to extract video segments from a video database.
- the method 100 may select a period T and may select video segments from a video database that were recorded between T V ⁇ T and T V +T.
- a period T may be predetermined (e.g., a standard 1 or 2 second period may be used).
- the method 100 may determine a period T based on the sensor record data or video record data.
- the method 100 may determine a period T based on sensor record data by determining a time period in which sensor record data is anomalous.
- the method 100 may set T as 3 seconds based on the sensor data.
- the method 100 may set T based on analyzing the changes in frames of video record data. For example, starting at T V , the method 100 may analyze the preceding and subsequent frames and calculate the similarity between the preceding and subsequent frames to the frame at T V .
- selecting a video segment may additionally comprise selecting only a portion of each of the video frames present within the video segment.
- the portion of the video frames may be selected based on detecting movement of a participant in the video.
- the method 100 may analyze the video segment to identify those pixels which change between frames. The method 100 may then identify a bounding box that captures all changing pixels and select only pixels within the bounding box for each frame.
- step 112 the method 100 performs image recognition on the selected frames or sub-frames.
- image recognition can be done by a multitude of methods known in the art.
- step 114 using the result of the image recognition obtained in step 112 , the method 100 improves time synchronization between sensors and video up to a one frame time resolution. As an example this could be done by detecting jump start or the first landing video frame. This step may be not required if an initial time synchronization is accurate to better than one video frame time interval.
- the method 100 may be performed for multiple events occurring within a stream of sensor record data.
- the method 100 may be executed for each event detected during an athlete's performance (e.g., a downhill skiing race).
- FIG. 2 is a flow diagram illustrating a method for automatically creating a large video training set for image recognition and deep learning based on sensor readings according to one embodiment of the disclosure.
- the method 200 identifies an event of interest using sensor data.
- sensor data e.g., data associated with a race or event
- the method 200 may analyze a stream of sensor record data (e.g., data associated with a race or event) and identify one or more events, such as spins, flips, rotations, etc., based on anomalies detected within the sensor record data.
- the method 200 may not receive sensor record data directly from a sensor recording device when identifying an event of interest using sensor record data.
- the method 200 may have previously received sensor record data from, for example, databases.
- the method 200 may use this data to train a machine learning algorithm or predictive model to automatically detect participants in a video.
- the method 200 may provide, as inputs to a machine learning algorithm or predictive model, sensor record data and video frames previously synchronized with the sensor record data.
- the machine learning algorithm or predictive model may then use these inputs as a training set to automatically classify video frames based on the sensor record data.
- the changes in pixels of video frames may be classified by the machine learning algorithm or predictive model using the sensor record data to predict future events without the need for explicit sensor record data.
- the method 200 may generate a machine learning algorithm or predictive model that receives video frames and can predict the type of event occurring within the video frames.
- the machine learning algorithm or predictive model may also be trained to predict various performance data associated with the video feeds such as velocity, acceleration, event type, etc.
- the method 200 selects all videos that contain an event of interest from a performance tagged video data base.
- the video have metadata that described such events in the video frames or there are retrieved from a related sensory database that enables the identification of such events.
- the method 200 identifies an event in each video. This can be done by time synchronization between sensors and video or directly from video tags or metadata if present. In one embodiment, tags may be generated during event sensor processing and event identification.
- the method 200 selects only frames where selected feature or event is present. This could be done by using time synchronization or directly from tags or metadata associated with video. In one embodiment, the method 200 may synchronize the operational cameras with sensor record data as described in connection with FIG. 1 and, specifically, as described in connection with step 108 .
- step 210 and 212 if camera field of view is calibrated, then an area or sub-region of each frame is selected to simplify image recognition.
- step 214 the method 200 provides selected video frames that contain selected feature of interest to an image recognition learning algorithm.
- These video clips taken by different cameras during multiple events from different angles represent a sample environment for algorithm training and verification.
- metadata such metadata as camera parameters, distance, focus, and viewing angle can be provided to the learning algorithm as well. This data can be derived from the sensory information about camera and events.
- step 216 the method 200 trains video recognition algorithms using the entirety of the selected video frames and their tags and meta data.
- the method 200 may further be configured to automatically recalibrate the operational cameras based on the results of the method 200 .
- the method 200 may be configured to utilize a machine learning algorithm or predictive model to classify unknown parameters of the operational cameras (e.g., angle, focus, etc.).
- the method 200 may utilize sensor record data to compute a three-dimensional position of the user recording the sensor record data and may generate updated focus parameters to automatically recalibrate the operational cameras.
- the method 200 may provide updated parameters for operational cameras to a camera operator for manual calibration.
- the method 200 may further be configured to calculate a three-dimensional position of the sensor recording device for each frame in the subset of video frames. After calculating these positions, the method 200 may determine a set of image areas for each frame in the subset of video frames, the image areas framing a participant equipped with the sensor recording device. Finally, the method 200 may digitally zoom each frame in the subset of video frames based on the set of image areas.
- the method 200 may be performed for multiple events occurring within a stream of sensor record data.
- the method 200 may be executed for each event detected during an athlete's performance (e.g., a downhill skiing race).
- FIG. 3 is a flow diagram illustrating a method for camera calibration based on sensor readings according to one embodiment of the disclosure.
- the method 300 receives video data with associated metadata.
- video data may comprise video data captured during an event, such as a sporting event.
- video data may be recorded live while in other embodiments, the method 300 may receive stored video from a user.
- the method 300 may be implemented by a server-based media platform, wherein users upload video data to the server for sharing among other users.
- metadata associated with the video data may include information related to the video or the participants in the video.
- metadata may include the geographic location of the video, the date and/or time the video was taken, and, as discussed further herein, a user identifier associated with the video.
- a user identifier may comprise a numeric identifier, a username, an e-mail address, or any other data that uniquely identifies a user.
- processing the video metadata may comprise extracting the video metadata from a video file or associated database.
- the method 300 may receive a single, flat file containing video data and metadata. The method 300 may then split the single file into video data and associated metadata.
- the method 300 may reformat the received video metadata into a format useable for later processing.
- video metadata may comprise binary data that the method 300 may convert into a structured format such as JSON or XML.
- selecting performance data may comprising selecting performance data based on a user identifier (ID) present within the video metadata, or by selecting data from a performance database that has the same time and location tag or metadata as the video performance database.
- the method 300 may isolate performance data upon determining that a user ID is present within the video metadata.
- the method 300 may perform steps 306 and 308 to limit the amount of performance data processed in later steps based on the presence of a user identifier.
- step 310 the method 300 time synchronizes video frames and performance data unless both databases are already synchronized to a common time frame.
- the method 300 determines the actual pixels in each frame where a particular event or feature(s) is present.
- a feature may comprise a particular user, user equipment, or the actual sensor that provides sensory data for this event.
- the pixels that correspond to a surfer's location or the tip of the surfboard where sensor is located
- this pixel identification is done manually, in another embodiment this is done via image recognition, or semi-automatically by providing one or more pixels in the first frame and then using image recognition in each following video frame.
- the method 300 calibrates a camera field of view by using pairs of real world sensor coordinates and video frame pixel locations for the same event since both time frames were previously synchronized by the method 300 .
- the actual calibration can be done by any of the multiple methods that are well known to the practitioners of the art, such as those described in Tsai, Roger Y. (1987) “A Versatile Camera Calibration Technique for High Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses,” IEEE Journal of Robotics and Automation, Vol. RA-3, No. 4, August 1987, pp. 323-344.
- a camera field of view may be accurately calibrated and may be used to provide an accurate mapping between real world coordinates and pixels for better image recognition, view selection, selected digital zoom, or use of multiple cameras for virtual reality or 3D synthetic views.
- FIG. 4 is a block diagram illustrating a video and sensor processing device according to one embodiment of the disclosure.
- the device 400 includes a CPU 402 , memory 404 , non-volatile storage 406 , accelerometer 408 , GPS receiver 410 , sensors 412 , camera 414 , microphone 116 , cellular transceiver 418 , Bluetooth transceiver 422 , and wireless transceiver 420 .
- the device 400 may comprise a computing device designed to be worn, or otherwise carried, by a user.
- the device 400 includes an accelerometer 408 and GPS receiver 410 which monitor the device 400 to identify its position (via GPS receiver 410 ) and its acceleration (via accelerometer 408 ).
- the device 400 includes one or more sensors 412 that may record additional data regarding the activity of the device 400 .
- sensors 412 may include speedometers, tachometers, pedometers, biometric sensors, or other sensor reading devices.
- accelerometer 408 , GPS receiver 410 , and sensors 412 may alternatively each include multiple components providing similar functionality.
- Accelerometer 408 , GPS receiver 410 , and sensors 412 generate data, as described in more detail herein, and transmit the data to other components via CPU 402 .
- accelerometer 408 , GPS receiver 410 , and sensors 412 may transmit data to memory 404 for short-term storage.
- memory 404 may comprise a random access memory device or similar volatile storage device.
- accelerometer 408 , GPS receiver 410 , and sensors 412 may transmit data directly to non-volatile storage 406 .
- CPU 402 may access the data (e.g., location and/or sensor data) from memory 404 .
- non-volatile storage 406 may comprise a solid-state storage device (e.g., a “flash” storage device) or a traditional storage device (e.g., a hard disk).
- GPS receiver 410 may transmit location data (e.g., latitude, longitude, etc.) to CPU 402 , memory 404 , or non-volatile storage 406 in similar manners.
- CPU 402 may comprise a field programmable gate array or customized application-specific integrated circuit.
- Device 400 additionally includes camera 414 and microphone 416 .
- Camera 414 and microphone 416 may be capable of recording audio and video signals and transmitting these signals to CPU 402 for long term storage in non-volatile storage 406 or short-term storage 104 .
- the device 400 includes multiple network interfaces including cellular transceiver 418 , wireless transceiver 420 , and Bluetooth transceiver 422 .
- Cellular transceiver 418 enables the device 400 to transmit performance or audio/video data, processed by CPU 402 , to a server via a mobile or radio network. Additionally, CPU 402 may determine the format and contents of data transferred using cellular transceiver 418 , wireless transceiver 420 , and Bluetooth transceiver 422 based upon detected network conditions.
- FIG. 5 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure.
- sensors 502 may comprise a variety of sensors used to record the movement of a device (and user of the device) during a finite period of time.
- sensors 502 may comprise gyroscopes, accelerometers, pedometers, speedometers, tachometers, and any other sensor-based device capable of recording data relating to the movement of a device or user of the device.
- sensors 502 may additionally include biometric sensors.
- Audio/visual capture devices 504 may include one or more video cameras, still cameras, microphones, three-dimensional cameras, or any other devices capable of recording multimedia data.
- sensors 502 may comprise a distributed network of sensors installed in multiple performance recording devices.
- processing system 506 may receive data from multiple performance recording devices operated by multiple users. Each of these performance recording devices may include one or more sensors as described above.
- audio/visual capture devices 504 may comprise multiple audio/visual capture devices, each recording data and transmitting that data to processing system 506 .
- audio/visual capture devices 504 may include personal recording devices as well as fixed recording devices.
- system 500 may be a locally-installed system.
- system 500 may be installed at a known location of an event for processing data specific to that event and location.
- system 500 may comprise a globally-available system wherein devices providing video and/or performance data may be located throughout the world.
- processing system 506 may comprise single server-based device or multiple server-based devices (co-located or distributed) processing data simultaneously.
- the system 500 includes a processing system 506 .
- processing system 506 may comprise a device, or multiple devices, receiving sensor data and audio/video data from sensors 502 and audio/video capture devices 504 .
- processing system 506 is capable of processing the received data and storing the received data in performance database 508 or video database 510 .
- performance database 508 and video database 510 Embodiments of the structure of performance database 508 and video database 510 are described more fully with respect to FIG. 6 , the description of which is incorporated herein in its entirety.
- processing system 506 may further be configured to embed performance data within video data and transmit the combined data to an output device (e.g., a display device, network connection, or other communications channel).
- an output device e.g., a display device, network connection, or other communications channel.
- FIG. 6 is a block diagram illustrating a database system for enhanced video image recognition according to one embodiment of the disclosure.
- database system 600 includes a video database 602 and a performance database 618 .
- Each database 602 , 618 may contain indexes 616 , 630 , respectively.
- indexes 616 , 630 may comprise various indexes used for the retrieval of information from databases 602 , 618 .
- indexes 616 , 630 may comprise bitmap indexes (e.g., B+trees), dense indexes, reverse indexes, or sparse indexes, as applicable.
- Video database 602 may comprise various data structures or fields for storing information related to captured video. As described herein, video data may be captured by one or more cameras associated with a user or with an event. In one embodiment, video capture devices may transmit data to database system 600 for processing and storage.
- Video database 602 includes a GPS data storage component 604 .
- video database 602 may store the GPS coordinates associated with a camera transmitting video data to database system 600 .
- GPS data may include the latitude, longitude, and altitude of the camera supplying data to video database 602 .
- GPS data may be constant over time. In other embodiments, GPS data may comprise a time sequence of GPS coordinates if the camera is mobile.
- Video database 602 additionally includes direction storage component 606 .
- direction storage component 606 may store information regarding the direction a camera is positioned during the capture of video data.
- direction data may comprise a three dimensional representation of the angle in which the camera is positioned.
- direction information may be constant.
- direction information may comprise a time sequence of x, y, and z coordinates if the camera is mobile.
- Video database 602 additionally includes focus storage component 608 .
- focus storage component 608 stores information regarding the focal length of the camera transmitting video data to database system 600 .
- Video database 602 additionally includes user storage component 610 .
- user storage component 610 may store user information relating to the user capturing the video transmitted to database system 600 .
- video may be captured by devices own and operated by users (e.g., portable video cameras, cellphones, etc.). Each of these devices may be associated with a user (e.g., via an application requiring a login, via a MAC address, etc.).
- video database 602 may not record user information if the camera is not associated with a specific user (e.g., if the camera is operated by an organization). Alternatively, the video database 602 may record the organization as the user within user storage component 610 .
- Video database 602 additionally includes video file storage component 612 .
- video file storage 612 may comprise a storage device for storing raw video data, such as a filesystem.
- video file storage component 612 may comprise a special purpose database for storing video data.
- video file storage component 612 may comprise a remote “cloud”-based storage device.
- Video database 602 additionally includes tag storage component 614 .
- tag storage component may store additional annotations regarding video data transmitted to video database 602 .
- video data may be captured by users and transmitted to database system 600 .
- the user may add additional tags or annotations to the video data via an application (e.g., a mobile application). For example, a user may add tags describing the actions in the video, the scene of the video, or any other information deemed relevant by the user.
- Performance database 618 may comprise various data structures or fields for storing information related to performance data captured by performance recording devices. As described herein, performance data may be captured by one or more performance recording devices associated with a user. In one embodiment, performance recording devices may transmit data to database system 600 for processing and storage. Alternatively, performance database 618 may be stored locally within the performance recording device.
- Performance database 618 includes a user storage component 620 .
- user storage component 620 stores user information associated with the owner or operator of a performance recording device transmitting sensor record data to performance database 618 .
- a user may be equipped with a performance recording device that has been setup for use by that specific user.
- the performance recording device may be associated with an identifier uniquely identifying the user.
- the performance recording device may additionally provide the user identifier which database system 600 may store in performance database 618 via user storage component 620 .
- Performance database 618 additionally includes a bounding box storage component 622 .
- a performance recording device may supply bounding box information to database system 602 for storage in bounding box storage component 622 .
- a bounding box may comprise an estimate rectangular area surrounding the performance recording device and/or user.
- a bounding box may comprise a fixed rectangular area; alternatively, the bounding box information may be updated as the performance recording device moves.
- Performance database 618 additionally includes a GPS data storage component 624 .
- GPS data storage component 624 stores information regarding the location of the performance recording device while recording movements of the device.
- GPS data may comprise the latitude, longitude, and altitude of the performance recording device.
- GPS data may comprise a time sequence of GPS coordinates.
- Performance database 618 additionally includes a sensor data storage component 626 .
- sensor data storage component 626 stores sensor data received from sensors within a performance recording device.
- sensors may comprise gyroscopes, accelerometers, speedometers, pedometers, or other sensor recordings devices.
- sensor data storage component 626 may store sensor data as a time-series of sensor readings.
- Performance database 618 additionally includes an event data storage component 628 .
- event data storage component 628 stores information regarding events detected using the aforementioned information. Techniques for detecting events are discussed more fully with respect to FIGS. 1 through 3 and the disclosure of those Figures is incorporated herein in its entirety.
- FIG. 7 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure.
- two performers 702 a , 702 b are moving along trajectories 704 a , 704 b , respectively.
- performers 702 a , 702 b may comprise athletes, such as skiers, and the performance may comprise a sporting event such as downhill or freestyle race.
- trajectories 704 a , 704 b may comprise the path of the performers 702 a , 702 b determined based on sensor data recorded by devices (not illustrated) present on the performers 702 a , 702 b .
- devices recording sensor record data may comprise a device such as that illustrated in FIG. 4 , the description of which is incorporated herein in its entirety.
- two cameras 706 a , 706 b may be installed to record performers 702 a , 702 b .
- cameras 706 a , 706 b may be pre-installed at an event location, such as the location of a sporting event.
- an event location such as the location of a sporting event.
- a course designed for competitive skiing may have cameras 706 a , 706 b installed to record and/or broadcast skiing events taking place at the location.
- Each camera 706 a , 706 b has an associated field of view 708 a , 708 b .
- the field of view 708 a , 708 b of cameras 706 a , 706 b may comprise the surface area, in three dimensions, that cameras 706 a , 706 b capture at any given moment.
- cameras 706 a , 706 b may be fixedly mounted and thus field of view 708 a , 708 b may be constant, that is, may record continuously a fixed portion of a location.
- cameras 706 a , 706 b may be movable and thus field of view 708 a , 708 b may move in accordance with the movement of cameras 706 a , 706 b.
- cameras 706 a , 706 b may be communicatively coupled to processing device 710 .
- cameras 706 a , 706 b may transmit video data to processing device 710 for storage and processing, as discussed in more detail with respect to FIGS. 1 through 3 .
- each performer 702 a , 702 b may be equipped with a performance recording device and may transmit sensor record data to processing device 710 .
- sensor record data may be transmitted to processing device 710 using a cellular connection.
- sensor record data may first be transmitted to a server device (not illustrated) for processing prior to transmittal to processing device 710 .
- sensor record data may be stored locally by the device and transferred to processing device 710 at a later time and date.
- the trajectory 704 a of performer 702 a illustrates the scenario wherein the performed 702 a is performing an event (e.g., a high speed event, jump, spin, etc.) wherein the performer is 702 a is in the field of view 708 a of camera 706 a .
- both performers 702 a , 702 b may be performing events while not in the field of view 708 a , 708 b of cameras 706 a , 706 b .
- processing device 710 may be configured to detect an event performed by performer 702 a using video data from camera 706 a and sensor data transmitted by performer 702 a .
- processing device 710 may receive sensor data from performer 702 a and be configured to identify camera 706 a as the device providing corresponding video footage for events identified by performer 702 a .
- the selection of cameras 706 a , 706 b is described more fully with respect to FIG. 2 .
- terms, such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context.
- the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.
- These computer program instructions can be provided to a processor of: a general purpose computer to alter its function to a special purpose; a special purpose computer; ASIC; or other programmable digital data processing apparatus, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the functions/acts specified in the block diagrams or operational block or blocks, thereby transforming their functionality in accordance with embodiments herein.
- a computer readable medium stores computer data, which data can include computer program code (or computer-executable instructions) that is executable by a computer, in machine readable form.
- a computer readable medium may comprise computer readable storage media, for tangible or fixed storage of data, or communication media for transient interpretation of code-containing signals.
- Computer readable storage media refers to physical or tangible storage (as opposed to signals) and includes without limitation volatile and non-volatile, removable and non-removable media implemented in any method or technology for the tangible storage of information such as computer-readable instructions, data structures, program modules or other data.
- Computer readable storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other physical or material medium which can be used to tangibly store the desired information or data or instructions and which can be accessed by a computer or processor.
- server should be understood to refer to a service point which provides processing, database, and communication facilities.
- server can refer to a single, physical processor with associated communications and data storage and database facilities, or it can refer to a networked or clustered complex of processors and associated network and storage devices, as well as operating software and one or more database systems and application software that support the services provided by the server.
- Servers may vary widely in configuration or capabilities, but generally a server may include one or more central processing units and memory.
- a server may also include one or more mass storage devices, one or more power supplies, one or more wired or wireless network interfaces, one or more input/output interfaces, or one or more operating systems, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, or the like.
- a “network” should be understood to refer to a network that may couple devices so that communications may be exchanged, such as between a server and a client device or other types of devices, including between wireless devices coupled via a wireless network, for example.
- a network may also include mass storage, such as network attached storage (NAS), a storage area network (SAN), or other forms of computer or machine readable media, for example.
- a network may include the Internet, one or more local area networks (LANs), one or more wide area networks (WANs), wire-line type connections, wireless type connections, cellular or any combination thereof.
- sub-networks which may employ differing architectures or may be compliant or compatible with differing protocols, may interoperate within a larger network.
- Various types of devices may, for example, be made available to provide an interoperable capability for differing architectures or protocols.
- a router may provide a link between otherwise separate and independent LANs.
- a communication link or channel may include, for example, analog telephone lines, such as a twisted wire pair, a coaxial cable, full or fractional digital lines including T1, T2, T3, or T4 type lines, Integrated Services Digital Networks (ISDNs), Digital Subscriber Lines (DSLs), wireless links including satellite links, or other communication links or channels, such as may be known to those skilled in the art.
- ISDNs Integrated Services Digital Networks
- DSLs Digital Subscriber Lines
- wireless links including satellite links, or other communication links or channels, such as may be known to those skilled in the art.
- a computing device or other related electronic devices may be remotely coupled to a network, such as via a wired or wireless line or link, for example.
- a “wireless network” should be understood to couple client devices with a network.
- a wireless network may employ stand-alone ad-hoc networks, mesh networks, Wireless LAN (WLAN) networks, cellular networks, or the like.
- a wireless network may further include a system of terminals, gateways, routers, or the like coupled by wireless radio links, or the like, which may move freely, randomly or organize themselves arbitrarily, such that network topology may change, at times even rapidly.
- a wireless network may further employ a plurality of network access technologies, including Wi-Fi, Long Term Evolution (LTE), WLAN, Wireless Router (WR) mesh, or 2nd, 3rd, or 4th generation (2G, 3G, or 4G) cellular technology, or the like.
- Network access technologies may enable wide area coverage for devices, such as client devices with varying degrees of mobility, for example.
- a network may enable RF or wireless type communication via one or more network access technologies, such as Global System for Mobile communication (GSM), Universal Mobile Telecommunications System (UMTS), General Packet Radio Services (GPRS), Enhanced Data GSM Environment (EDGE), 3GPP Long Term Evolution (LTE), LTE Advanced, Wideband Code Division Multiple Access (WCDMA), Bluetooth, 802.11b/g/n, or the like.
- GSM Global System for Mobile communication
- UMTS Universal Mobile Telecommunications System
- GPRS General Packet Radio Services
- EDGE Enhanced Data GSM Environment
- LTE Long Term Evolution
- LTE Advanced Long Term Evolution
- WCDMA Wideband Code Division Multiple Access
- Bluetooth 802.11b/g/n, or the like.
- 802.11b/g/n 802.11b/g/n, or the like.
- a wireless network may include virtually any type of wireless communication mechanism by which signals may be communicated between devices, such as a client device or a computing device, between or within a network, or
- a computing device may be capable of sending or receiving signals, such as via a wired or wireless network, or may be capable of processing or storing signals, such as in memory as physical memory states, and may, therefore, operate as a server.
- devices capable of operating as a server may include, as examples, dedicated rack-mounted servers, desktop computers, laptop computers, set top boxes, integrated devices combining various features, such as two or more features of the foregoing devices, or the like.
- Servers may vary widely in configuration or capabilities, but generally a server may include one or more central processing units and memory.
- a server may also include one or more mass storage devices, one or more power supplies, one or more wired or wireless network interfaces, one or more input/output interfaces, or one or more operating systems, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, or the like.
- a module is a software, hardware, or firmware (or combinations thereof) system, process or functionality, or component thereof, that performs or facilitates the processes, features, and/or functions described herein (with or without human interaction or augmentation).
- a module can include sub-modules.
- Software components of a module may be stored on a computer readable medium for execution by a processor. Modules may be integral to one or more servers, or be loaded and executed by one or more servers. One or more modules may be grouped into an engine or an application.
- the term “user”, “subscriber” “consumer” or “customer” should be understood to refer to a user of an application or applications as described herein and/or a consumer of data supplied by a data provider.
- the term “user” or “subscriber” can refer to a person who receives data provided by the data or service provider over the Internet in a browser session, or can refer to an automated software application which receives the data and stores or processes the data.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biodiversity & Conservation Biology (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Studio Devices (AREA)
Abstract
Description
- The present application claims priority to U.S. patent application Ser. No. 17/151,071 filed on Jan. 15, 2021, issued as U.S. Pat. No. 11,516,557 on Nov. 29, 2022, which is a continuation of U.S. patent application Ser. No. 16/401,017, filed on May 1, 2019, issued as U.S. Pat. No. 10,897,659 on Jan. 19, 2021 and titled “SYSTEM AND METHOD FOR ENHANCED VIDEO IMAGE RECOGNITION USING MOTION SENSORS,” which is a continuation of U.S. patent application Ser. No. 15/334,131, filed on Oct. 25, 2016, issued as U.S. Pat. No. 10,321,208 on Jun. 11, 2019 and titled “SYSTEM AND METHOD FOR ENHANCED VIDEO IMAGE RECOGNITION USING MOTION SENSORS,” which claims the benefit of the of Prov. U.S. Pat. App. Ser. No. 62/246,324, filed on Oct. 26, 2015, the entire disclosures of which applications herein are hereby incorporated by reference in their entirety.
- This application includes material that may be subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent disclosure, as it appears in the Patent and Trademark Office files or records, but otherwise reserves all copyright rights whatsoever.
- The embodiments described in the disclosure relate to the field of image processing and specifically, to systems and methods for video and image processing, image recognition, and video annotation using sensor measurements.
- Recently, action videos have become very popular due to the wide availability of portable video cameras. At the same time professional and semi-professional video of sporting events have become more common and more sophisticated. To achieve near professional quality of mass sport video and to make sport video more interesting and appealing to the viewer, multiple special effects are employed. It is often very desirable to annotate video with on screen comments and data, e.g. velocity, altitude, etc. These parameter values are usually obtained from sources that are not internal or connected to the camera device. It may also be desirable to analyze activities captured in a video, compare it with other videos, or select specific parts of the video for zooming, annotation, or enhancement. To achieve these effects, multiple conditions must be satisfied.
- To do this correctly, video frames capturing the selected event must be determined exactly. Since the most interesting events are often very fast motions, the time synchronization must be very accurate to provide the desired visual effect. For example, to slow down only the frames showing a skier's jump, time synchronization must be accurate to tenths of a second to create the appropriate visual effect.
- For example, to select a particular part of a frame for enhancement (e.g., of a basketball player performing a dunk), a camera frame must be well calibrated to the real world three-dimensional coordinates. While camera calibration is well known (e.g. Tsai, Roger Y. (1987) “A Versatile Camera Calibration Technique for High Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses,” IEEE Journal of Robotics and Automation, Vol. RA-3, No. 4, August 1987, pp. 323-344), for a mass market adaptation such procedures must be highly automated with a possible use of image recognition of the sample target in the video frame.
- There are methods that sync camera time and sensor time by using a common time source such as GPS or network time (e.g., commonly owned U.S. Pat. No. 8,929,709). Such methods require an accurate time source in both camera and sensor. Unfortunately, some cameras don't allow very accurate sub-second timestamps. Therefore, additional synchronization tuning is required. Image recognition methods can determine the video frame where a particular action starts or ends and, therefore, allow synchronization up to the time resolution of a frame.
- A separate requirement may be the graphical enhancement of the video by adding graphics to particular images in the frame, such as a person's face, etc.
- Image recognition has become a common part of the video and image processing. It is used to recognize particular images, like faces, cars, animals, or recognize and track particular objects or activities, say athlete jumping or moving.
- In all the above applications image recognition methods are very CPU intensive. To make video image analysis efficient one needs to know what kind of motion or image to search for. Modern automatic cameras and drones that can work in autonomous or “start and forget” modes produce gigabytes of video data that needs to be analyzed for image recognition. Therefore, for efficient image recognition, it is very advantageous to know the range of frames in which to search for the desired images and an area of the screen where such images should appear.
- Embodiments of the disclosure overcome the aforementioned difficulties by combining sensor and video processing to provide multiple advantages.
- Even non-perfect time synchronization between sensor data and video frames can significantly improve efficiency of video image recognition. Using image recognition, in return, significantly improves time synchronization between sensors and the video by identifying an exact frame where a particular sensor detected action starts, ends, or occurs. Further improvement in video recognition can be achieved by identifying a screen area and predicted pixel motion by mapping sensor-derived three-dimensional position and motion into two dimensional camera frame coordinates.
- This sensor-camera time and position synchronization creates a virtue cycle where simple feature recognition allows accurate 3D to 2D sensor-camera mapping, which then automatically creates a large number of samples for video recognition to learn more complicated motion via deep learning, neural networks, algorithmically, or any other method, which then allows reverse mapping of image recognized motions into world 3D/time space even for the subjects, machines or equipment that don't have attached sensors.
- Specifically, in one embodiment, the disclosure describes a method for improving image recognition by using information from sensor data. The method may comprise receiving one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device; selecting an event based on the sensor records; identifying a time associated with the event; retrieving a plurality of timestamped video frames; synchronizing the sensor records and the video frames, wherein synchronizing the sensor records and the video frames comprises synchronizing the timestamped sensor data with individual frames of the timestamped video frames according to a common timeframe; and selecting a subset of video frames from the plurality of timestamped video frames based on the selected event.
- In another embodiment, the disclosure describes a system for improving image recognition by using information from sensor data. In one embodiment, the system comprises a sensor recording device configured to capture one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device and one or more cameras configured to record a plurality of timestamped video frames. The system further comprises an event processing system configured to receive one or more sensor records, the sensor records representing timestamped sensor data collected by a sensor recording device; select an event based on the sensor records; identify a time associated with the event; retrieve a plurality of timestamped video frames; synchronize the sensor records and the video frames, wherein synchronizing the sensor records and the video frames comprises synchronizing the timestamped sensor data with individual frames of the timestamped video frames according to a common timeframe; and select a subset of video frames from the plurality of timestamped video frames based on the selected event.
- The foregoing and other objects, features, and advantages of the disclosure will be apparent from the following description of embodiments as illustrated in the accompanying drawings, in which reference characters refer to the same parts throughout the various views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating principles of the disclosure.
-
FIG. 1 is a flow diagram illustrating a method for improving image recognition by using information from sensor data according to one embodiment of the disclosure. -
FIG. 2 is a flow diagram illustrating a method for automatically creating a large video training set for image recognition and deep learning based on sensor readings according to one embodiment of the disclosure. -
FIG. 3 is a flow diagram illustrating a method for camera calibration based on sensor readings according to one embodiment of the disclosure. -
FIG. 4 is a block diagram illustrating a video and sensor processing device according to one embodiment of the disclosure. -
FIG. 5 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure. -
FIG. 6 is a block diagram illustrating a database system for enhanced video image recognition according to one embodiment of the disclosure. -
FIG. 7 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure. - The present disclosure will now be described more fully hereinafter with reference to the accompanying drawings, which form a part hereof, and which show, by way of illustration, certain example embodiments.
- Disclosed herein are systems and method for embedding performance data within a video segment. In the disclosed embodiments, a plurality of cameras may be utilized to capture timestamped video of event such as sporting events. Additionally, the participants captured on video may be equipped with a sensor recording device designed to capture movement and other activity data. Generally, the systems and methods utilize the timestamps of the sensor record data and the video data to time synchronize the two streams of data.
- After synchronizing the data streams, the systems and methods select a subset of the video frames for further processing. In some embodiments, the systems and methods select this subset by identifying events of interest (e.g., spins, jumps, flips) within the sensor record data and calculating a period of video footage to analyze. The systems and methods then embed the sensor record data within the video footage to provide an enhanced video stream that overlays performance data on top of the segment of video footage. Techniques for overlaying performance data are described in, for example, commonly owned U.S. Pat. No. 8,929,709 entitled “Automatic digital curation and tagging of action videos”, the entire disclosure of which is incorporated herein by reference.
-
FIG. 1 is a flow diagram illustrating a method for automatically creating a large video training set for image recognition and deep learning based on sensor readings according to one embodiment of the disclosure. - In
step 102, themethod 100 synchronizes time between video and sensor records. In one embodiment, synchronizing the time between video records and sensor records may comprise synchronizing the time between video records and sensor records using a standard time reference. In one embodiment, a standard time reference may comprise a GPS time reference or a common network time reference. For example, a device capturing video data may additionally tag one or more frames with a timestamp derived from a standard time reference. Similarly, a device capturing sensor records may utilize the same standard time reference (e.g., GPS) and may associate a timestamp with each recorded event. - In
step 104, themethod 100 selects and event of interest based on sensor data. In one embodiment, upon receiving the video records and sensor records, themethod 100 may analyze a stream of sensor records to detect when the sensor records indicate an event has occurred. For example, a stream of sensor record data may include a time-delimited stream of sensor readings corresponding to an activity (e.g., a race or other event). Within this stream of sensor records, various parameters of the sensor records may indicate that one or more subsets of the sensor record data may be associated with an event. For example, sensor records may store information regarding the acceleration, velocity, rotation, or other movement of a participant equipped with a performance recording device. During an event, sensor records may record aberrant readings when compared to the majority of the sensor records. For example, if a participant is a skier and performs a rotational movement (e.g., a spin), rotational sensor readings may be recorded to indicate as such. Thus, themethod 100 may analyze the stream of sensor records to determine such anomalous readings and identify those readings as a potential event of interest. In some embodiments, events may comprise various events such as jumps, flips, rotations, high speed movement, turns, or any other finite portion of a user's performance that may be of interest. - In
step 106, themethod 100 determines an event time in a sensor data time frame after identifying a potential event of interest. As discussed above, themethod 100 may first identify a set of sensor records that correspond to a potential event. Instep 106, the method then identifies a time (TSNS) associated with the event. In one embodiment, TSNS may comprise a point within a range of sensor readings. For example, sensor readings for a “spin” event may span multiple seconds. In one embodiment, themethod 100 identifies the point TSNS as a moment in time occurring between the start and end of the spin, such as the midpoint of the event. Thus, instep 106, themethod 100 converts a stream of sensor records into a limited set of “markers” that indicate when an event has occurred, the markers each being associated with a timestamp recorded using a standard time reference (e.g., GPS). - In
step 108, themethod 100 transfers the time, or times, determined instep 106 into the video time frame. In one embodiment, transferring the time or times determined instep 106 may comprise converting the time, TSNS, to a timestamp (TV) associated with video data. In one embodiment, determining a timestamp TV may comprise applying a synchronization function to the timestamp TSNS to obtain TV. - In one embodiment, there may be clock drift between the standard time references used by the device recording sensor record data and the device recording video record data. To offset this drift, the
method 100 may apply a linear transformation to the camera and/or sensor time if the video record frames and corresponding sensor times are known. In alternative embodiments, themethod 100 may utilize this transformation as a synchronization function and may apply the synchronization function to the all video record data. - In
step 110, themethod 100 selects video segments that correspond to the selected time segment obtained instep 106. In one embodiment, themethod 100 may utilize the timestamp TV in order to obtain a time period to extract video segments from a video database. In one embodiment, themethod 100 may select a period T and may select video segments from a video database that were recorded between TV−T and TV+T. In one embodiment, a period T may be predetermined (e.g., a standard 1 or 2 second period may be used). In alternative embodiments, themethod 100 may determine a period T based on the sensor record data or video record data. In one embodiment, themethod 100 may determine a period T based on sensor record data by determining a time period in which sensor record data is anomalous. For example, during a “spin” event, rotational data may be abnormal for a period of 3 seconds. If so, themethod 100 may set T as 3 seconds based on the sensor data. In alternative embodiments, themethod 100 may set T based on analyzing the changes in frames of video record data. For example, starting at TV, themethod 100 may analyze the preceding and subsequent frames and calculate the similarity between the preceding and subsequent frames to the frame at TV. - In some embodiments, selecting a video segment may additionally comprise selecting only a portion of each of the video frames present within the video segment. In this embodiment, the portion of the video frames may be selected based on detecting movement of a participant in the video. For example, the
method 100 may analyze the video segment to identify those pixels which change between frames. Themethod 100 may then identify a bounding box that captures all changing pixels and select only pixels within the bounding box for each frame. - In
step 112, themethod 100 performs image recognition on the selected frames or sub-frames. By performing image recognition on selected frames, CPU/GPU processing load is significantly reduced and recognition probability and performance is increased. Image recognition can be done by a multitude of methods known in the art. - In
step 114, using the result of the image recognition obtained instep 112, themethod 100 improves time synchronization between sensors and video up to a one frame time resolution. As an example this could be done by detecting jump start or the first landing video frame. This step may be not required if an initial time synchronization is accurate to better than one video frame time interval. - Although illustrated using a single event, the
method 100 may be performed for multiple events occurring within a stream of sensor record data. For example, themethod 100 may be executed for each event detected during an athlete's performance (e.g., a downhill skiing race). -
FIG. 2 is a flow diagram illustrating a method for automatically creating a large video training set for image recognition and deep learning based on sensor readings according to one embodiment of the disclosure. - In
step 202, themethod 200 identifies an event of interest using sensor data. As described more fully in connection withFIGS. 4 through 7 , multiple performers may be equipped with a sensor reading device which records sensor record data regarding the performer such as the performer's acceleration, velocity, etc. This sensor record data may be received by themethod 200. In response, themethod 200 may analyze a stream of sensor record data (e.g., data associated with a race or event) and identify one or more events, such as spins, flips, rotations, etc., based on anomalies detected within the sensor record data. - In alternative embodiments, the
method 200 may not receive sensor record data directly from a sensor recording device when identifying an event of interest using sensor record data. In this embodiment, themethod 200 may have previously received sensor record data from, for example, databases. Themethod 200 may use this data to train a machine learning algorithm or predictive model to automatically detect participants in a video. Specifically, themethod 200 may provide, as inputs to a machine learning algorithm or predictive model, sensor record data and video frames previously synchronized with the sensor record data. The machine learning algorithm or predictive model may then use these inputs as a training set to automatically classify video frames based on the sensor record data. Specifically, the changes in pixels of video frames may be classified by the machine learning algorithm or predictive model using the sensor record data to predict future events without the need for explicit sensor record data. In this manner, themethod 200 may generate a machine learning algorithm or predictive model that receives video frames and can predict the type of event occurring within the video frames. Alternatively, or in conjunction with the foregoing, the machine learning algorithm or predictive model may also be trained to predict various performance data associated with the video feeds such as velocity, acceleration, event type, etc. - In
step 204, themethod 200 selects all videos that contain an event of interest from a performance tagged video data base. In some embodiments, the video have metadata that described such events in the video frames or there are retrieved from a related sensory database that enables the identification of such events. - In
step 206, themethod 200 identifies an event in each video. This can be done by time synchronization between sensors and video or directly from video tags or metadata if present. In one embodiment, tags may be generated during event sensor processing and event identification. - In
step 208, themethod 200 selects only frames where selected feature or event is present. This could be done by using time synchronization or directly from tags or metadata associated with video. In one embodiment, themethod 200 may synchronize the operational cameras with sensor record data as described in connection withFIG. 1 and, specifically, as described in connection withstep 108. - In
step - In
step 214 themethod 200 provides selected video frames that contain selected feature of interest to an image recognition learning algorithm. These video clips taken by different cameras during multiple events from different angles represent a sample environment for algorithm training and verification. In addition such metadata as camera parameters, distance, focus, and viewing angle can be provided to the learning algorithm as well. This data can be derived from the sensory information about camera and events. - In
step 216, themethod 200 trains video recognition algorithms using the entirety of the selected video frames and their tags and meta data. - Alternatively, or in conjunction with the foregoing, the
method 200 may further be configured to automatically recalibrate the operational cameras based on the results of themethod 200. For example, themethod 200 may be configured to utilize a machine learning algorithm or predictive model to classify unknown parameters of the operational cameras (e.g., angle, focus, etc.). For example, themethod 200 may utilize sensor record data to compute a three-dimensional position of the user recording the sensor record data and may generate updated focus parameters to automatically recalibrate the operational cameras. Alternatively, themethod 200 may provide updated parameters for operational cameras to a camera operator for manual calibration. - In another embodiment, the
method 200 may further be configured to calculate a three-dimensional position of the sensor recording device for each frame in the subset of video frames. After calculating these positions, themethod 200 may determine a set of image areas for each frame in the subset of video frames, the image areas framing a participant equipped with the sensor recording device. Finally, themethod 200 may digitally zoom each frame in the subset of video frames based on the set of image areas. - Although illustrated using a single event, the
method 200 may be performed for multiple events occurring within a stream of sensor record data. For example, themethod 200 may be executed for each event detected during an athlete's performance (e.g., a downhill skiing race). -
FIG. 3 is a flow diagram illustrating a method for camera calibration based on sensor readings according to one embodiment of the disclosure. - In
step 302, themethod 300 receives video data with associated metadata. As discussed in connection with the preceding Figures, video data may comprise video data captured during an event, such as a sporting event. In some embodiments, video data may be recorded live while in other embodiments, themethod 300 may receive stored video from a user. For example, themethod 300 may be implemented by a server-based media platform, wherein users upload video data to the server for sharing among other users. In some embodiments, metadata associated with the video data may include information related to the video or the participants in the video. For example, metadata may include the geographic location of the video, the date and/or time the video was taken, and, as discussed further herein, a user identifier associated with the video. In one embodiment, a user identifier may comprise a numeric identifier, a username, an e-mail address, or any other data that uniquely identifies a user. - In
step 304, themethod 300 processes the video metadata. In one embodiment, processing the video metadata may comprise extracting the video metadata from a video file or associated database. For example, themethod 300 may receive a single, flat file containing video data and metadata. Themethod 300 may then split the single file into video data and associated metadata. Alternatively, or in conjunction with the foregoing, themethod 300 may reformat the received video metadata into a format useable for later processing. For example, video metadata may comprise binary data that themethod 300 may convert into a structured format such as JSON or XML. - In
step method 300 selects performance data associated with this video. In one embodiment, selecting performance data may comprising selecting performance data based on a user identifier (ID) present within the video metadata, or by selecting data from a performance database that has the same time and location tag or metadata as the video performance database. In some embodiments, themethod 300 may isolate performance data upon determining that a user ID is present within the video metadata. In one embodiment, themethod 300 may performsteps - In
step 310, themethod 300 time synchronizes video frames and performance data unless both databases are already synchronized to a common time frame. - In
step 312, themethod 300 determines the actual pixels in each frame where a particular event or feature(s) is present. In one embodiment, a feature may comprise a particular user, user equipment, or the actual sensor that provides sensory data for this event. For example, in each frame the pixels that correspond to a surfer's location (or the tip of the surfboard where sensor is located) are identified. In one embodiment this pixel identification is done manually, in another embodiment this is done via image recognition, or semi-automatically by providing one or more pixels in the first frame and then using image recognition in each following video frame. - In
step 314, themethod 300 calibrates a camera field of view by using pairs of real world sensor coordinates and video frame pixel locations for the same event since both time frames were previously synchronized by themethod 300. The actual calibration can be done by any of the multiple methods that are well known to the practitioners of the art, such as those described in Tsai, Roger Y. (1987) “A Versatile Camera Calibration Technique for High Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses,” IEEE Journal of Robotics and Automation, Vol. RA-3, No. 4, August 1987, pp. 323-344. - Therefore, a camera field of view may be accurately calibrated and may be used to provide an accurate mapping between real world coordinates and pixels for better image recognition, view selection, selected digital zoom, or use of multiple cameras for virtual reality or 3D synthetic views.
-
FIG. 4 is a block diagram illustrating a video and sensor processing device according to one embodiment of the disclosure. Thedevice 400 includes aCPU 402,memory 404,non-volatile storage 406,accelerometer 408,GPS receiver 410,sensors 412,camera 414, microphone 116,cellular transceiver 418,Bluetooth transceiver 422, andwireless transceiver 420. - In the illustrated embodiment, the
device 400 may comprise a computing device designed to be worn, or otherwise carried, by a user. Thedevice 400 includes anaccelerometer 408 andGPS receiver 410 which monitor thedevice 400 to identify its position (via GPS receiver 410) and its acceleration (via accelerometer 408). Additionally, thedevice 400 includes one ormore sensors 412 that may record additional data regarding the activity of thedevice 400. For example,sensors 412 may include speedometers, tachometers, pedometers, biometric sensors, or other sensor reading devices. Although illustrated as single components,accelerometer 408,GPS receiver 410, andsensors 412 may alternatively each include multiple components providing similar functionality. -
Accelerometer 408,GPS receiver 410, andsensors 412 generate data, as described in more detail herein, and transmit the data to other components viaCPU 402. Alternatively, or in conjunction with the foregoing,accelerometer 408,GPS receiver 410, andsensors 412 may transmit data tomemory 404 for short-term storage. In one embodiment,memory 404 may comprise a random access memory device or similar volatile storage device. Alternatively, or in conjunction with the foregoing,accelerometer 408,GPS receiver 410, andsensors 412 may transmit data directly tonon-volatile storage 406. In this embodiment,CPU 402 may access the data (e.g., location and/or sensor data) frommemory 404. In some embodiments,non-volatile storage 406 may comprise a solid-state storage device (e.g., a “flash” storage device) or a traditional storage device (e.g., a hard disk). Specifically,GPS receiver 410 may transmit location data (e.g., latitude, longitude, etc.) toCPU 402,memory 404, ornon-volatile storage 406 in similar manners. In some embodiments,CPU 402 may comprise a field programmable gate array or customized application-specific integrated circuit. -
Device 400 additionally includescamera 414 andmicrophone 416.Camera 414 andmicrophone 416 may be capable of recording audio and video signals and transmitting these signals toCPU 402 for long term storage innon-volatile storage 406 or short-term storage 104. - As illustrated in
FIG. 4 , thedevice 400 includes multiple network interfaces includingcellular transceiver 418,wireless transceiver 420, andBluetooth transceiver 422.Cellular transceiver 418 enables thedevice 400 to transmit performance or audio/video data, processed byCPU 402, to a server via a mobile or radio network. Additionally,CPU 402 may determine the format and contents of data transferred usingcellular transceiver 418,wireless transceiver 420, andBluetooth transceiver 422 based upon detected network conditions. -
FIG. 5 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure. - As illustrated in
FIG. 5 , a plurality ofsensors 502 and audio/visual capture devices 504 transmit data toprocessing system 506. In one embodiment,sensors 502 may comprise a variety of sensors used to record the movement of a device (and user of the device) during a finite period of time. In some embodiments,sensors 502 may comprise gyroscopes, accelerometers, pedometers, speedometers, tachometers, and any other sensor-based device capable of recording data relating to the movement of a device or user of the device. In alternative embodiments,sensors 502 may additionally include biometric sensors. - Audio/
visual capture devices 504 may include one or more video cameras, still cameras, microphones, three-dimensional cameras, or any other devices capable of recording multimedia data. - Although illustrated as single elements,
sensors 502 may comprise a distributed network of sensors installed in multiple performance recording devices. For example,processing system 506 may receive data from multiple performance recording devices operated by multiple users. Each of these performance recording devices may include one or more sensors as described above. - Likewise, audio/
visual capture devices 504 may comprise multiple audio/visual capture devices, each recording data and transmitting that data toprocessing system 506. For example, audio/visual capture devices 504 may include personal recording devices as well as fixed recording devices. - In one embodiment, the
system 500 may be a locally-installed system. For example, thesystem 500 may be installed at a known location of an event for processing data specific to that event and location. Alternatively,system 500 may comprise a globally-available system wherein devices providing video and/or performance data may be located throughout the world. In this embodiment,processing system 506 may comprise single server-based device or multiple server-based devices (co-located or distributed) processing data simultaneously. - As illustrated in
FIG. 5 , thesystem 500 includes aprocessing system 506. In one embodiment,processing system 506 may comprise a device, or multiple devices, receiving sensor data and audio/video data fromsensors 502 and audio/video capture devices 504. - In one embodiment,
processing system 506 is capable of processing the received data and storing the received data inperformance database 508 orvideo database 510. Embodiments of the structure ofperformance database 508 andvideo database 510 are described more fully with respect toFIG. 6 , the description of which is incorporated herein in its entirety. - In addition to processing and storing received data,
processing system 506 may further be configured to embed performance data within video data and transmit the combined data to an output device (e.g., a display device, network connection, or other communications channel). The processing of video data to include performance data is described more fully with respect toFIGS. 1 through 3 , the descriptions of which are incorporated herein in their entirety. -
FIG. 6 is a block diagram illustrating a database system for enhanced video image recognition according to one embodiment of the disclosure. - In the embodiment illustrated in
FIG. 6 ,database system 600 includes avideo database 602 and aperformance database 618. Eachdatabase indexes indexes databases indexes -
Video database 602 may comprise various data structures or fields for storing information related to captured video. As described herein, video data may be captured by one or more cameras associated with a user or with an event. In one embodiment, video capture devices may transmit data todatabase system 600 for processing and storage. -
Video database 602 includes a GPSdata storage component 604. In one embodiment,video database 602 may store the GPS coordinates associated with a camera transmitting video data todatabase system 600. GPS data may include the latitude, longitude, and altitude of the camera supplying data tovideo database 602. In one embodiment, GPS data may be constant over time. In other embodiments, GPS data may comprise a time sequence of GPS coordinates if the camera is mobile. -
Video database 602 additionally includesdirection storage component 606. In one embodiment,direction storage component 606 may store information regarding the direction a camera is positioned during the capture of video data. In one embodiment, direction data may comprise a three dimensional representation of the angle in which the camera is positioned. In one embodiment, direction information may be constant. In other embodiments, direction information may comprise a time sequence of x, y, and z coordinates if the camera is mobile. -
Video database 602 additionally includes focus storage component 608. In one embodiment, focus storage component 608 stores information regarding the focal length of the camera transmitting video data todatabase system 600. -
Video database 602 additionally includes user storage component 610. In one embodiment, user storage component 610 may store user information relating to the user capturing the video transmitted todatabase system 600. In one embodiment, video may be captured by devices own and operated by users (e.g., portable video cameras, cellphones, etc.). Each of these devices may be associated with a user (e.g., via an application requiring a login, via a MAC address, etc.). In alternative embodiments,video database 602 may not record user information if the camera is not associated with a specific user (e.g., if the camera is operated by an organization). Alternatively, thevideo database 602 may record the organization as the user within user storage component 610. -
Video database 602 additionally includes videofile storage component 612. In one embodiment,video file storage 612 may comprise a storage device for storing raw video data, such as a filesystem. Alternatively, videofile storage component 612 may comprise a special purpose database for storing video data. In some embodiments, videofile storage component 612 may comprise a remote “cloud”-based storage device. -
Video database 602 additionally includestag storage component 614. In one embodiment, tag storage component may store additional annotations regarding video data transmitted tovideo database 602. In one embodiment, video data may be captured by users and transmitted todatabase system 600. Prior to transmitting the video data, the user may add additional tags or annotations to the video data via an application (e.g., a mobile application). For example, a user may add tags describing the actions in the video, the scene of the video, or any other information deemed relevant by the user. -
Performance database 618 may comprise various data structures or fields for storing information related to performance data captured by performance recording devices. As described herein, performance data may be captured by one or more performance recording devices associated with a user. In one embodiment, performance recording devices may transmit data todatabase system 600 for processing and storage. Alternatively,performance database 618 may be stored locally within the performance recording device. -
Performance database 618 includes a user storage component 620. In one embodiment, user storage component 620 stores user information associated with the owner or operator of a performance recording device transmitting sensor record data toperformance database 618. For example, a user may be equipped with a performance recording device that has been setup for use by that specific user. Thus, the performance recording device may be associated with an identifier uniquely identifying the user. When transmitting performance data toperformance database 618, the performance recording device may additionally provide the user identifier whichdatabase system 600 may store inperformance database 618 via user storage component 620. -
Performance database 618 additionally includes a boundingbox storage component 622. In one embodiment, a performance recording device may supply bounding box information todatabase system 602 for storage in boundingbox storage component 622. In one embodiment, a bounding box may comprise an estimate rectangular area surrounding the performance recording device and/or user. In one embodiment, a bounding box may comprise a fixed rectangular area; alternatively, the bounding box information may be updated as the performance recording device moves. -
Performance database 618 additionally includes a GPSdata storage component 624. In one embodiment, GPSdata storage component 624 stores information regarding the location of the performance recording device while recording movements of the device. In one embodiment, GPS data may comprise the latitude, longitude, and altitude of the performance recording device. In one embodiment, GPS data may comprise a time sequence of GPS coordinates. -
Performance database 618 additionally includes a sensordata storage component 626. In one embodiment, sensordata storage component 626 stores sensor data received from sensors within a performance recording device. In one embodiment, sensors may comprise gyroscopes, accelerometers, speedometers, pedometers, or other sensor recordings devices. In one embodiment, sensordata storage component 626 may store sensor data as a time-series of sensor readings. -
Performance database 618 additionally includes an eventdata storage component 628. In one embodiment, eventdata storage component 628 stores information regarding events detected using the aforementioned information. Techniques for detecting events are discussed more fully with respect toFIGS. 1 through 3 and the disclosure of those Figures is incorporated herein in its entirety. -
FIG. 7 is a block diagram illustrating a system for enhanced video image recognition according to one embodiment of the disclosure. - In the diagram illustrated in
FIG. 7 , twoperformers trajectories performers trajectories performers performers FIG. 4 , the description of which is incorporated herein in its entirety. - As illustrated in
FIG. 7 , twocameras performers cameras cameras - Each
camera view view cameras cameras cameras view cameras view cameras - As illustrated,
cameras processing device 710. In the illustrated embodiment,cameras processing device 710 for storage and processing, as discussed in more detail with respect toFIGS. 1 through 3 . Alternatively, or in conjunction with the foregoing, eachperformer processing device 710. In one embodiment, sensor record data may be transmitted toprocessing device 710 using a cellular connection. In alternative embodiments, sensor record data may first be transmitted to a server device (not illustrated) for processing prior to transmittal toprocessing device 710. In alternative embodiments, sensor record data may be stored locally by the device and transferred toprocessing device 710 at a later time and date. Notably, as illustrated, thetrajectory 704 a ofperformer 702 a illustrates the scenario wherein the performed 702 a is performing an event (e.g., a high speed event, jump, spin, etc.) wherein the performer is 702 a is in the field ofview 708 a ofcamera 706 a. Conversely, bothperformers view cameras processing device 710 may be configured to detect an event performed byperformer 702 a using video data fromcamera 706 a and sensor data transmitted byperformer 702 a. Notably, as discussed in more detail herein,processing device 710 may receive sensor data fromperformer 702 a and be configured to identifycamera 706 a as the device providing corresponding video footage for events identified byperformer 702 a. The selection ofcameras FIG. 2 . - The subject matter described above may be embodied in a variety of different forms and, therefore, covered or claimed subject matter is intended to be construed as not being limited to any example embodiments set forth herein; example embodiments are provided merely to be illustrative. Likewise, a reasonably broad scope for claimed or covered subject matter is intended. Among other things, for example, subject matter may be embodied as methods, devices, components, or systems. Accordingly, embodiments may, for example, take the form of hardware, software, firmware or any combination thereof (other than software per se). The description presented above is, therefore, not intended to be taken in a limiting sense.
- Throughout the specification and claims, terms may have nuanced meanings suggested or implied in context beyond an explicitly stated meaning. Likewise, the phrase “in one embodiment” as used herein does not necessarily refer to the same embodiment and the phrase “in another embodiment” as used herein does not necessarily refer to a different embodiment. It is intended, for example, that claimed subject matter include combinations of example embodiments in whole or in part.
- In general, terminology may be understood at least in part from usage in context. For example, terms, such as “and”, “or”, or “and/or,” as used herein may include a variety of meanings that may depend at least in part upon the context in which such terms are used. Typically, “or” if used to associate a list, such as A, B or C, is intended to mean A, B, and C, here used in the inclusive sense, as well as A, B or C, here used in the exclusive sense. In addition, the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features, structures or characteristics in a plural sense. Similarly, terms, such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.
- The present disclosure is described below with reference to block diagrams and operational illustrations of methods and devices. It is understood that each block of the block diagrams or operational illustrations, and combinations of blocks in the block diagrams or operational illustrations, can be implemented by means of analog or digital hardware and computer program instructions. These computer program instructions can be provided to a processor of a general purpose computer to alter its function as detailed herein, a special purpose computer, ASIC, or other programmable data processing apparatus, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the functions/acts specified in the block diagrams or operational block or blocks. In some alternate implementations, the functions/acts noted in the blocks can occur out of the order noted in the operational illustrations. For example, two blocks shown in succession can in fact be executed substantially concurrently or the blocks can sometimes be executed in the reverse order, depending upon the functionality/acts involved.
- These computer program instructions can be provided to a processor of: a general purpose computer to alter its function to a special purpose; a special purpose computer; ASIC; or other programmable digital data processing apparatus, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the functions/acts specified in the block diagrams or operational block or blocks, thereby transforming their functionality in accordance with embodiments herein.
- For the purposes of this disclosure a computer readable medium (or computer readable storage medium/media) stores computer data, which data can include computer program code (or computer-executable instructions) that is executable by a computer, in machine readable form. By way of example, and not limitation, a computer readable medium may comprise computer readable storage media, for tangible or fixed storage of data, or communication media for transient interpretation of code-containing signals. Computer readable storage media, as used herein, refers to physical or tangible storage (as opposed to signals) and includes without limitation volatile and non-volatile, removable and non-removable media implemented in any method or technology for the tangible storage of information such as computer-readable instructions, data structures, program modules or other data. Computer readable storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other physical or material medium which can be used to tangibly store the desired information or data or instructions and which can be accessed by a computer or processor.
- For the purposes of this disclosure the term “server” should be understood to refer to a service point which provides processing, database, and communication facilities. By way of example, and not limitation, the term “server” can refer to a single, physical processor with associated communications and data storage and database facilities, or it can refer to a networked or clustered complex of processors and associated network and storage devices, as well as operating software and one or more database systems and application software that support the services provided by the server. Servers may vary widely in configuration or capabilities, but generally a server may include one or more central processing units and memory. A server may also include one or more mass storage devices, one or more power supplies, one or more wired or wireless network interfaces, one or more input/output interfaces, or one or more operating systems, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, or the like.
- For the purposes of this disclosure a “network” should be understood to refer to a network that may couple devices so that communications may be exchanged, such as between a server and a client device or other types of devices, including between wireless devices coupled via a wireless network, for example. A network may also include mass storage, such as network attached storage (NAS), a storage area network (SAN), or other forms of computer or machine readable media, for example. A network may include the Internet, one or more local area networks (LANs), one or more wide area networks (WANs), wire-line type connections, wireless type connections, cellular or any combination thereof. Likewise, sub-networks, which may employ differing architectures or may be compliant or compatible with differing protocols, may interoperate within a larger network. Various types of devices may, for example, be made available to provide an interoperable capability for differing architectures or protocols. As one illustrative example, a router may provide a link between otherwise separate and independent LANs.
- A communication link or channel may include, for example, analog telephone lines, such as a twisted wire pair, a coaxial cable, full or fractional digital lines including T1, T2, T3, or T4 type lines, Integrated Services Digital Networks (ISDNs), Digital Subscriber Lines (DSLs), wireless links including satellite links, or other communication links or channels, such as may be known to those skilled in the art. Furthermore, a computing device or other related electronic devices may be remotely coupled to a network, such as via a wired or wireless line or link, for example.
- For purposes of this disclosure, a “wireless network” should be understood to couple client devices with a network. A wireless network may employ stand-alone ad-hoc networks, mesh networks, Wireless LAN (WLAN) networks, cellular networks, or the like. A wireless network may further include a system of terminals, gateways, routers, or the like coupled by wireless radio links, or the like, which may move freely, randomly or organize themselves arbitrarily, such that network topology may change, at times even rapidly.
- A wireless network may further employ a plurality of network access technologies, including Wi-Fi, Long Term Evolution (LTE), WLAN, Wireless Router (WR) mesh, or 2nd, 3rd, or 4th generation (2G, 3G, or 4G) cellular technology, or the like. Network access technologies may enable wide area coverage for devices, such as client devices with varying degrees of mobility, for example.
- For example, a network may enable RF or wireless type communication via one or more network access technologies, such as Global System for Mobile communication (GSM), Universal Mobile Telecommunications System (UMTS), General Packet Radio Services (GPRS), Enhanced Data GSM Environment (EDGE), 3GPP Long Term Evolution (LTE), LTE Advanced, Wideband Code Division Multiple Access (WCDMA), Bluetooth, 802.11b/g/n, or the like. A wireless network may include virtually any type of wireless communication mechanism by which signals may be communicated between devices, such as a client device or a computing device, between or within a network, or the like.
- A computing device may be capable of sending or receiving signals, such as via a wired or wireless network, or may be capable of processing or storing signals, such as in memory as physical memory states, and may, therefore, operate as a server. Thus, devices capable of operating as a server may include, as examples, dedicated rack-mounted servers, desktop computers, laptop computers, set top boxes, integrated devices combining various features, such as two or more features of the foregoing devices, or the like. Servers may vary widely in configuration or capabilities, but generally a server may include one or more central processing units and memory. A server may also include one or more mass storage devices, one or more power supplies, one or more wired or wireless network interfaces, one or more input/output interfaces, or one or more operating systems, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, or the like.
- For the purposes of this disclosure a module is a software, hardware, or firmware (or combinations thereof) system, process or functionality, or component thereof, that performs or facilitates the processes, features, and/or functions described herein (with or without human interaction or augmentation). A module can include sub-modules. Software components of a module may be stored on a computer readable medium for execution by a processor. Modules may be integral to one or more servers, or be loaded and executed by one or more servers. One or more modules may be grouped into an engine or an application.
- For the purposes of this disclosure the term “user”, “subscriber” “consumer” or “customer” should be understood to refer to a user of an application or applications as described herein and/or a consumer of data supplied by a data provider. By way of example, and not limitation, the term “user” or “subscriber” can refer to a person who receives data provided by the data or service provider over the Internet in a browser session, or can refer to an automated software application which receives the data and stores or processes the data.
- Those skilled in the art will recognize that the methods and systems of the present disclosure may be implemented in many manners and as such are not to be limited by the foregoing exemplary embodiments and examples. In other words, functional elements being performed by single or multiple components, in various combinations of hardware and software or firmware, and individual functions, may be distributed among software applications at either the client level or server level or both. In this regard, any number of the features of the different embodiments described herein may be combined into single or multiple embodiments, and alternate embodiments having fewer than, or more than, all of the features described herein are possible.
- Functionality may also be, in whole or in part, distributed among multiple components, in manners now known or to become known. Thus, myriad software/hardware/firmware combinations are possible in achieving the functions, features, interfaces and preferences described herein. Moreover, the scope of the present disclosure covers conventionally known manners for carrying out the described features and functions and interfaces, as well as those variations and modifications that may be made to the hardware or software or firmware components described herein as would be understood by those skilled in the art now and hereafter.
- Furthermore, the embodiments of methods presented and described as flowcharts in this disclosure are provided by way of example in order to provide a more complete understanding of the technology. The disclosed methods are not limited to the operations and logical flow presented herein. Alternative embodiments are contemplated in which the order of the various operations is altered and in which sub-operations described as being part of a larger operation are performed independently.
- While various embodiments have been described for purposes of this disclosure, such embodiments should not be deemed to limit the teaching of this disclosure to those embodiments. Various changes and modifications may be made to the elements and operations described above to obtain a result that remains within the scope of the systems and processes described in this disclosure.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/056,402 US20230077815A1 (en) | 2015-10-26 | 2022-11-17 | System and method for enhanced video image recognition using motion sensors |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562246324P | 2015-10-26 | 2015-10-26 | |
US15/334,131 US10321208B2 (en) | 2015-10-26 | 2016-10-25 | System and method for enhanced video image recognition using motion sensors |
US16/401,017 US10897659B2 (en) | 2015-10-26 | 2019-05-01 | System and method for enhanced video image recognition using motion sensors |
US17/151,071 US11516557B2 (en) | 2015-10-26 | 2021-01-15 | System and method for enhanced video image recognition using motion sensors |
US18/056,402 US20230077815A1 (en) | 2015-10-26 | 2022-11-17 | System and method for enhanced video image recognition using motion sensors |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/151,071 Continuation US11516557B2 (en) | 2015-10-26 | 2021-01-15 | System and method for enhanced video image recognition using motion sensors |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230077815A1 true US20230077815A1 (en) | 2023-03-16 |
Family
ID=58562253
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/334,131 Active 2036-12-20 US10321208B2 (en) | 2015-10-26 | 2016-10-25 | System and method for enhanced video image recognition using motion sensors |
US16/401,017 Active US10897659B2 (en) | 2015-10-26 | 2019-05-01 | System and method for enhanced video image recognition using motion sensors |
US17/151,071 Active US11516557B2 (en) | 2015-10-26 | 2021-01-15 | System and method for enhanced video image recognition using motion sensors |
US18/056,402 Pending US20230077815A1 (en) | 2015-10-26 | 2022-11-17 | System and method for enhanced video image recognition using motion sensors |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/334,131 Active 2036-12-20 US10321208B2 (en) | 2015-10-26 | 2016-10-25 | System and method for enhanced video image recognition using motion sensors |
US16/401,017 Active US10897659B2 (en) | 2015-10-26 | 2019-05-01 | System and method for enhanced video image recognition using motion sensors |
US17/151,071 Active US11516557B2 (en) | 2015-10-26 | 2021-01-15 | System and method for enhanced video image recognition using motion sensors |
Country Status (1)
Country | Link |
---|---|
US (4) | US10321208B2 (en) |
Families Citing this family (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8929709B2 (en) | 2012-06-11 | 2015-01-06 | Alpinereplay, Inc. | Automatic digital curation and tagging of action videos |
US10548514B2 (en) | 2013-03-07 | 2020-02-04 | Alpinereplay, Inc. | Systems and methods for identifying and characterizing athletic maneuvers |
US10408857B2 (en) | 2012-09-12 | 2019-09-10 | Alpinereplay, Inc. | Use of gyro sensors for identifying athletic maneuvers |
US10008237B2 (en) | 2012-09-12 | 2018-06-26 | Alpinereplay, Inc | Systems and methods for creating and enhancing videos |
US9661355B2 (en) * | 2015-01-08 | 2017-05-23 | Kiswe Mobile Inc. | Virtual immersion via streamed content adaptation |
US10321208B2 (en) | 2015-10-26 | 2019-06-11 | Alpinereplay, Inc. | System and method for enhanced video image recognition using motion sensors |
US9758246B1 (en) * | 2016-01-06 | 2017-09-12 | Gopro, Inc. | Systems and methods for adjusting flight control of an unmanned aerial vehicle |
WO2018068321A1 (en) * | 2016-10-14 | 2018-04-19 | SZ DJI Technology Co., Ltd. | System and method for moment capturing |
US10363472B2 (en) * | 2016-11-02 | 2019-07-30 | Makenna Noel Bentley | Training system and method for cuing a jumper on a jump over a crossbar |
CN108734739A (en) * | 2017-04-25 | 2018-11-02 | 北京三星通信技术研究有限公司 | The method and device generated for time unifying calibration, event mark, database |
US11450148B2 (en) * | 2017-07-06 | 2022-09-20 | Wisconsin Alumni Research Foundation | Movement monitoring system |
CN107506261B (en) * | 2017-08-01 | 2020-05-15 | 北京丁牛科技有限公司 | Cascade fault-tolerant processing method suitable for CPU and GPU heterogeneous clusters |
US10867217B1 (en) * | 2017-09-01 | 2020-12-15 | Objectvideo Labs, Llc | Fusion of visual and non-visual information for training deep learning models |
US10491778B2 (en) | 2017-09-21 | 2019-11-26 | Honeywell International Inc. | Applying features of low-resolution data to corresponding high-resolution data |
US10354169B1 (en) * | 2017-12-22 | 2019-07-16 | Motorola Solutions, Inc. | Method, device, and system for adaptive training of machine learning models via detected in-field contextual sensor events and associated located and retrieved digital audio and/or video imaging |
US10375354B2 (en) * | 2018-01-05 | 2019-08-06 | Facebook, Inc. | Video communication using subtractive filtering |
CN110622213B (en) * | 2018-02-09 | 2022-11-15 | 百度时代网络技术(北京)有限公司 | System and method for depth localization and segmentation using 3D semantic maps |
JP6806737B2 (en) * | 2018-06-15 | 2021-01-06 | ファナック株式会社 | Synchronizer, synchronization method and synchronization program |
US10778916B2 (en) | 2018-10-24 | 2020-09-15 | Honeywell International Inc. | Applying an annotation to an image based on keypoints |
US11004224B2 (en) * | 2019-01-22 | 2021-05-11 | Velodyne Lidar Usa, Inc. | Generation of structured map data from vehicle sensors and camera arrays |
US11948097B1 (en) * | 2019-04-11 | 2024-04-02 | Stark Focus LLC | System and method for viewing an event |
US11605224B2 (en) * | 2019-05-31 | 2023-03-14 | Apple Inc. | Automated media editing operations in consumer devices |
JP6672511B1 (en) * | 2019-07-09 | 2020-03-25 | 株式会社東芝 | Sensor system, transmission terminal, time information processing device, and synchronization method |
CN112235563B (en) * | 2019-07-15 | 2023-06-30 | 北京字节跳动网络技术有限公司 | Focusing test method and device, computer equipment and storage medium |
JP7289241B2 (en) * | 2019-08-09 | 2023-06-09 | 富士フイルム株式会社 | Filing device, filing method and program |
US11900679B2 (en) | 2019-11-26 | 2024-02-13 | Objectvideo Labs, Llc | Image-based abnormal event detection |
EP4097577A4 (en) * | 2020-01-29 | 2024-02-21 | Iyengar, Prashanth | Systems and methods for resource analysis, optimization, or visualization |
WO2021243074A1 (en) * | 2020-05-27 | 2021-12-02 | Helios Sports, Inc. | Intelligent sports video and data generation from ai recognition events |
US11751800B2 (en) | 2020-10-22 | 2023-09-12 | International Business Machines Corporation | Seizure detection using contextual motion |
CN113033458B (en) * | 2021-04-09 | 2023-11-07 | 京东科技控股股份有限公司 | Action recognition method and device |
CN113591709B (en) * | 2021-07-30 | 2022-09-23 | 北京百度网讯科技有限公司 | Motion recognition method, apparatus, device, medium, and product |
US12067780B2 (en) * | 2022-02-28 | 2024-08-20 | Samsung Electronics Co., Ltd. | Systems and methods for video event segmentation derived from simultaneously recorded sensor data |
CN118042082B (en) * | 2024-02-23 | 2024-08-20 | 湖北泰跃卫星技术发展股份有限公司 | Video time calibration method based on meteorological change in data center station |
CN118524240A (en) * | 2024-07-22 | 2024-08-20 | 江苏欧帝电子科技有限公司 | Streaming media file generation method, system, terminal and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100287159A1 (en) * | 2000-11-21 | 2010-11-11 | Aol Inc. | Methods and systems for enhancing metadata |
US20110071792A1 (en) * | 2009-08-26 | 2011-03-24 | Cameron Miner | Creating and viewing multimedia content from data of an individual's performance in a physical activity |
US20130278727A1 (en) * | 2010-11-24 | 2013-10-24 | Stergen High-Tech Ltd. | Method and system for creating three-dimensional viewable video from a single video stream |
US8648919B2 (en) * | 2011-06-06 | 2014-02-11 | Apple Inc. | Methods and systems for image stabilization |
US8731239B2 (en) * | 2009-12-09 | 2014-05-20 | Disney Enterprises, Inc. | Systems and methods for tracking objects under occlusion |
US20140320681A1 (en) * | 2011-06-06 | 2014-10-30 | Apple Inc. | Correcting rolling shutter using image stabilization |
US20170099441A1 (en) * | 2015-10-05 | 2017-04-06 | Woncheol Choi | Virtual flying camera system |
Family Cites Families (86)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SE8503151D0 (en) | 1985-06-24 | 1985-06-24 | Se Produkter | DEVICE FOR THE DETECTION OF RELATIVE MOVEMENTS AND / OR THE DOCTOR OF A BODY PART OR LIKE |
US5067717A (en) | 1990-11-07 | 1991-11-26 | Harlan Thomas A | Golfer's swing analysis device |
US5337758A (en) | 1991-01-11 | 1994-08-16 | Orthopedic Systems, Inc. | Spine motion analyzer and method |
US6266623B1 (en) | 1994-11-21 | 2001-07-24 | Phatrat Technology, Inc. | Sport monitoring apparatus for determining loft time, speed, power absorbed and other factors such as height |
US5636146A (en) | 1994-11-21 | 1997-06-03 | Phatrat Technology, Inc. | Apparatus and methods for determining loft time and speed |
US7386401B2 (en) | 1994-11-21 | 2008-06-10 | Phatrat Technology, Llc | Helmet that reports impact information, and associated methods |
US6516284B2 (en) | 1994-11-21 | 2003-02-04 | Phatrat Technology, Inc. | Speedometer for a moving sportsman |
US5724265A (en) | 1995-12-12 | 1998-03-03 | Hutchings; Lawrence J. | System and method for measuring movement of objects |
EP0816986B1 (en) | 1996-07-03 | 2006-09-06 | Hitachi, Ltd. | System for recognizing motions |
US7843510B1 (en) * | 1998-01-16 | 2010-11-30 | Ecole Polytechnique Federale De Lausanne | Method and system for combining video sequences with spatio-temporal alignment |
US6176837B1 (en) | 1998-04-17 | 2001-01-23 | Massachusetts Institute Of Technology | Motion tracking system |
WO2000002102A1 (en) | 1998-07-01 | 2000-01-13 | Sportvision System, Llc | System for measuring a jump |
US7015950B1 (en) | 1999-05-11 | 2006-03-21 | Pryor Timothy R | Picture taking method and apparatus |
US6513532B2 (en) | 2000-01-19 | 2003-02-04 | Healthetech, Inc. | Diet and activity-monitoring device |
US6825777B2 (en) | 2000-05-03 | 2004-11-30 | Phatrat Technology, Inc. | Sensor and event system, and associated methods |
JP2001317959A (en) | 2000-05-10 | 2001-11-16 | Sony Corp | Apparatus, method and system for information processing, and program storage medium |
US7782363B2 (en) | 2000-06-27 | 2010-08-24 | Front Row Technologies, Llc | Providing multiple video perspectives of activities through a data network to a remote multimedia server for selective display by remote viewing audiences |
JP3787498B2 (en) * | 2001-02-13 | 2006-06-21 | キヤノン株式会社 | Imaging apparatus and imaging system |
JP3582495B2 (en) | 2001-02-21 | 2004-10-27 | 株式会社日立製作所 | Biomagnetic field measurement device |
JP2003244691A (en) | 2002-02-20 | 2003-08-29 | Matsushita Electric Ind Co Ltd | Memory support system |
US7505964B2 (en) | 2003-09-12 | 2009-03-17 | Google Inc. | Methods and systems for improving a search ranking using related queries |
US8712510B2 (en) | 2004-02-06 | 2014-04-29 | Q-Tec Systems Llc | Method and apparatus for exercise monitoring combining exercise monitoring and visual data with wireless internet connectivity |
JP4121973B2 (en) | 2004-03-26 | 2008-07-23 | 富士フイルム株式会社 | Scene extraction system and scene extraction method |
US20050223799A1 (en) | 2004-03-31 | 2005-10-13 | Brian Murphy | System and method for motion capture and analysis |
US7631808B2 (en) * | 2004-06-21 | 2009-12-15 | Stoplift, Inc. | Method and apparatus for detecting suspicious activity using video analysis |
WO2006014810A2 (en) | 2004-07-29 | 2006-02-09 | Kevin Ferguson | A human movement measurement system |
US7264554B2 (en) | 2005-01-26 | 2007-09-04 | Bentley Kinetics, Inc. | Method and system for athletic motion analysis and instruction |
US20060190419A1 (en) | 2005-02-22 | 2006-08-24 | Bunn Frank E | Video surveillance data analysis algorithms, with local and network-shared communications for facial, physical condition, and intoxication recognition, fuzzy logic intelligent camera system |
US20060247504A1 (en) | 2005-04-29 | 2006-11-02 | Honeywell International, Inc. | Residential monitoring system for selected parameters |
JP4289326B2 (en) | 2005-06-09 | 2009-07-01 | ソニー株式会社 | Information processing apparatus and method, photographing apparatus, and program |
DE602005010915D1 (en) | 2005-07-12 | 2008-12-18 | Dartfish Sa | METHOD FOR ANALYZING THE MOVEMENT OF A PERSON DURING AN ACTIVITY |
US8613620B2 (en) | 2005-07-26 | 2013-12-24 | Interactive Sports Direct Incorporated | Method and system for providing web based interactive lessons with improved session playback |
US20070027367A1 (en) | 2005-08-01 | 2007-02-01 | Microsoft Corporation | Mobile, personal, and non-intrusive health monitoring and analysis system |
WO2007033194A2 (en) | 2005-09-13 | 2007-03-22 | Aware Technologies, Inc. | Method and system for proactive telemonitor with real-time activity and physiology classification and diary feature |
US7602301B1 (en) | 2006-01-09 | 2009-10-13 | Applied Technology Holdings, Inc. | Apparatus, systems, and methods for gathering and processing biometric and biomechanical data |
WO2007082389A1 (en) | 2006-01-20 | 2007-07-26 | 6Th Dimension Devices Inc. | Method and system for assessing athletic performance |
US8055469B2 (en) | 2006-03-03 | 2011-11-08 | Garmin Switzerland Gmbh | Method and apparatus for determining the attachment position of a motion sensing apparatus |
US7827000B2 (en) | 2006-03-03 | 2010-11-02 | Garmin Switzerland Gmbh | Method and apparatus for estimating a motion parameter |
CN100423694C (en) | 2006-06-30 | 2008-10-08 | 中国科学院合肥物质科学研究院 | Method for distributed human motion link acceleration testing device |
US20080246841A1 (en) | 2007-04-03 | 2008-10-09 | Taiming Chen | Method and system for automatically generating personalized media collection for participants |
US8711224B2 (en) | 2007-08-06 | 2014-04-29 | Frostbyte Video, Inc. | Image capture system and method |
US8702430B2 (en) | 2007-08-17 | 2014-04-22 | Adidas International Marketing B.V. | Sports electronic training system, and applications thereof |
JP4755156B2 (en) | 2007-09-05 | 2011-08-24 | 日本電信電話株式会社 | Image providing apparatus and image providing program |
US8942764B2 (en) | 2007-10-01 | 2015-01-27 | Apple Inc. | Personal media device controlled via user initiated movements utilizing movement based interfaces |
CA2715965C (en) | 2008-02-14 | 2019-01-15 | Infomotion Sports Technologies, Inc. | Electronic analysis of athletic performance |
US8773269B2 (en) | 2008-06-27 | 2014-07-08 | Neal T. RICHARDSON | Autonomous fall monitor |
US9216341B2 (en) | 2008-08-04 | 2015-12-22 | Xipu Li | Real-time swimming monitor |
US8187182B2 (en) | 2008-08-29 | 2012-05-29 | Dp Technologies, Inc. | Sensor fusion for activity identification |
JP2010088886A (en) | 2008-10-03 | 2010-04-22 | Adidas Ag | Program products, methods, and systems for providing location-aware fitness monitoring services |
US8628453B2 (en) | 2008-12-05 | 2014-01-14 | Nike, Inc. | Athletic performance monitoring systems and methods in a team sports environment |
US9224425B2 (en) | 2008-12-17 | 2015-12-29 | Skyhawke Technologies, Llc | Time stamped imagery assembly for course performance video replay |
US7970573B2 (en) | 2008-12-22 | 2011-06-28 | Intel Corporation | Techniques for determining orientation of a three-axis accelerometer |
US8270670B2 (en) | 2008-12-25 | 2012-09-18 | Topseed Technology Corp. | Method for recognizing and tracing gesture |
US8423284B2 (en) * | 2009-04-15 | 2013-04-16 | Abalta Technologies, Inc. | Monitoring, recording and testing of navigation systems |
EP2419181A4 (en) | 2009-04-16 | 2016-09-21 | Nike Innovate Cv | Athletic performance rating system |
US8655441B2 (en) | 2009-04-16 | 2014-02-18 | Massachusetts Institute Of Technology | Methods and apparatus for monitoring patients and delivering therapeutic stimuli |
US20120130515A1 (en) | 2009-05-01 | 2012-05-24 | Homsi Kristopher L | Athletic performance rating system |
US8638985B2 (en) | 2009-05-01 | 2014-01-28 | Microsoft Corporation | Human body pose estimation |
US8331611B2 (en) * | 2009-07-13 | 2012-12-11 | Raytheon Company | Overlay information over video |
US20110270135A1 (en) | 2009-11-30 | 2011-11-03 | Christopher John Dooley | Augmented reality for testing and training of human performance |
WO2011069291A1 (en) | 2009-12-10 | 2011-06-16 | Nokia Corporation | Method, apparatus or system for image processing |
US9354447B2 (en) | 2010-01-18 | 2016-05-31 | Intel Corporation | Head mounted information systems and related methods |
US20110208822A1 (en) | 2010-02-22 | 2011-08-25 | Yogesh Chunilal Rathod | Method and system for customized, contextual, dynamic and unified communication, zero click advertisement and prospective customers search engine |
WO2011101858A1 (en) | 2010-02-22 | 2011-08-25 | Yogesh Chunilal Rathod | A system and method for social networking for managing multidimensional life stream related active note(s) and associated multidimensional active resources & actions |
JP5152231B2 (en) | 2010-03-12 | 2013-02-27 | オムロン株式会社 | Image processing method and image processing apparatus |
JP5047326B2 (en) | 2010-03-31 | 2012-10-10 | 株式会社東芝 | Action determination apparatus, method, and program |
US20110275907A1 (en) | 2010-05-07 | 2011-11-10 | Salvatore Richard Inciardi | Electronic Health Journal |
US9940508B2 (en) | 2010-08-26 | 2018-04-10 | Blast Motion Inc. | Event detection, confirmation and publication system that integrates sensor data and social media |
US8903521B2 (en) | 2010-08-26 | 2014-12-02 | Blast Motion Inc. | Motion capture element |
US9280851B2 (en) | 2010-11-08 | 2016-03-08 | Sony Corporation | Augmented reality system for supplementing and blending data |
US9001886B2 (en) * | 2010-11-22 | 2015-04-07 | Cisco Technology, Inc. | Dynamic time synchronization |
US9213405B2 (en) | 2010-12-16 | 2015-12-15 | Microsoft Technology Licensing, Llc | Comprehension and intent-based content for augmented reality displays |
US9167228B2 (en) | 2012-01-03 | 2015-10-20 | Lawrence Maxwell Monari | Instrumented sports paraphernalia system |
US9423272B2 (en) | 2012-02-17 | 2016-08-23 | Honeywell International Inc. | Estimation of conventional inertial sensor errors with atomic inertial sensor |
US9901815B2 (en) | 2012-03-22 | 2018-02-27 | The Regents Of The University Of California | Devices, systems, and methods for monitoring, classifying, and encouraging activity |
US9257054B2 (en) | 2012-04-13 | 2016-02-09 | Adidas Ag | Sport ball athletic activity monitoring methods and systems |
US20130316840A1 (en) | 2012-05-24 | 2013-11-28 | Gary James Neil Marks | Golf swing grading software system, golf swing component scoring chart and method |
US8929709B2 (en) * | 2012-06-11 | 2015-01-06 | Alpinereplay, Inc. | Automatic digital curation and tagging of action videos |
US10008237B2 (en) | 2012-09-12 | 2018-06-26 | Alpinereplay, Inc | Systems and methods for creating and enhancing videos |
US9566021B2 (en) | 2012-09-12 | 2017-02-14 | Alpinereplay, Inc. | Systems and methods for synchronized display of athletic maneuvers |
US10408857B2 (en) | 2012-09-12 | 2019-09-10 | Alpinereplay, Inc. | Use of gyro sensors for identifying athletic maneuvers |
US10548514B2 (en) | 2013-03-07 | 2020-02-04 | Alpinereplay, Inc. | Systems and methods for identifying and characterizing athletic maneuvers |
US9060682B2 (en) | 2012-10-25 | 2015-06-23 | Alpinereplay, Inc. | Distributed systems and methods to measure and process sport motions |
US9476730B2 (en) * | 2014-03-18 | 2016-10-25 | Sri International | Real-time system for multi-modal 3D geospatial mapping, object recognition, scene annotation and analytics |
US9196039B2 (en) | 2014-04-01 | 2015-11-24 | Gopro, Inc. | Image sensor read window adjustment for multi-camera array tolerance |
US10321208B2 (en) | 2015-10-26 | 2019-06-11 | Alpinereplay, Inc. | System and method for enhanced video image recognition using motion sensors |
-
2016
- 2016-10-25 US US15/334,131 patent/US10321208B2/en active Active
-
2019
- 2019-05-01 US US16/401,017 patent/US10897659B2/en active Active
-
2021
- 2021-01-15 US US17/151,071 patent/US11516557B2/en active Active
-
2022
- 2022-11-17 US US18/056,402 patent/US20230077815A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100287159A1 (en) * | 2000-11-21 | 2010-11-11 | Aol Inc. | Methods and systems for enhancing metadata |
US20110071792A1 (en) * | 2009-08-26 | 2011-03-24 | Cameron Miner | Creating and viewing multimedia content from data of an individual's performance in a physical activity |
US8731239B2 (en) * | 2009-12-09 | 2014-05-20 | Disney Enterprises, Inc. | Systems and methods for tracking objects under occlusion |
US20130278727A1 (en) * | 2010-11-24 | 2013-10-24 | Stergen High-Tech Ltd. | Method and system for creating three-dimensional viewable video from a single video stream |
US8648919B2 (en) * | 2011-06-06 | 2014-02-11 | Apple Inc. | Methods and systems for image stabilization |
US20140320681A1 (en) * | 2011-06-06 | 2014-10-30 | Apple Inc. | Correcting rolling shutter using image stabilization |
US20170099441A1 (en) * | 2015-10-05 | 2017-04-06 | Woncheol Choi | Virtual flying camera system |
Also Published As
Publication number | Publication date |
---|---|
US20170118539A1 (en) | 2017-04-27 |
US20210136466A1 (en) | 2021-05-06 |
US10321208B2 (en) | 2019-06-11 |
US10897659B2 (en) | 2021-01-19 |
US20190261065A1 (en) | 2019-08-22 |
US11516557B2 (en) | 2022-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11516557B2 (en) | System and method for enhanced video image recognition using motion sensors | |
US10721439B1 (en) | Systems and methods for directing content generation using a first-person point-of-view device | |
US10559324B2 (en) | Media identifier generation for camera-captured media | |
US9554160B2 (en) | Multi-angle video editing based on cloud video sharing | |
US9792951B2 (en) | Systems and methods for identifying potentially interesting events in extended recordings | |
US10402445B2 (en) | Apparatus and methods for manipulating multicamera content using content proxy | |
Pettersen et al. | Soccer video and player position dataset | |
US8692885B2 (en) | Method and apparatus for capture and distribution of broadband data | |
RU2617691C2 (en) | Automatic digital collection and marking of dynamic video images | |
EP3384495B1 (en) | Processing of multiple media streams | |
CN105262942B (en) | Distributed automatic image and video processing | |
US20160225410A1 (en) | Action camera content management system | |
US20160337718A1 (en) | Automated video production from a plurality of electronic devices | |
US10334217B2 (en) | Video sequence assembly | |
US9787862B1 (en) | Apparatus and methods for generating content proxy | |
US20180103197A1 (en) | Automatic Generation of Video Using Location-Based Metadata Generated from Wireless Beacons | |
US9871994B1 (en) | Apparatus and methods for providing content context using session metadata | |
CN112287771A (en) | Method, apparatus, server and medium for detecting video event | |
JP2022140458A (en) | Information processing device, information processing method, and program | |
WO2018147089A1 (en) | Information processing device and method | |
JP6089865B2 (en) | Information processing apparatus, display method and program in information processing apparatus | |
US20240214543A1 (en) | Multi-camera multiview imaging with fast and accurate synchronization | |
JP6146108B2 (en) | Information processing apparatus, image display system, image display method, and program | |
JP6296182B2 (en) | Information processing apparatus, display method and program in information processing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ALPINEREPLAY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LOKSHIN, DAVID J.;LOKSHIN, ANATOLE M.;ROBERTS-THOMSON, CLAIRE LOUISE;SIGNING DATES FROM 20190214 TO 20190220;REEL/FRAME:061811/0100 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: SENT TO CLASSIFICATION CONTRACTOR |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: EX PARTE QUAYLE ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO EX PARTE QUAYLE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |