US20130283143A1 - System for Annotating Media Content for Automatic Content Understanding - Google Patents

System for Annotating Media Content for Automatic Content Understanding Download PDF

Info

Publication number
US20130283143A1
US20130283143A1 US13/836,605 US201313836605A US2013283143A1 US 20130283143 A1 US20130283143 A1 US 20130283143A1 US 201313836605 A US201313836605 A US 201313836605A US 2013283143 A1 US2013283143 A1 US 2013283143A1
Authority
US
United States
Prior art keywords
gtm
prs
metadata
pad
media stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/836,605
Inventor
Eric David Petajan
David Eugene Weite
Douglas W. VUNIC
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jbshbm LLC
LiveClips LLC
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US13/836,605 priority Critical patent/US20130283143A1/en
Priority to BR112014026589A priority patent/BR112014026589A2/en
Priority to PCT/US2013/037545 priority patent/WO2013163066A2/en
Priority to US14/385,989 priority patent/US9659597B2/en
Priority to EP13781985.0A priority patent/EP2842054A4/en
Priority to CA2870454A priority patent/CA2870454A1/en
Priority to MX2014012970A priority patent/MX339009B/en
Assigned to LIVECLIPS LLC reassignment LIVECLIPS LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: VUNIC, DOUGLAS W., PETAJAN, ERIC DAVID, WEITE, DAVID EUGENE
Publication of US20130283143A1 publication Critical patent/US20130283143A1/en
Assigned to DIRECTV INVESTMENTS, INC. reassignment DIRECTV INVESTMENTS, INC. SECURITY AGREEMENT Assignors: LIVECLIPS LLC
Priority to US14/186,163 priority patent/US9367745B2/en
Assigned to LIVECLIPS LLC reassignment LIVECLIPS LLC MERGER (SEE DOCUMENT FOR DETAILS). Assignors: LIVECLIPS LLC
Priority to CO14244442A priority patent/CO7121323A2/en
Priority to US15/170,460 priority patent/US10491961B2/en
Priority to US15/491,031 priority patent/US10056112B2/en
Assigned to JBSHBM, LLC reassignment JBSHBM, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BUCHHEIT, BRIAN K., MCGHIE, SEAN I.
Priority to US16/044,084 priority patent/US10381045B2/en
Priority to US16/457,113 priority patent/US10553252B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06F17/241
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/036Insert-editing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring

Definitions

  • This disclosure relates to media presentations (e.g. live sports events), and more particularly to a system for improving performance by generating annotations for the media stream.
  • media presentations e.g. live sports events
  • a media presentation such as a broadcast of an event, may be understood as a stream of audio/video frames (live media stream). It is desirable to add information to the media stream to enhance the viewer's experience; this is generally referred to as annotating the media stream.
  • the annotation of a media stream is a tedious and time-consuming task for a human. Visual inspection of text, players, balls, and field/court position is mentally taxing and error prone. Keyboard and mouse entry are needed to enter annotation data but are also error prone and mentally taxing. Accordingly, systems have been developed to at least partially automate the annotation process.
  • PRS Pattern Recognition Systems
  • ASR Automatic Speech Recognition
  • PRS Process media streams in order to generate meaningful metadata.
  • Recognition systems operating on natural media streams always perform with less than absolute accuracy due to the presence of noise.
  • Computer Vision (CV) is notoriously error prone and ASR is only useable under constrained conditions.
  • the measurement of system accuracy requires knowledge of the correct PRS result, referred to here as Ground Truth Metadata (GTM).
  • GTM Ground Truth Metadata
  • the development of a PRS requires the generation of GTM that must be validated by Human Annotators (HA).
  • GTM can consist of positions in space or time, labeled features, events, text, region boundaries, or any data with a unique label that allows referencing and comparison.
  • an MSA system and process with a Human-Computer Interface (HCI), provides more efficient GTM generation and PRS input parameter adjustment.
  • HCI Human-Computer Interface
  • GTM is used to verify PRS accuracy and adjust PRS input parameters or to guide algorithm development for optimal recognition accuracy.
  • the GTM can be generated at low levels of detail in space and time, or at higher levels as events or states with start times and durations that may be imprecise compared to low-level video frame timing.
  • Adjustments to PRS input parameters that are designed to be static during a program should be applied to all sections of a program with associated GTM in order to maximize the average recognition accuracy and not just the accuracy of the given section or video frame. If the MSA processes live media, the effect of any automated PRS input parameter adjustments must be measured on all sections with (past and present) GTM before committing the changes for generation of final production output.
  • a system embodying the disclosure may be applied to both live and archived media programs and has the following features:
  • FIG. 1 is a schematic illustration of the Media Stream Annotator (MSA), according to an embodiment of the disclosure.
  • FIG. 2 is a schematic illustration of the Media Annotator flow chart during Third Party Metadata (TPM) ingest, according to an embodiment of the disclosure.
  • TPM Third Party Metadata
  • FIG. 3 is a schematic illustration of the Media Annotator flow chart during Human Annotation, according to an embodiment of the disclosure.
  • FIG. 4 is a schematic illustration of a football miniboard, according to an embodiment of the disclosure.
  • a priori information for sports includes the type of sport, stadium architecture and location, date and time, teams, players, broadcaster, language, and the media ingest process (e.g., original A/V resolution and transcoding).
  • Physical constraints include camera inertia, camera mount type, lighting, and the physics of players, balls, equipment, courts, fields, and boundaries.
  • Logical constraints include the rules of the game, sports production methods, uniform colors and patterns, and scoreboard operation.
  • a live media sporting event is processed in real time, only the current and past media streams are available for pattern recognition and metadata generation.
  • a recorded sporting event can be processed with access to any frame in the entire program.
  • the PRS processing a live event can become more accurate as time progresses since more information is available over time, while any frame from a recorded event can be analyzed repeatedly from the past or the future until maximum accuracy is achieved.
  • FIG. 1 shows a Media Stream Annotator (MSA) system according to an embodiment of the disclosure.
  • the MSA ingests both live and archived media streams (LMS 114 and AMS 115 ), and optional Third Party Metadata (TPM) 101 and input from the HA 118 .
  • the PAD is derived from a combination of PRS 108 result metadata and TPM 101 .
  • Metadata output by PRS 108 is archived in Metadata Archive 109 .
  • the system can convert the TPM 101 to GTM via the Metadata Mapper 102 and then use the Performance Optimization System (POS) 105 to adjust PRS Input Parameters to improve metadata accuracy for both past (AMS 115 ) and presently ingested media (LMS 114 ).
  • the PAD Encoder 110 merges GTM with metadata for each media frame and encodes the PAD into a compressed form suitable for transmission to the Human Annotator User Interface (HAUI) 104 via a suitable network, e.g. Internet 103 . This information is subsequently decoded and displayed to the HA, in a form the HA can edit, by a Media Stream and PAD Decoder, Display and Editor (MSPDE) 111 .
  • MSPDE Media Stream and PAD Decoder, Display and Editor
  • the HAUI also includes a Media Stream Navigator (MSN) 117 which the HA uses to select time points in the media stream whose corresponding frames are to be annotated.
  • MSN Media Stream Navigator
  • a low bit-rate version of the media stream is transcoded from the AMS by a Media Transcoder 116 and then transmitted to the HAUI.
  • the POS 105 compares the PRS 108 output metadata to the GTM and detects significant differences between them.
  • input parameters are set with initial estimated values that produce accurate results on an example set of media streams and associated GTM. These parameter values are adjusted by the POS 105 until the difference between the all GTM and the PRS 108 generated metadata is minimized.
  • the POS 105 does not need to operate in real time and exhaustive optimization algorithms may be used. During a live program the POS 105 should operate as fast as possible to improve PRS 108 performance each time new GTM is generated by the HA 118 ; faster optimization algorithms are therefore used during a live program. The POS 105 is also invoked when new TPM 101 is converted to GTM.
  • the choice of distance metric between PRS 108 output metadata and GTM depends on the type of data and the allowable variation. For example, in a presentation of a football game the score information extracted from the miniboard must be absolutely accurate while the spatial position of a player on the field can vary. If one PRS input parameter affects multiple types of results, then the distance values for each type can be weighted in a linear combination of distances in order to calculate a single distance for a given frame or time segment of the game.
  • TPM 101 e.g. from stats.com
  • GTM stats.com
  • Alignment can either be done manually, or the GTM can be aligned with TPM 101 , and/or the PRS 108 result metadata can be aligned using fuzzy matching techniques.
  • the PRS 108 maintains a set of state variables that change over time as models of the environment, players, overlay graphics, cameras, and weather are updated.
  • the arrival of TPM 101 and, in turn, GTM can drive changes to both current and past state variables. If the history of the state variables is not stored persistently, the POS 105 would have to start the media stream from the beginning in order to use the PRS 108 to regenerate metadata using new PRS 108 Input Parameters.
  • the amount of PRS 108 state variable information can be large, and is compressed using State Codec 112 into one or more sequences of Group Of States (GOS) such that a temporal section of PRS States is encoded and decoded as a group for greater compression efficiency and retrieval speed.
  • the GOS is stored in a GOS Archive 113 .
  • the number of media frames in a GOS can be as few as one.
  • the HA can navigate to a past point in time and immediately retrieve the associated metadata or GTM via the PAD Encoder 110 , which formats and compresses the PAD for delivery to the HA 118 over the network.
  • FIG. 2 shows a flow chart for MSA operation, according to an embodiment of the disclosure in which both a live media stream (LMS) and TPM are ingested. All LMS is archived in the AMS (step 201 ).
  • the initial or default values of the GOS are input to the PRS which then starts processing the LMS in real time (step 202 ). If the PRS does not have sufficient resources to process every LMS frame, the PRS will skip frames to minimize the latency between a given LMS frame and its associated result Metadata (step 203 ).
  • the internal state variable values of the PRS are encoded into GOS and archived (step 204 ).
  • the PRS generates metadata which is archived (step 205 ); the process returns to step 201 and the next or most recent next media frame is ingested.
  • the processing loop 201 - 205 may iterate indefinitely.
  • TPM arrives via the Internet, it is merged with any GTM that exists for that media frame via the Metadata Mapper (step 206 ).
  • the POS is then notified of the new GTM and generates new sets of PRS Input Parameters, while comparing all resulting Metadata to any corresponding GTM for each set until an optimal set of PRS Input Parameters are found that minimize the global distance between all GTM and the corresponding Metadata (step 207 ).
  • FIG. 3 shows a flow chart for MSA operation while the HA approves new GTM. This process operates in parallel with the process shown in the flowchart of FIG. 2 .
  • the HA must first select a point on the media stream timeline for annotation (step 301 ).
  • the HA can find a point in time by dragging a graphical cursor on a media player while viewing a low bit-rate version of the media stream transcoded from the AMS (step 302 ).
  • the Metadata and any existing GTM associated with the selected time point are retrieved from their respective archives 109 , 106 and encoded into the PAD (step 303 ); transmitted with the Media Stream to the HAUI over the Internet (step 304 );and presented to the HA via the HAUI after decoding both PAD and low bit-rate Media Stream (step 305 ).
  • the HAUI displays the PAD on or near the displayed Media Frame (step 306 ).
  • the HA compares the PAD with the Media Frame and either clicks on an Approve button 107 or corrects the PAD using an editor and approves the PAD (step 307 ).
  • the HAUI After approval of the PAD, the HAUI transmits the corrected and/or approved PAD as new GTM for storage in the GTM Archive (step 308 ).
  • the POS is then notified of the new GTM and generates new sets of PRS Input Parameters, while comparing all resulting Metadata to any corresponding GTM for each set (step 309 ) until an optimal set of PRS Input Parameters are found that minimize the global distance between all GTM and the corresponding Metadata (step 310 ).
  • the POS can perform more exhaustive and time consuming algorithms to minimize the distance between GTM and Metadata; the consequence of incomplete or less accurate Metadata is more editing time for the HA. If the MSA is operating on LMS during live production, the POS is constrained to not update the PRS Input Parameters for live production until the Metadata accuracy is maximized.
  • the HA does not need any special skills other than a basic knowledge of the media stream content (e.g. rules of the sporting event) and facility with a basic computer interface.
  • PRS performance depends on the collection of large amounts of GTM to ensure that optimization by the POS will result in optimal PRS performance on new media streams. Accordingly, it is usually advantageous to employ multiple HAs for a given media stream.
  • the pool of HAs is increased if the HAUI client can communicate with the rest of the system over the consumer-grade internet or mobile internet connections which have limited capacity.
  • the main consumer of internet capacity is the media stream that is delivered to the HAUI for decoding and display.
  • bit-rate of the media stream can be greatly lowered to allow carriage over consumer or mobile internet connections by transcoding the video to a lower resolution and quality.
  • Much of the bit-rate needed for high quality compression of sporting events is applied to complex regions in the video, such as views containing the numerous spectators at the event; however, the HA does not need high quality video of the spectators for annotation. Instead, the HA needs a minimal visual quality for the miniboard, player identification, ball tracking, and field markings which is easily achieved with a minimal compressed bit-rate.
  • the PAD is also transmitted to the HAUI, but this information is easily compressed as text, graphical coordinates, geometric objects, color properties or animation data. All PAD can be losslessly compressed using statistical compression techniques (e.g. zip), but animation data can be highly compressed using lossy animation stream codecs such as can be found in the MPEG-4 SNHC standard tools (e.g. Face and Body Animation and 3D Mesh Coding).
  • statistical compression techniques e.g. zip
  • animation data can be highly compressed using lossy animation stream codecs such as can be found in the MPEG-4 SNHC standard tools (e.g. Face and Body Animation and 3D Mesh Coding).
  • the display of the transmitted and decoded PAD to the HA is arranged for clearest viewing and comparison between the video and the PAD.
  • the miniboard content from the PAD should be displayed below the video frame in its own window pane 402 and vertically aligned with the miniboard in the video 401 .
  • PAD content relating to natural (non-graphical) objects in the video should be graphically overlayed on the video.
  • Editing of the PAD by the HA can be done either in the miniboard text window directly for miniboard data or by dragging spatial location data directly on the video into the correct position (e.g. field lines or player IDs).
  • the combined use of low bit-rate, adequate quality video and compressed text, graphics and animation data which is composited on the video results in a HAUI that can be used with low bit-rate internet connections.
  • the Metadata Archive 109 and the GTM Archive 106 are ideally designed and implemented to provide fast in-memory access to metadata while writing archive contents to disk as often as needed to allow fast recovery after system failure (power outage, etc).
  • the metadata archives should ideally be architected to provide fast search and data derivation operations. Fast search is needed to find corresponding entries in the GTM 106 vs Metadata 109 archives, and to support the asynchronous writes to the GTM Archive 106 from the Metadata Mapper 102 .
  • Preferred designs of the data structures in the archives that support fast search include the use of linked lists and hash tables. Linked lists enable insert edit operations without the need to move blocks of data to accommodate new data. Hash tables provide fast address lookup of sparse datasets.
  • the ingest of TPM 101 requires that the TPM timestamps be aligned with the GTM 106 and Metadata 109 Archive timestamps. This alignment operation may involve multiple passes over all datasets while calculating accumulated distance metrics to guide the alignment.
  • the ingest of multiple overlapping/redundant TPM requires that a policy be established for dealing with conflicting or inconsistent metadata.
  • the Metadata Mapper 102 should ideally compare the PRS 108 generated Metadata 109 to the conflicting TPMs 101 in case other prior knowledge does not resolve the conflict. If the conflict can't be reliably resolved, then a confidence value should ideally be established for the given metadata which is also stored in the GTM 106 . Alternatively, conflicting data can be omitted from the GTM 106 .
  • the GTM 106 and Metadata 109 Archives should ideally contain processes for efficiently performing common operations on the archives. For example, if the time base of the metadata needs adjustment, an internal archive process could adjust each timestamp in the whole archive without impacting other communication channels, or tying up other processing resources.
  • TPM is the game clock from a live sporting event.
  • TPM game clocks typically consist of an individual message for each tick/second of the clock containing the clock value.
  • the delay between the live clock value at the sports venue and the delivered clock value message can be seconds or tens of seconds with variation.
  • the PRS is recognizing the clock from the live video feed and the start time of the game is published in advance.
  • the Metadata Mapper 102 should use all of this information to accurately align the TPM clock ticks with the time base of the GTM 106 and Metadata 109 Archives. At the beginning of the game, there might not be enough data to determine this alignment very accurately, but as time moves forward, more metadata is accumulated and past alignments can be update to greater accuracy.
  • GTM 106 and Metadata 109 archives Another desirable feature of the GTM 106 and Metadata 109 archives is the ability to virtually repopulate the archives as an emulation of replaying of the original ingest and processing of the TPM. This emulation feature is useful for system tuning and debugging.

Abstract

A system for annotating frames in a media stream frames includes a pattern recognition system (PRS) to generate PRS output metadata for a frame; an archive for storing ground truth metadata (GTM); a device to merge the GTM and PRS output metadata and thereby generate proposed annotation data (PAD); and a user interface for use by the HA. The user interface includes an editor and an input device used by the HA to approve GTM for the frame. An optimization system receives the approved GTM and metadata output by the PRS, and adjusts input parameters for the PRS to minimize a distance metric corresponding to a difference between the GTM and PRS output metadata.

Description

    CROSS REFERENCE TO RELATED PATENT APPLICATION
  • This patent application claims a benefit to the priority date of the filing of U.S. Provisional Patent Application Ser. No. 61/637,344, titled “System for Annotating Media Content for Improved Automatic Content Understanding Performance,” by Petajan et al., that was filed on Apr. 24, 2012. The disclosure of U.S. 61/637,344 is incorporated by reference herein in its entirety.
  • FIELD OF THE DISCLOSURE
  • This disclosure relates to media presentations (e.g. live sports events), and more particularly to a system for improving performance by generating annotations for the media stream.
  • BACKGROUND OF THE DISCLOSURE
  • A media presentation, such as a broadcast of an event, may be understood as a stream of audio/video frames (live media stream). It is desirable to add information to the media stream to enhance the viewer's experience; this is generally referred to as annotating the media stream. The annotation of a media stream is a tedious and time-consuming task for a human. Visual inspection of text, players, balls, and field/court position is mentally taxing and error prone. Keyboard and mouse entry are needed to enter annotation data but are also error prone and mentally taxing. Accordingly, systems have been developed to at least partially automate the annotation process.
  • Pattern Recognition Systems (PRS), e.g. computer vision or Automatic Speech Recognition (ASR), process media streams in order to generate meaningful metadata. Recognition systems operating on natural media streams always perform with less than absolute accuracy due to the presence of noise. Computer Vision (CV) is notoriously error prone and ASR is only useable under constrained conditions. The measurement of system accuracy requires knowledge of the correct PRS result, referred to here as Ground Truth Metadata (GTM). The development of a PRS requires the generation of GTM that must be validated by Human Annotators (HA). GTM can consist of positions in space or time, labeled features, events, text, region boundaries, or any data with a unique label that allows referencing and comparison.
  • A compilation of acronyms used herein is appended to this Specification.
  • There remains a need for a system that can reduce the human time and effort required to create the GTM.
  • SUMMARY OF THE DISCLOSURE
  • We refer to a system for labeling features in a given frame of video (or audio) or events at a given point in time as a Media Stream Annotator (MSA). If accurate enough, a given PRS automatically generates metadata from the media streams that can be used to reduce the human time and effort required to create the GTM. According to an aspect of the disclosure, an MSA system and process, with a Human-Computer Interface (HCI), provides more efficient GTM generation and PRS input parameter adjustment.
  • GTM is used to verify PRS accuracy and adjust PRS input parameters or to guide algorithm development for optimal recognition accuracy. The GTM can be generated at low levels of detail in space and time, or at higher levels as events or states with start times and durations that may be imprecise compared to low-level video frame timing.
  • Adjustments to PRS input parameters that are designed to be static during a program should be applied to all sections of a program with associated GTM in order to maximize the average recognition accuracy and not just the accuracy of the given section or video frame. If the MSA processes live media, the effect of any automated PRS input parameter adjustments must be measured on all sections with (past and present) GTM before committing the changes for generation of final production output.
  • A system embodying the disclosure may be applied to both live and archived media programs and has the following features:
      • Random access into a given frame or section of the archived media stream and associated metadata
      • Real-time display or graphic overlay of PRS-generated metadata on or near video frame display
      • Single click approval of conversion of Proposed Annotation Data (PAD) into GTM
      • PRS recomputes all metadata when GTM changes
      • Merge metadata from 3rd parties with human annotations
      • Graphic overlay of compressed and decoded metadata on or near decoded low bit-rate video to enable real-time operation on mobile devices and consumer-grade internet connections
  • The foregoing has outlined, rather broadly, the preferred features of the present disclosure so that those skilled in the art may better understand the detailed description of the disclosure that follows. Additional features of the disclosure will be described hereinafter that form the subject of the claims of the disclosure. Those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiment as a basis for designing or modifying other structures for carrying out the same purposes of the present disclosure and that such other structures do not depart from the spirit and scope of the disclosure in its broadest form.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a schematic illustration of the Media Stream Annotator (MSA), according to an embodiment of the disclosure.
  • FIG. 2 is a schematic illustration of the Media Annotator flow chart during Third Party Metadata (TPM) ingest, according to an embodiment of the disclosure.
  • FIG. 3 is a schematic illustration of the Media Annotator flow chart during Human Annotation, according to an embodiment of the disclosure.
  • FIG. 4 is a schematic illustration of a football miniboard, according to an embodiment of the disclosure.
  • DETAILED DESCRIPTION
  • The accuracy of any PRS depends on the application of constraints that reduce the number or range of possible results. These constraints can take the form of a priori information, physical and logical constraints, or partial recognition results with high reliability. A priori information for sports includes the type of sport, stadium architecture and location, date and time, teams, players, broadcaster, language, and the media ingest process (e.g., original A/V resolution and transcoding). Physical constraints include camera inertia, camera mount type, lighting, and the physics of players, balls, equipment, courts, fields, and boundaries. Logical constraints include the rules of the game, sports production methods, uniform colors and patterns, and scoreboard operation. Some information can be reliably extracted from the media stream with minimal a priori information and can be used to “boot strap” subsequent recognition processes. For example, the presence of the graphical miniboard overlaid on the game video (shown in FIG. 4) can be detected with only knowledge of the sport and the broadcaster (e.g., ESPN, FOX Sports, etc).
  • If a live media sporting event is processed in real time, only the current and past media streams are available for pattern recognition and metadata generation. A recorded sporting event can be processed with access to any frame in the entire program. The PRS processing a live event can become more accurate as time progresses since more information is available over time, while any frame from a recorded event can be analyzed repeatedly from the past or the future until maximum accuracy is achieved.
  • The annotation of a media stream is a tedious and time-consuming task for a human. Visual inspection of text, players, balls, and field/court position is mentally taxing and error prone. Keyboard and mouse entry are needed to enter annotation data but are also error prone and mentally taxing. Human annotation productivity (speed and accuracy) is greatly improved by properly displaying available automatically generated Proposed Annotation Data (PAD) and thereby minimizing the mouse and keyboard input needed to edit and approve the PAD. If the PAD is correct, the Human Annotator (HA) can simultaneously approve the current frame and select the next frame for annotation with only one press of a key or mouse button. The PAD is the current best automatically generated metadata that can be delivered to the user without significant delay. Waiting for the system to maximize the accuracy of the PAD may decrease editing by the HA but will also delay the approval of the given frame.
  • FIG. 1 shows a Media Stream Annotator (MSA) system according to an embodiment of the disclosure. The MSA ingests both live and archived media streams (LMS 114 and AMS 115), and optional Third Party Metadata (TPM) 101 and input from the HA 118. The PAD is derived from a combination of PRS 108 result metadata and TPM 101. Metadata output by PRS 108 is archived in Metadata Archive 109. If the TPM 101 is available during live events the system can convert the TPM 101 to GTM via the Metadata Mapper 102 and then use the Performance Optimization System (POS) 105 to adjust PRS Input Parameters to improve metadata accuracy for both past (AMS 115) and presently ingested media (LMS 114). The PAD Encoder 110 merges GTM with metadata for each media frame and encodes the PAD into a compressed form suitable for transmission to the Human Annotator User Interface (HAUI) 104 via a suitable network, e.g. Internet 103. This information is subsequently decoded and displayed to the HA, in a form the HA can edit, by a Media Stream and PAD Decoder, Display and Editor (MSPDE) 111. The HAUI also includes a Media Stream Navigator (MSN) 117 which the HA uses to select time points in the media stream whose corresponding frames are to be annotated. A low bit-rate version of the media stream is transcoded from the AMS by a Media Transcoder 116 and then transmitted to the HAUI.
  • As GTM is generated by the HA 118 and stored in the GTM Archive 106, the POS 105 compares the PRS 108 output metadata to the GTM and detects significant differences between them. During the design and development of the PRS 108, input parameters are set with initial estimated values that produce accurate results on an example set of media streams and associated GTM. These parameter values are adjusted by the POS 105 until the difference between the all GTM and the PRS 108 generated metadata is minimized.
  • During development (as opposed to live production) the POS 105 does not need to operate in real time and exhaustive optimization algorithms may be used. During a live program the POS 105 should operate as fast as possible to improve PRS 108 performance each time new GTM is generated by the HA 118; faster optimization algorithms are therefore used during a live program. The POS 105 is also invoked when new TPM 101 is converted to GTM.
  • The choice of distance metric between PRS 108 output metadata and GTM depends on the type of data and the allowable variation. For example, in a presentation of a football game the score information extracted from the miniboard must be absolutely accurate while the spatial position of a player on the field can vary. If one PRS input parameter affects multiple types of results, then the distance values for each type can be weighted in a linear combination of distances in order to calculate a single distance for a given frame or time segment of the game.
  • A variety of TPM 101 (e.g. from stats.com) is available after a delay period from the live action that can be used as GTM either during development or after the delay period during a live program. Since the TPM is delayed by a non-specific period of time, it must be aligned in time with the program. Alignment can either be done manually, or the GTM can be aligned with TPM 101, and/or the PRS 108 result metadata can be aligned using fuzzy matching techniques.
  • The PRS 108 maintains a set of state variables that change over time as models of the environment, players, overlay graphics, cameras, and weather are updated. The arrival of TPM 101 and, in turn, GTM can drive changes to both current and past state variables. If the history of the state variables is not stored persistently, the POS 105 would have to start the media stream from the beginning in order to use the PRS 108 to regenerate metadata using new PRS 108 Input Parameters. The amount of PRS 108 state variable information can be large, and is compressed using State Codec 112 into one or more sequences of Group Of States (GOS) such that a temporal section of PRS States is encoded and decoded as a group for greater compression efficiency and retrieval speed. The GOS is stored in a GOS Archive 113. The number of media frames in a GOS can be as few as one.
  • If the PRS 108 result metadata is stored persistently, the HA can navigate to a past point in time and immediately retrieve the associated metadata or GTM via the PAD Encoder 110, which formats and compresses the PAD for delivery to the HA 118 over the network.
  • FIG. 2 shows a flow chart for MSA operation, according to an embodiment of the disclosure in which both a live media stream (LMS) and TPM are ingested. All LMS is archived in the AMS (step 201). At system startup, the initial or default values of the GOS are input to the PRS which then starts processing the LMS in real time (step 202). If the PRS does not have sufficient resources to process every LMS frame, the PRS will skip frames to minimize the latency between a given LMS frame and its associated result Metadata (step 203). Periodically, the internal state variable values of the PRS are encoded into GOS and archived (step 204). Finally, the PRS generates metadata which is archived (step 205); the process returns to step 201 and the next or most recent next media frame is ingested. The processing loop 201-205 may iterate indefinitely.
  • When TPM arrives via the Internet, it is merged with any GTM that exists for that media frame via the Metadata Mapper (step 206). The POS is then notified of the new GTM and generates new sets of PRS Input Parameters, while comparing all resulting Metadata to any corresponding GTM for each set until an optimal set of PRS Input Parameters are found that minimize the global distance between all GTM and the corresponding Metadata (step 207).
  • FIG. 3 shows a flow chart for MSA operation while the HA approves new GTM. This process operates in parallel with the process shown in the flowchart of FIG. 2. The HA must first select a point on the media stream timeline for annotation (step 301). The HA can find a point in time by dragging a graphical cursor on a media player while viewing a low bit-rate version of the media stream transcoded from the AMS (step 302). The Metadata and any existing GTM associated with the selected time point are retrieved from their respective archives 109, 106 and encoded into the PAD (step 303); transmitted with the Media Stream to the HAUI over the Internet (step 304);and presented to the HA via the HAUI after decoding both PAD and low bit-rate Media Stream (step 305). The HAUI displays the PAD on or near the displayed Media Frame (step 306). The HA compares the PAD with the Media Frame and either clicks on an Approve button 107 or corrects the PAD using an editor and approves the PAD (step 307). After approval of the PAD, the HAUI transmits the corrected and/or approved PAD as new GTM for storage in the GTM Archive (step 308). The POS is then notified of the new GTM and generates new sets of PRS Input Parameters, while comparing all resulting Metadata to any corresponding GTM for each set (step 309) until an optimal set of PRS Input Parameters are found that minimize the global distance between all GTM and the corresponding Metadata (step 310).
  • If the MSA is operating only on the AMS (and not on the LMS), the POS can perform more exhaustive and time consuming algorithms to minimize the distance between GTM and Metadata; the consequence of incomplete or less accurate Metadata is more editing time for the HA. If the MSA is operating on LMS during live production, the POS is constrained to not update the PRS Input Parameters for live production until the Metadata accuracy is maximized.
  • The HA does not need any special skills other than a basic knowledge of the media stream content (e.g. rules of the sporting event) and facility with a basic computer interface. PRS performance depends on the collection of large amounts of GTM to ensure that optimization by the POS will result in optimal PRS performance on new media streams. Accordingly, it is usually advantageous to employ multiple HAs for a given media stream. The pool of HAs is increased if the HAUI client can communicate with the rest of the system over the consumer-grade internet or mobile internet connections which have limited capacity. The main consumer of internet capacity is the media stream that is delivered to the HAUI for decoding and display. Fortunately, the bit-rate of the media stream can be greatly lowered to allow carriage over consumer or mobile internet connections by transcoding the video to a lower resolution and quality. Much of the bit-rate needed for high quality compression of sporting events is applied to complex regions in the video, such as views containing the numerous spectators at the event; however, the HA does not need high quality video of the spectators for annotation. Instead, the HA needs a minimal visual quality for the miniboard, player identification, ball tracking, and field markings which is easily achieved with a minimal compressed bit-rate.
  • The PAD is also transmitted to the HAUI, but this information is easily compressed as text, graphical coordinates, geometric objects, color properties or animation data. All PAD can be losslessly compressed using statistical compression techniques (e.g. zip), but animation data can be highly compressed using lossy animation stream codecs such as can be found in the MPEG-4 SNHC standard tools (e.g. Face and Body Animation and 3D Mesh Coding).
  • The display of the transmitted and decoded PAD to the HA is arranged for clearest viewing and comparison between the video and the PAD. For example, as shown in FIG. 4, the miniboard content from the PAD should be displayed below the video frame in its own window pane 402 and vertically aligned with the miniboard in the video 401. PAD content relating to natural (non-graphical) objects in the video should be graphically overlayed on the video.
  • Editing of the PAD by the HA can be done either in the miniboard text window directly for miniboard data or by dragging spatial location data directly on the video into the correct position (e.g. field lines or player IDs). The combined use of low bit-rate, adequate quality video and compressed text, graphics and animation data which is composited on the video results in a HAUI that can be used with low bit-rate internet connections.
  • Referring back to FIG. 1, The Metadata Archive 109 and the GTM Archive 106 are ideally designed and implemented to provide fast in-memory access to metadata while writing archive contents to disk as often as needed to allow fast recovery after system failure (power outage, etc). In addition to the inherent speed of memory access (vs disk access), the metadata archives should ideally be architected to provide fast search and data derivation operations. Fast search is needed to find corresponding entries in the GTM 106 vs Metadata 109 archives, and to support the asynchronous writes to the GTM Archive 106 from the Metadata Mapper 102. Preferred designs of the data structures in the archives that support fast search include the use of linked lists and hash tables. Linked lists enable insert edit operations without the need to move blocks of data to accommodate new data. Hash tables provide fast address lookup of sparse datasets.
  • The ingest of TPM 101 requires that the TPM timestamps be aligned with the GTM 106 and Metadata 109 Archive timestamps. This alignment operation may involve multiple passes over all datasets while calculating accumulated distance metrics to guide the alignment. The ingest of multiple overlapping/redundant TPM requires that a policy be established for dealing with conflicting or inconsistent metadata. In case there is conflict between TPMs 101, the Metadata Mapper 102 should ideally compare the PRS 108 generated Metadata 109 to the conflicting TPMs 101 in case other prior knowledge does not resolve the conflict. If the conflict can't be reliably resolved, then a confidence value should ideally be established for the given metadata which is also stored in the GTM 106. Alternatively, conflicting data can be omitted from the GTM 106.
  • The GTM 106 and Metadata 109 Archives should ideally contain processes for efficiently performing common operations on the archives. For example, if the time base of the metadata needs adjustment, an internal archive process could adjust each timestamp in the whole archive without impacting other communication channels, or tying up other processing resources.
  • An example of TPM is the game clock from a live sporting event. TPM game clocks typically consist of an individual message for each tick/second of the clock containing the clock value. The delay between the live clock value at the sports venue and the delivered clock value message can be seconds or tens of seconds with variation. The PRS is recognizing the clock from the live video feed and the start time of the game is published in advance. The Metadata Mapper 102 should use all of this information to accurately align the TPM clock ticks with the time base of the GTM 106 and Metadata 109 Archives. At the beginning of the game, there might not be enough data to determine this alignment very accurately, but as time moves forward, more metadata is accumulated and past alignments can be update to greater accuracy.
  • Another desirable feature of the GTM 106 and Metadata 109 archives is the ability to virtually repopulate the archives as an emulation of replaying of the original ingest and processing of the TPM. This emulation feature is useful for system tuning and debugging.
  • While the disclosure has been described in terms of specific embodiments, it is evident in view of the foregoing description that numerous alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the disclosure is intended to encompass all such alternatives, modifications and variations which fall within the scope and spirit of the disclosure and the following claims.
  • COMPILATION OF ACRONYMS
    • AMS Archived Media Stream
    • ASR Automatic Speech Recognition
    • CV Computer Vision
    • GOS Group Of States
    • GTM Ground Truth Metadata
    • HA Human Annotators
    • HAUI Human Annotator User Interface
    • HCI Human Computer Interface
    • LMS Live Media Stream
    • MSA Media Stream Annotator
    • MSN Media Stream Navigator
    • MSPDE Media Stream and PAD Decoder
    • PAD Proposed Annotation Data
    • POS Performance Optimization System
    • PRS Pattern Recognition System
    • TPM Third Party Metadata

Claims (18)

We claim:
1. A system to annotate media content, comprising:
a pattern recognition system (PRS) having an initial set of input parameters that generates PRS output metadata associated with a frame of a media stream;
an archive for storing ground truth metadata (GTM) associated with the same frame of the media stream;
a device to merge the GTM and the PRS output metadata and thereby generate proposed annotation data (PAD); and
a user interface for use by a human annotator (HA) including
an editor and
an input device to approve or edit the PAD for the frame; and
an optimization system to adjust input parameters for the PRS to minimize a distance metric corresponding to a difference between the GTM and PRS output metadata.
2. The system of claim 1 wherein the GTM is obtained from one or more of third party metadata, archived media stream and the HA.
3. The system of claim 2 wherein a time delay between third party metadata and the media stream is corrected by alignment.
4. The system of claim 2 including a communication network to enable a plurality of HA's to interface with the same media stream.
5. The system of claim 2 wherein when the PAD is approved it is converted to GTM.
6. The system of claim 5 wherein when the PAD is approved, it is graphically overlayed on the media stream.
7. The system of claim 1 wherein the optimization system adjusts the PRS initial set of input parameters to minimize the difference between the GTM and PRS output metadata thereby increasing accuracy.
8. The system of claim 1 wherein the PRS includes a set of state variables stored as a temporal group adjustable as a group in response to GTM.
9. A method comprising:
receiving data from a media stream, the data organized into frames;
processing the data using a pattern recognition system (PRS);
storing a state of the PRS;
generating metadata associated with the frame using the PRS;
receiving input characterized as ground truth metadata (GTM), into an optimization system;
adjusting input parameters for the PRS to minimize a distance metric corresponding to a difference between the GTM and PRS output metadata.
10. The method of claim 9 wherein said input is obtained from one or more of archived media streams, third party metadata and one or more human annotators.
11. The method of claim 10 wherein subsequent to receiving said input, said GTM and said metadata associated with said PRS are temporally aligned.
12. The method of claim 10 wherein said GTM and said metadata associated with said PRS are continuously stored and memory and periodically stored to disk thereby enabling fast recovery from system failure.
13. A method comprising
receiving from a human annotator (HA), via a human annotator user interface (HAUI), information regarding a time point selected by the HA on a timeline of the media stream;
merging existing ground truth metadata (GTM) relating to a media frame corresponding to the selected time point with pattern recognition system (PRS) output metadata relating to said media frame, thereby generating proposed annotation data (PAD) for the media frame;
displaying the media frame and the PAD to the HA;
receiving input from the HA including correction and/or approval of the PAD, where approved PAD is characterized as new GTM related to the selected time point;
storing the new GTM;
comparing the PRS output metadata and the new GTM related to the selected time point; and
adjusting PRS input parameters so that a distance metric corresponding to a difference between the new GTM and PRS output metadata related to the selected time point is minimized.
14. The method of claim 13 wherein said GTM is obtained from one or more of archived media streams, third party metadata, said human annotators and other human annotators.
15. The method of claim 14 wherein when said human annotator approves said PAD, said PAD is graphically overlaid on said media stream.
16. A method comprising:
generating output metadata associated with a frame of a media stream, output by a pattern recognition system (PRS);
storing in an archive input from a human annotator (HA) related to the frame, characterized as ground truth metadata (GTM);
merging the GTM and the PRS output metadata to thereby generate proposed annotation data (PAD); and
displaying the PAD to the HA by a user interface;
receiving via the user interface an input from the HA indicating approval of the GTM for the frame; and
adjusting input parameters for the PRS using an optimization system, to minimize a distance metric corresponding to a difference between the GTM and the PRS output metadata.
17. The method of claim 16 wherein said GTM is obtained from one or more of archived media streams, third party metadata, said human annotators and other human annotators.
18. The method of claim 17 wherein when said human annotator approves said PAD, said PAD is graphically overlaid on said media stream.
US13/836,605 2012-04-24 2013-03-15 System for Annotating Media Content for Automatic Content Understanding Abandoned US20130283143A1 (en)

Priority Applications (13)

Application Number Priority Date Filing Date Title
US13/836,605 US20130283143A1 (en) 2012-04-24 2013-03-15 System for Annotating Media Content for Automatic Content Understanding
MX2014012970A MX339009B (en) 2012-04-24 2013-04-22 System for annotating media content for automatic content understanding.
PCT/US2013/037545 WO2013163066A2 (en) 2012-04-24 2013-04-22 System for annotating media content for automatic content understanding
US14/385,989 US9659597B2 (en) 2012-04-24 2013-04-22 Annotating media content for automatic content understanding
EP13781985.0A EP2842054A4 (en) 2012-04-24 2013-04-22 Annotating media content for automatic content understanding
BR112014026589A BR112014026589A2 (en) 2012-04-24 2013-04-22 system for annotating media content for automatic content understanding
CA2870454A CA2870454A1 (en) 2012-04-24 2013-04-22 System for annotating media content for automatic content understanding
US14/186,163 US9367745B2 (en) 2012-04-24 2014-02-21 System for annotating media content for automatic content understanding
CO14244442A CO7121323A2 (en) 2012-04-24 2014-11-05 Media content annotation for automatic content understanding
US15/170,460 US10491961B2 (en) 2012-04-24 2016-06-01 System for annotating media content for automatic content understanding
US15/491,031 US10056112B2 (en) 2012-04-24 2017-04-19 Annotating media content for automatic content understanding
US16/044,084 US10381045B2 (en) 2012-04-24 2018-07-24 Annotating media content for automatic content understanding
US16/457,113 US10553252B2 (en) 2012-04-24 2019-06-28 Annotating media content for automatic content understanding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261637344P 2012-04-24 2012-04-24
US13/836,605 US20130283143A1 (en) 2012-04-24 2013-03-15 System for Annotating Media Content for Automatic Content Understanding

Related Child Applications (4)

Application Number Title Priority Date Filing Date
PCT/US2013/037545 Continuation-In-Part WO2013163066A2 (en) 2012-04-24 2013-04-22 System for annotating media content for automatic content understanding
US14/385,989 Continuation US9659597B2 (en) 2012-04-24 2013-04-22 Annotating media content for automatic content understanding
US14/385,989 Continuation-In-Part US9659597B2 (en) 2012-04-24 2013-04-22 Annotating media content for automatic content understanding
US14/186,163 Continuation-In-Part US9367745B2 (en) 2012-04-24 2014-02-21 System for annotating media content for automatic content understanding

Publications (1)

Publication Number Publication Date
US20130283143A1 true US20130283143A1 (en) 2013-10-24

Family

ID=49381315

Family Applications (5)

Application Number Title Priority Date Filing Date
US13/836,605 Abandoned US20130283143A1 (en) 2012-04-24 2013-03-15 System for Annotating Media Content for Automatic Content Understanding
US14/385,989 Active US9659597B2 (en) 2012-04-24 2013-04-22 Annotating media content for automatic content understanding
US15/491,031 Active US10056112B2 (en) 2012-04-24 2017-04-19 Annotating media content for automatic content understanding
US16/044,084 Active US10381045B2 (en) 2012-04-24 2018-07-24 Annotating media content for automatic content understanding
US16/457,113 Active US10553252B2 (en) 2012-04-24 2019-06-28 Annotating media content for automatic content understanding

Family Applications After (4)

Application Number Title Priority Date Filing Date
US14/385,989 Active US9659597B2 (en) 2012-04-24 2013-04-22 Annotating media content for automatic content understanding
US15/491,031 Active US10056112B2 (en) 2012-04-24 2017-04-19 Annotating media content for automatic content understanding
US16/044,084 Active US10381045B2 (en) 2012-04-24 2018-07-24 Annotating media content for automatic content understanding
US16/457,113 Active US10553252B2 (en) 2012-04-24 2019-06-28 Annotating media content for automatic content understanding

Country Status (7)

Country Link
US (5) US20130283143A1 (en)
EP (1) EP2842054A4 (en)
BR (1) BR112014026589A2 (en)
CA (1) CA2870454A1 (en)
CO (1) CO7121323A2 (en)
MX (1) MX339009B (en)
WO (1) WO2013163066A2 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140223279A1 (en) * 2013-02-07 2014-08-07 Cherif Atia Algreatly Data augmentation with real-time annotations
US20140229579A1 (en) * 2013-02-12 2014-08-14 Unicorn Media, Inc. Cloud-based video delivery
WO2015126830A1 (en) * 2014-02-21 2015-08-27 Liveclips Llc System for annotating media content for automatic content understanding
US9141860B2 (en) 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9141859B2 (en) 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9367745B2 (en) 2012-04-24 2016-06-14 Liveclips Llc System for annotating media content for automatic content understanding
US9659597B2 (en) 2012-04-24 2017-05-23 Liveclips Llc Annotating media content for automatic content understanding
CN109874047A (en) * 2017-12-04 2019-06-11 腾讯科技(深圳)有限公司 Living broadcast interactive method, apparatus and system
US10609398B2 (en) * 2017-07-28 2020-03-31 Black Sesame International Holding Limited Ultra-low bitrate coding based on 3D map reconstruction and decimated sub-pictures
WO2020154556A1 (en) * 2019-01-25 2020-07-30 Gracenote, Inc. Methods and systems for extracting sport-related information from digital video frames
WO2020154553A1 (en) * 2019-01-25 2020-07-30 Gracenote, Inc. Methods and systems for sport data extraction
US10762675B2 (en) * 2016-12-12 2020-09-01 Facebook, Inc. Systems and methods for interactive broadcasting
US11010627B2 (en) 2019-01-25 2021-05-18 Gracenote, Inc. Methods and systems for scoreboard text region detection
US11036995B2 (en) 2019-01-25 2021-06-15 Gracenote, Inc. Methods and systems for scoreboard region detection
US11087161B2 (en) 2019-01-25 2021-08-10 Gracenote, Inc. Methods and systems for determining accuracy of sport-related information extracted from digital video frames
US11895369B2 (en) * 2017-08-28 2024-02-06 Dolby Laboratories Licensing Corporation Media-aware navigation metadata

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10431208B2 (en) * 2015-06-01 2019-10-01 Sinclair Broadcast Group, Inc. Content presentation analytics and optimization
US10765954B2 (en) 2017-06-15 2020-09-08 Microsoft Technology Licensing, Llc Virtual event broadcasting
CN110347866B (en) * 2019-07-05 2023-06-23 联想(北京)有限公司 Information processing method, information processing device, storage medium and electronic equipment
EP3961430A1 (en) * 2020-08-07 2022-03-02 Rawbet GmbH Recognition of visual information in moving visual media
CN112527374A (en) * 2020-12-11 2021-03-19 北京百度网讯科技有限公司 Marking tool generation method, marking method, device, equipment and storage medium

Citations (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020059644A1 (en) * 2000-04-24 2002-05-16 Andrade David De Method and system for automatic insertion of interactive TV triggers into a broadcast data stream
US20020099695A1 (en) * 2000-11-21 2002-07-25 Abajian Aram Christian Internet streaming media workflow architecture
US20020184020A1 (en) * 2001-03-13 2002-12-05 Nec Corporation Speech recognition apparatus
US20030083879A1 (en) * 2001-10-31 2003-05-01 James Cyr Dynamic insertion of a speech recognition engine within a distributed speech recognition system
US20030152277A1 (en) * 2002-02-13 2003-08-14 Convey Corporation Method and system for interactive ground-truthing of document images
US20030177503A1 (en) * 2000-07-24 2003-09-18 Sanghoon Sull Method and apparatus for fast metadata generation, delivery and access for live broadcast program
US20040021685A1 (en) * 2002-07-30 2004-02-05 Fuji Xerox Co., Ltd. Systems and methods for filtering and/or viewing collaborative indexes of recorded media
US6763069B1 (en) * 2000-07-06 2004-07-13 Mitsubishi Electric Research Laboratories, Inc Extraction of high-level features from low-level features of multimedia content
US20040143604A1 (en) * 2003-01-21 2004-07-22 Steve Glenner Random access editing of media
US20040168118A1 (en) * 2003-02-24 2004-08-26 Wong Curtis G. Interactive media frame display
US20040205482A1 (en) * 2002-01-24 2004-10-14 International Business Machines Corporation Method and apparatus for active annotation of multimedia content
US20040267698A1 (en) * 2003-06-26 2004-12-30 Mitsutoshi Shinkai Information processing apparatus and method, program, and recording medium
US20060031236A1 (en) * 2004-08-04 2006-02-09 Kabushiki Kaisha Toshiba Data structure of metadata and reproduction method of the same
US20060136205A1 (en) * 2004-12-21 2006-06-22 Song Jianming J Method of refining statistical pattern recognition models and statistical pattern recognizers
US20060161867A1 (en) * 2003-01-21 2006-07-20 Microsoft Corporation Media frame object visualization system
US20070245400A1 (en) * 1998-11-06 2007-10-18 Seungyup Paek Video description system and method
US20070256016A1 (en) * 2006-04-26 2007-11-01 Bedingfield James C Sr Methods, systems, and computer program products for managing video information
US20070277092A1 (en) * 2006-05-24 2007-11-29 Basson Sara H Systems and methods for augmenting audio/visual broadcasts with annotations to assist with perception and interpretation of broadcast content
US20080154908A1 (en) * 2006-12-22 2008-06-26 Google Inc. Annotation Framework for Video
US20080270338A1 (en) * 2006-08-14 2008-10-30 Neural Id Llc Partition-Based Pattern Recognition System
US20080317286A1 (en) * 2007-06-20 2008-12-25 Sony United Kingdom Limited Security device and system
US20090055419A1 (en) * 2007-08-21 2009-02-26 At&T Labs, Inc Method and system for content resyndication
US20090097815A1 (en) * 2007-06-18 2009-04-16 Lahr Nils B System and method for distributed and parallel video editing, tagging, and indexing
US20090106297A1 (en) * 2007-10-18 2009-04-23 David Howell Wright Methods and apparatus to create a media measurement reference database from a plurality of distributed sources
US20090164462A1 (en) * 2006-05-09 2009-06-25 Koninklijke Philips Electronics N.V. Device and a method for annotating content
US20090249387A1 (en) * 2008-03-31 2009-10-01 Microsoft Corporation Personalized Event Notification Using Real-Time Video Analysis
US20090265617A1 (en) * 2005-10-25 2009-10-22 Sonic Solutions, A California Corporation Methods and systems for use in maintaining media data quality upon conversion to a different data format
US20090319885A1 (en) * 2008-06-23 2009-12-24 Brian Scott Amento Collaborative annotation of multimedia content
US7742921B1 (en) * 2005-09-27 2010-06-22 At&T Intellectual Property Ii, L.P. System and method for correcting errors when generating a TTS voice
US7773670B1 (en) * 2001-06-05 2010-08-10 At+T Intellectual Property Ii, L.P. Method of content adaptive video encoding
US20100287473A1 (en) * 2006-01-17 2010-11-11 Arthur Recesso Video analysis tool systems and methods
US20100293187A1 (en) * 2007-06-22 2010-11-18 Bayerische Medientechnik Gmbh System and method for broadcast media tagging
US20100332809A1 (en) * 2009-06-26 2010-12-30 Micron Technology Inc. Methods and Devices for Saving and/or Restoring a State of a Pattern-Recognition Processor
US20110040760A1 (en) * 2009-07-16 2011-02-17 Bluefin Lab, Inc. Estimating Social Interest in Time-based Media
US20110143811A1 (en) * 2009-08-17 2011-06-16 Rodriguez Tony F Methods and Systems for Content Processing
US20110145182A1 (en) * 2009-12-15 2011-06-16 Micron Technology, Inc. Adaptive content inspection
US20110288862A1 (en) * 2010-05-18 2011-11-24 Ognjen Todic Methods and Systems for Performing Synchronization of Audio with Corresponding Textual Transcriptions and Determining Confidence Values of the Synchronization
US20120123777A1 (en) * 2008-04-24 2012-05-17 Nuance Communications, Inc. Adjusting a speech engine for a mobile computing device based on background noise
US20120147264A1 (en) * 2007-01-19 2012-06-14 International Business Machines Corporation Method for the semi-automatic editing of timed and annotated data
US20120159290A1 (en) * 2010-12-17 2012-06-21 Microsoft Corporation Validation analysis of human target
US20120192227A1 (en) * 2011-01-21 2012-07-26 Bluefin Labs, Inc. Cross Media Targeted Message Synchronization
US20120215329A1 (en) * 2011-02-22 2012-08-23 Dolby Laboratories Licensing Corporation Alignment and Re-Association of Metadata for Media Streams Within a Computing Device
US20120215903A1 (en) * 2011-02-18 2012-08-23 Bluefin Lab, Inc. Generating Audience Response Metrics and Ratings From Social Interest In Time-Based Media
US20120257875A1 (en) * 2008-01-11 2012-10-11 Bruce Sharpe Temporal alignment of video recordings
US20120303643A1 (en) * 2011-05-26 2012-11-29 Raymond Lau Alignment of Metadata
US20130014155A1 (en) * 2011-06-14 2013-01-10 Douglas Clarke System and method for presenting content with time based metadata
US20130124984A1 (en) * 2010-04-12 2013-05-16 David A. Kuspa Method and Apparatus for Providing Script Data
US20130263166A1 (en) * 2012-03-27 2013-10-03 Bluefin Labs, Inc. Social Networking System Targeted Message Synchronization
US20130294746A1 (en) * 2012-05-01 2013-11-07 Wochit, Inc. System and method of generating multimedia content
US20150139610A1 (en) * 2013-11-15 2015-05-21 Clipmine, Inc. Computer-assisted collaborative tagging of video content for indexing and table of contents generation
US20150227849A1 (en) * 2009-12-07 2015-08-13 Yahoo! Inc. Method and System for Invariant Pattern Recognition

Family Cites Families (78)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5091780A (en) 1990-05-09 1992-02-25 Carnegie-Mellon University A trainable security system emthod for the same
WO1992000654A1 (en) 1990-06-25 1992-01-09 Barstow David R A method for encoding and broadcasting information about live events using computer simulation and pattern matching techniques
US5189630A (en) 1991-01-15 1993-02-23 Barstow David R Method for encoding and broadcasting information about live events using computer pattern matching techniques
US7373587B1 (en) 1990-06-25 2008-05-13 Barstow David R Representing sub-events with physical exertion actions
WO1995010915A1 (en) 1993-10-12 1995-04-20 Orad, Inc. Sports event video
US5539454A (en) 1995-02-06 1996-07-23 The United States Of America As Represented By The Administrator, National Aeronautics And Space Administration Video event trigger and tracking system using fuzzy comparators
US7055166B1 (en) 1996-10-03 2006-05-30 Gotuit Media Corp. Apparatus and methods for broadcast monitoring
US5892536A (en) 1996-10-03 1999-04-06 Personal Audio Systems and methods for computer enhanced broadcast monitoring
US7058376B2 (en) 1999-01-27 2006-06-06 Logan James D Radio receiving, recording and playback system
US6931451B1 (en) 1996-10-03 2005-08-16 Gotuit Media Corp. Systems and methods for modifying broadcast programming
US6088455A (en) 1997-01-07 2000-07-11 Logan; James D. Methods and apparatus for selectively reproducing segments of broadcast programming
US5986692A (en) 1996-10-03 1999-11-16 Logan; James D. Systems and methods for computer enhanced broadcast monitoring
US6031573A (en) 1996-10-31 2000-02-29 Sensormatic Electronics Corporation Intelligent video information management system performing multiple functions in parallel
US6920468B1 (en) 1998-07-08 2005-07-19 Ncr Corporation Event occurrence detection method and apparatus
US7211000B2 (en) 1998-12-22 2007-05-01 Intel Corporation Gaming utilizing actual telemetry data
DE60037119T3 (en) 1999-03-29 2012-10-04 Gotuit Media Corp., ELECTRONIC STORAGE OF MUSIC DATA AND PROGRAMS, WITH THE DETECTION OF PROGRAM SEGMENTS, SUCH AS MUSIC LECTURES RECORDED, AND SYSTEM FOR THE MANAGEMENT AND PLAYING OF SUCH PROGRAM SEGMENTS
US6411724B1 (en) * 1999-07-02 2002-06-25 Koninklijke Philips Electronics N.V. Using meta-descriptors to represent multimedia information
US20040125877A1 (en) 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content
US7624337B2 (en) * 2000-07-24 2009-11-24 Vmark, Inc. System and method for indexing, searching, identifying, and editing portions of electronic multimedia files
US20060064716A1 (en) * 2000-07-24 2006-03-23 Vivcom, Inc. Techniques for navigating multiple video streams
US6925455B2 (en) 2000-12-12 2005-08-02 Nec Corporation Creating audio-centric, image-centric, and integrated audio-visual summaries
US6567536B2 (en) 2001-02-16 2003-05-20 Golftec Enterprises Llc Method and system for physical motion analysis
US6625310B2 (en) 2001-03-23 2003-09-23 Diamondback Vision, Inc. Video segmentation using statistical pixel modeling
US7339992B2 (en) 2001-12-06 2008-03-04 The Trustees Of Columbia University In The City Of New York System and method for extracting text captions from video and generating video summaries
US7399277B2 (en) 2001-12-27 2008-07-15 Medtronic Minimed, Inc. System for monitoring physiological characteristics
US7027124B2 (en) 2002-02-28 2006-04-11 Fuji Xerox Co., Ltd. Method for automatically producing music videos
US6749512B2 (en) 2002-03-15 2004-06-15 Macgregor Brian Computer network implemented gaming system and method of using same
US20050149299A1 (en) 2002-04-24 2005-07-07 George Bolt Method and system for detecting change in data streams
US20040080615A1 (en) 2002-08-21 2004-04-29 Strategic Vista Intenational Inc. Digital video security system
US8087054B2 (en) 2002-09-30 2011-12-27 Eastman Kodak Company Automated event content processing method and system
US20040068758A1 (en) 2002-10-02 2004-04-08 Mike Daily Dynamic video annotation
GB2395852B (en) 2002-11-29 2006-04-19 Sony Uk Ltd Media handling system
US7904797B2 (en) 2003-01-21 2011-03-08 Microsoft Corporation Rapid media group annotation
US7244852B2 (en) 2003-02-27 2007-07-17 Abbott Laboratories Process for preparing 2-methylpyrrolidine and specific enantiomers thereof
WO2006009521A1 (en) 2004-07-23 2006-01-26 Agency For Science, Technology And Research System and method for replay generation for broadcast video
US20060218191A1 (en) * 2004-08-31 2006-09-28 Gopalakrishnan Kumar C Method and System for Managing Multimedia Documents
US8156427B2 (en) * 2005-08-23 2012-04-10 Ricoh Co. Ltd. User interface for mixed media reality
US7702673B2 (en) * 2004-10-01 2010-04-20 Ricoh Co., Ltd. System and methods for creation and use of a mixed media environment
US7609290B2 (en) 2005-01-28 2009-10-27 Technology Advancement Group, Inc. Surveillance system and method
US20060227237A1 (en) 2005-03-31 2006-10-12 International Business Machines Corporation Video surveillance system and method with combined video and audio recognition
US8016664B2 (en) 2005-04-13 2011-09-13 Take Two Interactive Software, Inc. Systems and methods for simulating a particular user in an interactive computer system
US20070100521A1 (en) 2005-10-31 2007-05-03 Eric Grae Reporting information related to a vehicular accident
US20070101394A1 (en) 2005-11-01 2007-05-03 Yesvideo, Inc. Indexing a recording of audiovisual content to enable rich navigation
US8117032B2 (en) 2005-11-09 2012-02-14 Nuance Communications, Inc. Noise playback enhancement of prerecorded audio for speech recognition operations
US20100005485A1 (en) 2005-12-19 2010-01-07 Agency For Science, Technology And Research Annotation of video footage and personalised video generation
JP4826333B2 (en) 2006-05-11 2011-11-30 ソニー株式会社 Image processing apparatus and method, and program
US7596759B2 (en) 2006-05-23 2009-09-29 Verna Anthony F Instant football widget
US7917514B2 (en) * 2006-06-28 2011-03-29 Microsoft Corporation Visual and multi-dimensional search
US8611723B2 (en) 2006-09-06 2013-12-17 James Andrew Aman System for relating scoreboard information with event video
WO2008043160A1 (en) * 2006-10-11 2008-04-17 Tagmotion Pty Limited Method and apparatus for managing multimedia files
JP2008123501A (en) 2006-10-15 2008-05-29 Fujitsu Ten Ltd Vehicle information recording device
TWI332640B (en) 2006-12-01 2010-11-01 Cyberlink Corp Method capable of detecting a scoreboard in a program and related system
JP5010292B2 (en) 2007-01-18 2012-08-29 株式会社東芝 Video attribute information output device, video summarization device, program, and video attribute information output method
GB2447053A (en) 2007-02-27 2008-09-03 Sony Uk Ltd System for generating a highlight summary of a performance
US8316302B2 (en) * 2007-05-11 2012-11-20 General Instrument Corporation Method and apparatus for annotating video content with metadata generated using speech recognition technology
GB2449125A (en) * 2007-05-11 2008-11-12 Sony Uk Ltd Metadata with degree of trust indication
US7460149B1 (en) 2007-05-28 2008-12-02 Kd Secure, Llc Video data storage, search, and retrieval using meta-data and attribute data in a video surveillance system
US8171030B2 (en) 2007-06-18 2012-05-01 Zeitera, Llc Method and apparatus for multi-dimensional content search and video identification
DE102007034010A1 (en) 2007-07-20 2009-01-22 Dallmeier Electronic Gmbh & Co. Kg Method and device for processing video data
WO2009018171A1 (en) 2007-07-27 2009-02-05 Synergy Sports Technology, Llc Systems and methods for generating bookmark video fingerprints
US7983442B2 (en) 2007-08-29 2011-07-19 Cyberlink Corp. Method and apparatus for determining highlight segments of sport video
US8200063B2 (en) * 2007-09-24 2012-06-12 Fuji Xerox Co., Ltd. System and method for video summarization
US8311344B2 (en) 2008-02-15 2012-11-13 Digitalsmiths, Inc. Systems and methods for semantically classifying shots in video
US8804005B2 (en) * 2008-04-29 2014-08-12 Microsoft Corporation Video concept detection using multi-layer multi-instance learning
US7890512B2 (en) 2008-06-11 2011-02-15 Microsoft Corporation Automatic image annotation using semantic distance learning
US8335786B2 (en) 2009-05-28 2012-12-18 Zeitera, Llc Multi-media content identification using multi-level content signature correlation and fast similarity search
US9141859B2 (en) 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9141860B2 (en) 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9442933B2 (en) 2008-12-24 2016-09-13 Comcast Interactive Media, Llc Identification of segments within audio, video, and multimedia items
US20100245072A1 (en) 2009-03-25 2010-09-30 Syclipse Technologies, Inc. System and method for providing remote monitoring services
CN102369729B (en) 2009-03-31 2014-11-05 日本电气株式会社 Tracking judgment device, tracking judgment method, and tracking judgment program
KR20110021195A (en) 2009-08-25 2011-03-04 삼성전자주식회사 Method and apparatus for detecting an important information from a moving picture
US8922718B2 (en) 2009-10-21 2014-12-30 Disney Enterprises, Inc. Key generation through spatial detection of dynamic objects
KR20130029082A (en) * 2010-05-04 2013-03-21 샤잠 엔터테인먼트 리미티드 Methods and systems for processing a sample of media stream
IL210427A0 (en) 2011-01-02 2011-06-30 Agent Video Intelligence Ltd Calibration device and method for use in a surveillance system for event detection
EP2707834B1 (en) 2011-05-13 2020-06-24 Vizrt Ag Silhouette-based pose estimation
US9367745B2 (en) 2012-04-24 2016-06-14 Liveclips Llc System for annotating media content for automatic content understanding
US20130283143A1 (en) 2012-04-24 2013-10-24 Eric David Petajan System for Annotating Media Content for Automatic Content Understanding

Patent Citations (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070245400A1 (en) * 1998-11-06 2007-10-18 Seungyup Paek Video description system and method
US20020059644A1 (en) * 2000-04-24 2002-05-16 Andrade David De Method and system for automatic insertion of interactive TV triggers into a broadcast data stream
US6763069B1 (en) * 2000-07-06 2004-07-13 Mitsubishi Electric Research Laboratories, Inc Extraction of high-level features from low-level features of multimedia content
US20030177503A1 (en) * 2000-07-24 2003-09-18 Sanghoon Sull Method and apparatus for fast metadata generation, delivery and access for live broadcast program
US20020099695A1 (en) * 2000-11-21 2002-07-25 Abajian Aram Christian Internet streaming media workflow architecture
US20050038809A1 (en) * 2000-11-21 2005-02-17 Abajian Aram Christian Internet streaming media workflow architecture
US20020184020A1 (en) * 2001-03-13 2002-12-05 Nec Corporation Speech recognition apparatus
US7773670B1 (en) * 2001-06-05 2010-08-10 At+T Intellectual Property Ii, L.P. Method of content adaptive video encoding
US20030083879A1 (en) * 2001-10-31 2003-05-01 James Cyr Dynamic insertion of a speech recognition engine within a distributed speech recognition system
US20040205482A1 (en) * 2002-01-24 2004-10-14 International Business Machines Corporation Method and apparatus for active annotation of multimedia content
US20030152277A1 (en) * 2002-02-13 2003-08-14 Convey Corporation Method and system for interactive ground-truthing of document images
US20040021685A1 (en) * 2002-07-30 2004-02-05 Fuji Xerox Co., Ltd. Systems and methods for filtering and/or viewing collaborative indexes of recorded media
US20040143604A1 (en) * 2003-01-21 2004-07-22 Steve Glenner Random access editing of media
US20060161867A1 (en) * 2003-01-21 2006-07-20 Microsoft Corporation Media frame object visualization system
US20040168118A1 (en) * 2003-02-24 2004-08-26 Wong Curtis G. Interactive media frame display
US20040267698A1 (en) * 2003-06-26 2004-12-30 Mitsutoshi Shinkai Information processing apparatus and method, program, and recording medium
US20060031236A1 (en) * 2004-08-04 2006-02-09 Kabushiki Kaisha Toshiba Data structure of metadata and reproduction method of the same
US20060136205A1 (en) * 2004-12-21 2006-06-22 Song Jianming J Method of refining statistical pattern recognition models and statistical pattern recognizers
US7742921B1 (en) * 2005-09-27 2010-06-22 At&T Intellectual Property Ii, L.P. System and method for correcting errors when generating a TTS voice
US20090265617A1 (en) * 2005-10-25 2009-10-22 Sonic Solutions, A California Corporation Methods and systems for use in maintaining media data quality upon conversion to a different data format
US20100287473A1 (en) * 2006-01-17 2010-11-11 Arthur Recesso Video analysis tool systems and methods
US20070256016A1 (en) * 2006-04-26 2007-11-01 Bedingfield James C Sr Methods, systems, and computer program products for managing video information
US20090164462A1 (en) * 2006-05-09 2009-06-25 Koninklijke Philips Electronics N.V. Device and a method for annotating content
US20070277092A1 (en) * 2006-05-24 2007-11-29 Basson Sara H Systems and methods for augmenting audio/visual broadcasts with annotations to assist with perception and interpretation of broadcast content
US20080270338A1 (en) * 2006-08-14 2008-10-30 Neural Id Llc Partition-Based Pattern Recognition System
US20080154908A1 (en) * 2006-12-22 2008-06-26 Google Inc. Annotation Framework for Video
US20120166930A1 (en) * 2006-12-22 2012-06-28 Google Inc. Annotation Framework For Video
US20120147264A1 (en) * 2007-01-19 2012-06-14 International Business Machines Corporation Method for the semi-automatic editing of timed and annotated data
US20090097815A1 (en) * 2007-06-18 2009-04-16 Lahr Nils B System and method for distributed and parallel video editing, tagging, and indexing
US20080317286A1 (en) * 2007-06-20 2008-12-25 Sony United Kingdom Limited Security device and system
US20100293187A1 (en) * 2007-06-22 2010-11-18 Bayerische Medientechnik Gmbh System and method for broadcast media tagging
US20090055419A1 (en) * 2007-08-21 2009-02-26 At&T Labs, Inc Method and system for content resyndication
US20090106297A1 (en) * 2007-10-18 2009-04-23 David Howell Wright Methods and apparatus to create a media measurement reference database from a plurality of distributed sources
US20120257875A1 (en) * 2008-01-11 2012-10-11 Bruce Sharpe Temporal alignment of video recordings
US20090249387A1 (en) * 2008-03-31 2009-10-01 Microsoft Corporation Personalized Event Notification Using Real-Time Video Analysis
US20120123777A1 (en) * 2008-04-24 2012-05-17 Nuance Communications, Inc. Adjusting a speech engine for a mobile computing device based on background noise
US20090319885A1 (en) * 2008-06-23 2009-12-24 Brian Scott Amento Collaborative annotation of multimedia content
US20100332809A1 (en) * 2009-06-26 2010-12-30 Micron Technology Inc. Methods and Devices for Saving and/or Restoring a State of a Pattern-Recognition Processor
US20110040760A1 (en) * 2009-07-16 2011-02-17 Bluefin Lab, Inc. Estimating Social Interest in Time-based Media
US20110143811A1 (en) * 2009-08-17 2011-06-16 Rodriguez Tony F Methods and Systems for Content Processing
US20150227849A1 (en) * 2009-12-07 2015-08-13 Yahoo! Inc. Method and System for Invariant Pattern Recognition
US20110145182A1 (en) * 2009-12-15 2011-06-16 Micron Technology, Inc. Adaptive content inspection
US20130124203A1 (en) * 2010-04-12 2013-05-16 II Jerry R. Scoggins Aligning Scripts To Dialogues For Unmatched Portions Based On Matched Portions
US20130124984A1 (en) * 2010-04-12 2013-05-16 David A. Kuspa Method and Apparatus for Providing Script Data
US20110288862A1 (en) * 2010-05-18 2011-11-24 Ognjen Todic Methods and Systems for Performing Synchronization of Audio with Corresponding Textual Transcriptions and Determining Confidence Values of the Synchronization
US20120159290A1 (en) * 2010-12-17 2012-06-21 Microsoft Corporation Validation analysis of human target
US20120192227A1 (en) * 2011-01-21 2012-07-26 Bluefin Labs, Inc. Cross Media Targeted Message Synchronization
US20120215903A1 (en) * 2011-02-18 2012-08-23 Bluefin Lab, Inc. Generating Audience Response Metrics and Ratings From Social Interest In Time-Based Media
US20120215329A1 (en) * 2011-02-22 2012-08-23 Dolby Laboratories Licensing Corporation Alignment and Re-Association of Metadata for Media Streams Within a Computing Device
US20120303643A1 (en) * 2011-05-26 2012-11-29 Raymond Lau Alignment of Metadata
US20130014155A1 (en) * 2011-06-14 2013-01-10 Douglas Clarke System and method for presenting content with time based metadata
US20130263166A1 (en) * 2012-03-27 2013-10-03 Bluefin Labs, Inc. Social Networking System Targeted Message Synchronization
US20130294746A1 (en) * 2012-05-01 2013-11-07 Wochit, Inc. System and method of generating multimedia content
US20150139610A1 (en) * 2013-11-15 2015-05-21 Clipmine, Inc. Computer-assisted collaborative tagging of video content for indexing and table of contents generation

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10102430B2 (en) 2008-11-17 2018-10-16 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US11625917B2 (en) 2008-11-17 2023-04-11 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US11036992B2 (en) 2008-11-17 2021-06-15 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US10565453B2 (en) 2008-11-17 2020-02-18 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9141860B2 (en) 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9141859B2 (en) 2008-11-17 2015-09-22 Liveclips Llc Method and system for segmenting and transmitting on-demand live-action video in real-time
US9659597B2 (en) 2012-04-24 2017-05-23 Liveclips Llc Annotating media content for automatic content understanding
US10056112B2 (en) 2012-04-24 2018-08-21 Liveclips Llc Annotating media content for automatic content understanding
US9367745B2 (en) 2012-04-24 2016-06-14 Liveclips Llc System for annotating media content for automatic content understanding
US10381045B2 (en) 2012-04-24 2019-08-13 Liveclips Llc Annotating media content for automatic content understanding
US10491961B2 (en) 2012-04-24 2019-11-26 Liveclips Llc System for annotating media content for automatic content understanding
US10553252B2 (en) 2012-04-24 2020-02-04 Liveclips Llc Annotating media content for automatic content understanding
US9524282B2 (en) * 2013-02-07 2016-12-20 Cherif Algreatly Data augmentation with real-time annotations
US20140223279A1 (en) * 2013-02-07 2014-08-07 Cherif Atia Algreatly Data augmentation with real-time annotations
US20140229579A1 (en) * 2013-02-12 2014-08-14 Unicorn Media, Inc. Cloud-based video delivery
US9112939B2 (en) * 2013-02-12 2015-08-18 Brightcove, Inc. Cloud-based video delivery
US10999340B2 (en) 2013-02-12 2021-05-04 Brightcove Inc. Cloud-based video delivery
WO2015126830A1 (en) * 2014-02-21 2015-08-27 Liveclips Llc System for annotating media content for automatic content understanding
US10762675B2 (en) * 2016-12-12 2020-09-01 Facebook, Inc. Systems and methods for interactive broadcasting
US10609398B2 (en) * 2017-07-28 2020-03-31 Black Sesame International Holding Limited Ultra-low bitrate coding based on 3D map reconstruction and decimated sub-pictures
US11895369B2 (en) * 2017-08-28 2024-02-06 Dolby Laboratories Licensing Corporation Media-aware navigation metadata
CN109874047A (en) * 2017-12-04 2019-06-11 腾讯科技(深圳)有限公司 Living broadcast interactive method, apparatus and system
US11087161B2 (en) 2019-01-25 2021-08-10 Gracenote, Inc. Methods and systems for determining accuracy of sport-related information extracted from digital video frames
US11036995B2 (en) 2019-01-25 2021-06-15 Gracenote, Inc. Methods and systems for scoreboard region detection
US11010627B2 (en) 2019-01-25 2021-05-18 Gracenote, Inc. Methods and systems for scoreboard text region detection
WO2020154553A1 (en) * 2019-01-25 2020-07-30 Gracenote, Inc. Methods and systems for sport data extraction
US11568644B2 (en) 2019-01-25 2023-01-31 Gracenote, Inc. Methods and systems for scoreboard region detection
WO2020154556A1 (en) * 2019-01-25 2020-07-30 Gracenote, Inc. Methods and systems for extracting sport-related information from digital video frames
US11792441B2 (en) 2019-01-25 2023-10-17 Gracenote, Inc. Methods and systems for scoreboard text region detection
US11798279B2 (en) 2019-01-25 2023-10-24 Gracenote, Inc. Methods and systems for sport data extraction
US11805283B2 (en) 2019-01-25 2023-10-31 Gracenote, Inc. Methods and systems for extracting sport-related information from digital video frames
US11830261B2 (en) 2019-01-25 2023-11-28 Gracenote, Inc. Methods and systems for determining accuracy of sport-related information extracted from digital video frames
US10997424B2 (en) 2019-01-25 2021-05-04 Gracenote, Inc. Methods and systems for sport data extraction

Also Published As

Publication number Publication date
US20150071618A1 (en) 2015-03-12
US10553252B2 (en) 2020-02-04
WO2013163066A3 (en) 2015-01-22
US20180336928A1 (en) 2018-11-22
US10381045B2 (en) 2019-08-13
US20170221523A1 (en) 2017-08-03
US20190318765A1 (en) 2019-10-17
EP2842054A2 (en) 2015-03-04
EP2842054A4 (en) 2016-07-27
BR112014026589A2 (en) 2017-06-27
US10056112B2 (en) 2018-08-21
CO7121323A2 (en) 2014-11-20
MX2014012970A (en) 2015-02-05
US9659597B2 (en) 2017-05-23
CA2870454A1 (en) 2013-10-31
WO2013163066A2 (en) 2013-10-31
MX339009B (en) 2016-05-05

Similar Documents

Publication Publication Date Title
US10553252B2 (en) Annotating media content for automatic content understanding
US10491961B2 (en) System for annotating media content for automatic content understanding
US11805291B2 (en) Synchronizing media content tag data
US20230071225A1 (en) Media environment driven content distribution platform
US11412293B2 (en) Modifying digital video content
US11431775B2 (en) System and method for data stream synchronization
JP2021509795A (en) Coordinates as auxiliary data
US10951935B2 (en) Media environment driven content distribution platform
WO2015126830A1 (en) System for annotating media content for automatic content understanding

Legal Events

Date Code Title Description
AS Assignment

Owner name: LIVECLIPS LLC, CONNECTICUT

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PETAJAN, ERIC DAVID;WEITE, DAVID EUGENE;VUNIC, DOUGLAS W.;SIGNING DATES FROM 20130503 TO 20130507;REEL/FRAME:030396/0203

AS Assignment

Owner name: DIRECTV INVESTMENTS, INC., CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:LIVECLIPS LLC;REEL/FRAME:031515/0508

Effective date: 20131025

AS Assignment

Owner name: LIVECLIPS LLC, CALIFORNIA

Free format text: MERGER;ASSIGNOR:LIVECLIPS LLC;REEL/FRAME:033180/0650

Effective date: 20140314

AS Assignment

Owner name: JBSHBM, LLC, FLORIDA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MCGHIE, SEAN I.;BUCHHEIT, BRIAN K.;REEL/FRAME:042658/0338

Effective date: 20170524

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION