WO2014178796A1 - System and method for identifying and synchronizing content - Google Patents

System and method for identifying and synchronizing content Download PDF

Info

Publication number
WO2014178796A1
WO2014178796A1 PCT/SG2014/000194 SG2014000194W WO2014178796A1 WO 2014178796 A1 WO2014178796 A1 WO 2014178796A1 SG 2014000194 W SG2014000194 W SG 2014000194W WO 2014178796 A1 WO2014178796 A1 WO 2014178796A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
primary content
primary
identification
time
Prior art date
Application number
PCT/SG2014/000194
Other languages
French (fr)
Inventor
Roland Benzon
Original Assignee
Telefun Transmedia Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefun Transmedia Pte Ltd filed Critical Telefun Transmedia Pte Ltd
Publication of WO2014178796A1 publication Critical patent/WO2014178796A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/835Generation of protective data, e.g. certificates
    • H04N21/8358Generation of protective data, e.g. certificates involving watermark
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43072Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/4104Peripherals receiving signals from specially adapted client devices
    • H04N21/4122Peripherals receiving signals from specially adapted client devices additional display device, e.g. video projector
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4622Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet

Definitions

  • the invention relates to a system and method for identifying and synchronizing content.
  • the system and method are suited (but not limited to) the identificatiqn of media content such as movies or broadcast shows for synchronized presentation of related secondary content such as multi-language tracks associated with the media content and will be described in this context.
  • Multiple language audio tracks in a plurality of foreign languages are typically available for physically sold media content such as DVD movies embedded in the physical DVD media. Consumers who buy popular movie DVDs can expect the movie to contain subtitles and language tracks. The consumer can select an alternative language track as an option to play, which is generally a dubbed version of the original language track.
  • Broadcast content are either broadcast in the original language or in a dubbed foreign language.
  • Some movies or TV shows may contain subtitles, but the subtitles are typically in one language.
  • the present invention seeks to alleviate the above mentioned problems and needs, at least in part.
  • a system for identifying and synchronizing content comprising:- a primary content player operable to execute a primary content; a second content player comprising a receiver for receiving a portion of the executed primary content; and an identification and synchronization server operable to be in data communication with the second content device to receive at least one parameter related to the primary content; the received one parameter utilized for identification of the primary content and retrieval of a second content contextually related to the primary content.
  • the primary content is identified using extraction of a watermark code from the received portion of the primary content; the watermark code comprises an alternation of an identifier of the primary content and a time code for recording a timing parameter of the primary content.
  • the identification and synchronization server proceeds to retrieve a full fingerprint of the primary content, and the secondary content device is operable to match the full fingerprint of the primary content with an unidentified portion of the primary content to determine a match point used to execute the second content in synchronization with the primary content.
  • the secondary content device may comprise a timer for calculating time difference between the time the portion of the primary content is presented and actual time the secondary content device receives the portion of the primary content.
  • the identification and synchronization server may comprises a timer for calculating time difference between the time the secondary content is retrieved and the time the secondary content is executed by the secondary content device.
  • the system further comprises a publish-subscribe server to send event-triggered updates on the primary content to the identification and synchronization server.
  • the primary content player and second content player are integrated as one single device.
  • the identification and synchronization server is operable to, upon retrieval of the second content, provides for the second content device to execute the second content in synchronization with the primary content.
  • a method for identifying and synchronizing content between a primary content player operable to execute a primary content and a second content player operable to execute a secondary content comprising:- receiving a portion of a primary content on a secondary content device; sending at least one parameter related to the primary content to an identification and synchronization server; identifying the primary content using the at least one parameter; retrieving the secondary content contextually related to the identified primary content using the identified primary content; and synchronizing the secondary content on the secondary content device with the primary content.
  • the method includes the step extracting a watermark code from the received portion of the primary content before sending the watermark code to the identification and synchronization server; the watermark code comprises an alternation of an identifier of the primary content and a time code for recording a timing parameter of the primary content.
  • the method may comprises an additional step wherein after the step of identifying the primary content the identification and synchronization server proceeds to retrieve a full fingerprint of the primary content.
  • the step of synchronizing the secondary content on the secondary content device with the primary content may include matching an unidentified portion of the primary content to determine a match point used to execute the second content in synchronization with the primary content.
  • the step of synchronizing may include calculating the time difference between the time the portion of the primary content is presented and actual time the secondary content device receives the portion of the primary content.
  • the step of synchronizing may further include calculating the time difference between the time the secondary content is retrieved and the time the secondary content is executed by the secondary content device.
  • Fig. 1 is a system diagram according to an embodiment of the invention
  • Fig. 2 illustrates the components of a secondary content device in accordance with an embodiment of the invention
  • Fig. 3a to 3c illustrates the content identification of content using audio fingerprints (Fig. 3a and 3b) or watermark encoding scheme (Fig. 3c);
  • Fig. 4 illustrates an example of content identification server with content management system
  • Fig. 5 illustrates an example of the database entries of the reference server for look-up of secondary content in accordance with an embodiment of the invention
  • Fig. 6 is a flow diagram of the method of identifying and synchronizing content according to another embodiment of the invention based on a hybrid watermark and fingerprint content identification technique;
  • Fig. 7 is a flow diagram of the method of identifying and synchronizing content according to another embodiment of the invention based on a pure watermark content identification technique
  • Fig.8 is a flow diagram of the method of identifying and synchronizing content according to another embodiment of the invention based on a pure fingerprinting content identification technique.
  • Other arrangements of the invention are possible and, consequently, the accompanying drawings are not to be understood as superseding the generality of the preceding description of the invention.
  • the term “broadcast” is used generally to refer to any content that is capable of being transmitted from one media source to a plurality of receivers/viewers/listeners.
  • One-to-many “broadcast content” includes, but is not limited to, TV, radio, satellite, IPTV, IP multicast, even shared space experiences such as theatre screening, conference presentations, etc.
  • the term "On-demand content”, on the other hand, which is used in the context of describing one-to-one (i.e., one content source to one receiver/viewer/listener) includes DVD, media players, on- demand services such as YouTube, Netflix, iTunes, Amazon, etc.
  • broadcast and on-demand content are referred to as primary content, which is to be distinguished from secondary content which are contextual and/or supplementary content contextually related to the primary content, including but not limited to language tracks, audio commentary, supplementary video, web pages, program code, sub-titles in different languages etc.)
  • a system 10 for identifying and synchronizing content comprising a primary content device 14, secondary content device 16, and identification and synchronization server 18.
  • Primary content may be broadcasted by a source 12.
  • Primary content may be media content such as audio tracks, videos, movies etc.
  • Primary content device 14 includes television, radio, Internet TV, DVD players, computers, movie projector, portable media players, etc. or any other devices capable of playing or presenting the primary content.
  • Secondary content device 16 is capable of presenting or playing secondary content.
  • secondary content refers to content that a user consumes to supplement the primary content.
  • Secondary content is usually, but not necessarily, contextually related to primary content.
  • the preferred embodiments of this invention are particularly optimized for the consumption of language tracks as a secondary content.
  • Secondary content device 16 may be a computer, smartphone, or any compatible device such as Smart TV, tablet, etc. enabled to run the secondary content device application 17 to present the secondary content suitable for consumption by the user.
  • the secondary content device 16 comprises various components as shown in Fig. 2.
  • the secondary content device 16 comprises a receiver such as an audio input mechanism 162 for receiving a portion of the played/executed primary content, the portion being audio sound from the presenting/presented primary content for further identification.
  • Audio input 162 may be a microphone (built-in or connected externally) or a line input.
  • An audio output 164 such as a speaker or headphone (Wired or wireless) may be integrated for listening to synchronized language tracks and/or other audible secondary content.
  • Secondary content device 16 may comprise a user interface 166 including keyboard, mouse, touch screen, buttons, etc., for controlling the secondary control device 16.
  • a display 168 for example a LED display, may be used to visually depict functions or the state of the content or device 16.
  • Secondary content device 16 is enabled to be in data communication with the identification and synchronization server 18 to achieve the identification and synchronization process and for storage of audio content where necessary.
  • the above functions are facilitated by capability for network connectivity 170 (e.g. Wifi, 3G etc.), a processor (CPU) 172, and memory 174.
  • a preferred secondary content device 16 is a smartphone which may be run on AndroidTM or iOSTM environment. The same may be achieved using other computing platforms— Windows mobile, Symbian, Windows, Mac OS X, Linux, etc.
  • the secondary content device application 17 may further comprise software components suitable for encoding or decoding the portion of audio data received from the primary content.
  • the software components may include extractors 174, which decode audio watermarks embedded in the primary content, or generate audio fingerprints from the primary content.
  • the secondary content device application 17 may further comprise media player components that allows different types of secondary content (i.e., audio, video, HTML, Javascript, etc.) files to be handled in one or more of the following:- played, rendered, executed, etc..
  • media player components that allows different types of secondary content (i.e., audio, video, HTML, Javascript, etc.) files to be handled in one or more of the following:- played, rendered, executed, etc.
  • the secondary content device 16 is operable to be in data communication with the identification and synchronization server 18. This may be via an Internet connection.
  • the identification and synchronization server 18 comprises a main server 182 in data communication with the following:- a content identification server 184, a reference server 186, a secondary content server 188, and a publish- subscribe server 190.
  • the main server 182 functions as a gateway between the identification and synchronization server 18 and the Internet.
  • Main server 182 comprises at least one application 183, which runs inter-process communications with the Content Identification Server 184, the Reference Server/Database 186, Secondary Content Server 188, Publish-Subscribe Server 190 and other support servers (not shown) where required.
  • Main server 182 and application 183 may further comprise logging, content management and analytic capabilities.
  • CIS 184 is primarily responsible for identifying the primary content.
  • CIS 184 accesses a plurality of databases, as shown in Fig. 4. This includes, but is not limited to the following:- 1 ) Electronic Program Guide (EPG) Database 184a contains a schedule of primary broadcast content programs, which includes each program's channel/ station name/ID, date, time, content title/ID, and other program- related information.
  • EPG Electronic Program Guide
  • the EPG database 184a is used to identify primary broadcast content based on a channel/station identifier, date and time.
  • Fingerprint Database 184b is used to store audio fingerprints of various primary content— movies, TV show, music, etc.
  • the fingerprint database 184b may contain pre-generated fingerprints or fingerprints that are generated in real-time (i.e., fingerprints that are generated during actual content broadcast or.presentation).
  • Watermark Database 184c is used to store watermark codes that are related to a particular primary content.
  • the code may include the primary content ID, channel/station ID, content owner ID, timecode, or other types of information presumably related to the primary content the watermark is embedded in.
  • the above content identification databases 184a, 184b, and 184c may be managed in a decentralized manner by granting access to different broadcasters, content owners, or content managers.
  • a web-based Content Management System (CMS) 250 is utilized for managing these databases.
  • CMS may also be used to manage the Reference Database 186 and Secondary Content Database 188.
  • the Reference Server 186 comprises a database 186a operable to relate primary content with secondary content for purpose of retrieval of secondary content upon identification of a primary content, for example multiple language tracks for a particular movie stored in the Secondary Content Server 188.
  • Entries of the database 186a includes, but is not limited to, matched primary content ID 186b, related secondary content ID 186c, and secondary content metadata 186d (i.e. title, file name, language, file type, path etc.)
  • the relationship between a primary content and one or more secondary content may be established manually, by associating the primary and secondary content IDs (primary content ID 186b, secondary content ID 186c, and secondary content's metadata 186d) in the database of the Reference Server 186.
  • Establishing the primary-secondary content relationships can be done directly via manual entries or via the CMS. In the database, this typically appears as fields in the same record, as shown in Fig. 5.
  • a primary content e.g. a movie may have more than one secondary language tracks associated with the same; one may be in Mandarin, another in Hindi, etc.
  • the secondary Content Server 188 contains a plurality of secondary content (i.e., companion language tracks) files, or other companion content (images, URL, HTML5 code, etc.)
  • the Secondary Content Server 88 may also include transport capabilities, such as file transfers, streaming, etc. to serve the related secondary content to the secondary content device 16 of a user.
  • the publish-subscribe server 190 contains message-oriented middleware and relevant software application that allows users to "subscribe” to particular "topics" (identified by topic IDs) of a primary content.
  • a user launching his secondary content device application 17 can "listen” to messages, events, notifications "published” by the topic publisher (i.e., broadcaster, content owner, brand, etc.), thereby making it unnecessary for users' applications to continually poll the server for changes such as more secondary content, play list changes, etc.
  • the topic publisher i.e., broadcaster, content owner, brand, etc.
  • primary content from the primary content device 14 it is important to identify the primary content for subsequent retrieval of secondary content and synchronization of both the content.
  • There are several ways to identify the primary content Specific to the context of the embodiments where secondary content is multiple language audio tracks associated with the primary content which is a movie playing in a default language, two of the most common approaches which may be used are audio fingerprinting and audio watermarking.
  • Audio fingerprinting comprises a condensed digital summary, deterministicaily generated from an audio signal that can be used to identify an audio sample or quickly locate similar items in an audio database. Practical uses of acoustic fingerprinting include identifying songs, melodies, tunes, or advertisements; sound effect library management; and video file identification.
  • the Fingerprint Database 184b is populated by a plurality of audio fingerprints generated or extracted from different primary content, either offline during production, or in real-time during broadcast.
  • the population of the fingerprint database 184b may be done in a variety of ways which is known to a skilled person and will not be elaborated further. Fingerprints in the database 184b may be associated with other information related to the primary content such as Primary Content ID, and any metadata (i.e., title, duration, content owner, etc.).
  • An unidentified primary Content Clip may be identified by extracting its fingerprint (Primary Content Clip Fingerprint) and matching it against a plurality of fingerprints stored in the Fingerprint Database 184b. If a match is found (step 320), the primary content is identified via its Content ID and relevant metadata which are passed on as parameters to subsequent synchronization processes.
  • the Fingerprint Database 184b may reside on the secondary content device 16, and/or on the server 18.
  • the former presupposes that the application has a specific use (detecting a specific advertisement, for instance), the fingerprint for which is small enough to be reasonably stored in the content device 16.
  • a server-side Fingerprint Database 184b makes more practical sense.
  • Figure 3b illustrates the matching of a primary content with any unidentified primary content clip.
  • the matching process derives a temporal matching point, which is the vast content's timestamp when the unidentified primary content clip fingerprint is matched.
  • the unidentified primary content clip's fingerprint matched a segment of an already identified primary content fingerprint in the database, and the matched segment ends at the 11.54155th-second point of primary content. So the temporal match value is 11.54155 seconds.
  • the temporal matching point used in this case is the tail of the segment.
  • the head or beginning point of the matched segment may also be used as a temporal matching point, as long as the same choice is used consistently throughout the fingerprint matching system.
  • An audio watermark is a kind of digital watermark— a marker embedded in an audio signal, typically to identify the copyright owner for that audio.
  • Primary content may be embedded with one or more watermark codes during production/post-production or live, in real-time, and/or a combination of both.
  • This invention's preferred watermark encoding scheme is shown in Fig. 3c.
  • Primary content is embedded with a sequence of watermark codes.
  • the preferred sequence comprises alternating an ID Code and a timecode.
  • the ID Code may represent the channel/station ID, content ID, a key in the Watermark Database, or any other identifier of primary content.
  • the timecode also known as synchronization Code, is a temporal reference, denoting either the global clock-aligned time of day, the content's elapsed time in relation to the beginning of the content, or an arbitrary temporal cue.
  • the interval between ID Code and timecode is arbitrarily determined. In the example shown in Fig. 3c, the interval is 20 seconds, and it can be shorter or longer.
  • intervallic timecode as a temporal reference, frame accuracy can be achieved by combining it with an application timer (client-side and/or server-side depending on need), which is used to measure elapsed time since a particular timecode or synchronization code was detected.
  • application timer client-side and/or server-side depending on need
  • the invention will next be described in the form of the method for identifying and synchronizing the primary and secondary content.
  • the method may employ hybrid watermark/fingerprint identification, a pure watermark identification mechanism or a pure fingerprint identification mechanism.
  • a user begins the process by launching the application 17 on the secondary content device 16 (step 601 ).
  • application 17 may be a dedicated software application colloquially known as "app" downloadable from iTunesTM or AndroidTM app store.
  • the application 17 extracts the watermark from the primary content being transmitted from the primary content device 14 (step 602) and picked up by audio input or microphone 162.
  • the primary content in this case, is presumed to contain a watermark.
  • the extraction process yields a watermark code which is then passed (step 603), along with relevant parameters (user ID, date, time, etc.) to the main server 182, for processing by the server application 183.
  • the code may be validated to be "within range” (i.e., has a value between 1 and 100, for instance). If the code is valid, then it is presumed to have a counterpart entry either in the EPG Database or the Watermark Database.
  • the server application 183 first attempts to identify the primary content received represented by the watermark code.
  • the application does a lookup (step 604) on the EPG Database 184a to attempt to determine the primary content's identification based on the passed parameters date and time (step 605). If found (step 606), i.e. a match is determined to exist with an entry in the EPG database 184a and the primary content is identified.
  • the EPG Database 184a is used to determine if the primary content is "now showing"; that is, it is currently being broadcast (as contrasted with primary content that was pre-recorded or was requested on-demand).
  • a topic ID of the primary content is also identified from the EPG Database 184a for the publish-subscribe application. If data of the primary content was not identified in the EPG Database 184a (step 607), it is presumed to be an unscheduled broadcast content, which may include on-demand media. As such, application 183 looks up the Watermark Database 184c to identify the watermark code, which then yields its corresponding content ID. Once primary content has been identified, application 183 then looks up the Fingerprint Database 84b for the full fingerprint of the primary content (step 608). Subsequently, the full fingerprint is passed back to the secondary content device 16, where it will be cached for later use (not shown).
  • Reference Database 186 looks up Reference Database 186 for one or more secondary content related to primary content, as referenced by the content ID of the primary content (step 610). If more than one secondary content is referenced, which may be the case for movies with multiple language tracks, the user is given a choice (not shown) via a prompt on the secondary content device to select the desired secondary content for synchronization and play. Once a secondary content is selected by the user, the main server application 183 passes several parameters back to the secondary content device application 17. The parameters include but are not limited to the full fingerprint of the matched primary content, the topic ID (if any), and the related secondary content, i.e. the preferred language track (step 61 1 ).
  • the secondary content device application 17 extracts a fingerprint (instead of a watermark) from the audio clip of the primary content (step 612).
  • the extracted primary content fingerprint clip is next matched against the full fingerprint of the primary content (step 613), which was earlier retrieved from the Fingerprint Database 184b (step 606 or step 607).
  • the matching determines a temporal match point as described in Fig. 3c, denoted by the matched time.
  • the related secondary content i.e. the language audio track
  • the related secondary content is then played starting at the matched/synchronized time based on the temporal match point. This effectively plays the secondary content in-synchronization with primary content (step 614).
  • the next steps relate to the publish-subscribe method.
  • the application 17 determines if primary content is being played or executed 'real time', i.e. "now showing" or currently being broadcast, which can be determined based on its topic ID. If so, application 17 subscribes to the broadcaster's topic, referenced by its topic ID, in the Publish-Subscribe Server 190 (step 616). Subscribing allows the user application 17 to wait and "listen" for broadcaster updates, such as commercial interruptions, change of program notifications, etc., instead of incessantly querying the server for further fingerprint matches, or status updates (step 617). In effect, publish- subscribe allows the publisher (broadcaster) to control the presentation of secondary content across all subscribers— all users currently "tuned in” to the primary content.
  • the above described embodiment as described is resource efficient, as it requires very few calls from the secondary content device application 17 to server-side components. It also allows broadcasters to control the secondary content device application 17 presentation, preset/scheduled/scripted or dynamically in real-time, of secondary content, including but not limited to play, pause, resume, stop, change/update secondary content, push play lists, etc.
  • the user application 17 may be programmed to periodically check if the user has changed channels by checking if the fingerprint profile of what the user is watching has changed. If so, application 17 "starts over" or loops back to step 601 to extract a new primary content clip's watermark (step 601 ).
  • An alternative embodiment using an all-watermark content identification mechanism may be desired by primary content broadcasters who only want to focus on providing secondary content for their programs. As such, this embodiment may not be suited for general-purpose use, particularly for handling non-watermarked content, such as on-demand content (DVD movies, Netflix, etc.)
  • Fig. 7 The processing for the all-watermark content identification embodiment of this invention is shown in Fig. 7. Similarly the process begins with a launch of the application 17 (step 701 ). The primary content would already have been watermarked with alternating channel ID and time-of-day code or an alternation of the content ID and the content sync code (preferred mode for content with accompanying language tracks) which denotes time elapsed since the content program began. This alternation between channel/content code and sync code is shown in Fig. 3c. Similar to step 602, the application 17 next extracts the watermark from the primary content being transmitted from the primary content device 14 (step 702) and picked up by audio input or microphone 162.
  • the primary content latency or delay is calculated (step 703).
  • the secondary content device 16 and the server clocks are synchronized, using Network Time Protocol (NTP) or the like.
  • NTP Network Time Protocol
  • Latency is then calculated using one or more extracted sync codes extracted from the watermark. If the sync code is a time-of-day code, then latency is calculated using the difference between the sync code (time of day) and the secondary device's 17 timestamp when that sync code was detected. For instance, if the sync code denoting 12 noon was detected by the secondary device at 12:00:01 .5, then the latency is 1 .5 seconds.
  • the sync code is a content time code
  • latency calculation is deferred to the server application 183.
  • the secondary content device application 1 7 should have the necessary code for extracting watermarks from the primary content.
  • a secondary device application timer is started to note the elapsed time since the extraction was completed. This will later be used for synchronization of the secondary content play.
  • the server application 183 looks up the EPG Database 184a (step 704) to identify the primary content based on the channel/content ID, time and location (step 705). If the primary content is found and is determined to be currently being aired (vis-a-vis a time-shifted program, such as those recorded by DVRs like TivoTM), then the broadcaster's topic (topic ID) is noted (not shown). Otherwise, it is null. If primary content was not found in the EPG Database 184a, the watermark database 184c is searched (step 707). Once primary content is determined, the Reference Database is queried to determine one or more secondary content associated with the primary content's ID (step 708).
  • a time difference is calculated using the difference between the server timestamp when the primary sync code was detected at the server-side and the secondary device timestamp of the same sync code (step 709). For instance, if the content timecode 0:0:40 (40 seconds into the content) was detected at the server at 1 :00:00 pm, and detected at the secondary content device at 1 :00:02.750 pm, then the latency is 2.75 seconds. It can also be inferred that the primary content was broadcast/ starting at 12:59:20 pm. (1 pm less 40 seconds), server-time.
  • the main server application 183 then responds to the secondary device application, passing the primary content ID, the broadcaster's topic, one or more secondary content, and latency, among other relevant parameters (step 710).
  • the secondary content device application 17 may subscribe to the broadcaster's topic to listen for subsequent updates (step 71 1 ).
  • the application then presents the secondary content (step 712), starting at the primary content's starting server-time, p us timer and latency
  • the steps of publish-subscribe is similar to steps 615, 616, 617 and will not be elaborated further.
  • the secondary content device application 17 is programmed to periodically check if the user has changed channels by checking if the watermark profile of what the user is watching has changed.
  • the process begins with a launch of the application 7 (step 801 ).
  • the secondary content device application next extracts either a fingerprint clip or streams the fingerprint to the main server application 183.
  • fingerprint clip is extracted (step 802).
  • the main server application 183 looks up the Fingerprint Database 184b to find a match (step 803). If not found, the server application 183 informs the secondary device application accordingly (step 804), and the fingerprint extraction starts anew (loop back to step 802).
  • the server application 183 looks at the EPG Database (which can be consolidated with the Fingerprint Database 184b), to determine if the matched fingerprint is currently being broadcast (step 805). If so, the broadcaster's topic is noted. Thereafter one or more secondary content is identified from the Reference Database (step 806).
  • Latency in this case, is calculated (step 807) by comparing the secondary device and server timestamps of a temporal match, such as that shown in Fig. 3b. If a reference point in the primary content clip was time stamped on the server-side at 12:00:00 pm, and the same was time stamped by the secondary content device application at 12:00:02.5, then the latency is the difference, or 2.5 seconds.
  • the entire fingerprint of the primary content is passed back to the secondary content device application 17, which is later used for the synchronized presentation of the secondary content (step 808).
  • the application 1 7 checks if the primary content is currently being broadcast (step 809). If so, the application subscribes to the server-side topic of the broadcaster (step 810), to await further updates, secondary content, instructions, etc.
  • the subsequent process is similar to the fingerprint aspect of the preferred hybrid watermark-fingerprint process, where the secondary content's starting position is synchronized with the primary content's current position.
  • the importance of the publish-subscribe facility is especially important for primary content that are currently being broadcast.
  • the publishing mechanism can be used by the broadcast source to send instructions to the user application in advance or instantaneously. For instance, a broadcaster can publish the instruction "load content123, play it at 1 :00:00pm" ahead of time to allow all user applications subscribed to that broadcaster's topic to cache the said content in the secondary device, and then played back at the specified time. Or the published message could be for immediate execution, such as "play content555 now".
  • An all-fingerprint system may only need the Fingerprint Database 184b, whereas a hybrid watermark-fingerprint embodiment needs at least the Watermark database 184c and Fingerprint database 184b.
  • the invention provides smartphone, Smart TV, and computer owners an on-demand capability to download or stream an accompanying audio track in the language of their preference, which is then played synchronized to the content that the user is watching or listening to. Furthermore, the invention allows broadcasters to remotely control (preset scripts or live), the presentation of the language track, and contextual content.
  • Audio watermarks and fingerprints have been used to identify songs, advertisements, or TV programs, but not to serve synchronized language tracks.
  • Resource efficiency Efficient utilization of resources (bandwidth consumption and computer processing capacity) is primarily achieved using: 1 ) a hybrid/watermark identification and synchronization mechanism; and
  • Broadcaster control Using a publish-and-subscribe mechanism, the broadcaster is given more control—particularly live, real-time orchestration— of the secondary content's presentation.
  • the preferred communication protocol is HTTP Live Streaming (HLS).
  • Other alternatives such as Real-Time Messaging Protocol (RTMP), Adobe HTTP dynamic streaming (HDS), Microsoft Smooth Streaming, RTSP/RTP/MPEG- TS, etc.
  • RTMP Real-Time Messaging Protocol
  • HDS Adobe HTTP dynamic streaming
  • Microsoft Smooth Streaming RTSP/RTP/MPEG- TS, etc.
  • the system 10 may also be used for non-entertainment purposes, such as listening to live translations (secondary content) of an event (primary content), such as a conference presentation, news, etc.
  • the system 10 can be used for enhancing accessibility, particularly for the hearing impaired.
  • a viewer who is hard of hearing can play back the original language track (primary content), but at volume higher than those of normal hearing (secondary content).
  • the service can also be used as a cordless headphone facility, whereby the main language track is streamed wirelessly to a smartphone for a cordless headphone effect.
  • the platform can also be used for musical accompaniment needs.
  • the primary content may be the rhythm section (drum and bass) of a musical piece
  • the companion audio track (secondary content) contains other parts, such as guitar, keyboards, and other instruments in an arrangement.
  • Several devices can be used concurrently, making possible an array of backing tracks playing from several devices.
  • the invention can be used for an on-demand karaoke service, whereby "minus one" tracks (secondary content) are served based on a song (primary content) that was sung or hummed by the user.
  • the platform can also be in reverse, whereby an audio track is detected and a companion video is streamed onto the application.
  • the primary content device 14 and secondary content device 16 may be integrated into a single device.
  • the TV primary content device
  • the TV is integrated with web connectivity and functionalities (secondary content device).
  • the secondary content device 16 is described in the embodiment(s) as suited for the execution and presentation of the secondary content, it is to be appreciated that the presentation of secondary content, for example sub-titles, may be displayed or presented on the primary content device 14.
  • the secondary content device 16 (smartphone or computer) detects the primary content via audio watermark and/or fingerprint, and retrieves the secondary content at the suitable time interval for display on the primary content device 14.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

A system for identifying and synchronizing content comprising a primary content player operable to execute a primary content; a second content player comprising a receiver for receiving a portion of the executed primary content; and an identification and synchronization server operable to be in data communication with the second content device to receive at least one parameter related to the primary content; the received one parameter utilized for identification of the primary content and retrieval of a second content contextually related to the primary content; wherein upon retrieval of the second content, provides for the second content device to execute the second content in synchronization with the primary content.

Description

SYSTEM AND METHOD FOR IDENTIFYING AND SYNCHRONIZING
CONTENT
FIELD OF THE INVENTION
The invention relates to a system and method for identifying and synchronizing content. In particular, the system and method are suited (but not limited to) the identificatiqn of media content such as movies or broadcast shows for synchronized presentation of related secondary content such as multi-language tracks associated with the media content and will be described in this context. BACKGROUND ART
The following discussion of the background to the invention is intended to facilitate an understanding of the present invention only. It should be appreciated that the discussion is not an acknowledgement or admission that any of the material referred to was published, known or part of the common general knowledge of the person skilled in the art in any jurisdiction as at the priority date of the invention.
Multiple language audio tracks in a plurality of foreign languages are typically available for physically sold media content such as DVD movies embedded in the physical DVD media. Consumers who buy popular movie DVDs can expect the movie to contain subtitles and language tracks. The consumer can select an alternative language track as an option to play, which is generally a dubbed version of the original language track.
However, the ubiquitous TV programme' and other broadcast (radio, satellite, etc.) content do not have accompanying foreign language tracks available for selection by a user. Broadcast content are either broadcast in the original language or in a dubbed foreign language. Some movies or TV shows may contain subtitles, but the subtitles are typically in one language. As such, there currently exists no solution for providing a plurality of companion language tracks to broadcast content available for selection by a user like in the case of a DVD.
With more than 200 million migrants around the world, and many more traveling for business or pleasure, millions of people find themselves watching a movie, TV show, or listening to a radio program that is in a language that they do not understand when they are in a foreign country. There exists a need to be available selection of secondary content such as multiple language tracks which can be synchronized to the primary broadcast content, where both the broadcast content and the multiple language content are not being constrained to physical media such as DVDs.
While there have been existing solutions, none has been able to provide a resource-efficient, reliable and complete solution to the above mentioned need. Most existing solutions for synchronizing the primary media content broadcasted or presented with a first device with a second content playing or presented with a second device are constrained by issues of video latency arising from transmission distance/medium, need for user to manually input show identifier, timing information etc. to assist in synchronization.
In addition, currently there are no synchronization solutions which allow a broadcast content owner to control or orchestrate the presentation of the device/player held by an audience.
The present invention seeks to alleviate the above mentioned problems and needs, at least in part.
SUMMARY OF INVENTION
In accordance with a first aspect of the invention there is a system for identifying and synchronizing content comprising:- a primary content player operable to execute a primary content; a second content player comprising a receiver for receiving a portion of the executed primary content; and an identification and synchronization server operable to be in data communication with the second content device to receive at least one parameter related to the primary content; the received one parameter utilized for identification of the primary content and retrieval of a second content contextually related to the primary content.
Preferably, the primary content is identified using extraction of a watermark code from the received portion of the primary content; the watermark code comprises an alternation of an identifier of the primary content and a time code for recording a timing parameter of the primary content. In such instance upon identification of the primary content the identification and synchronization server proceeds to retrieve a full fingerprint of the primary content, and the secondary content device is operable to match the full fingerprint of the primary content with an unidentified portion of the primary content to determine a match point used to execute the second content in synchronization with the primary content. The secondary content device may comprise a timer for calculating time difference between the time the portion of the primary content is presented and actual time the secondary content device receives the portion of the primary content. In addition, the identification and synchronization server may comprises a timer for calculating time difference between the time the secondary content is retrieved and the time the secondary content is executed by the secondary content device.
Preferably, the system further comprises a publish-subscribe server to send event-triggered updates on the primary content to the identification and synchronization server.
Preferably, the primary content player and second content player are integrated as one single device. Preferably, there is a plurality of secondary content. Preferably, the identification and synchronization server is operable to, upon retrieval of the second content, provides for the second content device to execute the second content in synchronization with the primary content. In accordance with a second aspect of the invention there is a method for identifying and synchronizing content between a primary content player operable to execute a primary content and a second content player operable to execute a secondary content comprising:- receiving a portion of a primary content on a secondary content device; sending at least one parameter related to the primary content to an identification and synchronization server; identifying the primary content using the at least one parameter; retrieving the secondary content contextually related to the identified primary content using the identified primary content; and synchronizing the secondary content on the secondary content device with the primary content.
Preferably, the method includes the step extracting a watermark code from the received portion of the primary content before sending the watermark code to the identification and synchronization server; the watermark code comprises an alternation of an identifier of the primary content and a time code for recording a timing parameter of the primary content. The method may comprises an additional step wherein after the step of identifying the primary content the identification and synchronization server proceeds to retrieve a full fingerprint of the primary content. In addition, the step of synchronizing the secondary content on the secondary content device with the primary content may include matching an unidentified portion of the primary content to determine a match point used to execute the second content in synchronization with the primary content.
The step of synchronizing may include calculating the time difference between the time the portion of the primary content is presented and actual time the secondary content device receives the portion of the primary content. The step of synchronizing may further include calculating the time difference between the time the secondary content is retrieved and the time the secondary content is executed by the secondary content device.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will now be described, by way of example only, with reference to the accompanying drawing, in which:
Fig. 1 is a system diagram according to an embodiment of the invention;
Fig. 2 illustrates the components of a secondary content device in accordance with an embodiment of the invention;
Fig. 3a to 3c illustrates the content identification of content using audio fingerprints (Fig. 3a and 3b) or watermark encoding scheme (Fig. 3c);
Fig. 4 illustrates an example of content identification server with content management system; Fig. 5 illustrates an example of the database entries of the reference server for look-up of secondary content in accordance with an embodiment of the invention;
Fig. 6 is a flow diagram of the method of identifying and synchronizing content according to another embodiment of the invention based on a hybrid watermark and fingerprint content identification technique;
Fig. 7 is a flow diagram of the method of identifying and synchronizing content according to another embodiment of the invention based on a pure watermark content identification technique;
Fig.8 is a flow diagram of the method of identifying and synchronizing content according to another embodiment of the invention based on a pure fingerprinting content identification technique. Other arrangements of the invention are possible and, consequently, the accompanying drawings are not to be understood as superseding the generality of the preceding description of the invention.
PREFERRED EMBODIMENTS OF THE INVENTION
In the context of describing this invention, the term "broadcast" is used generally to refer to any content that is capable of being transmitted from one media source to a plurality of receivers/viewers/listeners. One-to-many "broadcast content" includes, but is not limited to, TV, radio, satellite, IPTV, IP multicast, even shared space experiences such as theatre screening, conference presentations, etc. The term "On-demand content", on the other hand, which is used in the context of describing one-to-one (i.e., one content source to one receiver/viewer/listener) includes DVD, media players, on- demand services such as YouTube, Netflix, iTunes, Amazon, etc. For purpose of illustration and description, both broadcast and on-demand content are referred to as primary content, which is to be distinguished from secondary content which are contextual and/or supplementary content contextually related to the primary content, including but not limited to language tracks, audio commentary, supplementary video, web pages, program code, sub-titles in different languages etc.)
In accordance with an embodiment of the invention as shown in Fig. 1 there is a system 10 for identifying and synchronizing content comprising a primary content device 14, secondary content device 16, and identification and synchronization server 18. Primary content may be broadcasted by a source 12. Primary content may be media content such as audio tracks, videos, movies etc.
Primary content device 14 includes television, radio, Internet TV, DVD players, computers, movie projector, portable media players, etc. or any other devices capable of playing or presenting the primary content. Secondary content device 16 is capable of presenting or playing secondary content.. In the context of this invention, secondary content refers to content that a user consumes to supplement the primary content. Secondary content is usually, but not necessarily, contextually related to primary content. The preferred embodiments of this invention are particularly optimized for the consumption of language tracks as a secondary content.
Secondary content device 16 may be a computer, smartphone, or any compatible device such as Smart TV, tablet, etc. enabled to run the secondary content device application 17 to present the secondary content suitable for consumption by the user.
The secondary content device 16 comprises various components as shown in Fig. 2. The secondary content device 16 comprises a receiver such as an audio input mechanism 162 for receiving a portion of the played/executed primary content, the portion being audio sound from the presenting/presented primary content for further identification. Audio input 162 may be a microphone (built-in or connected externally) or a line input. An audio output 164 such as a speaker or headphone (Wired or wireless) may be integrated for listening to synchronized language tracks and/or other audible secondary content. Secondary content device 16 may comprise a user interface 166 including keyboard, mouse, touch screen, buttons, etc., for controlling the secondary control device 16. A display 168, for example a LED display, may be used to visually depict functions or the state of the content or device 16.
Secondary content device 16 is enabled to be in data communication with the identification and synchronization server 18 to achieve the identification and synchronization process and for storage of audio content where necessary. The above functions are facilitated by capability for network connectivity 170 (e.g. Wifi, 3G etc.), a processor (CPU) 172, and memory 174.
A preferred secondary content device 16 is a smartphone which may be run on Android™ or iOS™ environment. The same may be achieved using other computing platforms— Windows mobile, Symbian, Windows, Mac OS X, Linux, etc.
The secondary content device application 17 may further comprise software components suitable for encoding or decoding the portion of audio data received from the primary content. The software components may include extractors 174, which decode audio watermarks embedded in the primary content, or generate audio fingerprints from the primary content.
The secondary content device application 17 may further comprise media player components that allows different types of secondary content (i.e., audio, video, HTML, Javascript, etc.) files to be handled in one or more of the following:- played, rendered, executed, etc..
The secondary content device 16 is operable to be in data communication with the identification and synchronization server 18. This may be via an Internet connection. The identification and synchronization server 18 comprises a main server 182 in data communication with the following:- a content identification server 184, a reference server 186, a secondary content server 188, and a publish- subscribe server 190.
The main server 182 functions as a gateway between the identification and synchronization server 18 and the Internet. Main server 182 comprises at least one application 183, which runs inter-process communications with the Content Identification Server 184, the Reference Server/Database 186, Secondary Content Server 188, Publish-Subscribe Server 190 and other support servers (not shown) where required. Main server 182 and application 183 may further comprise logging, content management and analytic capabilities.
Content Identification Server (CIS) 184 is primarily responsible for identifying the primary content. In the preferred embodiment of this invention, CIS 184 accesses a plurality of databases, as shown in Fig. 4. This includes, but is not limited to the following:- 1 ) Electronic Program Guide (EPG) Database 184a contains a schedule of primary broadcast content programs, which includes each program's channel/ station name/ID, date, time, content title/ID, and other program- related information. The EPG database 184a is used to identify primary broadcast content based on a channel/station identifier, date and time.
2) Fingerprint Database 184b is used to store audio fingerprints of various primary content— movies, TV show, music, etc. The fingerprint database 184b may contain pre-generated fingerprints or fingerprints that are generated in real-time (i.e., fingerprints that are generated during actual content broadcast or.presentation).
3) Watermark Database 184c is used to store watermark codes that are related to a particular primary content. The code may include the primary content ID, channel/station ID, content owner ID, timecode, or other types of information presumably related to the primary content the watermark is embedded in.
The above content identification databases 184a, 184b, and 184c may be managed in a decentralized manner by granting access to different broadcasters, content owners, or content managers. In the embodiment, a web-based Content Management System (CMS) 250 is utilized for managing these databases. The same CMS may also be used to manage the Reference Database 186 and Secondary Content Database 188.
The Reference Server 186 comprises a database 186a operable to relate primary content with secondary content for purpose of retrieval of secondary content upon identification of a primary content, for example multiple language tracks for a particular movie stored in the Secondary Content Server 188.
Entries of the database 186a, as shown in Fig. 5 includes, but is not limited to, matched primary content ID 186b, related secondary content ID 186c, and secondary content metadata 186d (i.e. title, file name, language, file type, path etc.) The relationship between a primary content and one or more secondary content may be established manually, by associating the primary and secondary content IDs (primary content ID 186b, secondary content ID 186c, and secondary content's metadata 186d) in the database of the Reference Server 186. Establishing the primary-secondary content relationships can be done directly via manual entries or via the CMS. In the database, this typically appears as fields in the same record, as shown in Fig. 5.
There can be a plurality of secondary content associated with a primary content. For example, a primary content e.g. a movie may have more than one secondary language tracks associated with the same; one may be in Mandarin, another in Hindi, etc.
The secondary Content Server 188 contains a plurality of secondary content (i.e., companion language tracks) files, or other companion content (images, URL, HTML5 code, etc.) The Secondary Content Server 88 may also include transport capabilities, such as file transfers, streaming, etc. to serve the related secondary content to the secondary content device 16 of a user.
The publish-subscribe server 190 contains message-oriented middleware and relevant software application that allows users to "subscribe" to particular "topics" (identified by topic IDs) of a primary content. Once subscribed, a user launching his secondary content device application 17, can "listen" to messages, events, notifications "published" by the topic publisher (i.e., broadcaster, content owner, brand, etc.), thereby making it unnecessary for users' applications to continually poll the server for changes such as more secondary content, play list changes, etc. Upon receiving at the secondary content device 16, primary content from the primary content device 14, it is important to identify the primary content for subsequent retrieval of secondary content and synchronization of both the content. There are several ways to identify the primary content. Specific to the context of the embodiments where secondary content is multiple language audio tracks associated with the primary content which is a movie playing in a default language, two of the most common approaches which may be used are audio fingerprinting and audio watermarking.
Audio fingerprinting comprises a condensed digital summary, deterministicaily generated from an audio signal that can be used to identify an audio sample or quickly locate similar items in an audio database. Practical uses of acoustic fingerprinting include identifying songs, melodies, tunes, or advertisements; sound effect library management; and video file identification.
An example of content identification process using audio fingerprints is shown in Figure 3a. The Fingerprint Database 184b is populated by a plurality of audio fingerprints generated or extracted from different primary content, either offline during production, or in real-time during broadcast. The population of the fingerprint database 184b may be done in a variety of ways which is known to a skilled person and will not be elaborated further. Fingerprints in the database 184b may be associated with other information related to the primary content such as Primary Content ID, and any metadata (i.e., title, duration, content owner, etc.).
An unidentified primary Content Clip may be identified by extracting its fingerprint (Primary Content Clip Fingerprint) and matching it against a plurality of fingerprints stored in the Fingerprint Database 184b. If a match is found (step 320), the primary content is identified via its Content ID and relevant metadata which are passed on as parameters to subsequent synchronization processes.
The Fingerprint Database 184b may reside on the secondary content device 16, and/or on the server 18. The former presupposes that the application has a specific use (detecting a specific advertisement, for instance), the fingerprint for which is small enough to be reasonably stored in the content device 16. In the case of broad searches though, such as matching against a large fingerprint database of movies, a server-side Fingerprint Database 184b makes more practical sense. Figure 3b illustrates the matching of a primary content with any unidentified primary content clip. The matching process derives a temporal matching point, which is the primar content's timestamp when the unidentified primary content clip fingerprint is matched. In the example shown in Figure 3b, the unidentified primary content clip's fingerprint matched a segment of an already identified primary content fingerprint in the database, and the matched segment ends at the 11.54155th-second point of primary content. So the temporal match value is 11.54155 seconds. The temporal matching point used in this case is the tail of the segment. The head or beginning point of the matched segment may also be used as a temporal matching point, as long as the same choice is used consistently throughout the fingerprint matching system.
An audio watermark is a kind of digital watermark— a marker embedded in an audio signal, typically to identify the copyright owner for that audio. Primary content may be embedded with one or more watermark codes during production/post-production or live, in real-time, and/or a combination of both.
This invention's preferred watermark encoding scheme is shown in Fig. 3c. Primary content is embedded with a sequence of watermark codes. The preferred sequence comprises alternating an ID Code and a timecode. The ID Code may represent the channel/station ID, content ID, a key in the Watermark Database, or any other identifier of primary content. The timecode, also known as synchronization Code, is a temporal reference, denoting either the global clock-aligned time of day, the content's elapsed time in relation to the beginning of the content, or an arbitrary temporal cue. The interval between ID Code and timecode is arbitrarily determined. In the example shown in Fig. 3c, the interval is 20 seconds, and it can be shorter or longer. Using this intervallic timecode as a temporal reference, frame accuracy can be achieved by combining it with an application timer (client-side and/or server-side depending on need), which is used to measure elapsed time since a particular timecode or synchronization code was detected. The invention will next be described in the form of the method for identifying and synchronizing the primary and secondary content.
The method may employ hybrid watermark/fingerprint identification, a pure watermark identification mechanism or a pure fingerprint identification mechanism.
With reference to Fig. 6 showing the flow diagram of hybrid watermark fingerprint identification, a user begins the process by launching the application 17 on the secondary content device 16 (step 601 ). Where the secondary content device 16 is a smart phone, application 17 may be a dedicated software application colloquially known as "app" downloadable from iTunes™ or Android™ app store.
Once launched, the application 17 extracts the watermark from the primary content being transmitted from the primary content device 14 (step 602) and picked up by audio input or microphone 162. The primary content, in this case, is presumed to contain a watermark. As such, the extraction process yields a watermark code which is then passed (step 603), along with relevant parameters (user ID, date, time, etc.) to the main server 182, for processing by the server application 183.
As an optional feature, after the watermark code is extracted, the code may be validated to be "within range" (i.e., has a value between 1 and 100, for instance). If the code is valid, then it is presumed to have a counterpart entry either in the EPG Database or the Watermark Database.
The server application 183 first attempts to identify the primary content received represented by the watermark code. The application does a lookup (step 604) on the EPG Database 184a to attempt to determine the primary content's identification based on the passed parameters date and time (step 605). If found (step 606), i.e. a match is determined to exist with an entry in the EPG database 184a and the primary content is identified. Furthermore, the EPG Database 184a is used to determine if the primary content is "now showing"; that is, it is currently being broadcast (as contrasted with primary content that was pre-recorded or was requested on-demand). If primary content is currently being shown 'real time', a topic ID of the primary content is also identified from the EPG Database 184a for the publish-subscribe application. If data of the primary content was not identified in the EPG Database 184a (step 607), it is presumed to be an unscheduled broadcast content, which may include on-demand media. As such, application 183 looks up the Watermark Database 184c to identify the watermark code, which then yields its corresponding content ID. Once primary content has been identified, application 183 then looks up the Fingerprint Database 84b for the full fingerprint of the primary content (step 608). Subsequently, the full fingerprint is passed back to the secondary content device 16, where it will be cached for later use (not shown).
Application 183 then looks up Reference Database 186 for one or more secondary content related to primary content, as referenced by the content ID of the primary content (step 610). If more than one secondary content is referenced, which may be the case for movies with multiple language tracks, the user is given a choice (not shown) via a prompt on the secondary content device to select the desired secondary content for synchronization and play. Once a secondary content is selected by the user, the main server application 183 passes several parameters back to the secondary content device application 17. The parameters include but are not limited to the full fingerprint of the matched primary content, the topic ID (if any), and the related secondary content, i.e. the preferred language track (step 61 1 ). The secondary content device application 17 extracts a fingerprint (instead of a watermark) from the audio clip of the primary content (step 612). The extracted primary content fingerprint clip is next matched against the full fingerprint of the primary content (step 613), which was earlier retrieved from the Fingerprint Database 184b (step 606 or step 607). The matching determines a temporal match point as described in Fig. 3c, denoted by the matched time.
The related secondary content (i.e. the language audio track) is then played starting at the matched/synchronized time based on the temporal match point. This effectively plays the secondary content in-synchronization with primary content (step 614).
The next steps (steps 615, 616) relate to the publish-subscribe method. The application 17 determines if primary content is being played or executed 'real time', i.e. "now showing" or currently being broadcast, which can be determined based on its topic ID. If so, application 17 subscribes to the broadcaster's topic, referenced by its topic ID, in the Publish-Subscribe Server 190 (step 616). Subscribing allows the user application 17 to wait and "listen" for broadcaster updates, such as commercial interruptions, change of program notifications, etc., instead of incessantly querying the server for further fingerprint matches, or status updates (step 617). In effect, publish- subscribe allows the publisher (broadcaster) to control the presentation of secondary content across all subscribers— all users currently "tuned in" to the primary content.
The above described embodiment as described is resource efficient, as it requires very few calls from the secondary content device application 17 to server-side components. It also allows broadcasters to control the secondary content device application 17 presentation, preset/scheduled/scripted or dynamically in real-time, of secondary content, including but not limited to play, pause, resume, stop, change/update secondary content, push play lists, etc.
Where the primary content is broadcasted 'real time', the user application 17 may be programmed to periodically check if the user has changed channels by checking if the fingerprint profile of what the user is watching has changed. If so, application 17 "starts over" or loops back to step 601 to extract a new primary content clip's watermark (step 601 ). An alternative embodiment using an all-watermark content identification mechanism may be desired by primary content broadcasters who only want to focus on providing secondary content for their programs. As such, this embodiment may not be suited for general-purpose use, particularly for handling non-watermarked content, such as on-demand content (DVD movies, Netflix, etc.)
The processing for the all-watermark content identification embodiment of this invention is shown in Fig. 7. Similarly the process begins with a launch of the application 17 (step 701 ). The primary content would already have been watermarked with alternating channel ID and time-of-day code or an alternation of the content ID and the content sync code (preferred mode for content with accompanying language tracks) which denotes time elapsed since the content program began. This alternation between channel/content code and sync code is shown in Fig. 3c. Similar to step 602, the application 17 next extracts the watermark from the primary content being transmitted from the primary content device 14 (step 702) and picked up by audio input or microphone 162.
Upon extraction, the primary content latency or delay is calculated (step 703). To simplify the description of latency detection, it shall be assumed that the secondary content device 16 and the server clocks are synchronized, using Network Time Protocol (NTP) or the like. Latency is then calculated using one or more extracted sync codes extracted from the watermark. If the sync code is a time-of-day code, then latency is calculated using the difference between the sync code (time of day) and the secondary device's 17 timestamp when that sync code was detected. For instance, if the sync code denoting 12 noon was detected by the secondary device at 12:00:01 .5, then the latency is 1 .5 seconds.
Alternatively, if the sync code is a content time code, then latency calculation is deferred to the server application 183. The secondary content device application 1 7 should have the necessary code for extracting watermarks from the primary content. Upon extraction, a secondary device application timer is started to note the elapsed time since the extraction was completed. This will later be used for synchronization of the secondary content play.
Once latency is calculated, the parameters are passed to the identification and synchronization server 18. The server application 183 looks up the EPG Database 184a (step 704) to identify the primary content based on the channel/content ID, time and location (step 705). If the primary content is found and is determined to be currently being aired (vis-a-vis a time-shifted program, such as those recorded by DVRs like Tivo™), then the broadcaster's topic (topic ID) is noted (not shown). Otherwise, it is null. If primary content was not found in the EPG Database 184a, the watermark database 184c is searched (step 707). Once primary content is determined, the Reference Database is queried to determine one or more secondary content associated with the primary content's ID (step 708).
If the primary content sync code is a content-specific timecode (vis-a-vis broadcaster's 24-hour timecode), which is the preferred watermark encoding scheme for content with companion language tracks, then a time difference (secondary content latency) is calculated using the difference between the server timestamp when the primary sync code was detected at the server-side and the secondary device timestamp of the same sync code (step 709). For instance, if the content timecode 0:0:40 (40 seconds into the content) was detected at the server at 1 :00:00 pm, and detected at the secondary content device at 1 :00:02.750 pm, then the latency is 2.75 seconds. It can also be inferred that the primary content was broadcast/ starting at 12:59:20 pm. (1 pm less 40 seconds), server-time.
The main server application 183 then responds to the secondary device application, passing the primary content ID, the broadcaster's topic, one or more secondary content, and latency, among other relevant parameters (step 710).
If the primary content is being aired, the secondary content device application 17 may subscribe to the broadcaster's topic to listen for subsequent updates (step 71 1 ).
The application then presents the secondary content (step 712), starting at the primary content's starting server-time, p us timer and latency The steps of publish-subscribe is similar to steps 615, 616, 617 and will not be elaborated further. In addition, the secondary content device application 17 is programmed to periodically check if the user has changed channels by checking if the watermark profile of what the user is watching has changed.
In the case of an all-fingerprint embodiment, the process is as shown in Fig. 8. .
Similarly to the previous described embodiments the process begins with a launch of the application 7 (step 801 ). The secondary content device application next extracts either a fingerprint clip or streams the fingerprint to the main server application 183. For purpose of illustration, fingerprint clip is extracted (step 802).
The main server application 183 looks up the Fingerprint Database 184b to find a match (step 803). If not found, the server application 183 informs the secondary device application accordingly (step 804), and the fingerprint extraction starts anew (loop back to step 802).
If found, the server application 183 looks at the EPG Database (which can be consolidated with the Fingerprint Database 184b), to determine if the matched fingerprint is currently being broadcast (step 805). If so, the broadcaster's topic is noted. Thereafter one or more secondary content is identified from the Reference Database (step 806).
Latency, in this case, is calculated (step 807) by comparing the secondary device and server timestamps of a temporal match, such as that shown in Fig. 3b. If a reference point in the primary content clip was time stamped on the server-side at 12:00:00 pm, and the same was time stamped by the secondary content device application at 12:00:02.5, then the latency is the difference, or 2.5 seconds.
If the secondary content is time-sensitive, such as in the case of an audio (language track) or video file, then the entire fingerprint of the primary content is passed back to the secondary content device application 17, which is later used for the synchronized presentation of the secondary content (step 808).
Back at the secondary content device 16, the application 1 7 checks if the primary content is currently being broadcast (step 809). If so, the application subscribes to the server-side topic of the broadcaster (step 810), to await further updates, secondary content, instructions, etc.
The subsequent process is similar to the fingerprint aspect of the preferred hybrid watermark-fingerprint process, where the secondary content's starting position is synchronized with the primary content's current position. The importance of the publish-subscribe facility is especially important for primary content that are currently being broadcast. In such cases, the publishing mechanism can be used by the broadcast source to send instructions to the user application in advance or instantaneously. For instance, a broadcaster can publish the instruction "load content123, play it at 1 :00:00pm" ahead of time to allow all user applications subscribed to that broadcaster's topic to cache the said content in the secondary device, and then played back at the specified time. Or the published message could be for immediate execution, such as "play content555 now".
The need for these databases largely depends on the preferred embodiment. An all-fingerprint system may only need the Fingerprint Database 184b, whereas a hybrid watermark-fingerprint embodiment needs at least the Watermark database 184c and Fingerprint database 184b.
These databases may be consolidated in one huge database, or they may be partitioned separately, as illustrated in the embodiments. The invention, described herein, provides smartphone, Smart TV, and computer owners an on-demand capability to download or stream an accompanying audio track in the language of their preference, which is then played synchronized to the content that the user is watching or listening to. Furthermore, the invention allows broadcasters to remotely control (preset scripts or live), the presentation of the language track, and contextual content.
There currently exists no commercial Internet service that dynamically serves language tracks, synchronized to a broadcast content.
Audio watermarks and fingerprints have been used to identify songs, advertisements, or TV programs, but not to serve synchronized language tracks.
The advantages of the invention, described herein, over the prior art include:
Resource efficiency— Efficient utilization of resources (bandwidth consumption and computer processing capacity) is primarily achieved using: 1 ) a hybrid/watermark identification and synchronization mechanism; and
2) publish-subscribe mechanism for eliminating the need for the secondary content device to continuously poll the server.
Broadcaster control— Using a publish-and-subscribe mechanism, the broadcaster is given more control— particularly live, real-time orchestration— of the secondary content's presentation.
The preferred communication protocol is HTTP Live Streaming (HLS). Other alternatives such as Real-Time Messaging Protocol (RTMP), Adobe HTTP dynamic streaming (HDS), Microsoft Smooth Streaming, RTSP/RTP/MPEG- TS, etc. It is to be understood that the above embodiments have been provided only by way of exemplification of this invention, such as those detailed below, and that further modifications and improvements thereto, as would be apparent to persons skilled in the relevant art, are deemed to fall within the broad scope and ambit of the present invention described:- • The system 10 may be used in any case where secondary content is a companion audio track, not just language tracks. The companion audio track can be an accompanying commentary by the program's director, cast member, or other third parties.
• The system 10 may also be used for non-entertainment purposes, such as listening to live translations (secondary content) of an event (primary content), such as a conference presentation, news, etc.
• The system 10 can be used for enhancing accessibility, particularly for the hearing impaired. A viewer who is hard of hearing can play back the original language track (primary content), but at volume higher than those of normal hearing (secondary content). That said, the service can also be used as a cordless headphone facility, whereby the main language track is streamed wirelessly to a smartphone for a cordless headphone effect.
• The platform can also be used for musical accompaniment needs. For instance, the primary content may be the rhythm section (drum and bass) of a musical piece, whereas the companion audio track (secondary content) contains other parts, such as guitar, keyboards, and other instruments in an arrangement. Several devices can be used concurrently, making possible an array of backing tracks playing from several devices.
• The invention can be used for an on-demand karaoke service, whereby "minus one" tracks (secondary content) are served based on a song (primary content) that was sung or hummed by the user.
• The platform can also be in reverse, whereby an audio track is detected and a companion video is streamed onto the application.
• While the preferred embodiment of this invention is optimal for orchestrating language tracks, other forms of secondary content can also be orchestrated, such as web pages, program codes, etc. • The primary content device 14 and secondary content device 16 may be integrated into a single device. For example, in the case of a Smart television system, the TV (primary content device) is integrated with web connectivity and functionalities (secondary content device).
• Although the secondary content device 16 is described in the embodiment(s) as suited for the execution and presentation of the secondary content, it is to be appreciated that the presentation of secondary content, for example sub-titles, may be displayed or presented on the primary content device 14. In such instances, the secondary content device 16 (smartphone or computer) detects the primary content via audio watermark and/or fingerprint, and retrieves the secondary content at the suitable time interval for display on the primary content device 14.
It is to be further appreciated that features from one or more embodiments as described may be combined to form further embodiments without departing from the scope of the present invention.

Claims

The Claims Defining the Invention are as Follows
1. A system for identifying and synchronizing content comprising:- a primary content player operable to execute a primary content;
a second content player comprising a receiver for receiving a portion of the executed primary content; and
an identification and synchronization server operable to be in data communication with the second content device to receive at least one parameter related to the primary content; the received one parameter utilized for identification of the primary content and retrieval of a second content contextually related to the primary content.
2. A system according to claim 1 , wherein the primary content is identified using extraction of a watermark code from the received portion of the primary content; the watermark code comprises an alternation of an identifier of the primary content and a time code for recording a timing parameter of the primary content.
3. A system according to claim 2, wherein upon identification of the primary content the identification and synchronization server proceeds to retrieve a full fingerprint of the primary content.
4. A system according to claim 3, wherein the secondary content device is operable to match the full fingerprint of the primary content with an unidentified portion of the primary content to determine a match point used to execute the second content in synchronization with the primary content.
5. A system according to claim 1 , the system further comprises a publish- subscribe server to send event-triggered updates on the primary content to the identification and synchronization server.
6. A system according to claim 2, wherein the secondary content device comprises a timer for calculating time difference between the time the portion of the primary content is presented and actual time the secondary content device receives the portion of the primary content.
7. A system according to claim 6, wherein the identification and synchronization server comprises a timer for calculating time difference between the time the secondary content is retrieved and the time the secondary content is executed by the secondary content device.
8. A system according to claim 1 , wherein the primary content player and second content player are integrated as one single device.
9. A system according to claim 1 , wherein there is a plurality of secondary content.
10. A system according to claim 1 , wherein the identification and synchronization server is operable to, upon retrieval of the second content, provide for the second content device to execute the second content in synchronization with the primary content.
11. A method for identifying and synchronizing content between a primary content player operable to execute a primary content and a second content player operable to execute a secondary content comprising:- receiving a portion of a primary content on a secondary content device;
sending at least one parameter related to the primary content to an identification and synchronization server;
identifying the primary content using the at least one parameter;
retrieving the secondary content contextually related to the identified primary content using the identified primary content; and synchronizing the secondary content on the secondary content device with the primary content.
12. A method according to claim 1 1 including the step of extracting a watermark code from the received portion of the primary content before sending the watermark code to the identification and synchronization server; the watermark code comprising an alternation of an identifier of the primary content and a time code for recording a timing parameter of the primary content.
13. A method according to claim 12, wherein after the step of identifying the primary content the identification and synchronization server proceeds to retrieve a full fingerprint of the primary content.
14. A method according to claim 13, wherein the step of synchronizing the secondary content on the secondary content device with the primary content includes matching an unidentified portion of the primary content to determine a match point used to execute the secondary content in synchronization with the primary content.
15. A method according to claim 12, wherein the step of synchronizing includes calculating the time difference between the time the portion of the primary content is presented and actual time the secondary content device receives the portion of the primary content.
16. A method according to claim 15, wherein the step of synchronizing includes calculating the time difference between the time the secondary content is retrieved and the time the secondary content is executed by the secondary content device.
PCT/SG2014/000194 2013-05-03 2014-05-02 System and method for identifying and synchronizing content WO2014178796A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG201303433 2013-05-03
SG2013034335 2013-05-03

Publications (1)

Publication Number Publication Date
WO2014178796A1 true WO2014178796A1 (en) 2014-11-06

Family

ID=51843787

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2014/000194 WO2014178796A1 (en) 2013-05-03 2014-05-02 System and method for identifying and synchronizing content

Country Status (1)

Country Link
WO (1) WO2014178796A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019121904A1 (en) * 2017-12-22 2019-06-27 Nativewaves Gmbh Method for synchronizing an additional signal to a primary signal
US10907371B2 (en) 2014-11-30 2021-02-02 Dolby Laboratories Licensing Corporation Large format theater design
EP3323244B1 (en) * 2015-07-16 2021-12-29 Inscape Data, Inc. System and method for improving work load management in acr television monitoring system
US11885147B2 (en) 2014-11-30 2024-01-30 Dolby Laboratories Licensing Corporation Large format theater design

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100280641A1 (en) * 2009-05-01 2010-11-04 David Henry Harkness Methods, apparatus and articles of manufacture to provide secondary content in association with primary broadcast media content
WO2012092247A1 (en) * 2010-12-30 2012-07-05 Thomson Licensing Method and system for providing additional content related to a displayed content
WO2013040533A1 (en) * 2011-09-16 2013-03-21 Umami Co. Second screen interactive platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100280641A1 (en) * 2009-05-01 2010-11-04 David Henry Harkness Methods, apparatus and articles of manufacture to provide secondary content in association with primary broadcast media content
WO2012092247A1 (en) * 2010-12-30 2012-07-05 Thomson Licensing Method and system for providing additional content related to a displayed content
WO2013040533A1 (en) * 2011-09-16 2013-03-21 Umami Co. Second screen interactive platform

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10907371B2 (en) 2014-11-30 2021-02-02 Dolby Laboratories Licensing Corporation Large format theater design
US11885147B2 (en) 2014-11-30 2024-01-30 Dolby Laboratories Licensing Corporation Large format theater design
EP3323244B1 (en) * 2015-07-16 2021-12-29 Inscape Data, Inc. System and method for improving work load management in acr television monitoring system
WO2019121904A1 (en) * 2017-12-22 2019-06-27 Nativewaves Gmbh Method for synchronizing an additional signal to a primary signal
US11570506B2 (en) 2017-12-22 2023-01-31 Nativewaves Gmbh Method for synchronizing an additional signal to a primary signal
EP4178212A1 (en) 2017-12-22 2023-05-10 NativeWaves GmbH Method for synchronising an additional signal to a main signal

Similar Documents

Publication Publication Date Title
JP6247309B2 (en) Apparatus and method for processing interactive services
KR102068567B1 (en) Apparatus and method for processing an interactive service
US9661371B2 (en) Method for transmitting a broadcast service, apparatus for receiving same, and method for processing an additional service using the apparatus for receiving same
RU2594295C1 (en) Device and method for processing of interactive service
US9167278B2 (en) Method and system for automatic content recognition (ACR) based broadcast synchronization
US20170195744A1 (en) Live-stream video advertisement system
US9838741B2 (en) Method for transmitting broadcast service, method for receiving broadcast service, and apparatus for receiving broadcast service
US20160073141A1 (en) Synchronizing secondary content to a multimedia presentation
JP6212557B2 (en) Apparatus and method for processing interactive services
US20140373036A1 (en) Hybrid video recognition system based on audio and subtitle data
US20120089911A1 (en) Bookmarking System
JP5982359B2 (en) Synchro contents broadcast distribution system
US11272246B2 (en) System and method for management and delivery of secondary syndicated companion content of discovered primary digital media presentations
KR20090026940A (en) Method and apparatus for playing contents in iptv terminal
TWI770583B (en) Method, non-transitory computer-readable storage medium, and computing system for using broadcast-schedule data to facilitate performing a content-modification operation
JP2016514391A (en) Video display device and operation method thereof
WO2014178796A1 (en) System and method for identifying and synchronizing content
WO2019188393A1 (en) Information processing device, information processing method, transmission device and transmission method
EP3177026B1 (en) Reception device, reception method, transmission device, and transmission method
KR20150030669A (en) Reception device, information processing method, program, transmission device and application linking system
JP2016213709A (en) Moving image reproduction system, client device, server device, and program
JP2023528366A (en) Server Side Ad Insertion (SSAI) with additional metadata and client functionality
Casagranda et al. A framework for a context-based hybrid content radio
JP6532164B2 (en) Receiver and program
EP3695612A1 (en) Media break management

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14792073

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14792073

Country of ref document: EP

Kind code of ref document: A1