WO2016204873A1 - Systèmes et des procédés pour incorporer une fonction d'interaction utilisateur dans une vidéo - Google Patents

Systèmes et des procédés pour incorporer une fonction d'interaction utilisateur dans une vidéo Download PDF

Info

Publication number
WO2016204873A1
WO2016204873A1 PCT/US2016/029890 US2016029890W WO2016204873A1 WO 2016204873 A1 WO2016204873 A1 WO 2016204873A1 US 2016029890 W US2016029890 W US 2016029890W WO 2016204873 A1 WO2016204873 A1 WO 2016204873A1
Authority
WO
WIPO (PCT)
Prior art keywords
hotpath
video
data stream
embedding
stream
Prior art date
Application number
PCT/US2016/029890
Other languages
English (en)
Inventor
Jay Monahan
Original Assignee
Hotpathz, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hotpathz, Inc. filed Critical Hotpathz, Inc.
Publication of WO2016204873A1 publication Critical patent/WO2016204873A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8583Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by creating hot-spots
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23614Multiplexing of additional data and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4722End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content
    • H04N21/4725End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content using interactive regions of the image, e.g. hot spots
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8545Content authoring for generating interactive applications
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Definitions

  • the present invention generally relates to video streaming.
  • the present invention is directed to Systems and Methods for Embedding Information into a Video.
  • users can interact with their delivered video in many of the same ways they interact with online videos (e.g., play on demand, etc.) such that users are beginning to expect certain functionality to be co-delivered with the video— functionality such as clickable hyperlinks within the video frame, external links that lead the viewer to a specific website or other internet address, pop- outs that display information or images, or playing video or audio alongside of or instead of the video previously being played.
  • functionality such as clickable hyperlinks within the video frame, external links that lead the viewer to a specific website or other internet address, pop- outs that display information or images, or playing video or audio alongside of or instead of the video previously being played.
  • a hyperlink may be considered as a connection between an element, such as a word, phrase, symbol or object in a document, such as a hypertext document, with a different element in the same document, another document, file or script.
  • the hyperlink may be activated by a user clicking or otherwise selecting the hyperlink.
  • the browser may be redirected to the element or other document.
  • the concept of hyperlink may also be used on images, particularly as a "map" tag on images in hypertext markup language. For example, when a user clicks on a region having the map tag, the browser is redirected to the linked webpage.
  • a method of allowing a user to interact with a video stream comprising: accessing the video stream, the video stream including a plurality of objects; developing a hotpath data stream, the hotpath data stream including a plurality of moveable clickable areas, wherein each of the plurality of moveable clickable areas are associated with a corresponding respective one of the plurality of objects; embedding the hotpath data stream with the video stream in a file container; decoding the file container such that the user can interact with ones of the plurality of objects via its corresponding respective one of the plurality of moveable clickable areas.
  • a system for allowing a user to retrieve information related to objects found in a video stream comprising: a computing device, the computing device including a processor having a set of instructions, the set of instructions configured to: access the video stream, the video stream including a plurality of objects; develop a hotpath data stream, the hotpath data stream including information and a moveable clickable area associated with each of the plurality of objects; embed the hotpath data stream with the video stream in a file container.
  • FIG. 1 is a block diagram of an embedding system according to an embodiment of the present invention
  • FIG. 2 is an illustration of the development of a hotpath data stream on a frame by frame basis
  • FIG. 3 is a block diagram of an exemplary process of embedding data in a video stream according to an embodiment of the present invention
  • FIG. 4 is a block diagram of the build of a video container or "wrapper" that incorporates hotpath data according to an embodiment of the present invention
  • FIG. 5 is another block diagram of the build of a video container that incorporates a hotpath data stream according to an embodiment of the present invention
  • FIG. 6 is an image of an embedded a hotpath data stream within a video frame using an unused variable length code according to an embodiment of the present invention
  • FIG. 7 is a table showing the coding for embedding a hotpath data stream in an audio file.
  • FIG. 8 is a block diagram of a computing environment suitable for use with systems and methods of the present invention.
  • systems and methods for embedding information in a video stream embeds a hotpath data stream within a video feed.
  • the system and method can define a data micro-format for "moveable clickable areas" (MCAs) for any desired object within a video and then embedding this data into either: a) a data track when using a video container format that supports arbitrary metadata tracks, b) the subtitle data stream, c) the video data stream, d) the audio data stream, or combinations of the same.
  • MCAs moveable clickable areas
  • the MCA contains information about the object, including a dataset regarding the MCA's size and relative location as well as other information, such as, but not limited to, a hyperlink, a text box, an image, a scoring algorithm, and an expectant motion.
  • the system allows the data to be carried with the video instead of in a separate database file, the data is more easily matched with the object(s) of interest in the video, and a larger amount of data can be put into the video (data associated with a single object or many objects) with no degradation in video loading or streaming speed.
  • FIG. 1 FIG. 1
  • system 100 includes a computer 104 that generally has, among other possible components, an operating system 108, a hotpath
  • creator 112 a processor 116, a memory 120, an input device 124, and an output device 128.
  • computer 104 may be one of various computing or entertainment devices, such as, but not limited to, a personal computer, personal digital assistant, a set top box (e.g., Web TV, Internet Protocol TV, etc.), a mobile device (e.g., a smartphone, a tablet, etc.), and a system 700 (shown and described in FIG. 7).
  • Computer 104 may include additional applications aside from operating system 108 and hotpath creator 112.
  • computer 104 can include applications that record, play back, and store video streams, such as, but not limited to, the VLC media player, by VideoLAN Organization of Paris, France.
  • a browser program also may be used with or without other applications to play back video streams or to facilitate access to one or more embedded piece of information contained in the video, e.g., hyperlinks, redirection to websites, etc.
  • Input device 124 and output device 128 may be included with computer 104 so as to facilitate the receipt of video streams from external sources.
  • Input device 124 and output device 128 may include particular encoders, decoders, and communication interfaces (e.g., wired, wireless, etc.) that support various communication protocols, transportation protocols, and other protocols used to receive the video stream and other data.
  • Input device 124 and output device 128 may also further provide functionality to send other data and information.
  • output device 128 is a display that is capable of showing a video stream with MCAs associated to particular objects in the video stream.
  • hotpath creator 112 may implement use of a display 120 so that a user can identify or select an object or objects in a series of frames and associate information, such as a hyperlink, to the object(s), as further discussed below.
  • a user can identify or select an object or objects in a series of frames and associate information, such as a hyperlink, to the object(s), as further discussed below.
  • multiple objects 132, object 132A (a car) and 132B (a soccer ball) appear in a series of frames 136 (e.g., 136A-D).
  • Each object 132 has an area 140 (e.g., 140A and 140B) associated with the object.
  • a user could, for example, associate information, such as the type of object 132, projected path of the object, etc., with each the object, by selecting each object at each frame 136.
  • a hotpath data stream is a series of two or more data sets that include the location (defined herein as at least a time and position), other desired information, and a size of a moving clickable area (MCA) (MCA may be two or three dimensional and take on most any desired shape) that is associated with and is positioned around or over an object of interest in the video.
  • MCA moving clickable area
  • a hotpath data stream includes more than two data sets related to each object of interest during the time the object appears in the video because additional data sets allow for the MCA shape to move through a series of locations with more accurate fluid movement, MCA resizing, and for the MCA to follow the objects on non-trivial paths.
  • the data set is also configured to allow for multiple types of movement, e.g., linear, curvilinear, spline, etc.
  • hotpath creator 112 acquires an array of object related information, e.g., locations, MCAs, other information, by monitoring the user's interactions with objects in the video.
  • the video is stopped or paused by the user, automatically upon the appearance of new object, or on a scheduled basis, e.g., every 10 frames, so as allow the user to select desired object(s) and to associate information and an MCA with the object(s).
  • a user can reposition and resize the MCA as desired so as to provide for better tracking of the object.
  • the location is recorded, and any additional content the user desires to associate with the object can be added or removed.
  • the MCA can be moved or modified during which hotpath creator 112 records the movement of the MCA as a new dataset for that object.
  • the user can move the video to the desired playback time and reposition and resize the MCA.
  • an automated object tracking system allows the user to follow and resize desired objects.
  • a software program such as MatLab ® , produced by Mathworks, Inc. of Natick, MA, can be used to detect the moving objects in each video frame, follow the objects over time, and provide a hotpath data stream from the information collected.
  • the desired path accuracy of the MCA for each object is dependent upon, among other things, the frequency by which a user, or an algorithmic hotpath data set generation program, specifies the location and size of the MCA.
  • each MCA may have a few locations per second, e.g., 10 locations per second.
  • the MCA can be programmed to follow other types of paths between the locations, e.g., curvilinear.
  • the data sets developed by hotpath creator 112 can be configured to work with the same video at different resolutions, and as such, the values used to indicate the position and size of each MCA are, in a preferred embodiment, not absolute pixel locations, but can also be relative numbers that correspond to percentages relating to the video's frame size.
  • the data set format is flexible and expansible so as to allow for additional attributes of each object to be associated with the object and that can be incorporated into multiple types of existing metadata or subtitle mechanisms of the various video file formats, available now, or in the future.
  • a suitable data set format can be an extendable JSON-style text format.
  • a text-based, JSON-style comma-separated name/value pairs format is used.
  • JSON-style data format for two objects e.g., Object One and Object Two, in a rectangular coordinate space would be as follows:
  • Object One has a first data set that includes a hyperlink, e.g., http://www.somewhere.com/, and defines a first MCA which starts at time 1 :21 at position 15.3, 46.2 with size 24.5 x 18.7, and moves to time 1 :33 and position 19.8, 55.2 with size 28.1 x 22.6.
  • Object Two has three data sets in contrast to Object One's two data sets.
  • Object Two includes a hyperlink, e.g., http://www.otherplace.com/, and moves from point A to B to C during its
  • data sets associated with Object Two include an additional attribute, i.e., a "note”.
  • the data format used is flexible in that it can add any number of additional attributes to the hotpath data stream. For example, if added support for curved motion paths is desired (as opposed to linear projections), the data set can include an attribute that would result in the MCA's motion being curvilinear between locations.
  • the shape of an MCA can be modified to be customized to the user's desires, such as, but not limited to, ovals, rectangles, stars, and human-shaped.
  • Hotpaths data streams are particularly useful for video applications that require a large degree of interaction with the user.
  • Current applications include, but are not limited to, a driver training app wherein the user must notice and indicate they have seen, in the correct sequence, a variety of items included in a film of an actual drive in a congested area; a quarterback training app wherein the user must recognize and indicate they have seen the intended movements of key defensive players pre-snap and then indicate the choices available to them post-snap in a film of an actual scrimmage; a law enforcement training app wherein the user must recognize and indicate they have properly assessed a variety of threats contained in a film of a crowd; a baseball training app wherein the user must recognize and indicate their understanding of signals from the batting coach, first base coach, etc.
  • method 200 for the embedding of a hotpath data stream within a video stream.
  • method 200 allows for the inclusion of data and information that relates to objects found in video stream and for the retrieval of this data and information by a user.
  • method 200 combines multiple data streams without significantly impacting playback speed or recording/playback quality by selective combining the hotpath data stream with the other streams or in lieu of other streams. Additionally, by embedding the hotpath data stream within the video stream, the two streams can be tightly coupled to each other, providing consistent object / hotpath registration and tracking.
  • a video stream is accessed by the user.
  • the video stream can be stored or recorded locally, retrieved from an external server or other storage device, or can be retrieved or produced via methods known in the art.
  • the video stream can include, among other things, images, still frames, audio, or combinations of the same.
  • the video stream is a prerecorded scene that depicts a desired setting for training an end user of the video stream with the embedded hotpath data stream.
  • a hotpath data stream is developed.
  • a hotpath data stream is developed by a hotpath creator, such as hotpath creator 112 described above with reference to FIG. 1.
  • hotpath creator includes an automated object tracking system, such as the system described herein.
  • the hotpath data stream developed at step 208 is associated/embedded within the video stream.
  • the hotpath data stream is placed into a multimedia file container via one of the association/embedding methodologies discussed in more detail below with reference to FIGS. 3 and 4.
  • a multimedia file container 300 which can also be called a wrapper, "wraps up", for example, a video stream 304, an audio stream 308, a subtitle stream 312, and a hotpath stream 316 (each of the aforementioned can also be called a "track") into a chronological, time-linear, single delivery stream or file by including, with the streams, a header 320 that includes data regarding how each of the streams is to be treated upon display to a user.
  • file container 300 encapsulates each stream and allows for the interleaving of audio, video, and other data inside a single package.
  • the use of header 320 allows file container 300 to administer overhead tasks such as packet framing, ordering, interleaving, error detection, and periodic timestamps for seeking video, audio, or subtitle information.
  • each stream is compressed and coded by an encoder or codec.
  • the codec encodes the stream on the originating end (during file container creation) and decodes it on the receiving end (during display).
  • a codec describes how video or audio data is to be compressed and decompressed. Codecs are traditionally licensed exclusively to a certain format. For example, the WMV video codec is only used in Windows Media containers.
  • File container types include, but are not limited to, QuickTime ® (a video player from Apple, Inc. of Cupertino, CA) (.MOV), Matroska, AVI, WMV, and MPEG-4.
  • Common codecs include, but are not limited to, MPEG-4, H.264, DivX, AAC, Vorbis (audio codec), and SRT.
  • Some file containers including, but not limited to, OGG and FFmpeg, are multimedia container formats, meaning they can contain additional streams beyond the traditional video, audio, and subtitle streams.
  • audio stream 308 may be delivered to audio encoder 408
  • video stream 304 may be delivered to video encoder 404
  • subtitle stream 312 may be delivered to subtitle encoder 412
  • hotpath stream 316 may be delivered to hotpath encoder 416.
  • the encoded stream can be included in the file container via several different embedding options, including, metadata embedding 420, subtitle embedding 424, video embedding 424, and audio embedding 428.
  • a multimedia container format is chosen that supports arbitrary time-based metadata tracks.
  • Some available container formats meeting these criteria include, but are not limited to, MOV, WebM, and OGG.
  • an appropriate decoding application such as VLC media player (mentioned above) that includes the appropriate codecs (including a hotpath codec)
  • VLC media player would be used to play back the video stream, audio stream, subtitle stream (if used), and hotpath data stream.
  • the hotpath codec specifies the format of the hotpath data stream on the creating end (as discussed above) and provides information on the interpretation, display, and actions of the hotpath data stream on the receiving end.
  • the file container developed can also include metadata that specifies how the data streams inside the file container are organized and which codecs are necessary so as to decipher the data streams.
  • the encoded hotpath data stream can be embedded using subtitle embedding 428, which treats the hotpath data stream similar to a subtitle track, which is supported by many video container file formats, such as, but not limited to, MP4 and MKV.
  • the hotpath data stream is embedded and multiplexed (interleaved) into the video file along with the video and audio tracks instead of one of the subtitle tracks.
  • a multimedia file header such as header 320 (FIG. 4) includes information to alert the playback device to use the hotpath data steam contained in (one of) the subtitle data streams.
  • This approach is generally compatible with many video container formats, and is useful for video container formats that do not support arbitrary metadata tracks (i.e., would not support metadata embedding 420).
  • an appropriate decoding application would be used to appropriately play back the video stream with the embedded hotpath data stream (now substituted for the subtitle track).
  • the hotpath data stream can alternatively be encoded in either the video or audio data streams using either video embedding 428 or audio embedding 432.
  • Video embedding 428 entails encoded the hotpath data stream onto the video stream by re-purposing unused variable length codes (VLC) present in the video stream.
  • VLC variable length codes
  • the video stream can be modeled as a linear sequence of JPEG images displayed one after another at a video rate.
  • MPEG takes advantage of this redundancy to further compressive the file. Nevertheless, the analogy works for illustration purposes.
  • JPEG standard (a commonly used method of lossy compression for digital images) defines 162 different VLCs for alternating component (AC) coefficients, many codes are not used during image compression.
  • unused codes can be found and repurposed to store hotpath data.
  • video embedding 428 to encode the hotpath data stream into the video data stream, a map of "unused" VLCs is developed by frame number and position within the frame, the hotpath data stream file is compressed, and sequentially inserted into the previously mapped unused VLCs. The presence of repurposed VLCs and the code mapping relationships to the hotpath data stream can be stored in the header of the multimedia container.
  • This method of embedding permits file-sized hotpath data streams to be inserted into in a compressed video domain such as MPEG-4 or H.256, with little quality distortion.
  • a compressed video domain such as MPEG-4 or H.256
  • the hotpath data stream is encoded using Huffman coding (a lossless data compression algorithm) and embedded in the "unused" VLCs that have been previously mapped/identified.
  • decoding includes two primary steps: 1) VLC mapping reconstruction using the mapping relationship information contained in the video data stream header, and 2) decompression of the hotpath data stream using a decoding table contained in the video container header.
  • a Huffman table also called Huffman tree
  • the file container header allows the decoder to sequentially reconstitute the hotpath data stream from the repurposed VLCs header.
  • a Huffman table is read from the file header.
  • the presence of this table allows for the examination of the video data stream header.
  • the file header includes information on where in the video stream to look for VLCs. The implementation time of this approach is comparable to the previously discussed embodiments where a separate hotpath track is included in the file container.
  • FIG. 6 shows an illustration of a video frame 500, including an object 504 and a VLC 508.
  • VLC 508 is encountered in the video, it is decompressed or expanded using the Huffman table.
  • a hotpath data stream 512 is reconstituted from VLC 508 thus providing the data related to object 504, for example, the size and shape of the MCA, a note, an object type, etc..
  • FIG. 6 another option for embedding a hotpath data stream is using audio embedding 432, so that the hotpath data stream is embedded in an audio stream, such as, but not limited to, AAC, that is included in the multimedia container.
  • audio embedding 432 so that the hotpath data stream is embedded in an audio stream, such as, but not limited to, AAC, that is included in the multimedia container.
  • Techniques for embedding information in digital audio include, but are not limited to, parity coding, phase coding, spread spectrum, echo hiding, and least significant bit (LSB) insertion.
  • LSB is used to embed a hotpath data stream in the left or right audio channel of a video/audio recording. With this technique, the LSB of binary sequences of each sample of a digital audio file is replaced with the part of the hotpath information.
  • the letter "D" is embedded in the last bit of the audio code, which has an ASCII code of 68, or 01000100, inside eight bytes of an audio file.
  • column 604 there is shown the original audio code.
  • column 608 there is shown the exemplary text to embed (e.g., the letter "D").
  • column 612 there is shown the embedded text (far right number of each row).
  • the LSB coding method can inject a low level of noise into the audio track ( ⁇ 6bB/bit), which may impact some uses. However, for certain videos, e.g., those with road noise or other significant background noise, more than 1 bit may be used.
  • LSB coding For other applications where high quality sound is desired, variations of LSB coding can be employed such as the zigzag LSB coding method where information is inserted into the last bit of the audio in a zigzag fashion. In this embodiment, on average, only half of the bits are used thereby maintaining higher audio quality. As with the previously described method of embedding the hotpath information into the video data stream, the header(s) within the audio stream are used to signal the presence of embedded hotpath stream.
  • the hotpath data stream can be readily synchronized to objects and events occurring in the video.
  • the hotpath data stream can be read by the video player having the appropriate codecs.
  • the MCA may be highlighted or otherwise visible to the user so that the user would know that the object could be tapped (when computer 104 is implemented as or includes a touchscreen) or clicked for more information.
  • the video player would move an object's MCA according to each dataset related to the object.
  • the video player would determine the position of the object in-between datasets using an expectant motion algorithm, until the object no longer was associated with an MCA (the object need not necessarily be off the video display).
  • the system described herein does not have a limit to the number of objects that can be tracked using MCAs.
  • decoding is performed by the media/video player employed by the user, which has the appropriate codecs to decode the information for playback.
  • FIG. 8 shows a diagrammatic representation of one embodiment of computing system in the exemplary form of a system 700, e.g., computing device 104, within which a set of instructions that cause a processor 705 to perform any one or more of the aspects and/or methodologies, such as method 200, or to perform the encoding, decoding, and embedding functions described in the present disclosure. It is also contemplated that multiple computing devices, such as computing device 104, or mobile devices, or combinations of computing devices and mobile devices, may be utilized to implement a specially configured set of instructions for causing the performance of any one or more of the aspects and/or methodologies of the present disclosure.
  • System 700 includes a processor 705 and a memory 710 that communicate with each other via a bus 715.
  • Bus 715 may include any of several types of communication structures including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combinations thereof, using any of a variety of architectures.
  • Memory 710 may include various components (e.g., machine-readable media) including, but not limited to, a random access memory component (e.g., a static RAM "SRAM” or a dynamic RAM “DRAM”), a read-only component, and any combinations thereof.
  • SRAM static RAM
  • DRAM dynamic RAM
  • a basic input/output system 720 (BIOS), including basic routines that help to transfer information between elements within system 700, such as during startup, may be stored in memory 710.
  • Memory 710 may also include (e.g., stored on one or more machine-readable media) instructions (e.g., software) 725 embodying any one or more of the aspects and/or methodologies of the present disclosure.
  • memory 710 may further include any number of program modules including, but not limited to, an operating system, one or more application programs, other program modules, program data, and any combinations thereof.
  • System 700 may also include a storage device 730.
  • a storage device e.g., storage device 730
  • Examples of a storage device include, but are not limited to, a hard disk drive for reading from and/or writing to a hard disk, a magnetic disk drive for reading from and/or writing to a removable magnetic disk, an optical disk drive for reading from and/or writing to an optical media (e.g., a CD or a DVD), a solid-state memory device, and any combinations thereof.
  • Storage device 730 may be connected to bus 715 by an appropriate interface (not shown).
  • Example interfaces include, but are not limited to, SCSI, advanced technology attachment (ATA), serial ATA, universal serial bus (USB), IEEE 7395 (FIREWIRE), and any combinations thereof.
  • storage device 730 may be removably interfaced with system 700 (e.g., via an external port connector (not shown)). Particularly, storage device 730 and an associated non-transitory machine-readable medium 735 may provide nonvolatile and/or volatile storage of machine-readable instructions, data structures, program modules, and/or other data for system 700.
  • instructions 725 may reside, completely or partially, within non -transitory machine-readable medium 735. In another example, instructions 725 may reside, completely or partially, within processor 705.
  • System 700 may also include a connection to one or more systems or software modules included with system 70. Any system or device may be interfaced to bus 715 via any of a variety of interfaces (not shown), including, but not limited to, a serial interface, a parallel interface, a game port, a USB interface, a FIREWIRE interface, a direct connection to bus 715, and any combinations thereof. Alternatively, in one example, a user of system 700 may enter commands and/or other information into system 700 via an input device (not shown).
  • an input device examples include, but are not limited to, an alpha-numeric input device (e.g., a keyboard), a pointing device, a joystick, a gamepad, an audio input device (e.g., a microphone, a voice response system, etc.), a cursor control device (e.g., a mouse), a touchpad, an optical scanner, a video capture device (e.g., a still camera, a video camera), a touch screen (as discussed above), and any combinations thereof.
  • an alpha-numeric input device e.g., a keyboard
  • a pointing device e.g., a joystick, a gamepad
  • an audio input device e.g., a microphone, a voice response system, etc.
  • a cursor control device e.g., a mouse
  • a touchpad e.g., an optical scanner
  • video capture device e.g., a still camera, a video camera
  • a user may also input commands and/or other information to system 700 via storage device 730 (e.g., a removable disk drive, a flash drive, etc.) and/or a network interface device 745.
  • a network interface device such as network interface device 745, may be utilized for connecting system 700 to one or more of a variety of networks, such as network 750, and one or more remote devices 755 connected thereto. Examples of a network interface device include, but are not limited to, a network interface card, a modem, and any combination thereof.
  • Examples of a network include, but are not limited to, a wide area network (e.g., the Internet, an enterprise network), a local area network (e.g., a network associated with an office, a building, a campus, or other relatively small geographic space), a telephone network, a direct connection between two computing devices, and any combinations thereof.
  • a network such as network 750, may employ a wired and/or a wireless mode of communication. In general, any network topology may be used.
  • Information e.g., data, instructions 725, etc.
  • System 700 may further include a video display adapter 760 for communicating a displayable image to a display device 765.
  • a display device 765 include, but are not limited to, a liquid crystal display (LCD), a cathode ray tube (CRT), a plasma display, and any combinations thereof.
  • system 700 may include a connection to one or more other peripheral output devices including, but not limited to, an audio speaker, a printer, and any combinations thereof.
  • Peripheral output devices may be connected to bus 715 via a peripheral interface 770. Examples of a peripheral interface include, but are not limited to, a serial port, a USB connection, a FIREWIRE connection, a parallel connection, a wireless connection, and any combinations thereof.
  • a method of allowing a user to interact with a video stream comprising: accessing the video stream, the video stream including a plurality of objects; developing a hotpath data stream, the hotpath data stream including a plurality of moveable clickable areas, wherein each of the plurality of moveable clickable areas are associated with a corresponding respective one of the plurality of objects; embedding the hotpath data stream with the video stream in a file container; decoding the file container such that the user can interact with ones of the plurality of objects via its corresponding respective one of the plurality of moveable clickable areas.
  • the embedding is metadata embedding.
  • the hotpath data stream is embedded as an arbitrary time-based metadata track.
  • the embedding is subtitle embedding.
  • a plurality of subtitle streams are provided with the video stream, and wherein one of the plurality of subtitle streams is replaced by the hotpath data stream.
  • the embedding is video embedding.
  • the video stream includes a plurality of unused variable length codes, and wherein the hotpath data stream is encoded into the video stream by re-purposing the plurality of unused variable length codes.
  • the plurality of unused variable length codes are statistically determined.
  • the hotpath data stream is encoded in the video stream using Huffman coding.
  • the embedding is audio embedding.
  • the audio embedding uses least significant bit (LSB) insertion to embed the hotpath data stream in an audio stream.
  • a system for allowing a user to retrieve information related to objects found in a video stream comprising: a computing device, the computing device including a processor having a set of instructions, the set of instructions configured to: access the video stream, the video stream including a plurality of objects; develop a hotpath data stream, the hotpath data stream including information and a moveable clickable area associated with each of the plurality of objects; embed the hotpath data stream with the video stream in a file container.
  • decoding the file container such that the user can interact with ones of the plurality of objects via its corresponding respective one of the plurality of moveable clickable areas.
  • embedding the hotpath data stream is performed using metadata embedding.
  • the hotpath data stream is embedded as an arbitrary time-based metadata track. In certain embodiments, embedding the hotpath data stream is performed using subtitle embedding. In certain embodiments, a plurality of subtitle streams are provided with the video stream, and wherein one of the plurality of subtitle streams is replaced by the hotpath data stream. In certain embodiments, embedding the hotpath data stream is performed using video embedding. In certain embodiments, the video stream includes a plurality of unused variable length codes, and wherein the hotpath data stream is encoded into the video stream by re-purposing the plurality of unused variable length codes. In certain embodiments, the plurality of unused variable length codes are statistically determined.
  • the hotpath data stream is encoded in the video stream using Huffman coding.
  • embedding the hotpath data stream is performed using audio embedding.
  • the audio embedding uses least significant bit (LSB) insertion to embed the hotpath data stream in an audio stream.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

L'invention concerne des systèmes et des procédés pour incorporer un flux de données de chemins chauds ("hotpath") dans une alimentation vidéo. Le système et le procédé permettent de définir un micro-format de données pour des "zones cliquables déplaçables" (MCA) pour un quelconque objet désiré dans une vidéo, puis d'incorporer ces données dans l'un des éléments suivants ou leurs combinaisons: a) une piste de données lorsqu'on utilise un format de support de vidéo prenant en charge des pistes de métadonnées arbitraires; b) le flux de données de sous-titres; c) le flux de données vidéo; d) le flux de données audio. La MCA contient des informations se rapportant à l'objet, y compris un ensemble de données concernant la taille et l'emplacement relatif de la MCA, ainsi que d'autres informations telles que, mais pas exclusivement, un hyperlien, une zone de texte, une image, un algorithme de notation, et un mouvement d'attente.
PCT/US2016/029890 2015-06-13 2016-04-28 Systèmes et des procédés pour incorporer une fonction d'interaction utilisateur dans une vidéo WO2016204873A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562175232P 2015-06-13 2015-06-13
US62/175,232 2015-06-13

Publications (1)

Publication Number Publication Date
WO2016204873A1 true WO2016204873A1 (fr) 2016-12-22

Family

ID=57545898

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/029890 WO2016204873A1 (fr) 2015-06-13 2016-04-28 Systèmes et des procédés pour incorporer une fonction d'interaction utilisateur dans une vidéo

Country Status (1)

Country Link
WO (1) WO2016204873A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1332470B1 (fr) * 2000-07-06 2010-09-08 Thomas W. Meyer Inclusion steganographique de donnees dans des signaux numeriques
US20140181882A1 (en) * 2012-12-24 2014-06-26 Canon Kabushiki Kaisha Method for transmitting metadata documents associated with a video

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1332470B1 (fr) * 2000-07-06 2010-09-08 Thomas W. Meyer Inclusion steganographique de donnees dans des signaux numeriques
US20140181882A1 (en) * 2012-12-24 2014-06-26 Canon Kabushiki Kaisha Method for transmitting metadata documents associated with a video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PURNAMASARI ET AL.: "Clickable and Interactive Video System Using HTML5'';", ICOIN IEEE ;, 12 February 2014 (2014-02-12), pages 232 - 237, XP032586923, Retrieved from the Internet <URL:http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber--6799697&url=hftp%3A%2F% 2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6799697> [retrieved on 20160627] *

Similar Documents

Publication Publication Date Title
US9852762B2 (en) User interface for video preview creation
US8701008B2 (en) Systems and methods for sharing multimedia editing projects
US11350184B2 (en) Providing advanced playback and control functionality to video client
US8612623B2 (en) Protection of delivered media
JP6969013B2 (ja) メディアファイルの同期再生方法、装置及び記憶媒体
JP7052070B2 (ja) メディアファイルのネットワーク再生方法、装置及び記憶媒体
US20130335447A1 (en) Electronic device and method for playing real-time images in a virtual reality
KR20110056476A (ko) 향상된 메타데이터 구조들을 사용하는 멀티미디어 배포 및 재생 시스템들 및 방법들
US11006192B2 (en) Media-played loading control method, device and storage medium
WO2015060165A1 (fr) Dispositif de traitement d&#39;affichage, dispositif de distribution, et métadonnées
US11818406B2 (en) Data storage server with on-demand media subtitles
CN110545460B (zh) 一种媒体文件的预加载方法、装置及存储介质
KR20220031560A (ko) 정보 처리 장치, 정보 처리 방법, 재생 처리 장치 및 재생 처리 방법
US9078049B2 (en) Protection of internet delivered media
US9070403B2 (en) Processing of scalable compressed video data formats for nonlinear video editing systems
WO2016204873A1 (fr) Systèmes et des procédés pour incorporer une fonction d&#39;interaction utilisateur dans une vidéo
JP2016538755A (ja) 音声トラックおよびビデオトラックをインターネットで再生および個別格納する方法
CN110545480A (zh) 一种媒体文件的预加载控制方法、装置及存储介质
US11838602B2 (en) MPD chaining in a live CMAF/DASH player using W3C media source and encrypted extensions
US11973820B2 (en) Method and apparatus for mpeg dash to support preroll and midroll content during media playback
US10531142B2 (en) Multimedia progress tracker
KR20230086792A (ko) 미디어 스트리밍 및 재생 동안 프리롤 및 미드롤을 지원하기 위한 방법 및 장치
WO2012037033A2 (fr) Protection d&#39;un fichier multimédia distribué par internet

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16812093

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16812093

Country of ref document: EP

Kind code of ref document: A1