WO1997041504A1 - A method and system for synchronizing and navigating multiple streams of isochronous and non-isochronous data - Google Patents
A method and system for synchronizing and navigating multiple streams of isochronous and non-isochronous data Download PDFInfo
- Publication number
- WO1997041504A1 WO1997041504A1 PCT/US1997/006982 US9706982W WO9741504A1 WO 1997041504 A1 WO1997041504 A1 WO 1997041504A1 US 9706982 W US9706982 W US 9706982W WO 9741504 A1 WO9741504 A1 WO 9741504A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- isochronous
- data
- streams
- data streams
- user
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/41—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
Definitions
- the present invention generally relates to the production and delivery of video recordings of speakers giving presentations, and, more particularly, to the production and delivery of digital multimedia programs of speakers giving presentations.
- These digital multimedia programs consist of multiple synchronized streams of isochronous and non-isochronous data, including video, audio, graphics, text, hypertext, and other data types.
- a video camera is used to record the event onto a video tape, which is subsequently duplicated to an analog medium suitable for distribution, most commonly a VHS tape, which can be viewed using a commercially-available VCR and television set.
- VHS tape an analog medium suitable for distribution
- Such video tapes generally contain a video recording of the speaker and a synchronized audio recording of the speaker's words. They may also contain a video recording of any visual aids which the speaker used, such as text or graphics projected in a manner visible to the audience.
- Such video tapes may also be edited prior to duplication to include a textual transcript of the audio component recording, typically presented on the bottom of the video display as subtitles.
- Such subtitles are of particular use to the hearing impaired, and if translated into other languages, are of particular use to viewers who prefer to read along in a language other than the language used by the speaker.
- Analog tape players offer limited navigation facilities, generally limited to fast forward and rewind capabilities.
- analog tapes have the capacity to store only a few hours of video and audio, resulting in the need to duplicate and distribute a large number of tapes, leading to the accumulation of a large number of such tapes by viewers.
- Isochronous data is data that is time ordered and must be presented at a particular rate.
- the isochronous data contained in such a digital recording generally includes video and audio.
- Non- isochronous data may or may not be time ordered, and need not be presented at a particular rate.
- Non-isochronous data contained in such a digital recording may include graphics, text, and hypertext.
- a method and system for manipulating multiple streams of isochronous and non- isochronous digital data including synchronizing multiple streams of isochronous and non-isochronous data by reference to a common time base, supporting navigation through each stream in the manner most appropriate to that stream, defining a framework of conceptual events and allowing a user to navigate though the streams using this structured framework, identifying the position in each stream corresponding to the position selected in the navigated stream, and simultaneously displaying to the user some or all of the streams at the position corresponding to the position selected in the navigated stream.
- a method and system of efficiently supporting sequential and random access into streams of isochronous and non-isochronous data across non-isochronous networks including reading the isochronous and non-isochronous data from the storage medium into memory of the server CPU, transmitting the data from the memory of the server CPU to the memory of the client CPU, and caching the different types of data in the memory of the client CPU in a manner that ensures continuous display of the isochronous data on the client CPU display device.
- FIG. 1 is a schematic diagram of the organization of a data processing system incorporating an embodiment of the present invention.
- FIGS. 2 and 3 are schematic diagrams of the organization of the data in an embodiment of the present invention.
- FIG. 4 is a diagram showing how two different sets of "conceptual events" may be associated with the same presentation in an embodiment of the present invention.
- FIGS. 5, 6 and 9 are exemplary screens produced in accordance with an embodiment of the present invention.
- FIGS. 7, 8, 10, and 11 are flow charts indicating the operation of an embodiment of the present invention.
- a data processing system 100 inco ⁇ orating the invention.
- Conventional elements of the system include a client central processing unit 110 which includes high-speed memory, a local storage device 112 such as a hard disk or CD-ROM, input devices such as keyboard 114 and pointing device 116 such as a mouse, and a visual data presentation device 118, such as a computer display screen, capable of presenting visual data perceptible to the senses of a user, and an audio data presentation device 120, such as speakers or headphones, capable of presenting audio data to the senses of a user.
- a client central processing unit 110 which includes high-speed memory, a local storage device 112 such as a hard disk or CD-ROM, input devices such as keyboard 114 and pointing device 116 such as a mouse, and a visual data presentation device 118, such as a computer display screen, capable of presenting visual data perceptible to the senses of a user, and an audio data presentation device 120, such as speakers or headphones, capable of presenting audio data to the senses of
- server central processing unit 130 which includes high-speed memory, a local storage device 132 such as a hard disk or CD-ROM, input devices such as keyboard 134 and pointing device 136, and a visual data presentation device 138, and an audio data presentation device 140.
- the client CPU is connected to the server CPU by means of a network connection 150.
- the invention includes three basic aspects: (1) synchronizing multiple streams of isochronous and non-isochronous data, (2) navigating through the synchronized streams of data by means of a structured framework of conceptual events, or by means of the navigational method most appropriate to each stream, and (3) delivering the multiple synchronized streams of isochronous and non- isochronous data over a non-isochronous network connecting the client CPU and the server CPU.
- FIG. 2 and FIG. 3 An exemplary form of the organization of the data embodied in the invention is shown in FIG. 2 and FIG. 3.
- the video/audio stream 200 is of a type known in the art capable of being played on a standard computer equipped with the appropriate video and audio subsystems, such as shown in FIG. 1.
- An example of such a video/audio stream is Microsoft Co ⁇ oration's AVITM format, which stands for "audio/video interleaved.”
- AVITM and other such video/audio formats consist of a series of digital images, each referred to as a
- frame of the video, and a series of samples that make up the digital audio.
- the frames are spaced equally in time, so that displaying consecutive frames on a display device at a sufficiently high and constant rate produces the sensation of continuous motion to the human perceptual system.
- the rate of displaying frames typically must exceed ten to fifteen frames per second to achieve the effect of continuous motion.
- the audio samples are synchronized with the video frames, so that the associated audio can be played in synchronization with the displayed video images. Both the digital images and digital audio samples may be compressed to reduce the amount of data that must be stored or transmitted.
- a time base 210 associates a time code with each video frame.
- the time base is used to associate other data with each frame of video.
- the audio data which for the pu ⁇ oses of this invention consists primarily of spoken words, is transcribed into a textual format, called the Transcript 220.
- the transcript is synchronized to the audio data stream by assigning a time code to each word, producing the Time-Coded Transcript 225.
- the time codes (shown in angle- brackets) preceding each word in the Time-Coded Transcript correspond to the time at which the speaker begins pronouncing that word. For example, the time code 230 of 22.51 s is associated with the word 235 "the.”
- the Time-Coded Transcript may be created manually or by means of an automatic procedure.
- Manual time- coding requires a person to associate a time code with each word in the transcript.
- Automatic time coding for example, uses a speech recognition system of a type well-known in the art to automatically assign a time code to each word as it is recognized and recorded.
- the current state of the art of speech recognition systems renders automatic time coding of the transcript less economical than manual time coding.
- the set 310 of Slides SI 311, S2 312, ... that the speaker used as part of the presentation may be stored in an electronic format of any of the types well-known in the art.
- Each slide may consist of graphics, text, and other data that can be rendered on a computer display.
- a Slide Index 315 assigns a time code to each Slide.
- Slide SI 311 would have a time code 316 of 0 s, S2 312 having a time code 317 of 20.40 s, and so on.
- the time code corresponds to the time during the presentation at which the speaker caused the specified Slide to be presented.
- all of the Slides are contained in the same disk file, and the Slide Index contains pointers to the locations of each Slide in the disk file.
- each Slide may be stored in a separate disk file, and the Slide Index contains pointers to the files containing the Slides.
- An Outline 320 of the presentation is stored as a separate text data object.
- the Outline is a hierarchy of topics 321, 322, .. that describe the organization of the presentation, analogous to the manner in which a table of contents describes the organization of a book.
- the outline may consist of an arbitrary number of entries, and an arbitrary number of levels in the hierarchy.
- An Outline Index 325 assigns a time code to each entry in the Outline. The time code corresponds to the time during the presentation at which the speaker begins discussing the topic represented by the entry in the Outline.
- topic 321 “Introduction” has entry name “01” and time code 326 of 0 s
- topic 322 "The First Manned Flight” has entry name “02” and time code 327 of 20.50 s
- “The Wright Brothers” 323 has entry name "021” (and hence is a subtopic of topic 322) with time code 328 of 120.05 s, and so on.
- the Outline and the Outline Index may be created by means of a manual or an automatic procedure.
- Manual creation is accomplished by a person viewing the presentation, authoring the Outline, and assigning a time code to each element in the outline.
- Automatic creation may be accomplished by automatically constructing the outline consisting of the titles of each of the Slides, and associating with each entry on the Outline the time code of the corresponding Slide. Note that manual and automatic creation may produce different Outlines.
- the set 330 of Hypertext Objects 331, 332, ... relating to the subject of the presentation may be stored in an electronic formats of various types well-known in the art.
- Each Hypertext Object may consist of graphics, text, and other data that can be rendered on a computer display, or pointers to other software applications, as spreadsheets, word processors, and electronic mail systems, as well as more specialized applications such as proficiency testing applications or computer-based training applications.
- a Hypertext Index table 335 is used to assign two time codes and a display location to each Hypertext Object.
- the first time code 336 corresponds to the earliest time during the presentation at which the Hypertext Object relates to the content of the presentation.
- the second time code 337 corresponds to the latest time during the presentation at which the Hypertext Object relates to the content of the presentation.
- the Object Name 338 denotes the Hypertext Object's name.
- the display location 339 denotes how the connection to the Hypertext Object, referred to as the Hypertext Link, is to be displayed on the computer screen.
- Hypertext Links may be displayed as highlighted words in the Transcript or the Slides, as buttons or menu items on the end-user interface, or in other visual presentation that may be selected by the user.
- An Outline represents an example of what is termed here a set of "conceptual events.”
- a conceptual event is an association one makes with a segment of a data stream, having a beginning and end (though the beginning and end may be the points), that represents something of interest. These data segments delineating a set of conceptual events may overlap each other, and furthermore, need not cover the entire data stream.
- An Outline represents a set of conceptual events that does cover the entire data stream and, if arranged hierarchically, such as with sections and subsections, has sections covering subsections. In the Outline 320 of FIG. 3, one has the sections 01 :"Introduction" 321, 02:"The First Manned Flight" 322 , and so on, covering the entire presentation.
- bookmarks that denote particular segments, or user-chosen “conceptual events” within presentations.
- the bookmarks allow the user, for example, to return quickly to interesting parts of the presentation, or to pick up at the previous stopping point.
- time lines representing the various data streams, as for example, video 350, audio 352, slides 354 and transcript 356.
- the first set S, 360, S 2 362, S 3 364, etc. would respectively invoke time codes 380 and 381, 382 and 383, 384 and 385, etc., not only for the video 350 data stream, but for the audio 352 , slides 354 and transcript 356 streams.
- the second set S', 370, S' 2 372, S' 3 384, etc. would invoke respectively time codes 390 (a point), 391 and 392, 393 and 394 (394 shown collinear with 384, whether by choice or accident), etc., respectively, not only on the audio 352 data stream, but on the video 350, slides 354 and transcript 356 streams.
- a first Outline might list each skater and be broken down further into the individual moves of each skater's program.
- a second Outline might track the musical portion of the audio stream, following the music piece to piece, even movement to movement.
- one user might be interested in how a skater performed a particular move, while another user might wish to study how a particular passage of music inspired a skater to make a particular move. Note that there is no requirement that two sets of conceptual events track each other in any way, they represent two different ways of studying the same presentation.
- the exemplary screen 400 shows five windows 410, 420, 430, 440, 450 contained within the display.
- the Video Window 410 is used to display the video stream.
- the Slide Window 420 is used to display the slides used in the presentation.
- the Transcript Window 430 is used to display the transcribed audio of the speech.
- the Outline Window 440 is used to display the Outline of the presentation.
- the Control Panel 450 is used to control the display in each of the other four windows.
- the Transcript Window 430 includes a Transcript Slider Bar 432 that allows the user to scroll through the transcript, and Next 433 and Previous 434 Phrase Buttons that allow the user to step through the transcript a phrase at a time, where a phrase consists of a single line of the transcript. It also includes a Hypertext Link 436, as illustrated here in the form of the highlighted words, "Robert Jones", in the transcript.
- the Outline Window 440 includes an Outline Slider Bar 442 that allows the user to scroll through the outline, and Next 443 and Previous Entry buttons 444 that allow the user to jump directly to the next or previous topic.
- the Control Panel 450 includes a Video Slider Bar 452 used to select a position in the video stream, and a Play Button 454 used to play the program.
- Slider Bar 456 used to position the program at a Slide
- Previous 457 and Next 458 Slide Buttons used to display the next and previous Slides in the Slide Window 420.
- Search Box 460 used to search for text strings (e.g., words) in the Transcript.
- FIG. 5 shows the beginning of a presentation, corresponding to a time code of zero.
- the speaker's first slide is displayed in the Slide Window 410
- the speaker's first words are displayed in the Transcript Window 430
- the beginning of the outline is displayed in the Outline Window 440.
- the user can press the play button 454 to begin playing the presentation, which will cause the video and audio data to begin streaming, the transcript and outline scroll in synchronization with the video and audio, and the slides to advance at the appropriate times.
- FIG. 6 shows the result of the user selecting the second entry in the Outline from Outline Window 440', entitled "The First Manned Flight" (recall entry 322 of Outline 320 in FIG. 3).
- the system determines that the time code 327 of "The First Manned Flight" is 20.50 s.
- the system looks in the Slide Index 315 (also in FIG. 3) and determines that the second slide S2 begins at time code 317 of 20.40 s, and thus the second slide should be displayed in the Slide Window 420'.
- the system looks at the Time-Coded Transcript 215 (shown in FIG.
- the flowchart starting at 600 indicates the operation of an embodiment of the present invention.
- the Event Handler 601 in FIG. 7 receives a Move Video Slider Event 610.
- the Move Video Slider Event 610 causes the invention to calculate the video frame of the new position of the slider 452.
- the position of the video slider 452 is translated into the position in the video data stream in a proportional fashion. For example, if the new position of the video slider 452 is positioned half-way along its associated slider bar, and the video stream consist of 10,000 frames of video, then the 5,000' h frame of video is displayed on the Video Window 420.
- the invention displays the new video frame 611, and computes the time code of the new video frame 612.
- the system looks up the Slide associated with the displayed video frame, and displays 613 the new Slide in the Slide Window 410. Again using this new time code, the system looks up the Phrase associated with the displayed video frame, and displays the new Phrase 614 in the Transcript Window 430. Again using this new time code, the system looks up the Outline Entry associated with the displayed video frame, and displays the new Outline Entry 615 in the Outline Window 440. Finally, using this new time code, the system looks up the Hypertext Links associated with the displayed video frame, and displays them 616 in the appropriate place in the Transcript Window 430.
- the Event Handler 601 in FIG. 7 receives a New Slide Event 620.
- the New Slide Event causes the system to display the selected new Slide 621 in the Slide Window 420, and to look up the time code of the new Slide in the Slide Index 622.
- the system uses the time code of the new Slide as the new time code, the system computes the video frame associated with the new time code and displays the indicated video frame 623 in the Video Window.
- the system looks up the Phrase associated with the displayed Slide, and displays the new Phrase 624 in the Transcript Window 430.
- the invention looks up the Outline Entry associated with the displayed Slide, and displays the new Outline Entry 625 in the Outline Window 440. Finally, using the new time code, the system looks up the Hypertext Links associated with the displayed Slide, and displays them 626 in the appropriate place in the Transcript Window 430.
- the Event Handler 601 in FIG. 7 receives a New Phrase Event 630.
- the New Phrase Event causes the system to display the selected new Phrase 631 in the Transcript Window 430, and to look up the time code of the new Phrase in the Transcript Index 632.
- the invention uses the time code of the new Phrase as the new time code, the invention computes the video frame associated with the new time code and displays the indicated video frame 633 in the Video Window 410.
- the invention looks up the Slide associated with the displayed Phrase, and displays the new Slide 634 in the Slide Window.
- the invention looks up the Outline Entry associated with the displayed Phrase, and displays the new Outline Entry 635 in the Outline Window 440. Finally, using the new time code, the invention looks up the Hypertext Links associated with the displayed Phrase, and displays them 636 in the appropriate place in the Transcript Window 430.
- the Event Handler 601 in FIG. 7 receives a Search Transcript Event 640.
- the Search Transcript event causes the system to employ a string matching algorithm of a type well-known in the art to scan the Transcript and locate the first occurrence of the search string 641.
- the system uses the Transcript Index to determine which Phrase contains the matched string in the Transcript 642.
- the system displays the selected new Phrase 631 in the Transcript Window, and looks up the time code of the new Phrase in the Transcript Index 632.
- the system uses the time code of the new Phrase as the new time code to compute the video frame associated with the new time code and displays the indicated video frame 633 in the Video Window 410. Again using the new time code, the system looks up the Slide associated with the displayed Phrase, and displays the new Slide 634 in the Slide Window 420. Again using the new time code, the system looks up the Outline Entry associated with the displayed Phrase, and displays the new Outline Entry 635 in the Outline Window 440. Finally, using the new time code, the system looks up the Hypertext Links associated with the displayed Phrase, and displays them 636 in the appropriate place.
- the Event Handler 601 in FIG. 7 receives a New Outline Entry Event 650.
- the New Outline Entry Event causes the system to display the selected new Outline Entry 651 in the Outline Window 440, and to look up the time code of the new Outline Entry in the Outline Index 652.
- the system uses the time code of the new Outline Entry as the new time code, the system computes the video frame associated with the new time code and displays the indicated video frame 653 in the Video Window 410.
- the system looks up the Slide associated with the displayed Outline Entry, and displays the new Slide 654 in the Slide Window 420.
- the system looks up the Phrase associated with the displayed Outline Entry, and displays the new Phrase 655 in the Transcript Window 430. Finally, using the new time code, the system looks up the Hypertext Links associated with the displayed Outline Entry, and displays them 656 in the appropriate place in the Transcript Window 430. Referring again to FIG. 5, when the user selects a Hypertext Link 436, the Event Handler 601 in FIG. 7 receives a Display Hypertext Object 660. The system displays the data object pointed to by the selected Hypertext Link 661.
- the system Whenever the system is in a stationary state, that is, when no video/audio stream is being played, the system maintains a record of the current time code.
- the data displayed in FIGS. 4 and 5 always correspond to the current time code.
- the Event Handler 601 in FIG. 5 receives a Play Program Event 670.
- the system begin playing the video and audio streams, starting at the current time code.
- the system uses the time code of the displayed video frame to check the Transcript Index, the Slide Index, the Outline Index, and Hypertext Index and determine if the data displayed in the Slide Window 420, Transcript Window 430, or Outline Window 440 must be updated, or if new Hypertext Links must be displayed in the Transcript Window 430. If the time code of the new video frame corresponds to the time code of the next Phrase 710, the system displays the next Phrase 711 in the Transcript Window 430. If the time code of the new video frame corresponds to the time code of the next Slide 720, the system displays the next Slide 721 in the Slide Window 420.
- the system displays the next Outline Entry 731 in the Outline Window 440. Finally, if the time code of the new video frame corresponds to the time codes of a different set of Hypertext Links than are currently displayed 740, the system displays the new set of Hypertext Links 741 at the appropriate places on the display in the Transcript Window 430.
- the textual transcript may be translated into other languages. Multiple transcripts, corresponding to multiple languages, may be synchronized to the same time base, corresponding to a single video/audio stream. Users may choose which transcript language to view, and may switch among different transcripts in different languages during the operation of the invention.
- multiple synchronized streams of each data type may be inco ⁇ orated into a single multimedia program.
- Multiple video/audio streams, each corresponding to different video resolution, audio sampling rate, or data compression technology, may be included in a single program.
- Multiple sets of slides, hypertext links, and other streams of isochronous data types may also be included in a single program.
- One or more of each data type may be displayed on the computer screen, and users may switch among the different streams of data available in the program.
- the present invention is compatible with operating with a collection of many presentations, and to assist users in locating the particular portion of the particular presentation that most interests them.
- the presentations are stored in a data base of a type well-known in the art, which may range from a simple non ⁇ relational data base that stores data in disk files to a complex relational or object- oriented data base that stores data in a specialized format.
- a data base of a type well-known in the art, which may range from a simple non ⁇ relational data base that stores data in disk files to a complex relational or object- oriented data base that stores data in a specialized format.
- users can issue structured queries or full text queries to identify programs they wish to view.
- the user types in a query in the query type-in box 810.
- the titles of the programs that match the query are displayed in the results box 820.
- Structured queries are queries that allow the user to select programs on the basis of structured information associated with each program, such as title, author, or date.
- the user can specify a particular title, author, range of dates, or other structured query, and select only those programs which have associated structured information that matches the query.
- Full-text queries are queries that allow the user to select programs on the basis of text associated witb each program, such as the abstract, transcript, slides, or ancillary materials connected via hypertext.
- the user can specify a particular combination of words and phrases, and select only those programs which have associated text that matches the full-text query. Users can also select which of the associated text elements to search.
- the user can specify to search only the transcript, only the slides, or a combination of both.
- the user can jump directly to the matched text, and display all of the other synchronized multimedia data types at that point in the program.
- FIG. 10 presents a flow chart of the agent mechanism starting at 900.
- the system constructs a summary of the program 920.
- the summary of the program may be constructed in multiple alternative ways. Each program may have associated with it a list of keywords that describe the major subjects discussed in the program. In this case, constructing the summary simply involves accessing this predefined list of keywords.
- any text summarization engine well-known in the art may be run across the text associated with program, including the abstract, the transcript, and the slides, to generate a list of keywords that describe the major subjects discussed in the program. This summary is added to the user's profile
- the user's profile is a list of keywords that collectively describe the programs that the user has viewed in the past. Each time the user views a new program, the keywords that describe that program are added to the user's profile. In this manner, the agent "learns" which subjects are most interesting to the user, and continues to learn about the user's changing interests as the user uses the system.
- the agent mechanism also inco ⁇ orates the concept of memory.
- Each keyword that is added to the user's profile is labeled with the date at which its associated program was viewed. Whenever the agent mechanism is initiated, the difference between the current date and the date label on each keyword is used to assess the relative importance of that keyword. Keywords that entered the profile more recently are treated as more important than keywords that entered the profile in the distant past.
- the agents mechanism is initiated 901.
- the system creates a query from the current user's profile 940.
- the list of keywords in the profile are reorganized into the query syntax required by the full-text search engine.
- the ages of the keywords are converted into the relative importance measure required by the full-text search engine.
- the query is run against all of the programs on the server 950, and the resulting list of programs are presented to the user 960. This list of programs constitutes the programs which the system has determined may be of interest to the user, based on the user's past viewing behavior.
- users can create their own agents by manually constructing a query that describe their ongoing interest. Each time the agents' mechanism is initiated, the user's manually-constructed agents are executed along with the system's automatically-constructed agent, and the selected programs are presented to the user.
- the user can create "virtual conferences" that consist of user-defined aggregations of programs.
- a user composes and executes a query that selects a set of programs that share a common attribute, such as author, or discuss a common subject.
- This thematic aggregation of programs can be named, saved, and distributed to other users interested in the same theme.
- the user can construct "synthetic programs" by sequencing together segments of programs from multiple different programs.
- To create a synthetic program the user composes and executes a query, specifying that the invention should select only those portions of the programs that match the query.
- the user can then view the concatenated portions of multiple programs in a continuous manner.
- the synthetic program can be named, saved, and distributed to other users interesting in the synthetic program content. Referring now to FIG. 11, which will be used to describe the operation of an embodiment of the present invention across a non- isochronous network connection.
- This embodiment inco ⁇ orates a cooperative processing data distribution and caching model that enables the isochronous data streams to play continuously immediately following a navigational event, such as moving to the next slide or searching to a particular word in the transcript.
- the system downloads the selected portions of the non-isochronous data from the server to the client.
- the downloaded non-isochronous data includes the Slide Index, the Slides, the Transcript Index, the Transcript, and the Hypertext Index.
- the downloaded non-isochronous data is stored in a disk cache 1010 on the client. The pu ⁇ ose of pre-downloading this non-isochronous data is to avoid having to transmit it over the network connection simultaneously with the transmission of the isochronous data, thereby interrupting the transmission of the isochronous data.
- the Hypertext Objects are not pre-downloaded to the client; rather, the system is designed to pause the transmission of the isochronous data to accommodate the downloading of any Hypertext Objects.
- the client disk cache is emptied in preparation for use with another program.
- the system downloads a segment of the isochronous data from the server to a memory cache on the client.
- the downloaded isochronous data includes the initial segment of the video data and the corresponding initial segment of the audio data.
- the amount of isochronous data downloaded typically ranges from 5 to 60 seconds, but may be more or less.
- the downloaded isochronous data is stored in a memory cache 1020 on the client.
- the Event Handler 1030 receives a Play Program Event 1040.
- the system begins the continuous delivery of the isochronous data to the display devices 1041. Based on the time code of the currently displayed video frame, it also displays the associated non-isochronous data 1042, including the Transcript, the Slides, and the Hypertext Links. As the system streams the isochronous data to the display devices, it depletes the memory cache. When the amount of isochronous data in the memory cache falls below a specified threshold, the system causes the client CPU to send a request to the server CPU for the next contiguous segment of isochronous data 1043. This threshold typically works out to be on the order of 5-10 seconds, with a worst-case scenario of 60 seconds.
- the client CPU Upon receiving this data, the client CPU repopulates the isochronous data memory cache. If, as anticipated, the client CPU experiences a delay in receiving the requested data, caused by the non-isochronous network connection, the client CPU continues to deliver isochronous data remaining in its memory cache in a continuous stream to the display device, until that cache is exhausted.
- the method for repopulating the client's memory cache is a critical element in supporting efficient random access into isochronous data streams over a non- isochronous network.
- the method for downloading the isochronous data from the server to the memory cache on the client is designed to balance two competing requirements.
- the first requirement is for continuous, uninterrupted delivery of the isochronous data to the video display device and speakers attached to the client CPU.
- the network connection between the client and server is typically non- isochronous, and may introduce significant delays in the transmission of data from the client to the server.
- the memory cache on the client becomes empty, requiring client to send a request across the network to the server for additional isochronous data, the amount of time needed to send and receive the request will cause the interruption of play of the isochronous data.
- the requirement for continuous delivery thus encourages the caching of as much data as possible on the client.
- the second requirement is to minimize the amount of data that is transmitted across the network.
- multiple users share a fixed amount of network bandwidth, and transmitting video and audio data across a network consumes a substantial portion of this limited resource. It is anticipated that a common user behavior will be to use the random access navigation capabilities to reposition the program. But the act of repositioning the program invalidates all or part of the data stored in the memory cache in the client.
- the present invention balances the need for continuous delivery of isochronous data to the display devices with the need to avoid wasting network bandwidth by implementing a novel cooperative processing data distribution and caching model.
- the memory cache on the client is designed specifically for compressed isochronous data, and more specifically for compressed digital video data.
- the caching strategy differs markedly from traditional caching strategies.
- Traditional caching strategies measure the number of bytes of data in the cache, and repopulate the cache when the number of bytes falls below a specified threshold.
- one embodiment of the present invention measures the number of seconds of isochronous data in the memory cache, and repopulates the cache when the number of seconds falls below a specified threshold.
- the cooperative distribution and caching model reduces the amount of data sent across the network compared to a traditional caching scheme.
- the cooperative distribution and caching model guarantees a certain number of seconds of video data cached on the server, reducing the likelihood of interrupted play of the video data stream compared to a traditional caching scheme.
- the memory cache In addition to designing the memory cache to contain a range of a number of seconds of isochronous data, the memory cache employs a policy of unbalanced look ahead and look behind.
- Look ahead refers to caching the isochronous data corresponding to "N" seconds into the future. This isochronous data will be delivered to the display device under the normal operation of playing the program.
- Look behind refers to caching the isochronous data corresponding to "M" seconds into the past. This isochronous data will be delivered to the display device under the frequent operation of replaying the previously played few seconds of the program.
- Unbalanced refers to the policy of caching a different amount (that is, a different number of seconds) of look ahead and look behind data.
- look ahead data is cached than look behind data, typically in the approximate ratio of 7:1.
- different caching policies can be employed in anticipation of different common user behaviors. For example, the use of a circular data structure, a structure well-known in the art, may effect this operation.
- the server sends data to the client at the nominal rate of one second of isochronous data each second.
- the server adapts to the characteristics of the network, bursting data if the network supports a high burst rate, or steadily transmitting data if the network does not support a high burst rate.
- the client monitors its memory cache, and sends requests to the server to speed up or slow down.
- the client also sends requests to the server to stop, restart at a new place in the program, or start playing a different program.
- the system administrator can specify how much network bandwidth is available to the system, for each individual program, and collectively across all programs.
- the system automatically tunes its memory caching scheme to reflect these limits. If the transmitted data would exceed the specified limits, the system automatically drops video frames as necessary.
- the Event Hander 1030 receives a Navigational Event 1050.
- the system computes the time base value of the new position 1051. It then downloads a new segment of the isochronous data from the server to the memory cache on the client 1052.
- the downloaded isochronous data includes a segment of the video data and a corresponding segment of the audio data.
- the system displays the video frame corresponding to the current time base value, and the non-isochronous data corresponding to the displayed video frame 1053.
- the Event Handler 1030 receives a Display Hypertext Object Event 1060.
- the system pauses the play of the program 1061.
- the client CPU requests that the server CPU send the Hypertext Object across the network connection 1062, and upon receiving the Hypertext Object, causes it to be displayed 1063.
- the server 130 records the actions of each user, including not only which programs each user viewed, but also which portions of the programs each user viewed. This record can be used for usage analysis, billing, or report generation.
- the user can ask the server 130 for a usage summary, which contains an historical record of that particular user's usage.
- a manager or system administrator can ask the server 130 for a summary across some or all users, thereby developing an understanding of the patterns of usage.
- the usage record may serve as a guide to restructure old programs or to structure new ones, having learned what works from a presentation perspective and what does not, for example.
- the usage record furthermore enables the system to notify users of changing data.
- the list of users who have viewed a program can be determined from the usage records. If a program is updated, the system reviews the usage record to determine which users have viewed the program, and notifies them that the program that they previously viewed has changed.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU29922/97A AU2992297A (en) | 1996-04-26 | 1997-04-24 | A method and system for synchronizing and navigating multiple streams of isochronous and non-isochronous data |
EP97924520A EP0895617A4 (en) | 1996-04-26 | 1997-04-24 | A method and system for synchronizing and navigating multiple streams of isochronous and non-isochronous data |
JP09539064A JP2000510622A (en) | 1996-04-26 | 1997-04-24 | Method and system for synchronizing and guiding multiple streams of isochronous and non-isochronous data |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US63835096A | 1996-04-26 | 1996-04-26 | |
US08/638,350 | 1996-04-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1997041504A1 true WO1997041504A1 (en) | 1997-11-06 |
Family
ID=24559673
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1997/006982 WO1997041504A1 (en) | 1996-04-26 | 1997-04-24 | A method and system for synchronizing and navigating multiple streams of isochronous and non-isochronous data |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0895617A4 (en) |
JP (1) | JP2000510622A (en) |
AU (1) | AU2992297A (en) |
CA (1) | CA2252490A1 (en) |
WO (1) | WO1997041504A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002077966A2 (en) * | 2001-03-23 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Synchronizing text/visual information with audio playback |
WO2002104036A1 (en) * | 2001-06-15 | 2002-12-27 | Yahoo Japan Corporation | Method, system, and program for creating, recording, and distributing digital stream contents______________ |
WO2004079709A1 (en) * | 2003-03-07 | 2004-09-16 | Nec Corporation | Scroll display control |
EP1517328A1 (en) | 2003-09-16 | 2005-03-23 | Ricoh Company | Information editing device, information editing method, and computer program product |
WO2005101237A1 (en) * | 2004-04-14 | 2005-10-27 | Tilefile Pty Ltd | A media package and a system and method for managing a media package |
US7085842B2 (en) | 2001-02-12 | 2006-08-01 | Open Text Corporation | Line navigation conferencing system |
WO2007049999A1 (en) * | 2005-10-26 | 2007-05-03 | Timetomarket Viewit Sweden Ab | Information intermediation system |
US7295548B2 (en) * | 2002-11-27 | 2007-11-13 | Microsoft Corporation | Method and system for disaggregating audio/visual components |
WO2008070993A1 (en) * | 2006-12-15 | 2008-06-19 | Desktopbox Inc. | Simulcast internet media distribution system and method |
US8499090B2 (en) | 2008-12-30 | 2013-07-30 | Intel Corporation | Hybrid method for delivering streaming media within the home |
US8977375B2 (en) | 2000-10-12 | 2015-03-10 | Bose Corporation | Interactive sound reproducing |
US9729594B2 (en) | 2000-09-12 | 2017-08-08 | Wag Acquisition, L.L.C. | Streaming media delivery system |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8560327B2 (en) * | 2005-08-26 | 2013-10-15 | Nuance Communications, Inc. | System and method for synchronizing sound and manually transcribed text |
JP2007208477A (en) * | 2006-01-31 | 2007-08-16 | Toshiba Corp | Video reproduction device, data structure of bookmark data, storage medium storing bookmark data, and bookmark data generation method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4761781A (en) * | 1985-08-13 | 1988-08-02 | International Business Machines Corp. | Adaptative packet/circuit switched transportation method and system |
US5101274A (en) * | 1987-08-10 | 1992-03-31 | Canon Kabushiki Kaisha | Digital signal recording apparatus time-division multiplexing video and audio signals |
US5274758A (en) * | 1989-06-16 | 1993-12-28 | International Business Machines | Computer-based, audio/visual creation and presentation system and method |
US5471576A (en) * | 1992-11-16 | 1995-11-28 | International Business Machines Corporation | Audio/video synchronization for application programs |
US5613909A (en) * | 1994-07-21 | 1997-03-25 | Stelovsky; Jan | Time-segmented multimedia game playing and authoring system |
US5619733A (en) * | 1994-11-10 | 1997-04-08 | International Business Machines Corporation | Method and apparatus for synchronizing streaming and non-streaming multimedia devices by controlling the play speed of the non-streaming device in response to a synchronization signal |
US5630117A (en) * | 1989-02-27 | 1997-05-13 | Apple Computer, Inc. | User interface system and method for traversing a database |
US5642171A (en) * | 1994-06-08 | 1997-06-24 | Dell Usa, L.P. | Method and apparatus for synchronizing audio and video data streams in a multimedia system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5442744A (en) * | 1992-04-03 | 1995-08-15 | Sun Microsystems, Inc. | Methods and apparatus for displaying and editing multimedia information |
EP0597798A1 (en) * | 1992-11-13 | 1994-05-18 | International Business Machines Corporation | Method and system for utilizing audible search patterns within a multimedia presentation |
-
1997
- 1997-04-24 EP EP97924520A patent/EP0895617A4/en not_active Withdrawn
- 1997-04-24 CA CA002252490A patent/CA2252490A1/en not_active Abandoned
- 1997-04-24 AU AU29922/97A patent/AU2992297A/en not_active Abandoned
- 1997-04-24 WO PCT/US1997/006982 patent/WO1997041504A1/en not_active Application Discontinuation
- 1997-04-24 JP JP09539064A patent/JP2000510622A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4761781A (en) * | 1985-08-13 | 1988-08-02 | International Business Machines Corp. | Adaptative packet/circuit switched transportation method and system |
US5101274A (en) * | 1987-08-10 | 1992-03-31 | Canon Kabushiki Kaisha | Digital signal recording apparatus time-division multiplexing video and audio signals |
US5630117A (en) * | 1989-02-27 | 1997-05-13 | Apple Computer, Inc. | User interface system and method for traversing a database |
US5274758A (en) * | 1989-06-16 | 1993-12-28 | International Business Machines | Computer-based, audio/visual creation and presentation system and method |
US5471576A (en) * | 1992-11-16 | 1995-11-28 | International Business Machines Corporation | Audio/video synchronization for application programs |
US5642171A (en) * | 1994-06-08 | 1997-06-24 | Dell Usa, L.P. | Method and apparatus for synchronizing audio and video data streams in a multimedia system |
US5613909A (en) * | 1994-07-21 | 1997-03-25 | Stelovsky; Jan | Time-segmented multimedia game playing and authoring system |
US5619733A (en) * | 1994-11-10 | 1997-04-08 | International Business Machines Corporation | Method and apparatus for synchronizing streaming and non-streaming multimedia devices by controlling the play speed of the non-streaming device in response to a synchronization signal |
Non-Patent Citations (4)
Title |
---|
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, August 1993, Vol. 5, No. 4, RAVINDRAN K. et al., "Delay Compensation Protocols for Synchronization of Multimedia Data Streams", pages 574-589. * |
PRODUCI NEWS LAW TECHNOLOGY, Vol. 2, No. 6, June 1995, "The Re:Viewer Workstation Revolutionary Search, Retrieval and Organization of Litigation Discovery Data". * |
See also references of EP0895617A4 * |
SOFTCOM LEARNINGNET MULTIMEDIA DISTANCE LEARNING, http://www.softcom.com/Learningnet.html, 1995, Softcom. Inc. * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10567453B2 (en) | 2000-09-12 | 2020-02-18 | Wag Acquisition, L.L.C. | Streaming media delivery system |
US10298639B2 (en) | 2000-09-12 | 2019-05-21 | Wag Acquisition, L.L.C. | Streaming media delivery system |
US10298638B2 (en) | 2000-09-12 | 2019-05-21 | Wag Acquisition, L.L.C. | Streaming media delivery system |
US9762636B2 (en) | 2000-09-12 | 2017-09-12 | Wag Acquisition, L.L.C. | Streaming media delivery system |
US9742824B2 (en) | 2000-09-12 | 2017-08-22 | Wag Acquisition, L.L.C. | Streaming media delivery system |
US9729594B2 (en) | 2000-09-12 | 2017-08-08 | Wag Acquisition, L.L.C. | Streaming media delivery system |
US8977375B2 (en) | 2000-10-12 | 2015-03-10 | Bose Corporation | Interactive sound reproducing |
US10481855B2 (en) | 2000-10-12 | 2019-11-19 | Bose Corporation | Interactive sound reproducing |
US10140084B2 (en) | 2000-10-12 | 2018-11-27 | Bose Corporation | Interactive sound reproducing |
US9223538B2 (en) | 2000-10-12 | 2015-12-29 | Bose Corporation | Interactive sound reproducing |
US7085842B2 (en) | 2001-02-12 | 2006-08-01 | Open Text Corporation | Line navigation conferencing system |
US7058889B2 (en) | 2001-03-23 | 2006-06-06 | Koninklijke Philips Electronics N.V. | Synchronizing text/visual information with audio playback |
WO2002077966A3 (en) * | 2001-03-23 | 2003-02-27 | Koninkl Philips Electronics Nv | Synchronizing text/visual information with audio playback |
WO2002077966A2 (en) * | 2001-03-23 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Synchronizing text/visual information with audio playback |
US7831916B2 (en) | 2001-06-15 | 2010-11-09 | Fry-Altec, Llc | Method, system, and program for creating, recording, and distributing digital stream contents |
US8276082B2 (en) | 2001-06-15 | 2012-09-25 | Fry-Altec, Inc. | Method and computer readable media for organizing digital stream contents |
WO2002104036A1 (en) * | 2001-06-15 | 2002-12-27 | Yahoo Japan Corporation | Method, system, and program for creating, recording, and distributing digital stream contents______________ |
US7295548B2 (en) * | 2002-11-27 | 2007-11-13 | Microsoft Corporation | Method and system for disaggregating audio/visual components |
US8671359B2 (en) | 2003-03-07 | 2014-03-11 | Nec Corporation | Scroll display control |
WO2004079709A1 (en) * | 2003-03-07 | 2004-09-16 | Nec Corporation | Scroll display control |
US7844163B2 (en) | 2003-09-16 | 2010-11-30 | Ricoh Company, Ltd. | Information editing device, information editing method, and computer product |
EP1517328A1 (en) | 2003-09-16 | 2005-03-23 | Ricoh Company | Information editing device, information editing method, and computer program product |
WO2005101237A1 (en) * | 2004-04-14 | 2005-10-27 | Tilefile Pty Ltd | A media package and a system and method for managing a media package |
JP2007533015A (en) * | 2004-04-14 | 2007-11-15 | デーヴィッド・ピーター・ボリジャー | Media package and media package management system and method |
WO2007049999A1 (en) * | 2005-10-26 | 2007-05-03 | Timetomarket Viewit Sweden Ab | Information intermediation system |
WO2008070993A1 (en) * | 2006-12-15 | 2008-06-19 | Desktopbox Inc. | Simulcast internet media distribution system and method |
US8280949B2 (en) | 2006-12-15 | 2012-10-02 | Harris Corporation | System and method for synchronized media distribution |
US8499090B2 (en) | 2008-12-30 | 2013-07-30 | Intel Corporation | Hybrid method for delivering streaming media within the home |
Also Published As
Publication number | Publication date |
---|---|
EP0895617A4 (en) | 1999-07-14 |
EP0895617A1 (en) | 1999-02-10 |
AU2992297A (en) | 1997-11-19 |
CA2252490A1 (en) | 1997-11-06 |
JP2000510622A (en) | 2000-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6557042B1 (en) | Multimedia summary generation employing user feedback | |
US6636238B1 (en) | System and method for linking an audio stream with accompanying text material | |
US9729907B2 (en) | Synchronizing a plurality of digital media streams by using a descriptor file | |
US10735488B2 (en) | Method of downloading digital content to be rendered | |
US6956593B1 (en) | User interface for creating, viewing and temporally positioning annotations for media content | |
US6148304A (en) | Navigating multimedia content using a graphical user interface with multiple display regions | |
US6907570B2 (en) | Video and multimedia browsing while switching between views | |
US10805111B2 (en) | Simultaneously rendering an image stream of static graphic images and a corresponding audio stream | |
CA2140850C (en) | Networked system for display of multimedia presentations | |
US5692213A (en) | Method for controlling real-time presentation of audio/visual data on a computer system | |
US7051275B2 (en) | Annotations for multiple versions of media content | |
US6868440B1 (en) | Multi-level skimming of multimedia content using playlists | |
US7945857B2 (en) | Interactive presentation viewing system employing multi-media components | |
EP1999953B1 (en) | Embedded metadata in a media presentation | |
EP0895617A1 (en) | A method and system for synchronizing and navigating multiple streams of isochronous and non-isochronous data | |
US8612384B2 (en) | Methods and apparatus for searching and accessing multimedia content | |
US20140214907A1 (en) | Media management system and process | |
WO2007064715A2 (en) | Systems, methods, and computer program products for the creation, monetization, distribution, and consumption of metacontent | |
CN101491089A (en) | Embedded metadata in a media presentation | |
EP1405212B1 (en) | Method and system for indexing and searching timed media information based upon relevance intervals | |
Horner | NewsTime--a graphical user interface to audio news | |
JP2003283944A (en) | Interface unit usable with multimedia contents reproducer for searching multimedia contents being reproduced | |
WO2003021416A1 (en) | Method and apparatus for object oriented multimedia editing | |
Shirota et al. | A TV program generation system using digest video scenes and a scripting markup language | |
Layaïda et al. | SMIL: The new multimedia document standard of the W3C |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG US UZ VN AM AZ BY KG KZ MD RU TJ TM |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
ENP | Entry into the national phase |
Ref document number: 2252490 Country of ref document: CA Ref country code: CA Ref document number: 2252490 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1997924520 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1997924520 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1997924520 Country of ref document: EP |