US20070033515A1 - System And Method For Arranging Segments Of A Multimedia File - Google Patents
System And Method For Arranging Segments Of A Multimedia File Download PDFInfo
- Publication number
- US20070033515A1 US20070033515A1 US11/423,134 US42313406A US2007033515A1 US 20070033515 A1 US20070033515 A1 US 20070033515A1 US 42313406 A US42313406 A US 42313406A US 2007033515 A1 US2007033515 A1 US 2007033515A1
- Authority
- US
- United States
- Prior art keywords
- video
- multimedia
- segment
- information
- metadata
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 296
- 238000000547 structure data Methods 0.000 claims description 2
- 239000000463 material Substances 0.000 abstract description 6
- 230000000007 visual effect Effects 0.000 description 85
- 230000033764 rhythmic process Effects 0.000 description 65
- 239000013598 vector Substances 0.000 description 63
- 238000003860 storage Methods 0.000 description 62
- 230000008569 process Effects 0.000 description 49
- 238000010586 diagram Methods 0.000 description 45
- 238000004422 calculation algorithm Methods 0.000 description 41
- 238000001514 detection method Methods 0.000 description 36
- 238000005070 sampling Methods 0.000 description 28
- 230000002123 temporal effect Effects 0.000 description 26
- 238000010845 search algorithm Methods 0.000 description 25
- 230000006870 function Effects 0.000 description 23
- 238000013459 approach Methods 0.000 description 19
- 230000002829 reductive effect Effects 0.000 description 17
- 238000004458 analytical method Methods 0.000 description 15
- 230000006835 compression Effects 0.000 description 15
- 238000007906 compression Methods 0.000 description 15
- 238000012545 processing Methods 0.000 description 14
- 230000003044 adaptive effect Effects 0.000 description 13
- 230000002085 persistent effect Effects 0.000 description 13
- 230000008859 change Effects 0.000 description 12
- 239000011159 matrix material Substances 0.000 description 12
- 238000004891 communication Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 230000009467 reduction Effects 0.000 description 8
- 238000010276 construction Methods 0.000 description 7
- 238000000605 extraction Methods 0.000 description 7
- 230000001965 increasing effect Effects 0.000 description 7
- 230000033001 locomotion Effects 0.000 description 7
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000006978 adaptation Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 5
- 230000004807 localization Effects 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 230000036961 partial effect Effects 0.000 description 5
- 238000003909 pattern recognition Methods 0.000 description 5
- 238000009877 rendering Methods 0.000 description 5
- 230000011218 segmentation Effects 0.000 description 5
- 230000003068 static effect Effects 0.000 description 5
- 241000282412 Homo Species 0.000 description 4
- 230000000977 initiatory effect Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000000513 principal component analysis Methods 0.000 description 4
- 239000003086 colorant Substances 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 238000005192 partition Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000013468 resource allocation Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000000593 degrading effect Effects 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 238000001094 photothermal spectroscopy Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- UKUVVAMSXXBMRX-UHFFFAOYSA-N 2,4,5-trithia-1,3-diarsabicyclo[1.1.1]pentane Chemical compound S1[As]2S[As]1S2 UKUVVAMSXXBMRX-UHFFFAOYSA-N 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000005266 casting Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000013549 information retrieval technique Methods 0.000 description 1
- 238000011423 initialization method Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 230000004800 psychological effect Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000005549 size reduction Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000008093 supporting effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 235000019640 taste Nutrition 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/71—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7844—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/7867—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4092—Image resolution transcoding, e.g. by using client-server architectures
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G11B27/034—Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B2220/00—Record carriers by type
- G11B2220/20—Disc-shaped record carriers
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B2220/00—Record carriers by type
- G11B2220/40—Combinations of multiple record carriers
- G11B2220/41—Flat as opposed to hierarchical combination, e.g. library of tapes or discs, CD changer, or groups of record carriers that together store one title
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/34—Indicating arrangements
Definitions
- the present invention relates generally to marking multimedia files. More specifically, the present invention relates to applying or inserting tags into multimedia files for indexing and searching, as well as for editing portions of multimedia files, all to facilitate the storing, searching, and retrieving of the multimedia information.
- multimedia content is represented in a streaming file format so that a user has to view the file from the beginning in order to look for the exact point where the first user left off.
- a conventional bookmark marks a document such as a static web page for later retrieval by saving a link (address) to the document.
- a link for example, Internet browsers support a bookmark facility by saving an address called a Uniform Resource Identifier (URI) to a particular file.
- URI Uniform Resource Identifier
- Internet Explorer manufactured by the Microsoft Corporation of Redmond, Washington, uses the term “favorite” to describe a similar concept.
- bookmarks store only the information related to the location of a file, such as the directory name with a file name, a Universal Resource Locator (URL), or the URI.
- the files referred to by conventional bookmarks are treated in the same way regardless of the data formats for storing the content.
- a simple link is used for multimedia content also.
- a URI is used. Each time the file is revisited using the bookmark, the multimedia content associated with the bookmark is always played from the beginning.
- FIG. 1 illustrates a list 108 of conventional bookmarks 110 , each comprising positional information 112 and title 114 .
- the positional information 112 of a conventional bookmark is composed of a URI as well as a bookmarked position 106 .
- the bookmarked position is a relative time or byte position measured from a beginning of the multimedia content.
- the title 114 can be specified by a user, as well as delivered with the content, and it is typically used to make the user easily recognize the bookmarked URI in a bookmark list 108 .
- a conventional bookmark without using a bookmarked position, when a user wants to replay the specified multimedia file, the file is played from the beginning of the file each time, regardless of how much of the file the user has already viewed.
- the user has no choice but to record the last accessed position on a memo and to move manually the last stopped point. If the multimedia file is viewed by streaming, the user must go through a series of buffering to find out the last accessed position, thus wasting much time. Even for the conventional bookmark with a bookmarked position, the same problem occurs when the multimedia content is delivered in live broadcast, since the bookmarked position within the multimedia content is not usually available, as well as when the user wants to replay one of the variations of the bookmarked multimedia content.
- Multimedia content may be generated and stored in a variety of formats.
- video may be stored in the formats such as MPEG, ASF, RM, MOV, and AVI.
- Audio may be stored in the formats such as MID, MP 3 , and WAV.
- Time information may be incorporated into a bookmark to return to the last-accessed segment within the multimedia content.
- the use of time information only, however, fails to return to exactly the same segment at a later time for the following reasons. If a bookmark incorporating time information was used to save the last-accessed segment during the preview of multimedia content broadcast, the bookmark information would not be valid during a regular full-version broadcast, so as to return to the last-accessed segment. Similarly, if a bookmark incorporating time information was used to save the last-accessed segment during real-time broadcast, the bookmark would not be effective during later access because the later available version may have been edited or a time code was not available during the real-time broadcast.
- CNN.com provides five different streaming videos for a single video content: two different types of streaming videos with the bandwidths of 28.8 kbps and 80 kbps, both encoded in Microsoft's Advanced Streaming Format (ASF).
- ASF Microsoft's Advanced Streaming Format
- CNN.com also provides RM streaming format by RealNetworks, Inc. of Seattle, Wash. (RM), and a streaming video with the smart bandwidth encoded in Apple Computer, Inc.'s QuickTime streaming format (MOV).
- RM RealNetworks, Inc. of Seattle, Wash.
- MOV QuickTime streaming format
- the five video files may start and end at different time points from the viewpoint of the source video content, since each variation may be produced by an independent encoding process varying the values chosen for encoding formats, bandwidths, resolutions, etc. This results in mismatches of time points because a specific time point of the source video content may be presented as different media time points in the five video files.
- the mismatches of positions cause a problem of mis-positioned playback.
- a multimedia bookmark on a master file of a multimedia content (for example, video encoded in a given format), and tries to play another variation (for example, video encoded in a different format) from the bookmarked position. If the two variations do not start at the same position of the source content, the playback will not start at the bookmarked position. That is, the playback will start at the position that is temporally shifted with the difference between the start positions of the two variations.
- the entire multimedia presentation is often lengthy. However, there are frequent occasions when the presentation is interrupted, voluntarily or forcibly, to terminate before finishing. Examples include a user who starts playing a video at work leaves the office and desires to continue watching the video at home, or a user who may be forced to stop watching the video and log out due to system shutdown. It is thus necessary to save the termination position of the multimedia file into persistent storage in order to return directly to the point of termination without a time-consuming playback of the multimedia file from the beginning.
- the interrupted presentation of the multimedia file will usually resume exactly at the previously saved terminated position. However, in some cases, it is desirable to begin the playback of the multimedia file a certain time before the terminated point, since such rewinding could help refresh the user's memory.
- EPG Electronic Program Guide
- EPG Electronic Program Guide
- two-dimensional presentation becomes cumbersome as terrestrial, cable, and satellite systems send out thousands of programs through hundreds of channels. Navigation through a large table of rows and columns in order to search for desired programs is frustrating.
- STB set-top box
- PVR personal video recording
- Such STB usually contains digital video encoder/decoder based on an international digital video compression standard such as MPEG-1/2, as well as the large local storage for the digitally compressed video data.
- MPEG-1/2 an international digital video compression standard
- Some of the recent STBs also allow connection to the Internet.
- STB users can experience new services such as time-shifting and web-enhanced television (TV).
- the first problem is that even the latest STBs alone cannot fully satisfy users' ever-increasing desire for diverse functionalities.
- the STBs now on the market are very limited in terms of computing and memory and so it is not easy to execute most CPU and memory intensive applications.
- the people who are bored with plain playback of the recorded video may desire more advanced features such as video browsing/summary and search.
- all of those features require metadata for the recorded video.
- the metadata are usually the data describing content, such as the title, genre and summary of a television program.
- the metadata also include audiovisual characteristic data such as raw image data corresponding to a specific frame of the video stream.
- the segment may be a single frame, a single shot consisting of successive frames, or a group of several successive shots.
- Each segment may be described by some elementary semantic information using texts.
- the segment is referenced by the metadata using media locators such as frame number or time codes.
- media locators such as frame number or time codes.
- the generation of such video metadata usually requires intensive computation and a human operator's help, so practically speaking, it is not feasible to generate the metadata in the current STB.
- one possible solution for this problem is to generate the metadata in the server connected to the STB and to deliver it to the STB via network.
- it is essential to know the start position of recorded video with respect to the video stream used to generate the metadata in the server/content provider in order to match the temporal position referenced by the metadata to the position of the recorded video.
- the second problem is related to discrepancy between the two time instants: the time instant at which the STB starts the recording of the user-requested TV program, and the time instant at which the TV program is actually broadcast.
- the time mismatch problem can be solved by using metadata delivered from the server, for example, reference frames/segment representing the beginning of the TV program. The exact location of the TV program, then, can be easily found by simply matching the reference frames with all the recorded frames for the program.
- CBIR Content-based image retrieval
- low-level image features such as color, texture and shape. While high-level image descriptors are potentially more intuitive for common users, the derivation of high-level descriptors is still in its experimental stages in the field of computer vision and requires complex vision processing. Despite its efficiency and ease of implementation, on the other hand, the main disadvantage of low-level image features is that they are perceptually non-intuitive for both expert and non-expert users, and therefor, do not normally represent users' intent effectively. Furthermore, they are highly sensitive to a small amount of image variation in feature shape, size, position, orientation, brightness and color. Perceptually similar images are often highly dissimilar in terms of low-level image features. Searches made by low-level features are often unsuccessful and it usually takes many trials to find images satisfactory to a user.
- Milanes et al. utilized hierarchical clustering to organize an image database into visually similar groupings. See, R. Milanese, D. Squire, and T. Pun, “Correspondence analysis and hierarchical indexing for content-based image retrieval,” in Proc. IEEE Int. Conf. Image Processing , Vol. 3, Lausanne, Switzerland, pp. 859-862, September, 1996.
- Zhang and Zhong provided a hierarchical self-organizing map (HSOM) method to organize an image database into a two-dimensional grid. See, H. J. Zhang and D. Zhong, “A scheme for visual feature based image indexing,” in Proc. SPIE/IS & T Conf. Storage Retrieval Image Video Database III , Vol. 2420, pp. 36-46, San Jose, Calif., February, 1995.
- HSOM hierarchical self-organizing map
- Peer-to-Peer is a class of applications making the most of previously unused resources (for example, storage, content, and/or CPU cycles), which are available on the peers at the edges of networks.
- P2P computing allows the peers to share the resources and services, or to aggregate CPU cycles, or to chat with each other, by direct exchange.
- Two of the more popular implementations of P2P computing are Napster and Gnutella.
- Napster has its peers register files with a broker, and uses the broker to search for files to copy.
- the broker plays the role of server in a client-server model to facilitate the interaction between the peers.
- Gnutella has peers register files with network neighbors, and searches the P2P network for files to copy. Since this model does not require a centralized broker, Gnutella is considered to be a true P2P system.
- the first problem of the prior art method is that it requires additional storage to store the new version of an edited video file.
- Conventional video editing software generally uses the original input video file to create an edited video.
- editors having a large database of videos attempt to edit the videos to create a new one.
- the storage is wasted storing duplicated portions of the video.
- the second problem with the prior art method is that a whole new metadata have to be generated for a newly created video. If the metadata are not edited in accordance with the edition of the video, even if the metadata for the specific segment of the input video are already constructed, the metadata may not accurately reflect the content. Because considerable effort is required to create the metadata of videos, it is desirable to reuse efficiently existing metadata, if possible.
- Metadata of a video segment contain textual information such as time information (for example, starting frame number and duration, or starting frame number as well as the finishing frame number), title, keyword, and annotation, as well as image information such as the key frame of a segment.
- the metadata of segments can form a hierarchical structure where the larger segment contains the smaller segments. Because it is hard to store both the video and their metadata into a single file, the video metadata are separately stored as a metafile, or stored in a database management system (DBMS).
- DBMS database management system
- Metadata having a hierarchical structure browsing a whole video, searching for a segment using the keyword and annotation of each segment, and using the key frames of each segment for visual summary of the video are supported. Also, not only does it support the existing simple playback, but also the playback and repeated playback of a specific segment. Therefor, the use of hierarchically-structured metadata is becoming popular.
- Multimedia data are accessed by ever increasing kinds of devices such as hand-held computers (HHCs), personal digital assistants (PDAs), and smart cellular phones.
- HHCs hand-held computers
- PDAs personal digital assistants
- multimedia content is accessed in a universal fashion from a wide variety of devices. See, J. R. Smith, R. Mohan and C. Li, “Transcoding Internet Content for Heterogeneous Client Devices,” in Proc. ISCASA , Monterey, Calif., 1998.
- a data representation, the InfoPyramid is a framework for aggregating the individual components of multimedia content with content descriptions, and methods and rules for handling the content and content descriptions. See, C. Li, R. Mohan and J. R. Smith, “Multimedia Content Description in the InfoPyramid,” in Proc. IEEE Intern. Conf. on Acoustics, Speech and Signal Processing , May, 1998.
- the InfoPyramid describes content in different modalities, at different resolutions and at multiple abstractions. Then a transcoding tool dynamically selects the resolutions or modalities that best meet the client capabilities from the InfoPyramid. J. R.
- the importance value describes the relative importance of the region/block in the image presentation compared with the other regions. This value ranges from 0 to 1, where 1 stands for the highest important region and 0 for the lowest. For example, the regions of high importance are compressed with a lower compression factor than the remaining part of the image. Then, the other parts of the image are first blurred and then compressed with a higher compression factor in order to reduce the overall data size of the compressed image.
- a scaling mechanism such as format/resolution change, bit-wise data size reduction, and object dropping, is needed. More specifically, when an image is transmitted to a variety of client devices with different display sizes, a system should generate a transcoded (e.g., scaled and cropped) image to fit the size of the respective client display.
- the extent of transcoding depends on the type of objects embedded in the image, such as cards, bridges, face, and so forth. Consider, for example, an image containing an embedded text or a human face. If the display size of a client device is smaller than the size of the image, sub-sampling and/or cropping to fit the client display must reduce the spatial resolution of the image.
- the importance value may be used to provide information on which part of the image can be cropped, it does not provide a quantified measure of perceptibility indicating the degree of allowable transcoding.
- the prior art does not provide the quantitative information on the allowable compression factor with which the important regions can be compressed while preserving the minimum fidelity that an author or a publisher intended.
- the InfoPyramid does not provide either the quantitative information about how much the spatial resolution of the image can be reduced or ensure that the user will perceive the transcoded image as the author or publisher initially intended.
- the first step for indexing and retrieving of visual data is to temporally segment the input video, that is, to find shot boundaries due to camera shot transitions.
- the temporally segmented shots can improve the storing and retrieving of visual data if keywords to the shots are also available.
- a fast and accurate automatic shot detector needs to be developed as well as an automatic text caption detector to automatically annotate keywords to the temporally segmented shots.
- Visual rhythm contains distinctive patterns or visual features for many type of video editing effects, especially for all wipe-like effects which manifest as visually distinguishable lines or curves on the visual rhythm with very little computational time, which enables an easy verification of automatically detected shots by human without actually playing the whole individual frame sequence to minimize or possible eliminate all false as well as missing shots.
- Visual rhythm on the other hand contains visual features readily available to detect caption text also. See, H. Kim, J. Lee and S. M. Song, “An efficient graphical shot verifier incorporating visual rhythm”, in Proceedings of IEEE International Conference on Multimedia Computing and Systems , pp. 827-834, June, 1999.
- Lienhart and Stuber provided Split- and- Merge algorithm based on characteristics of artificial text to segment text. See, R. Lienhart, “Automatic Text Recognition for Video Indexing,” in Proc. Of A CM MM , pp. 11-20.
- Doermann and Kia used wavelet analysis and employed a multi-frame coherence approach to cluster edges into rectangular shape. See, L. Doermann, O. Kia, “Automatic Text Detection and Tracking in Digital Video,” in IEEE Trans. On Image Processing , Vol. 9, pp. 147-156. Sato et al. adopted a multi-frame integration technique to separate static text from moving background. See, T. Sato, T. Kanade and S. Satoh, “Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Captions,” in Multimedia Systems , Vol. 7, pp. 385-394.
- Yeo and Liu proposed a method for the detection of text caption events in video by modified scene change detection which cannot handle captions that gradually enter or disappear from frames.
- B. L. Yeo “Visual Content Highlighting Visa Automatic Extraction of Embedded Captions on MPEG Compressed Video,” in SPIE/IS & TSymp. on Electronic Imaging Science and Technology , Vol. 2668, 1996.
- Zhong et al. examined the horizontal variations of AC values in DCT to locate text frames and examined the vertical intensity variation within the text regions to extract the final text frames. See, Y. Zhong, K. Karu and A. Jain, “Automatic captions localization in compressed video,” in IEEE Trans.
- the invention overcomes the above-identified problems as well as other shortcomings and deficiencies of existing technologies by providing
- Multimedia Bookmark provides a system and method for accessing multimedia content stored in a multimedia file having a beginning and an intermediate point, the content having at least one segment at the intermediate point.
- the system includes a multimedia bookmark, the multimedia bookmark having content information about the segment at the intermediate point, wherein a user can utilize the multimedia bookmark to access the segment without accessing the beginning of the multimedia file.
- the system of the present invention can include a wide area network such as the Internet. Moreover, the method of the present invention can facilitate the creating, storing, indexing, searching, retrieving and rendering of multimedia content on any device capable of connecting to the network and performing one or more of the aforementioned functions.
- the multimedia content can be one or more frames of video, audio data, text data such as a string of characters, or any combination or permutation thereof.
- the system of the present invention includes a search mechanism that locates a segment in the multimedia file.
- An access mechanism is included in the system that reads the multimedia content at the segment designated by the multimedia bookmark.
- the multimedia content can be partial data that are related to a particular segment.
- the multimedia bookmark used in conjunction with the system of the present invention includes positional information about the segment.
- the positional information can be a URI, an elapsed time, a time code, or other information.
- the multimedia file used in conjunction with the system of the present invention can be contained on local storage, it can also be stored at remote locations.
- the system of the present invention can be a computer server that is operably connected to a network that has connected to it one or more client devices.
- Local storage on the server can optionally include a database and sufficient circuitry and/or logic, in the form of hardware and/or software in any combination that facilitates the storing, indexing, searching, retrieving and/or rendering of multimedia information.
- the present invention further provides a methodology and implementation for adaptive refresh rewinding, as opposed to traditional rewinding, which simply performs a rewind from a particular position by a predetermined length.
- adaptive refresh rewinding as opposed to traditional rewinding, which simply performs a rewind from a particular position by a predetermined length.
- the exemplary embodiment described below will demonstrate the present invention using video data.
- Three essential parameters are identified to control the behavior of adaptive refresh rewinding, that is, how far to rewind, how to select certain frames in the rewind interval, and how to present the chosen refresh video frames on a display device.
- the present invention also provides a new way to generate and deliver programming information that is customized to the user's viewing preferences.
- This embodiment of the present invention removes the navigational difficulties associated with EPG. Specifically, data regarding the user's habits of recording, scheduling, and/or accessing TV programs or Internet movies are captured and stored. Over a long period of time, these data can be analyzed and used to determine the user's trends or patterns that can be used to predict future viewing preferences.
- the present invention also relates to the techniques to solve the two problems by downloading the metadata from a distant metadata server and then synchronizing/matching the content with the received metadata. While this invention is described in the context of video content stored on STB having PVR function, it can be extended to other multimedia content such as audio.
- the present invention also allows the reuse of the content prerecorded on the analog VCR videotapes.
- the present invention works equally well.
- the present invention also provides a method for searching for relevant multimedia content based on at least one feature saved in a multimedia bookmark.
- the method preferably includes transmitting at least one feature saved in a multimedia bookmark from a client system to a server system in response to a user's selection of the multimedia bookmark.
- the server may then generate a query for each feature received and, subsequently, use each query generated to search one or more storage devices.
- the search results may be presented to the user upon completion.
- the present invention provides a method for verifying inclusion of attachments to electronic mail messages.
- the method preferably includes scanning the electronic mail message for at least one indicator of an attachment to be included and determining whether at least one attachment to the electronic mail message is present upon detection of the at least one indicator.
- the method preferably also includes displaying a reminder to a user that no attachment is present.
- the present invention provides a method for searching for multimedia content in a peer to peer environment.
- the method preferably includes broadcasting a message from a user system to announce its entrance to the peer to peer environment.
- Active nodes in the peer to peer environment preferably acknowledge receipt of the broadcast message while the user system preferably tracks the active nodes.
- a query message including multimedia features is preferably broadcast to the peer to peer environment.
- a multimedia search engine on a multimedia database included in a storage device on one or more active nodes is preferably executed.
- a search results message including a listing of found filenames and network locations is preferably sent to the user system upon completion of the database search.
- the present invention further provides a method for sending a multimedia bookmark between devices over a wireless network.
- the method preferably includes acknowledging receipt of a multimedia bookmark by a video bookmark message service center upon receipt of the multimedia bookmark from a sending device.
- the video bookmark message service center preferably invokes a send multimedia bookmark operation at a mobile switching center.
- the mobile switching center then preferably sends the multimedia bookmark and, upon acknowledgement of receipt of the multimedia bookmark by the recipient device, notifies the video bookmark message service center of the completed multimedia bookmark transaction.
- the present invention provides a method for sending multimedia content over a wireless network for playback on a mobile device.
- the mobile device preferably sends a multimedia bookmark and a request for playback to a mobile switching center.
- the mobile switching center then preferably sends the request and the multimedia bookmark to a video bookmark message service center.
- the video bookmark message service center then preferably determines a suitable bit rate for transmitting the multimedia content to the mobile device. Based on the bit rate and various characteristics of the mobile device, the video bookmark message service center also preferably calculates a new multimedia bookmark.
- the new multimedia bookmark is then sent to a multimedia server which streams the multimedia content to the video bookmark message service center before the multimedia content is delivered to the mobile device via the mobile switching center.
- the present invention further provides a new approach to utilizing user-established relevance between images.
- the method of the present invention uses only direct links between images without relying on image descriptors such as low-level image features or textual annotations. Users provide relevance information in the form of relevance feedback, and the information is accumulated in each image's queue of links and propagated through linked images in a relevance graph.
- the collection of direct image links can be effective for the retrieval of subjectively similar images when they are gathered from a large number of users over a considerable period of time.
- the present invention can be used in conjunction with other content-based and text-based image retrieval methods.
- the present invention also provides a new method to fast find from a large database of image/frames the objects close enough to a query image/frame under a certain distortion.
- the present invention reduces the number of distance evaluations at query time, thus resulting in fast retrieval of data objects from the database.
- the present invention sorts and stores in advance the distances to a group of predefined distinguished points (called reference points) in the feature space and performs binary searches on the distances so as to speed up the search.
- the present invention introduces an abstract multidimensional structure called hypershell. More practically, the hypershell can be conceived as a set of all the feature vectors in the feature space which lie away r ⁇ from its corresponding reference point, where r is the distance between a query feature point and the reference point, and ⁇ is a real number indicating the fidelity of search results. And the intersection of such hypershells leads to some intersected regions which are often small partitions of the whole feature space. Therefor, instead of the whole feature space, the present invention performs the search only on the intersected regions to improve the search speed.
- the present invention further provides a new approach to editing video materials, in which it only virtually edits the metadata of input videos to create a new video, instead of actually editing videos stored as computer files.
- the virtual editing is performed either by copying the metadata of a video segment of interest in an input metafile or copying only the URI of the segment into a newly constructed metafile.
- the present invention provides a way of playing the newly edited video only with its metadata.
- the present invention also provides a system for the virtual editing.
- the present invention can be applied not only to videos stored on CD-ROM, DVD, and hard disk, but also to streaming videos over a network.
- the present invention also provides a method for virtual editing multimedia files. Specifically, the one or more video files are provided. A metadata file is created for each of the video files, each of the metadata files having at least one segment to be edited. Thereafter, a single edited metafile is created that contains the segments to were to be edited from each of the metadata files so that when the edited metadata file is accessed, the user is able to play the segments to be edited in the edited order.
- the present invention also provides a method for virtual editing multimedia files. Specifically, the one or more video files are provided. A metadata file is created for each of the video files, each of the metadata files having at least one segment to be edited. Thereafter, a single edited metafile is created that contains links to the segments to were to be edited from each of the metadata files so that when the edited metadata file is accessed, the user is able to play the segments to be edited in the edited order.
- the present invention also includes a method for editing a multimedia file by providing a metafile, the metafile having at least one segment that is selectable; selecting a segment in the metafile; determining if a composing segment should be created, and if the composing segment should be created, then creating a composing segment in a hierarchical structure; specifying the composing segment as a child of a parent composing segment; determining if metadata is to be copied or if a URI is to be used; if the metadata is to be copied, then copying metadata of the selected segment to the component segment; if the URI is to be used, then writing a URI of the selected segment to the component segment; writing a URL of an input video file to the component segment; determining if all URLs of any sibling files are the same; and if the URL is the same as any of the sibling's URLs, then writing the URL to the parent composing segment and deleting the URLs of all sibling segments.
- the method for editing a multimedia file includes determining if another segment is to be selected and if another segment is to be selected, then performing the step of selecting a segment in a metafile.
- the method includes determining if another metafile is to be browsed and if another metafile is to be browsed, then performing the step of providing a metafile.
- the metafiles may be XML files or some other format.
- the present invention also provides a virtual video editor in one embodiment.
- the virtual video editor includes a network controller constructed and arranged to access remote metafiles and remote video files and a file controller in operative connection to the network controller and constructed and arranged to access local metafiles and local video files, and to access the remote metafiles and the remote video files via the network controller.
- a parser constructed and arranged to receive information about the files from the file controller and an input buffer constructed and arranged to receive parser information from the parser are also included in the virtual video editor.
- a structure manager constructed and arranged to provide structure data to the input buffer
- a composing buffer constructed and arranged to receive input information from the input buffer and structure information from the structure manager to generate composing information
- a generator constructed and arranged to receive the composing information from the composing buffer are preferably included and wherein the generator generates output information in a pre-selected format are preferably included.
- the virtual video editor also includes a playlist generator constructed and arranged to receive structure information from the structure manager in order to generate playlist information and a video player constructed and arranged to receive the playlist information from the playlist generator and file information from the file controller in order to generate display information.
- the virtual video editor also includes a display device constructed and arranged to receive the display information from the video player and to display the display information to a user.
- the present invention provides a method for transcoding an image for display at multiple resolutions.
- the method includes providing a multimedia file, designating one or more regions of the multimedia file as focus zones and providing a vector to each of the focus zones.
- the method continues by reading the multimedia file with a client device, the client device having a maximum display resolution and determining if the resolution of the multimedia file exceeds the maximum display resolution of the client device. If the multimedia file resolution exceeds the maximum display resolution of the display device, the method determines the maximum number focus zones that can be displayed on the client device. Finally, the method includes displaying the maximum number of focus zones on the client device.
- the present invention also provides a novel scheme for generating transcoded (scaled and cropped) image to fit the size of the respective client display when an image is transmitted to a variety of client devices with different display sizes.
- the scheme has two key components: 1) perceptual hint for each image block, and 2) an image transcoding algorithm.
- the perceptual hint provides the information on the minimum allowable spatial resolution. Actually, it provides a quantitative information on how much the spatial resolution of the image can be reduced while ensuring that the user will perceive the transcoded image as the author or publisher want to represent it.
- the image transcoding algorithm that is basically a content adaptation process selects the best image representation to meet the client capabilities while delivering the largest content value.
- the content adaptation algorithm is modeled as a resource allocation problem to maximize the content value.
- One of the embodiments of the method of the present invention provides a fast and efficient approach for constructing visual rhythm. Unlike the conventional approaches which decode all pixels composing a frame to obtain certain group of pixel values using conventional video decoders, the present invention provides a method such that only few of the pixels composing a frame are decoded to obtain the actual group of pixels needed for constructing visual rhythm. Most video compressions adopt intraframe and interframe coding to reduce spatial as well as temporal redundancies. Therefor, once the group of pixels is determined for constructing visual rhythm, one only decodes this group of pixels in frames which are not referenced by other frames for interframe coding.
- the other embodiment of the method of present invention provides an efficient and fast-compressed DCT domain method to locate caption text regions in intra-coded and inter-coded frames through visual rhythm from observations that caption text generally tend to appear on certain areas on video or are known a prior; and secondly, the method employs a combination of contrast and temporal coherence information on the visual rhythm, to detect text frame and uses information obtained through visual rhythm to locate caption text regions in the detected text frame along with their temporal duration within the video.
- a content transcoder for modifying and forwarding multimedia content maintained in one or more multimedia content databases to a wide area network for display on a requesting client device.
- the content transcoder preferably includes a policy engine coupled to the multimedia content database and a content analyzer operably coupled to both the policy engine and the multimedia content database.
- the content transcoder of the present invention also preferably includes a content selection module operably coupled to both the policy engine and the content analyzer and a content manipulation module operably coupled to the content selection module.
- the content transcoder preferably includes a content analysis and manipulation library operably coupled to the content analyzer, the content selection module and the content manipulation module.
- the policy engine may receive a request for multimedia content from the requesting client device via the wide area network and policy information from the multimedia content database.
- the content analyzer may retrieve multimedia content from the multimedia content database and forward the multimedia content to the content selection module.
- the content selection module may select portions of the multimedia content based on the policy information and information from the content analysis and manipulation library and forward the selected portions of multimedia content to the content manipulation module.
- the content manipulation module may then modify the multimedia content for display on the requesting client device before transmitting the modified multimedia content over the wide area network to
- FIG. 1 is an illustration of a conventional prior art bookmark.
- FIG. 2 is an illustration of a multimedia bookmark in accordance with the present invention.
- FIG. 3 is an illustration of exemplary searching for multimedia content relevant to the content information saved in the multimedia bookmark of the present invention, where both positional and content information are used.
- FIG. 4 is an illustration of an exemplary tree structure used by two exemplary search methods in accordance with the present invention.
- FIG. 5 is an example of five variations encoded by the present invention from the same source video content.
- FIG. 6 is an example of two multimedia contents and their associated metadata of the present invention.
- FIG. 7 is a list of example multimedia bookmarks of the present invention.
- FIG. 8 is an illustration of an exemplary method of adjusting bookmarked positions in the durable bookmark system of the present invention.
- FIG. 9 is an illustration of an exemplary user interface incorporating a multimedia bookmark of the present invention.
- FIG. 10 is a flowchart illustrating an exemplary embodiment of a method of the present invention that is effective to implement the disclosed processing system.
- FIG. 11 is a flowchart illustrating the overall process of saving and retrieving multimedia bookmarks of the present invention.
- FIG. 12 is a flowchart illustrating an exemplary process of playing a multimedia bookmark of the present invention.
- FIG. 13 is a flowchart illustrating an exemplary process of deleting a multimedia bookmark of the present invention.
- FIG. 14 is a flowchart illustrating an exemplary process of adding a title to a multimedia bookmark of the present invention.
- FIG. 15 is a flowchart illustrating an exemplary process of the present invention for searching for the relevant multimedia content based upon content, as well as textual information if available.
- FIG. 16 is a flow chart illustrating an exemplary process of the present invention for sending a bookmark to other people via e-mail.
- FIG. 17 is a flowchart illustrating an exemplary method of the present invention for e-mailing a multimedia bookmark of the present invention.
- FIG. 18 is a block diagram illustrating an exemplary system for transmitting multimedia content to a mobile device using the multimedia bookmark of the present invention.
- FIG. 19 is a block diagram illustrating an exemplary message signal arrangement of the present invention between a personal computer and a mobile device.
- FIG. 20 is a block diagram illustrating an exemplary message signal arrangement of the present invention between two mobile devices.
- FIG. 21 is a block diagram illustrating an exemplary message signal arrangement of the present invention between a video server and a mobile device.
- FIG. 22 is a block diagram illustrating an exemplary data correlation method of the present invention.
- FIG. 23 is a block diagram illustrating an exemplary swiping technique of the present invention.
- FIG. 24 is a block diagram illustrating an alternate exemplary swiping technique of the present invention.
- FIG. 25 is a flowchart illustrating an exemplary peer-to-peer exchange of the multimedia bookmark of the present invention.
- FIG. 26 is a block diagram illustrating different sampling strategies.
- FIG. 27 is a block diagram illustrating an exemplary visual rhythm method of the present invention.
- FIG. 28 is a block diagram illustrating the localization and segmentation of text information according to the present invention.
- FIG. 29 is a block diagram illustrating the use of an exemplary Haar transformation according to the present invention.
- FIG. 30 is a block diagram illustrating an exemplary queue for image links of the present invention.
- FIG. 31 is a block diagram illustrating an alternate exemplary queue for image links of the present invention.
- FIGS. 32 ( a ) and ( b ) are block diagrams illustrating a comparison of a prior art video methodology and an exemplary editing method of the present invention.
- FIG. 33 is a block diagram illustrating an exemplary segmentation and reconstruction of a new multimedia video presentation according to the method of the present invention.
- FIG. 34 is a block diagram illustrating an exemplary edited multimedia file according to the present invention.
- FIG. 35 is a flowchart of an exemplary method of the present invention for virtual video editing based on metadata.
- FIG. 36 is an exemplary pseudocode implementation of the method of the present invention.
- FIG. 37 is an exemplary pseudocode implementation of the method of the present invention.
- FIG. 38 is an exemplary pseudocode implementation of the method of the present invention.
- FIG. 39 is an exemplary pseudocode implementation of the method of the present invention.
- FIG. 40 is an exemplary pseudocode implementation of the method of the present invention.
- FIG. 41 is an exemplary pseudocode implementation of the method of the present invention.
- FIG. 42 is a block diagram illustrating an exemplary virtual video editor of the present invention.
- FIG. 43 is a block diagram illustrating an exemplary transcoding method of the present invention without SRR value.
- FIG. 44 is a block diagram illustrating an exemplary transcoding method of the present invention with SRR value.
- FIG. 45 is a block diagram illustrating an exemplary content transcoder of the present invention.
- FIG. 46 is a block diagram illustrating an exemplary adaptive widow focusing method of the present invention.
- FIG. 47 is a block diagram and table illustrating image nodes and edges according to an exemplary method of the present invention.
- FIG. 48 is a block diagram illustrating an exemplary hypershell search method of the present invention.
- FIG. 49 is a block diagram illustrating the contents of an embodiment of the video bookmark of the present invention.
- FIG. 50 is a block diagram illustrating the recommendation engine of the present invention.
- FIG. 51 is a block diagram illustrating the video bookmark process of the present invention in conjunction with an EPG channel.
- FIG. 52 is a block diagram illustrating the video bookmark process of the present invention in conjunction with a network.
- FIG. 53 is a block diagram of the system of the present invention.
- FIG. 54 is a block diagram of an exemplary relevance queue of the present invention.
- FIG. 55 is a timeline diagram showing an exemplary embodiment of the rewind method of the present invention.
- FIG. 56 is a timeline diagram showing an exemplary embodiment of the rewind method of the present invention.
- FIG. 57 is a flowchart showing an exemplary embodiment of the retrieval method of the present invention.
- FIG. 58 is a flowchart showing another exemplary embodiment of the retrieval method of the present invention.
- FIG. 59 is a flowchart showing another exemplary embodiment of the retrieval method of the present invention.
- FIG. 60 is a block diagram illustrating a hierarchical arrangement of images that exemplifies a navigation method of the present invention.
- FIG. 61 is a web page illustrating a web page having an exemplary duration bar of the present invention.
- FIG. 62 is a web page illustrating a web page having an exemplary duration bar of the present invention.
- FIG. 63 is a diagram illustrating an exemplary hypershell search method of the present invention.
- FIG. 64 is a diagram illustrating another exemplary hypershell search method of the present invention.
- FIG. 65 is a diagram illustrating another exemplary hypershell search method of the present invention.
- FIG. 66 is a diagram illustrating another exemplary hypershell search method of the present invention.
- FIG. 67 is a diagram illustrating another exemplary hypershell search method of the present invention.
- FIG. 68 is a block diagram illustrating an exemplary embodiment of the metadata server and metadata agent of the present invention.
- FIG. 69 is a block diagram illustrating an alternate exemplary embodiment of the metadata server and metadata agent of the present invention.
- FIG. 70 is a timeline comparison illustrating exemplary offset recording capability of the present invention.
- FIG. 71 is a timeline comparison illustrating alternate exemplary offset recording capability of the present invention.
- FIG. 72 is a timeline comparison illustrating exemplary interrupt recording capability of the present invention.
- FIG. 73 is a timeline comparison illustrating the exemplary disparate and sequential recording capabilities of the present invention.
- FIG. 53 illustrates the system of the present invention.
- a Wide Area Network 5350 exemplary or most famously embodied in the Internet.
- the present invention can be contained within the server 5314 , as well as a series of clients such as Laptop 5322 , Video Camera 5324 , Telephone 5326 , Digitizing Pad 5328 , Personal Digital Assistance (PDA) 5330 , Television 5332 , Set Top Box 5340 (that is connected to and serves Television 5338 ), Scanner 5334 , Facsimile Machine 5336 , Automobile 5302 , Truck 5304 , Screen 5308 , Work Station 5312 , Satellite Dish 5310 , and Communications Tower 5306 , all useful for communications to or from remote devices for use with the system of the present invention.
- PDA Personal Digital Assistance
- the present invention is particularly useful for set top boxes 5340 .
- the set top boxes 5340 may be used as intermediate video servers for home networking, serving televisions, personal computers, game stations and other appliances.
- the server 5314 can be connected to an internal local area network via, for example, Ethernet 5316 , although any type of communications protocol in a local area network or wide area network is possible for use with the present invention.
- the local area network for the server 5314 has with it connections for data storage 5318 which can include database storage capability.
- the local area network connected to Ethernet 5316 may also hold one or more alternate servers 5320 for purposes of load balancing, performance, etc.
- the multimedia bookmarking scheme of the present invention can utilize the servers and clients of the system of the present invention, as illustrated in FIG. 53 , for use in transferring data to or loading data from the servers through the Wide Area Network 5350 .
- the present invention is useful for storing, indexing, searching, retrieving, editing, and rendering multimedia content over networks having at least one device capable of storing and/or manipulating an electronic file, and at least one device capable of playing the electronic file.
- the present invention provides various methodologies for tagging multimedia files to facilitate the indexing, searching, and retrieving of the tagged files.
- the tags themselves can be embedded in the electronic file, or stored separately in, for example, a search engine database.
- Other embodiments of the present invention facilitate the e-mailing of multimedia content.
- Still other embodiments of the present invention employ user preferences and user behavioral history that can be stored in a separate database or queue, or can also be stored in the tag related to the multimedia file in order to further enhance the rich search capabilities of the present invention.
- aspects of the present invention include using hypershell and other techniques to read text information embedded in multimedia files for use in indexing, particularly tag indexes. Still more methods of the present invention enable the virtual editing of multimedia files by manipulating metadata and/or tags rather than editing the multimedia files themselves. Then the edited file (with rearranged tags and/or metadata) can be accessed in sequence in order to link seamlessly one or more multimedia files in the new edited arrangement.
- Still other methods of the present invention enable the transcoding of images/videos so that they enable users to display images/videos on devices that do not have the same resolution capabilities as the devices for which the images/videos were originally intended.
- This allows devices such as, for example, PDA 5330 , laptop 5322 , and automobile 5302 , to retrieve useable portions of the same image/video that can be displayed on, for example, workstation 5312 , screen 5308 , and television 5332 .
- indexing methods of the present invention are enhanced by the unique modification of visual rhythm techniques that are part of other methods of the present invention.
- Modification of prior art visual rhythm techniques enable the system of the present invention to capture text information in the form of captions that are embedded into multimedia information, and even from video streams as they are broadcast, so that text information about the multimedia information can be included in the multimedia bookmarks of the present invention and utilized for storing, indexing, searching, retrieving, editing and rendering of the information.
- the methods of the present invention described in this disclosure can be implemented, for example, in software on a digital computer having a processor that is operable with system memory and a persistent storage device. However, the methods described herein may also be implemented entirely in hardware, or entirely in software, and in any combination thereof.
- the metadata usually include information on description of multimedia data content such as distinctive characteristic of the data, structure and semantics of the content. Some of the description provides information on the whole content such as summary, bibliography and media format. However, in general, most of the description is structured around “segments” that represent spatial, temporal or spatial-temporal components of the audio-visual content.
- the segment may be a single frame, a single shot consisting of successive frames, or a group of several successive shots. Low-level features and some elementary semantic information may describe each segment. Examples of such descriptions include color, texture, shape, motion, audio features and annotated texts.
- the media positions (in terms of time points or bytes) contained in the metadata obtained with respect to the master file may not be directly applied to the other variations. This is because there may be mismatches of media positions between the master and the other variations if the master and the other variations do not start at the same position of the source content.
- the method and system of the present invention include a tag that can contain information about all or a portion of a multimedia file.
- the tag can come in several varieties, such as text information embedded into the multimedia file itself, appended to the end of the multimedia file, or stored separately from the multimedia file on the same or remote network storage device.
- the multimedia file has embedded within it one or more global unique identifiers (GUIDs).
- GUIDs global unique identifiers
- each scene in a movie can be provided with its own GUID.
- the GUIDs can be indexed by a search engine and the multimedia bookmarks of the present invention can reference the GUID that is in the movie.
- multiple multimedia bookmarks of the present invention can reference the same GUID in a multimedia document without impacting the size of the multimedia document, or the performance of servers handling the multimedia document.
- the GUID references in the multimedia bookmarks of the present invention are themselves indexable.
- a search on a given multimedia document can prompt a search for all multimedia bookmarks that reference a GUID embedded within the multimedia file, providing a richer and more extensive resource for the user.
- FIG. 2 shows a multimedia bookmark 210 of the present invention comprising positional information 212 and content information 214 .
- the positional information 212 is used for accessing a multimedia content 204 starting from a bookmarked position 206 .
- the content information 214 is used for visually displaying multimedia bookmarks in a bookmark list 208 , as well as for searching one or more multimedia content databases for the content that matches the content information 214 .
- the positional information 212 may be composed of a URI, a URL, or the like, and a bookmarked position (relative time or byte position) within the content.
- a URI is synonymous with a position of a file and can be used interchangeably with a URL or other file location identifier.
- the content information 214 may be composed of audio-visual features and textual features.
- the audio-visual features are the information, for example, obtained by capturing or sampling the multimedia content 204 at the bookmarked position 206 .
- the textual features are text information specified by the user(s), as well as delivered with the content. Other aspects of the textual features may be obtained by accessing metadata of the multimedia content.
- the positional information 212 is composed of a URI and a bookmarked position like an elapsed time, time code or frame number.
- the content information 214 is composed of audio-visual features, such as thumbnail image data of the captured video frame, and visual feature vectors like color histogram for one or more of the frames.
- the content information 214 of a multimedia bookmark 210 is also composed of such textual features as a title specified by a user as well as delivered with the content, and annotated text of a video segment corresponding to the bookmarked position.
- the positional information 212 is composed of a URI, a URL, or the like, and a bookmarked position such as elapsed time.
- the content information 214 is composed of audio-visual features such as the sampled audio signal (typically of short duration) and its visualized image.
- the content information 214 of an audio bookmark 210 is also composed of such textual features as a title, optionally specified by a user or simply delivered with the content, and annotated text of an audio segment corresponding to the bookmarked position.
- the positional information 212 is composed of a URI, URL, or the like, and an offset from the starting point of a text document.
- the offset can be of any size, but is normally about a byte in size.
- the content information 214 is composed of a sampled text string present at the bookmarked position, and text information specified by user(s) and/or delivered with the content, such as the title of the text document.
- FIG. 3 shows an illustration of searching for multimedia contents that are relevant to the content information 314 (that correlates to element 214 of FIG. 2 ) that is stored in the multimedia bookmark 210 of FIG. 2 of the present invention where both positional and content information are used.
- the content information 314 is comprised of audio-visual features 320 such as a captured frame 322 and a sampled audio data 324 , and textual features 326 such as annotated text 328 and a title 330 .
- audio-visual features 320 such as a captured frame 322 and a sampled audio data 324
- textual features 326 such as annotated text 328 and a title 330 .
- the bookmark would not be valid for viewing a full version of the broadcast. If a bookmark were saved during live Internet broadcast, the bookmark would not be valid for viewing an edited version of the live broadcast. Further, if a user wanted to access the bookmarked multimedia content from another site that also provides the content, even the positional information such as URI would be not be valid.
- the present invention uses content information 314 (element 214 of FIG. 2 ) that is saved in the multimedia bookmark to obtain the actual positional information of the last-visited segment by searching the multimedia database 310 using the content information 314 as a query input.
- Content information characteristics such as captured frame 322 , sampled audio data 324 , annotated text of the segment corresponding to a bookmarked position 328 , and the title delivered with the content 330 can be used as query input to a multimedia search engine 332 .
- the multimedia search engine searches its multimedia database 310 by performing content-based and/or text-based multimedia searches, and finds the relevant positions of multimedia contents.
- the search engine then retrieves a list of relevant segments 334 with their positional information such as URI, URL and the like, and the relative position.
- a multimedia player 336 a user can start playing from the retrieved segments of the contents.
- the retrieved segments 334 are usually those segments having contents relevant or similar to the content information saved in the multimedia bookmark.
- FIG. 4 illustrates an embodiment of a key frame hierarchy used by a search method of the multimedia search engine 332 (see FIG. 3 ) in accordance with the present invention.
- the method arranges key frames in a hierarchical fashion to enable fast and accurate searching of frames similar to a query image.
- the key frame hierarchy illustrated in FIG. 4 is a tree-structured representation for multi-level abstraction of a video by key frames, where a node denotes each key frame.
- a number Df is associated with each node and represents the maximum distance between the low-level feature vector of the node 414 and those of its decendent nodes in its subtree (for example, nodes 416 and 418 ).
- An example of such feature vector is the color histogram of a frame.
- the dissimilarity between fq and a subtree rooted at the key frame fm is measured by testing d(fq, fm)>Df+e where d(fq, fm) is a distance metric measuring dissimilarity such as the L1 norm between feature vectors, and e is a threshold value set by a user. If the condition is satisfied, searching of the subtree rooted at the nodefm is skipped (i.e., the subtree is “pruned” from the search).
- This method of the present invention reduces the search time substantially by pruning out the unnecessary comparison steps.
- FIG. 5 shows an example of five variations encoded from the same source video content 502 .
- FIG. 5 shows two ASF format files 504 , 506 with the bandwidths of 28.8 and 80 kbps that start and end exactly at the same time points.
- FIG. 5 also shows the first RM format file 508 with the bandwidth of 80 kbps.
- source content starts to be encoded with the time interval o 1 before the start time point of the ASF files 504 , 506 , and ends to be encoded with the time interval o 4 , before the end time point of the ASF files 504 and 506 .
- the RM file 508 thus has an extra video segment with the duration of o 1 at the beginning.
- the start time point of the video segment in the RM file is temporally shifted right with the time interval o 1
- the start time point of the video segment in the RM file can be computed by adding the time interval o 1 to the start time point of the video segment in the ASF files.
- the second RM file 510 with the bandwidth of 28.8 kbps does not have a leading video segment with the duration of o 2 .
- the start time point of the video segment 514 in the second RM file can be computed by subtracting the time interval o 2 from the start time point of the video segment in the ASF files.
- the MOV file 512 with the smart bandwidth of 56 kbps has two extra segments with the duration of o 3 and o 6 , respectively.
- the ASF file encoded at the bandwidth of 80 kbps 504 is to be the master file, and the other four files are slave files.
- an offset of a slave file will be the difference of positions in time duration or byte offset between a start position of a master file and a start position of the slave file.
- the difference of positions o 1 , o 2 , and o 3 are offsets.
- the offset of a slave file is computed by subtracting the start position of a slave file from the start position of a master file. In this formula, the two start positions are measured with respect to the source content.
- the offset will have a positive value if the start position of a slave occurred before the start position of a master with reference to the source content. Conversely, the offset will have a negative value if the start position of a slave occurred after the start position of a master.
- the offsets o 1 and o 3 are positive values, and o 2 is negative.
- an offset of a master file is set to zero.
- a user generates a multimedia bookmark with respect to one of the variations that is to be called a bookmarked file. Then, the multimedia bookmark is used at a later time to play one of the variations that is called a playback file.
- the bookmarked file pointed to by the multimedia bookmark, and the playback file selected by the user may not be the same variation, but refer to the same multimedia content.
- both the bookmarked and the playback files should be the same. However, if there are multiple variations, a user can store a multimedia bookmark for one variation and later play another variation by using the saved bookmark. The playback may not start at the last accessed position because there may be mismatches of positions between the bookmarked and the playback files.
- Each media profile corresponds to the different variation that can be produced from a single source content depending on the values chosen for the encoding formats, bandwidths, resolutions, etc.
- Each media profile of a variation contains at least a URI and an offset of the variation.
- Each media profile of a variation optionally contains a time scale factor of the media time of the variation encoded in different temporal data rates with respect to its master variation. The time scale factor is specified on a zero to one scale where a value of one indicates the same temporal data rate, and 0.5 indicates that the temporal data rate of the variation is reduced by half with respect to the master variation.
- Table 1 is an example metadata for the five variations in FIG. 5 .
- the metadata is written according to the ISO/IEC MPEG-7 metadata description standard which is under development.
- the metadata are described by XML since MPEG-7 adopted XML Schema as its description language.
- the temporal data rate of the variation 512 is assumed to be reduced by half with respect to the master variation 504 , and the other variations are not temporally reduced.
- FIG. 6 shows an example of two multimedia contents and their associated metadata. Since the first multimedia content has five variations and the second has three variations, there are five media profiles in the metadata of the first multimedia content 602 , and three media profiles in the metadata of the second 604 .
- two subscripts attached to identifiers of variations, URIs, URLs or the like, and offsets represent a specific variation of a multimedia content.
- the third variation of the first multimedia content 610 has the associated media profile 612 in the metadata of the first multimedia content 602 .
- the media profile 612 provides the values of a URI and an offset of the third variation of the first multimedia content 610 .
- a bookmark system stores the following positional information along with content information in the multimedia bookmark:
- FIG. 7 shows an example of a list of bookmarks 702 for the variations of two multimedia contents in FIG. 6 .
- the list contains the first and second bookmarks 704 and 706 for the first variation, and the third one 708 for the fourth variation of the first multimedia content. Because those three bookmarks are for the same multimedia content, they also have the same metadata ID.
- the list also contains the fourth and fifth bookmarks 710 and 712 for the first and third variations of the second multimedia content, respectively. Thus, these two bookmarks have the same metadata ID referring to the second multimedia content.
- the bookmark system checks whether the selected bookmarked file is equal to the playback file or not. If they are not equal, the bookmark system adjusts the saved bookmarked position in order to obtain an accurate playback position on the playback file. This adjustment is performed by using the offsets saved in a metafile and a bookmarked position saved in a multimedia bookmark. Assume that P b is a bookmarked position of a bookmarked file, and P p is the desirable position (adjusted bookmark position) of the playback file.
- o b and o p be the offsets of bookmarked and playback files, respectively.
- ) if o p >0 >s ⁇ o b ii) P p s ⁇ P b +(
- ) if o p >s ⁇ o b ⁇ 0 or 0 ⁇ o p >s ⁇ o b iii) P p s ⁇ P b ⁇ (
- ) if o p >0 >s ⁇ o b iv) P p s ⁇ P b ⁇ (
- FIG. 8 shows the five distinct cases ( 802 , 804 , 806 , 808 , 810 ) illustrating the above formula.
- one offset is assumed for each slave file. In general, however, there may be a list of offset values for each slave file for the cases where the frame skipping occurs during the encoding of the slave file or the part of the slave file is edited.
- This durable multimedia bookmark is to be explained with the examples in FIGS. 6 and 7 .
- a user wants to play back the third variation 610 of the first multimedia content in FIG. 6 from the position stored in the second bookmark 706 in FIG. 7 .
- the second bookmark 706 was made with reference to the first variation 606 of the first multimedia content in FIG. 6 .
- the bookmarked file 606 is not equal to the playback file 610 .
- the bookmark system accesses the metadata of the first multimedia content 602 . From the metadata, the system reads the media profile of the first variation 608 and the third variation 612 .
- the system adjusts the bookmarked position, thus obtaining a correct playback position of a playback file.
- an offset of a slave file is defined as the difference between the start position of a master file and the start position of a slave file.
- This offset calculation requires locating a referential segment, for example, the segment A 514 in FIG. 5 .
- the offset is calculated as the start time of the master file minus the start time of the slave file.
- a referential segment may be any multimedia segment bounded by two different time positions. In practice, however, a segment bounded between two specific successive shot boundaries in the case of a video is frequently used as a referential segment. Thus, the following method may be used to determine a referential segment:
- FIG. 9 shows an example of a user interface incorporating the multimedia bookmark of the present invention.
- the user interface 900 is composed of a playback area 912 and a bookmark list 916 . Further, the playback area 912 is also composed of a multimedia player 904 and a variation list 910 .
- the multimedia player 904 provides various buttons 906 for normal VCR (Video Cassette Recorder) controls such as play, pause, stop, fast forward and rewind. Also, it provides another add-bookmark control button 908 for making a multimedia bookmark. If a user selects this button while playing a multimedia content, a new multimedia bookmark having both positional and content information is saved in a persistent storage. Also, in the bookmark list 916 , the saved bookmark is visually displayed with its content information. For example, a spatially reduced thumbnail image corresponding to the temporal location of interest saved by a user in the case of a multimedia bookmark is presented to help the user to easily recognize the previously bookmarked content of the video.
- VCR Video Cassette Recorder
- bookmark list 916 every bookmark has five bookmark controls just below its visually displayed content information.
- the left-most play-bookmark control button 918 is for playing a bookmarked multimedia content from a saved bookmarked position.
- the delete-bookmark control button 920 is for managing bookmarks. If this button is selected, the corresponding bookmark is deleted from the persistent storage.
- the add-bookmark-title control button 922 is used to input a title of bookmark given by a user. If this button is not selected, a default title is used.
- the search control button 924 is used for searching multimedia database for multimedia contents relevant to the selected content information 914 as a multimedia query input. There are a variety of cases when this control might be selected.
- the send-bookmark control button 926 is used for sending both positional and content information saved in the corresponding bookmark to other people via e-mail. It should be noted that the positional information sent via e-mail includes either a URI or other locator, and a bookmarked position.
- the variation list 910 provides possible variations of a multimedia content with corresponding check boxes. Before a traditional normal playback or a bookmarked playback, a user selects a variation by checking the corresponding mark. If the multimedia content does not have multiple variations, this list may not appear in the user interface.
- FIG. 10 is an exemplary flow chart illustrating the overall method 1000 of saving and retrieving multimedia bookmarks with the two additional functions: i) Searching for other multimedia content relevant to the content pointed by the bookmark and ii) Sending a bookmark to another person via e-mail.
- step 1002 if a user wants to play the multimedia content (step 1004 ), the multimedia player is first displayed to the user in step 1006 . A check is made in step 1008 to determine if multiple variations of multimedia content are available. If so, then two extra steps are taken.
- step 1010 the variation list is presented to the user and (optionally) with a default variation in step 1012 .
- step 1014 the list of multimedia bookmarks is displayed to the user by using their content information and bookmark controls.
- step 1016 is performed.
- a check is made to determine if the user wants to change the variation, step 1018 . If so, the user can select the other variation, step 1020 .
- step 1022 a check is made to determine if the user has selected one of the conventional VCR-type controls (e.g., play, pause, stop, fast forward, and rewind) or one of the bookmark-type controls (add-bookmark, play-bookmark, delete-bookmark, add-bookmark-title, search, and send-bookmark).
- the conventional VCR-type controls e.g., play, pause, stop, fast forward, and rewind
- bookmark-type controls add-bookmark, play-bookmark, delete-bookmark, add-bookmark-title, search, and send-bookmark.
- the execution of the method jumps to the selected function 1024 . Otherwise, if the user selects one of the controls related to the bookmarks ( 1026 , 1030 , 1034 , 1038 , 1042 , and 1046 ), the program goes to the corresponding routine ( 1028 , 1032 , 1036 , 1040 , 1044 , and 1048 ), respectively. Until the different multimedia content is selected (step 1004 ), the multimedia player with the variation list and the bookmark list will continue to be displayed (steps 1006 , 1010 and 1014 ).
- FIG. 11 is a flow chart illustrating the process of adding a multimedia bookmark.
- the add-bookmark control is selected (step 1026 of FIG. 10 )
- execution of the method proceeds to step 1028 of FIG. 11 .
- the multimedia playback is suspended in step 1102 .
- the URI, URL or similar address is obtained in step 1104 .
- a check is made in step 1106 to determine if the information on the bookmarked position such as time code is available at the currently suspended multimedia content. If so, execution is moved to step 1108 , where the bookmarked position is obtained.
- the bookmarked position data if available, are used to capture, sample or derive audio-visual features of the suspended multimedia content at the bookmarked position.
- step 1112 a check is made to determine if the metadata exists. If not, then execution jumps to step 1124 where the URI (or the like), the bookmarked position, and the audio-visual features are stored in persistent storage. Otherwise (i.e., the metadata of the suspended multimedia content exist), the search is conducted to find a segment corresponding to the bookmarked position in the metadata in step 11 14 . Next, a check is made to determine if the annotated text is available for the segment. If so, then the annotated text is obtained in step 11 18 . If not, step 1118 is skipped and execution resumes at step 1120 , where a check is made to determine if there are media profiles that contain offset values of the suspended multimedia content.
- step 1122 is performed where a metadata ID is obtained in order to adjust the bookmarked position in future playback. Otherwise, step 1122 is skipped and the method proceeds directly to step 1124 , where the annotated text and the metadata ID are also stored in persistent storage. Then, in step 1126 , the list of multimedia bookmarks is redisplayed with their content information and bookmark controls. The multimedia playback is resumed in step 1128 , and execution of the method is moved to a clearing-off routine 1610 (of FIG. 16 ) that is performed at the end of every bookmark control routine.
- a clearing-off routine 1610 of FIG. 16
- step 1612 a check is made in step 1612 to determine if the user wants to play back different multimedia content. If so, the method returns to step 1002 (see FIG. 10 ) where another multimedia process begins. Otherwise, the method resumes at step 1016 of FIG. 10 , where the multimedia process waits for the user to select one of the conventional VCR or bookmark controls.
- FIG. 12 is a flow chart illustrating the process of playing a multimedia bookmark.
- step 1032 is invoked.
- step 1202 the URI or the like, bookmarked position, and metadata ID for the multimedia content to be played back are read from persistent storage.
- step 1204 the URI of the content is valid. If not, execution of the method is shifted to step 1044 (see FIG. 10 ) where the process of the content-based and/or text-based search begins.
- the URI of the content becomes invalid when the multimedia content is moved to other location, for example.
- step 1204 If the URI of the content is valid (the result of step 1204 is positive), a check is made to determine if the bookmarked position is available. If not, a check is made to determine if the user desires to select the content-based and/or text-based search in step 1208 . If so, execution is moved to step 1044 (see FIG. 10 ). Otherwise, the method moves to step 1210 , where the user can just play the multimedia content from the beginning. If the URI of the content is valid and the bookmarked position is available (e.g., both results of steps 1204 and 1206 are positive), a check is made in step 1212 to determine if the metadata ID is available. If it is not available, the multimedia playback starts from the bookmarked position in step 1222 .
- the bookmarked and playback files are identified in step 1214 and the values of their respective offsets are read from the metadata in step 1216 . Then, in step 1218 , the bookmarked position is adjusted by using offsets. The multimedia playback starts from the adjusted bookmarked position in step 1220 . After starting one of the playbacks ( 1210 , 1220 , or 1222 ), the method executes the clearing-off routine in step 1610 of FIG. 16 .
- FIG. 13 is a flow chart illustrating the process of deleting a multimedia bookmark.
- the method invokes the routine illustrated in FIG. 13 .
- all positional and content information of the selected multimedia bookmark is deleted from the persistent storage in step 1302 .
- the list of multimedia bookmarks is redisplayed with their content information and bookmark controls in step 1304 , and then execution is shifted to the clearing-off routine, step 1610 of FIG. 16 .
- FIG. 14 is a flow chart illustrating the process of adding a title to a multimedia bookmark.
- the program goes through this portion 1400 of the method of the present invention. In this routine, the user will be prompted to enter a title in step 1402 for the saved multimedia bookmark. A check is made to determine if the user entered a title in step 1404 . If not, the program may provide a default title in step 1406 that may be made in accordance with a predetermined routine. In any case, execution proceeds to step 1408 , where the list of multimedia bookmarks is redisplayed with their content information, including the titles and bookmark controls. Thereafter, the method executes the clearing-off routine of step 1610 of FIG. 16 .
- FIG. 15 is a flow chart illustrating the portion 1500 of the present invention for searching for the relevant multimedia content based on audio-visual features as well as textual features saved in a multimedia bookmark, if available.
- the search methods currently available can be largely categorized into two types: content-based search and text-based search.
- Most of the prior art search engines utilize a text-based information retrieval technique.
- the present invention also employs content-based multimedia search engines which use, for example, the retrieval technique based on such visual and audio characteristics or features as color histogram and audio spectrum.
- the content information of a particular segment, stored in a multimedia bookmark may be used to find other relevant information about the particular segment. For example, a frame-based video search may be employed to find other video segments similar to the particular video segments.
- a text-based search may be combined with a frame-based video search to improve the search result.
- Most of frame-based video search methods are based on comparing low-level features such as colors and texture. These methods lack semantics necessary for recognition of high-level features.
- This limitation may be overcome by combining a text-based search.
- Most available multimedia contents are annotated with text. For example, video segments showing President Clinton may be annotated with “Clinton.” In that case, the combined search using the image of Clinton wearing a red shirt as a bookmark may find other video segments containing Clinton, such as the segment showing Clinton wearing a blue shirt.
- the search routine ( 1044 of FIG. 15 ) is invoked in the following three scenarios:
- this portion 1500 is invoked and the content information of the multimedia bookmark such as audio-visual and textual features of the query input and the positional information, if available, are read from persistent storage in step 1502 .
- visual features for the multimedia bookmark include, but are not limited to, captured frames in JPEG image compression format or color histograms of the frames.
- step 1504 a check is made to determine if the annotated texts are available. If so, the annotated text is retrieved directly from the content information of the bookmark in step 1506 and execution proceeds immediately to step 1516 , where the process of the text-based multimedia search is performed by using the annotated texts as query input, resulting in the multimedia segments having texts relevant to the query. If the result of step 1504 is negative, the annotated texts can be also obtained by accessing the metadata, using the positional information. Thus a check is made in step 1508 to determine if the positional information is available. If so, then another check is made to determine if the metatdata exist in step 1510 .
- step 1512 is executed, where a segment corresponding to the bookmarked position in the metadata is found.
- a check is then made to determine if some annotated texts for the segment are available in step 1514 . If so (i.e., the result of step 1514 is positive), the text-based multimedia search is also performed in step 1516 . If the annotated texts or the positional information is not available from the content information of the bookmark (i.e., the result of step 1514 is negative) or from the metadata (i.e., the result of step 1510 is negative), then a content-based multimedia search is performed by using the audio-visual features of the bookmark as query input in step 1518 .
- step 1518 The result of step 1518 is that the resulting multimedia segments have audiovisual features similar to the query. It should be noted that both the text-based multimedia search (step 1516 ) and the content-based multimedia search (step 1518 ) can be performed in sequences, thus combining their results. Alternatively, one search can be performed based the results of the other search, although they are not presented in the flow chart of FIG. 15 .
- the audio-visual features of the retrieved segments at their retrieved positions are computed in step 1520 and temporarily stored to show visually the search results in step 1522 , as well as to be used as query input to another search if desired by the user in steps 1530 , 1532 , and 1534 . If the user wants to play back one of the retrieved segments, i.e., the result of step 1524 is positive, the user selects a retrieved segment in step 1526 , and plays back the segment from the beginning of the segment in step 1528 . The beginning of the retrieved segment that was selected is called as the retrieved position in either step 1528 or step 1508 .
- step 1530 If the user wants another search (i.e., the result of step 1530 is positive), the user selects one of retrieved segments in step 1532 . Then, the content information, including audio-visual features and annotated texts for the selected segment, is obtained by accessing temporarily stored audio-visual features and/or the corresponding metadata in step 1534 , and the new search process begins at step 1504 . If the user wants no more playbacks and searches, the execution is transferred to the clearing-off routine, step 1610 of FIG. 16 .
- Search Type A The multimedia bookmark has only information on image.
- Search Type B The multimedia bookmark has only positional information.
- the multimedia bookmark has only annotated text.
- a sever at a client side selects a multimedia bookmark, the client sends the annotated text to the server.
- Search Type D The multimedia bookmark has both image and positional information. This type of search can be implemented in the way of either Search Type A or B.
- Search Type E The multimedia bookmark has both image and annotated text.
- Search Type F The multimedia bookmark has both positional information and annotated text.
- Search Type G The multimedia bookmark has all the information: image, position, and annotated text. This type of search can be implemented in the way of either Search Type E or F.
- FIG. 16 is a flow chart illustrating the method of sending a bookmark to other people via e-mail.
- step 1048 of FIG. 16 is invoked.
- all saved bookmark information including the URI, the bookmarked position and metadata ID, the audio-visual and the textual features of a selected multimedia bookmark to be sent, are read from the persistent storage in step 1602 .
- the user will be prompted to enter some related input in order to send an e-mail to another individual or a group of people. If all of the necessary information is input by the user in step 1606 , the e-mail is sent to the designated persons with the bookmark information in step 1608 .
- step 1610 the clearing-off routine, step 1610 , that may be entered from several other portions of the method shown in FIGS. 11, 12 , 13 , 14 , and 15 .
- step 1612 a check is made in step 1612 to determine if other multimedia contents are available. If so, execution of the method is transferred to step 1002 of FIG. 10 . Otherwise, execution of the method is transferred to step 1016 of FIG. 10 .
- the multimedia bookmark may consist of the following bookmarked information:
- the content information can be obtained at the client or server side when its corresponding multimedia content is being played in networked environment.
- a multimedia bookmark for example, the image captured at a bookmarked position (3) can be obtained from a user's video player or a video file stored at a server.
- the title of a bookmark (5) might be obtained at a client side if a user types in his own title. Otherwise, a default title, such as a title of a bookmarked file stored at a server, can be used as the title of the bookmark.
- the textual annotations attached to a segment which contains the bookmarked position are stored in a metadata in which offsets and time scales of variations also exist for the durable bookmark.
- the textual annotations (4) and metadata ID (6) are obtained at a server.
- the bookmarked information can be stored at a client's or server's storage regardless of the place where the bookmarked information is obtained.
- the user can send the bookmarked information to others via e-mail.
- the bookmarked information is stored at a server, it is simple to send the bookmarked information via e-mail, that is, to send just a link of the bookmarked information stored at a server.
- the bookmarked information is stored at a user's storage, the user has to send all of the information to another via e-mail.
- the delivered bookmarked information can then be stored at the receiver's storage, and the bookmarked multimedia content starts to play exactly from the bookmarked position. Also, the bookmarked multimedia content can be replayed at any time the receiver wants.
- Some content information of the bookmarked information is also multimedia data, and all the other information, including the positional information is textual data.
- Both forms of the bookmarked information stored at a user's storage are sent to other person within a single e-mail. There can be two possible methods of sending the information from one user to another user via an e-mail:
- HTML HyperText Markup Language
- An HTML document can be sent via e-mail. All textual parts of bookmarked information can be directly included in the HTML document to be sent via e-mail. But the captured image in case of a multimedia bookmark cannot be directly included in the HTML from which the included image will be detached and stored at a receiver's local storage. This is because the image is represented in a binary file format.
- Sending the binary image within an HTML document can be possible by converting the binary image into a text string with encoders, such as Base-16 or Base-64, and directly including it in an HTML document as a normal character string. The converted image is called as an inline media by which one can locate any multimedia file in an HTML document.
- Table 3 is a sample HTML document which includes both the captured content image and the last of the textual bookmarked information.
- FIG. 17 is an exemplary flow chart illustrating the process of saving a multimedia bookmark at a receiving user's local storage.
- a user invokes his e-mail program in step 1704 , the user selects a message to read in step 1706 .
- a check is made in step 1708 to determine if the message includes a multimedia bookmark. If not, execution is moved to step 1706 where the user selects another message to read. Otherwise, another check is made in step 1710 to determine if the user wants to play the multimedia bookmark by selecting a play control button, which appears within the message. If not, execution is also moved to step 1706 , where the user selects another message to read. Otherwise, in step 1712 , a multimedia bookmark program having such a user interface illustrated in FIG. 9 is invoked.
- step 1714 the delivered bookmark information included in the message is saved at the user's persistent storage, thus adding the delivered multimedia bookmark into the user's list of local multimedia bookmarks. Then, in step 1716 , content information of the saved multimedia bookmark can appear at the multimedia bookmark program. Next, the play-bookmark control is internally selected in step 1718 . Execution is then moved to step 1032 of FIG. 12 .
- SMS Short Message Service
- FIG. 18 illustrates the basic elements of this embodiment of the present invention.
- the video server VS 1804 of the server network 1802 is responsible for streaming video over wired or wireless networks.
- the server network 1802 also has the video database 1806 that is operably connected to the video server 1804 .
- the multimedia bookmark message service center (VMSC) 1818 acts as a store-and-forward system that delivers a multimedia bookmark of the present invention over mobile networks.
- the multimedia bookmark sent by a user PC 1810 either stand-alone or part of a local area network 1808 , is stored in VMSC 1818 , which then forwards it to the destination mobile phone 1828 when the mobile phone 1828 is available for receiving messages.
- the gateway to the mobile switching center 1820 is a mobile network's point of contact with other networks. It receives a short message like a multimedia bookmark from VMSC and requests the HLR about routing information, and forwards the message to the MSC near to the recipient mobile phone.
- the home location register (HLR) 1822 is the main database in the mobile network.
- the HLR 1822 retains information about the subscriptions and service profile, and also about the routing information.
- the HLR 1822 Upon the request by the GWMSC 1820 , the HLR 1822 provides the routing information for the recipient mobile phone 1828 or personal digital assistant 1830 .
- the mobile phone 1828 is typically a mobile handset.
- the PDA 1830 includes, but is not limited to, small handheld devices, such as a Blackberry, manufactured by Research in Motion (RIM) of Canada.
- the mobile switching center 1824 switches connections between mobile stations or between mobile stations and other telephone and data networks (not shown).
- FIG. 19 illustrates the method of the present invention for sending a multimedia bookmark from a personal computer to a mobile telephone over a mobile network.
- the personal computer submits a multimedia bookmark to the VMSC 1918 .
- the VMSC 1918 returns an acknowledgement to the PC 1910 , indicating the reception of the multimedia bookmark.
- the VMSC 1918 sends a request to the HRL 1922 to look up the routing information for the recipient mobile. Then the HRL 1922 sends the routing information back to the VMSC 1918 , step 4 .
- the VMSC 1918 invokes the operation to send the multimedia bookmark to the MSC 1924 .
- step 6 the MSC delivers the multimedia bookmark to the mobile phone 1928 .
- step 7 the mobile phone 1928 returns an acknowledgement to the MSC 1924 .
- step 8 the MSC 1924 notifies the VMSC 1918 of the outcome of the operation invoked in step 5 .
- the method described above is equally applicable to personal digital assistants that are connected to mobile networks.
- FIG. 20 illustrates an alternate embodiment of the present invention that enables the transmission of a multimedia bookmark from one mobile device to another.
- the method begins at step 1 , where the mobile phone 2028 submits a request to the MSC 2024 to send a multimedia bookmark to another mobile telephone customer.
- the MSC 2024 sends the multimedia bookmark to the VMSC 2018 .
- the VMSC 2018 returns an acknowledgement to the MSC 2024 .
- the MSC 2024 returns to the sending mobile phone 2028 an acknowledgement indicating the acceptance of the request.
- the VMSC 2018 queries the HLR 2022 for the location of the recipient mobile phone 2030 .
- the sender or the recipient need not be a mobile telephone.
- the sending and/or receiving device could be any device that can send or receive a signal on a mobile network.
- the HLR 2022 returns the identity of the destination MSC 2024 that is close to the recipient device 2030 .
- the VMSC 2018 delivers the multimedia bookmark to the MSC 2024 in step 7 .
- the MSC 2024 delivers the multimedia bookmark to the recipient mobile device 2030 .
- the mobile device 2030 returns an acknowledgement to the MSC 2024 for the acceptance of the multimedia bookmark.
- the MSC 2024 returns to the VMSC 2018 the outcome of the request (to send the multimedia bookmark).
- FIG. 21 illustrates an alternate embodiment of the present invention for playing video sequences on a mobile device.
- the method begins generally at step 1 , where the mobile device 2128 submits a request to the MSC 2124 to play the video associated with the multimedia bookmark.
- the MSC 2128 sends the request with the multimedia bookmark to the VMSC 2118 .
- the video pointed to by the multimedia bookmark cannot be streamed directly to the mobile device 2128 .
- the marked video that is in high bit rate format is to be transmitted to the mobile device 2128 , then the high bit rate video data might not be delivered properly due to the limited bandwidth available. Further, the video might not be properly decoded on the mobile device 2128 due to the limited computing resources on the mobile device.
- the VMSC 2118 decides which bit rate video is the most suitable for the current mobile device 2128 .
- the VMSC 2118 also calculates the new marked location to compensate for the offset value due to the different encoding format or different frame rate needed to display the video on the mobile device 2128 .
- the VMSC 2118 sends the modified multimedia bookmark to the video server 2104 , using the server IP address designated in the multimedia bookmark.
- step 4 the video server 2104 starts to stream the video data down to the VMSC 2118 .
- step 5 the VMSC 2118 passes the video data to the MSC 2124 .
- step 6 the MSC 2124 delivers the video data to the service requester, mobile device 2128 . Steps 4 though 6 are repeated until the mobile device 2128 issues a termination request.
- the metadata associated with multimedia bookmark include positional information and content information.
- the positional information can be a time code or byte offset to denote the marked time point of the video stream.
- the content information consists of textual information (features) and audio-visual information.
- textual information There are two types of textual information depending upon its source: i) a bookmark user and ii) a bookmark server.
- a bookmark user When a user makes a multimedia bookmark at the specific position of the video stream (generally, multimedia file), i) a user can input the text annotation and other metadata that the user would like to associate with the bookmark, and/or ii) the multimedia bookmark system (server) delivers and associates the corresponding metadata with the bookmark.
- An example of metadata from the server includes the textual annotation describing the semantic information of the bookmarked position of the video stream.
- the semantic annotation or description or indexing is often performed by humans since it is usually difficult to automatically generate semantic metadata by using the current state of the art video processing technologies.
- the problem is that the manual annotation process is time-consuming, and, further, different people, even the specialists, can differently describe the same video frames/segment.
- the present invention discloses an approach to solve the above problem by making use of (bookmark) user's annotations. It enables video metadata to gradually be populated with information from users as time goes by. That is, the textual metadata for each video frames/segment are improved using a large number of users' textual annotations.
- the idea behind the invention is as follows.
- a user makes a multimedia bookmark at the specific position, the user is asked to enter the textual annotation. If the user is willing to annotate for his/her own later use, the user will describe the bookmark using his/her own words.
- This textual annotation is delivered to the server.
- the server collects and analyzes all the information from users for each video stream. Then, the analyzed metadata that basically represent the common view/description among a large number of users are attached to the corresponding position of the video stream.
- FIG. 54 shows a relevance queue 5402 having an enqueue 5404 and a dequeue 5406 with one or more intermediate elements 5408 .
- the queue of FIG. 54 is initially empty.
- a user makes a multimedia bookmark at the specific position of the video stream (generally multimedia file)
- a user inputs the text annotation that the user would like to associate with the bookmark.
- the text annotation is delivered to the server and is enqueued.
- the first element of the queue 5404 for the golf video stream V a is “Tiger Woods; 01:21:13:29.”
- a second user subsequently marks a new element at the 01:21:17:00 in hours:minutes:seconds:frames of the golf video stream V a (same video stream as before) and enters the keyword “Tee Shot.”
- the first element is shifted to the second and the new input is entered into the relevance queue 5402 for the video stream V a at the enqueue 5404 .
- This queue operation continues indefinitely.
- the video indexing server 5410 regularly analyzes each queue. Suppose, for instance, that the video stream is segmented into a finite number of time intervals using the automatic shot boundary detection method.
- the indexing server 5410 groups the elements inside the queue by checking time codes so that the time codes for each group are included by each time interval corresponding to each segment. For each group, the frequency of each keyword is computed and the highly frequent keywords are considered as new semantic text annotation for the corresponding segment. In this way, the semantic textual metadata for each segment can be generated by utilizing a large number of users.
- the text engine When users make a bookmark for a specific URL like www.google.com, they can add their own annotations. Thus, if the text engine maintains a queue for each document/URL, it can collect a large number of users' annotations. Therefor, it can analyze the queue and find out the most frequent words that become new metadata for the document/URL.
- the search engine would continuously have users update and enrich the text databases. This would help in the internationalization of the process, as users who are not native speakers of the particular web site content would annotate the contents in their own language and help their countrymen who conduct a search using their native tongue to find the site.
- the present invention provides a methodology and implementation for adaptive refresh rewinding, as opposed to traditional rewinding, which simply performs a rewind from a particular position by a predetermined length.
- adaptive refresh rewinding as opposed to traditional rewinding, which simply performs a rewind from a particular position by a predetermined length.
- the exemplary embodiment described below will demonstrate the present invention using video data.
- Three essential parameters are identified to control the behavior of adaptive refresh rewinding: that is, how far to rewind, how to select which refresh frames in the rewind interval, and how to present the chosen refresh video frames on a display device.
- the scope of rewinding implies how much to rewind a video back toward the beginning. For example, it is reasonable to set 30 seconds before the saved termination position, or the last scene boundary position viewed by the user. Depending on a user preference, the rewind scope may be set to a particular value.
- the selection can be static or dynamic.
- a static selection allows the refresh frames to be predetermined at the time of DB population or at the time of saving the termination position, while a dynamic selection determines the refresh frames at the time of the user's request to play back the terminated video.
- the candidate frames for user refresh can be selected in many different ways. For example, the frames can be picked out at random or at some fixed interval over the rewind interval. Alternatively, the frames at which a video scene change takes place can be selected.
- slide show is good for devices with a small display screen while the storyboard may be preferred with devices having a large display screen.
- the frames keep appearing sequentially on the display screen at regular time intervals.
- the storyboard presentation a group of frames is simultaneously placed on the large display panel.
- FIG. 55 illustrates an embodiment of the rewind aspect of the present invention.
- the viewing user or the client system displaying the video preferably sends a request to mark the video at the point of interruption to the server delivering the multimedia content to the client device.
- an instance between beginning 5504 and end 5518 of video or multimedia content 5502 is preferably selected as the videos termination or marked position 5514 .
- the server randomly selects a sequence of refresh frames 5506 , 5508 , 5510 and 5512 from rewind interval 5516 for storage on a storage device.
- the server When the viewing user or client later initiates playback of the interrupted video, the server first delivers the sequence of refresh frames 5506 , 5508 , 5510 and 5512 to the client.
- refresh frames 5506 , 5508 , 5510 and 5512 are preferably displayed either in a slide-show or storyboard format before the video or multimedia content 5502 resumes playback from termination or marked position 5514 .
- FIG. 56 illustrates an alternate embodiment of the rewind aspect of the present invention.
- a request to mark the current location of video is sent by the client system to the network server.
- the network server has already retained a list of scene change frames 5610 , 5612 , 5618 , 5620 , 5622 , 5624 , 5628 and 5632 .
- the network server is able to determine the sequence of refresh frames 5618 , 5620 , 5622 , 5624 and 5628 over the interval between viewing termination position 5630 and beginning position 5614 , or alternatively, the rewind internal 5616 .
- the network server preferably delivers to the client the sequence of selected refresh frames 5618 , 5620 , 5622 , 5624 and 5628 .
- Refresh frames 5618 , 5620 , 5622 , 5624 and 5628 are then preferably displayed by the client in a slide-show or storyboard manner before the video or multimedia content 5602 continues from termination position 5630 .
- a third embodiment of the method of the present invention may also be gleaned from FIG. 56 .
- a request to mark the current location or termination position 5630 of the video is sent to the network server by the client.
- the server preferably executes a scene change detection algorithm on the rewind interval 5616 , i.e., the segment of multimedia content 5602 between viewing beginning position 5614 and termination position 5630 .
- the network server Upon completion of the scene detection algorithm, the network server sends the client system the resulting list of scene boundaries or scene change frames 5618 , 5620 , 5622 , 5624 and 5628 , which will serve as refresh frames.
- Playback of the video or multimedia content 5602 preferably begins upon completion of the client's display of refresh frames 5618 , 5620 , 5622 and 5624 .
- FIG. 57 Illustrated in FIG. 57 is a flow chart depicting a static method of adaptive refresh rewinding implemented on a network server according to teachings of the present invention.
- method 5700 Upon initiation at step 5702 , method 5700 preferably proceeds to step 5704 , where the network server runs a scene detection algorithm on video or other multimedia content to obtain a list of scene boundaries in advance of video or other multimedia content playback.
- step 5706 Upon completion of the scene detection algorithm at step 5704 , method 5700 preferably proceeds to step 5706 , where a request received from a client system by the network server is evaluated to determine its type. Specifically, step 5706 determines whether the request received by the network server is a video or multimedia content bookmark or playback request.
- the playback request is preferably received by the network server at step 5708 .
- the network server then preferably sends the client system a pre-computed list of refresh frames and the previous termination position for the video or multimedia media content requested for playback.
- method 5700 preferably proceeds to step 5712 .
- a multimedia bookmark preferably using termination position information received from the client, may be created and saved in persistent storage.
- the rewind scope for the bookmark is preferably decided.
- the rewind scope generally defines how much to rewind the video or multimedia file back towards its beginning.
- the rewind scope may be a fixed amount before the termination position or the last scene boundary prior to the termination position.
- User preferences may also be employed to determine the rewind scope.
- step 5716 the method of frame selection for determining the refresh scenes to be later displayed at the client system is determined.
- refresh frames can be selected in many different ways. For example, refresh frames can be selected randomly, at some fixed-interval or at each scene change. Depending upon user preference settings, or upon other settings, method 5700 may proceed from step 5716 to step 5718 where refresh frames may be selected randomly over the rewind scope. Method 5700 may also proceed from step 5716 to step 5720 where refresh frames may be selected at fixed or regular intervals. Alternatively, method 5700 may proceed from step 5716 to step 5722 where refresh frames are selected based on scene changes. Upon completion of the selection of refresh frames at any of steps 5718 , 5720 or 5722 , method 5700 preferably returns to step 5706 to await the next request from a client.
- method 5800 preferably waits at step 5804 for a user request.
- the request is evaluated to determine whether the request is a video or multimedia content bookmark request or whether the request is a video or multimedia content playback request.
- method 5800 preferably proceeds to step 5806 .
- a bookmark creation request is preferably sent to a network server configured to use method 5700 of FIG. 57 or method 5900 of FIG. 59 .
- method 5800 preferably returns to step 5804 where the next user request is awaited.
- method 5800 preferably proceeds to step 5808 .
- the client system sends a playback request to the network server providing the video or multimedia content.
- method 5800 preferably proceeds to step 5810 where the client system waits to receive the refresh frames from the network server.
- method 5800 Upon receipt of the refresh frames at step 5810 , method 5800 preferably proceeds to step 5812 where a determination is made whether to display the refresh frames in a storyboard or a slide show manner. Method 5800 preferably proceeds to step 5814 if a slide show presentation of the refresh frames is to be shown and to step 5816 if a storyboard presentation of the refresh frames is to be shown. Once the refresh frames have been presented at either step 5814 or 5816 , method 5800 preferably proceeds to step 5820 .
- step 5820 the client system begins playback of the interrupted video or multimedia content from the previously terminated position (see FIGS. 55 and 56 ).
- method 5800 preferably proceeds to step 5822 where a determination is made whether or not to end the client's connection with the network server. The determination to be made at step 5822 may be made from a user prompt, from user preferences, from server settings or by other methods. If it is determined at step 5822 that the client connection with the server is to end, method 5800 preferably severs the connection and proceeds to step 5824 where method 5800 ends. Alternatively, if a determination is made at step 5822 that the client connection with the server is to be maintained, method 5800 preferably proceeds to step 5804 to await a user request.
- step 5904 a request received from a client by the network server is evaluated to determine its type. Specifically, step 5904 determines whether the request received by the network server is a video or multimedia content bookmark or playback request.
- step 5904 If, at step 5904 , the request is determined to be a video or multimedia content bookmark request, method 5900 preferably proceeds to step 5906 .
- a bookmark preferably using termination position information received from the client, may be created and saved in persistent storage.
- the playback request is preferably received by the network server at step 5908 .
- a decision regarding the rewind scope of the playback request is made by the network server at step 5908 .
- method 5900 preferably proceeds to step 5910 where the type of refresh frame selection to be made is determined.
- the network server determines whether refresh frame selection should be made based on randomly selected refresh frames from the rewind scope, refresh frames selected at fixed intervals throughout the rewind scope or scene boundaries during the rewind scope. If a determination is made that the refresh frames should be selected randomly, method 5900 preferably proceeds to step 5912 where refresh frames are randomly selected from the rewind scope. If, at step 5910 , a determination is made that the refresh frames should be selected at fixed or regular intervals over the rewind scope, such selection preferably occurs at step 5914 . Alternatively, if the scene boundaries should be used as the refresh frames, method 5900 preferably proceeds to step 5916 .
- the network server preferably runs a scene detection algorithm on the segment of video or multimedia content bounded by the rewind scope to obtain a listing of scene boundaries.
- method 5900 preferably proceeds to step 5918 .
- the network server preferably sends the selected refresh frames to the client system.
- the network server also preferably sends the client system its previous termination position for the video or multimedia content requested for playback. Once the selected refresh frames and the termination position have been sent to the client system, method 5900 preferably returns to step 5904 where another client request may be awaited.
- the multimedia bookmark of the present invention in its simplest form, denotes a marked location in a video that consists of positional information (URL, time code), content information (sampled audio, thumbnail image), and some metadata (title, type of content, actors).
- positional information URL, time code
- content information stampled audio, thumbnail image
- metadata title, type of content, actors
- multimedia bookmarks are created and stored when a user wants to watch the same video again at a later time.
- the multimedia bookmarks may be received from friends via e-mail (as described herein) and may be loaded into a receiving user's bookmark folder. If the bookmark so received does not attract the attention of the user, it may be deleted shortly thereafter.
- one aspect of the present invention provides a method and system embodied in a “recommendation engine” that uses multimedia bookmarks as an input element for the prediction of a user's viewing preferences.
- FIG. 49 illustrated generally at 4900 , illustrates the elements of an embodiment of a multimedia bookmark of the present invention.
- the multimedia bookmark 4902 contains positional information 4910 preferably consisting of a URL 4912 and a time code 4914 .
- Content information 4920 may also be stored in the multimedia bookmark 4902 .
- audio data 4922 and a thumbnail 4924 of the visual information are preferably stored in the content information 4920 .
- metadata information 4930 of multimedia bookmark 4902 are genre description 4932 , the title 4934 of the associated video and information regarding one or more actors 4936 featured in the video. Other types of information may also be stored in multimedia bookmark 4902 .
- a recommendation engine 5004 may be employed to evaluate a user's multimedia bookmark folder 5002 to determine or predict a user's viewing preferences.
- recommendation engine 5004 is preferably configured to read any positional, content and/or metadata information contained in any of the multimedia bookmarks 5006 , 5008 and 5010 maintained in a user's multimedia bookmark folder 5002 .
- the recommendation engine 5004 periodically visits the user's multimedia bookmark folder 5002 and performs a statistical analysis upon the multimedia bookmarks 5006 , 5008 and 5010 maintained therein. For example, assume that a user has 10 multimedia bookmarks in his multimedia bookmark folder. Further assume that five of the bookmarks are captured from sports programs, three are captured from science fiction programs, and two are captured from situation comedy programs. As the recommendation engine 5004 examines the “genre” attribute contained in the metadata of each multimedia bookmark, it preferably counts the number of specific keywords and infers that this user's most favorite genre is sports followed by science fiction and situation comedy. Over time and as the user saves additional multimedia bookmarks, the recommendation engine 5004 is better able to identify the user's viewing preferences.
- the recommendation engine can use its predictive capabilities to serve as a guide to the user through a multitude of program channels by automatically bringing together the user's preferred programs.
- the recommendation engine 5004 may also be configured to perform similar analyses on such metadata information as the “actors,” “title,” etc.
- FIG. 51 Illustrated in FIG. 51 , indicated generally at 5100 , is a block diagram incorporating one or more EPG channel streams 5104 with teachings of the present invention.
- the preferred information to be associated with the multimedia bookmark i.e., the positional, content and metadata information illustrated in FIG. 49 .
- the positional information i.e., desired URL and time code information
- the content information i.e., a desired audio segment and thumbnail image
- the metadata (genre, title, actors) information sought by the multimedia bookmark process 5106 may be obtained from the EPG channel 5102 via EPG channel stream 5104 .
- This metadata is the source of information used by the recommendation engine of the present invention to examine the users' viewing preferences.
- the multimedia bookmark process 5106 After extracting the metadata from the EPG channel stream 5104 , the multimedia bookmark process 5106 creates a new multimedia bookmark and places the multimedia bookmark into the user's multimedia bookmark folder on the user's storage device 5108 .
- FIG. 52 Illustrated in FIG. 52 is a block diagram of a system incorporating teachings of the present invention without an EPG channel.
- the preferred information to be associated with the multimedia bookmark i.e., the positional, content and metadata information illustrated in FIG. 49 .
- the multimedia bookmark process 5206 preferably accesses network 5202 via two-way communication medium 5204 to thereby establish a communication link with metadata server 5210 .
- metadata server 5210 Preferably located on metadata server 5210 is such metadata as genre, title, actors, etc.
- the multimedia bookmark process 5206 may download or otherwise obtain the metadata information it prefers for inclusion in the multimedia bookmark.
- the user's multimedia bookmark is preferably placed in the user's multimedia bookmark folder on the user's storage device 5208 .
- FIG. 68 shows the system to implement the present invention for a set top box (“STB”) with the personal video recorder (“PVR”) functionality.
- the metadata agent 6806 receives metadata for the video content of interest from a remote metadata server 6802 via the network 6804 .
- a user could provide the STB with a command to record a TV program beginning at 10:30 PM and ending at 11:00 PM.
- the TV signal 6816 is received by the tuner 6814 of the STB 6820 .
- the incoming TV signal 6816 is processed by the tuner 6814 and then digitized by MPEG encoder 6812 for storage of the video stream in the storage device 6810 .
- Metadata received by the metadata agent 6806 can be stored in a metadata database 6808 , or in the same data storage device 6810 that contains the video streams. The user could also indicate a desire to interactively browse the recorded video. Assume further that due to emergency news or some technical difficulties, the broadcasting station sends the program out on the air from 10:45 PM to 11:15 PM.
- the PVR on the STB starts recording the broadcast TV program at 10:30 sharp.
- the STB since the user also wants to browse the video, the STB also needs the metadata for browsing the program.
- An example of such metadata is shown in the Table 4.
- the metadata agent 6806 requests from a remote metadata server 6802 for the metadata needed for browsing the video that was specified by the user via the metadata agent 6806 .
- the corresponding metadata is delivered to the STB 6820 transparently to the user.
- the delivered metadata might include a set of time codes/frame numbers pointing to the segments of the video content of interest. Since these time codes are defined relative to the start of the video used to generate the metadata, they are meaningful only when the start of the recorded video matches that of the video used for metadata. However, in this scenario, there is a 15-minute time difference between the recorded content on the STB 6820 and the content on the metadata server 6802 . Therefor, the received metadata cannot be directly applied to the recorded content without proper adjustments. The detailed procedure to solve this mismatch will be described in the next section.
- FIG. 69 shows the system 6900 that implements the present invention when a STB 6930 with PVR is connected to the analog video cassette recorder (VCR) 6920 .
- VCR analog video cassette recorder
- metadata server 6902 interacts with the metadata agent 6906 via network 6904 .
- the metadata received by the metadata agent 6906 (and optionally any instructions stored by the user) are stored in metadata database 6908 or video stream storage device 6910 .
- the analog VCR 6920 provides an analog video signal 6916 to the MPEG encoder 6912 of the STB 6930 .
- the digitized video stream is stored by the MPEG encoder 6912 in the video stream storage device 6910 .
- this embodiment might be an excellent model to reuse the content stored in the conventional videotapes for the enhanced interactive video service.
- This model is beneficial to both consumers and content providers.
- consumers want very high quality video compared to VHS format they can reuse their content which they already paid for whereas the content providers can charge consumers at the nominal cost for metadata download.
- Video synchronization is necessary when a TV program is broadcast behind schedule (noted above and illustrated in FIG. 70 ).
- the forward collation is to match the reference frames/segment A 1 ( 7004 ) which is delivered from the server, against all the frames on the STB and to find the most similar frames/segment A 1 ′ ( 7024 ).
- the temporal media offset value d ( 7010 ) is determined, which implies that each representative frame number (or time code) that is received from the server for metadata services has to be added by the offset d ( 7010 ).
- the downloaded metadata is synchronized with the video stream encoded in the STB.
- the use of the offset 7010 enables correlation of frames A 1 ( 7004 ) to A 1 ′ ( 7024 ), A 2 ( 7006 ), and A 3 ( 7008 ) to A 3 ′ ( 7028 ).
- the server can send the STB characteristic data other than image data that represents the reference frame or segment.
- image data that represents the reference frame or segment.
- the important thing to do is to send the STB a characteristic set of data that uniquely represents the content of reference frame or segment for the video under consideration.
- data can include audio data and image data such as color histogram, texture and shape as well as the sampled pixels.
- the information such as PTS (presentation time stamp) present in the packet header can be utilized for synchronization. This information is needed especially when the program is recorded from the middle of the program or when the recording of the program stops before the end of the program. Since both the first and last PTSs are not available in the STB, it is difficult to compute the media time code with respect to the start of the broadcast program unless such information is periodically broadcast with the program.
- the STB can synchronize the time code of the recorded program with respect to the time code used in the metadata by computing the difference between the first and last PTS since the video stream of the broadcast program is assumed to be identical to that used to generate the metadata.
- a backward collation is needed when a TV program ( 7102 ) is broadcast ahead of the schedule as illustrated in FIG. 71 .
- the backward collation is to match the reference frame A 1 ( 7104 ) from the metadata server against all the frames on the STB and to find the most similar frame A 1 ′ ( 7124 ) to the reference frame A 1 ( 7104 ).
- the offset value d ( 7110 ) is determined, which implies that each representative frame number or time code that is received from the server has to be subtracted by the offset d ( 7110 ) to obtain, for example, the correlation between frames A 2 ( 7106 ) with A 2 ′ ( 7126 ) and A 3 ( 7108 ) with A 3 ′ ( 7128 ) as illustrated in FIG. 71 .
- the user has set a flag instructing the STB to ignore commercials that are embedded in the video stream.
- the metadata server knows which advertisement clip is inserted in the regular TV program, but it does not know exactly the temporal position of inserted clip.
- the frame P ( 7212 ) is the first frame of the advertisement clip S C ( 7230 )
- the frame Q ( 7212 ) is the last frame of S C ( 7230 )
- the temporal length of the clip S C is d C ( 7236 )
- the total temporal length of the TV program (video stream A 7202 ) is dT ( 7204 ) as illustrated in FIG. 72 .
- the most similar frame P′ ( 7232 ) to the reference frame P ( 7212 ) is identified by using an image matching technique and the temporal distance hi ( 7224 ) between the start frame ( 7223 ) to the frame P′ ( 7232 ) is computed. Then, for each received representative frame whose frame number (or time code) is greater than h 1 ( 7224 ), the value of d C ( 7236 ) is added.
- this procedure computes the frame (or time code) offset from the first frame of the video stream up to the frame which is most similar to the reference frame. For example, assume there are three reference start frames A 1 ( 7304 ), Bi ( 7314 ), and C 1 ( 7324 ), and end frames 7306 , 7316 , and 7326 , that are selected from videos A 7302 , B 7312 , and C 7322 , respectively.
- the procedure matches the frame A 1 ( 7304 ) against all the frames on the stream 7303 and finds the most similar frame A 1 ′ ( 7344 ).
- the offset “offA” ( 7348 ) from the beginning 7305 to the location of A 1 ′ ( 7344 ) is now computed.
- This process is repeated in the same manner for the other reference frames B 1 ( 7314 ) and C 1 ( 7324 ) for video streams 7312 and 7322 , respectively. That is, find the most similar frames B 1 ′ ( 7354 ) and C 1 ′ ( 7364 ) of the video streams 7352 and 7362 , respectively and then compute the offset for the frame B 1 ′ ( 7354 ), which is “offB” ( 7358 ), followed by the offset for the frame C 1 ′ ( 7364 ), which is “offC” ( 7368 ) from the beginning 7305 . This enables calculation of the end frames 7352 and 7366 of video streams 7352 and 7362 , respectively.
- a solution to that problem would be to analyze the e-mail content and give a message to the user asking if he or she indeed attached it. For example, if the user sets an option flag on his e-mail client software program that is equipped with the present invention, a small program or other software routine then analyzes the e-mail content in order to determine if there is the possibility or likelihood of an attachment being referenced by the user. If so, then a check is made to determine if the draft e-mail message has an attachment. If there is no attachment, then a reminder message is issued to the user inquiring about the apparent need for an attachment.
- An example of the method of content analysis of the present invention includes:
- FIGS. 61 and 62 illustrate portions of the highlights of the Masters tournament of 1997 .
- a browser window 6102 having a Web page 6104 and a remote control bar button 6106 along the bottom of the window 6102 .
- the remote control buttons have various functionality, for example, there is a program list button 6108 , a browsing button 6110 , a play button 6112 , and a story board button 6116 .
- a multifunction button 6114 In the center of the buttons is a multifunction button 6114 that can be enabled with various functionality for moving among various selections within a web page. This is particularly useful if the page contains a number of thumbnail images in a tabular format.
- FIG. 62 contains a drill-down from one of the video links in FIG. 61 .
- the remote control button bar 6206 has identical functionality as the one described in FIG. 61 .
- the remote control buttons have various functionality, for example, there is a program list button 6208 , a browsing button 6210 , a play button 6212 , and a story board button 6216 .
- the selected image from FIG. 61 namely 6120
- the corresponding video portion of Tiger Woods' play on the ninth hole is element 6220
- the web page illustrates several other video clips, namely the play to the 18th hole 6232 , and the interview with players 6234 .
- FIG. 60 illustrates a hierarchical navigation scheme of the present invention as it relates to FIGS. 61 and 62 .
- This hierarchical tree is usually utilized as a semantic representation of video content. Specifically, there is the whole video 6002 that contains all the video segments which compose a single hierarchical tree. Subsets of the video segments were shown in video clip 6004 , the third round 6020 , the fourth round 6022 , Tiger Woods' biography 6024 , and the ending narration 6026 that correspond to elements 6120 , 6122 , 6124 and 6126 , respectively, of FIG. 61 .
- the lower three boxes of FIG. 60 correspond to the three choices available, as illustrated in FIG.
- the hierarchical navigation scheme allows a user to quickly drill down to the desired web page without having to wait for the rendering of multiple interceding web pages.
- the hierarchical status bar using different colors, can be used to show the relative position of the segment as currently selected by the user.
- FIG. 61 further contains a status bar 6150 that shows the relative position 6152 of the selected video segment 6120 , as illustrated in FIG. 61 .
- the status bar 6250 illustrates the relative position of the video segment 6120 as portion 6252 , and the sub-portion of the video segment 6120 , i.e., 6254 , that corresponds to Tiger Woods' play to the 18 th hole 6232 .
- the status bar 6150 , 6250 can be mapped such that a user can click on any portion of the mapped status bar to bring up web pages showing thumbnails of selectable video segments within the hierarchy, i.e., if the user had clicked on to a portion of the map corresponding to element 6254 , the user would be given a web page containing starting thumbnail of Tiger Woods' play to the 18th hole, as well as Tiger Woods' play to the ninth hole, as well as the initial thumbnail for the highlights of the Masters tournament, in essence, giving a quick map of the branch of the hierarchical tree from the position on which the user clicked on the map status bar.
- the video files are stored in each user's storage devices, such as a hard disk on a personal computer (PC) that are themselves connected to a P2P server so that those files can be downloaded to other users who are interested in watching them.
- PC personal computer
- the user B cannot play the video starting from the position pointed to by the bookmark unless the user B downloads the entire video file from user A's storage device.
- the full download could take a considerable length of time.
- the present invention solves this problem by sending the multimedia bookmark as well as a part of the video as follows:
- Yet another embodiment of the present invention deals with the problem with the broadcast video when the user cannot make the bookmark of his/her favorite segment when the segment disappears and thereafter a new scene appears at the same place in the video.
- One solution would be to use the time-shifting property of the digital personal video recorder (PVR).
- PVR digital personal video recorder
- the STB can check the electronic programming guide (EPG) and see if the same program will be scheduled to be broadcast sometime in the future. If so, the STB can automatically records the same program at the scheduled time and then the user B can play the bookmarked video.
- EPG electronic programming guide
- An embodiment of the present invention is based on the observation that perceptually relevant images often do not share any apparent low-level features but still appear conceptually and contextually similar to humans. For instance, photographs that show people in swimsuits may be drastically inconsistent in terms of shape, color and texture but conceptually look alike to humans.
- the present invention does not rely on the low-level image features, except in an initialization stage, but mostly on the perceptual links between images that are established by many human users over time. While it is unfeasible to manually provide links between a huge number of images at once, the present invention is based on the notion that a large number of users over a considerable period of time can build a network of meaningful image links.
- the method of the present invention is a scheme that accumulates information provided by human interaction in a simpler way than image feature-based relevance feedback and utilizes the information for perceptually meaningful image retrieval. It is independent of and complementary to the image search methods that use low-level features and therefor can be used in conjunction with them.
- This embodiment of the method of the present invention is a set of algorithms and data structures for organizing and accumulating users' experience in order to build image links and to retrieve conceptually relevant images.
- a small amount of extra data space, a queue of image links, is needed for each query image in order to document the prior browsing and searching.
- a graph data structure with image objects and image links is formed and the constructed graph can be used to search and cluster perceptually relevant images effectively.
- the next section describes the underlying mathematical model for accumulating users' browsing and search based on image links.
- the subsequent section presents the algorithm for the construction of perceptual relevance graph and searching.
- the present invention utilizes the concept of collecting and propagating perceptual relevance information using simple data structures and algorithms.
- the relevance information provided by users can be based on image content, concept, or both.
- each image has a queue of finite length as illustrated in FIG. 30 . This is called the “relevance queue.”
- the relevance queue 3006 can be initially empty or filled with links to computationally similar images (CSIs) determined by low-level image feature descriptors such as color, shape and texture descriptors that are commonly used in a conventional content-based image search engine.
- a perceptually relevant image is determined by a user's selection in a manner that is similar to that of general relevance feedback schemes.
- the user views the retrieved images and establishes relevance by clicking perceptually related images as positive examples.
- FIG. 30 illustrates the case of Image 5 3004 of the retrieved images 3002 being clicked and its link being enqueued 3010 into the relevance queue Q n 3006 of the query Image n 3008 .
- the method of the present invention inserts the link to the clicked image, the PRI, into the query image's relevance queue by the normal “enqueue” operation 3010 .
- the oldest image link is deleted from the queue in a de-queue operation 3012 .
- the list of PRIs for each image queue is updated dynamically whenever a link is made to the image by a user's relevance feedback, and thus, an initially small set of links will grow over time.
- the frequency at which a PRI appears in the queue is the frequency of the users' selection and can be taken as the degree of relevance.
- This data structure that is comprised of image data and image links will become the basic vertex and edge structures, respectively, in the relevance graph that is developed for image searching, and the frequency of the PRI will be used for determining edge weights in the graph.
- Conventional relevance feedback methods explicitly require users to select positive or negative examples and may further require imposing weighting factors on selected images.
- users are not explicitly instructed to click similar images. Instead, the user simply browses and searches images motivated only by their interest. During the users' browsing and searching, it is expected that they are likely to click more often on relevant images than irrelevant images so the relevance information is likewise accumulated in the relevance queues.
- the structure of the image queue as defined above affords many different interpretations.
- the entire queue structure, one queue for each image in the database, may be viewed upon as a state vector that gets updated after each user interaction, namely by the enqueue and dequeue operations.
- Q nth column of the queue matrix, Q n contains the image indices as its elements and they may be initialized according to some low-level image relevance criteria.
- S n as defined above is basically a weighted histogram of image indices of the nth image queue Q n .
- the matrix is composed of elements r mn , the relevance values, which in essence is the probability of a viewer clicking the mth image while searching (querying) for images similar to the nth image.
- the actual values in the relevance matrix R will necessarily be different for different individuals. However, when all users are viewed upon as a collective whole, the assumption of the existence of a unique R becomes rather natural.
- the state vector (matrix) S converges to the image relevance matrix R, provided that an image relevance matrix exists.
- the discussion of the state vector is helpful in identifying the state to which it converges, the actual construction and update (of the state vector) is not necessary.
- the image queue has all information that it needs to compute the state vector (or the image relevance values)
- the implementation requires only the image queue itself.
- the current state vector is computed as required. As such, it is during the image retrieval process, when it needs to use the forgetting factor ⁇ to return images similar to the query image based on the current image relevance values.
- the discussion in the previous subsection assumes steady state of the relevance queue.
- the relevance queue is initialized with CSIs obtained with a conventional search engine in a manner that makes higher-ranked CSIs have higher relevance values.
- CSI links are put into the relevance queue evenly but higher-ranked CSI links more frequently.
- An initialization method is illustrated for eight retrieved CSIs 3102 in the relevance queue 3106 in FIG. 31 where the image link numbers denote the ranks of the retrieved CSIs. This technique ensures that higher-ranked CSIs will remain longer in the queue as users replace CSIs with PRIs by relevance feedback.
- Graph is a natural model for representing syntactic and semantic relationships among multimedia data objects. Weighted graphs are used by the present invention to represent relevance relationships between images in an image database. As shown in FIG. 47 , the vertices 4706 of the graph 4702 represent the images and the edges 4708 are made by image links in the image queue.
- An edge between two image vertices P n and P j is established if image P j is selected by users when P n is used as a query image, and therefor image P j appears for a certain number of times in the image link queue of P n .
- the edge cost is determined by the frequency of image P j in the image link queue of P n , i.e., the degree of relevance established by users.
- the threshold function signifies the fact that P j is related to P n by a weighted edge only when P j appears in the image link of P n more than a certain number of times. If the frequency of P j is very low, P j is not considered to be relevant to P n .
- a relevance relationship can possibly be asymmetric while a relevance graph is generally a directed graph.
- the symmetry of relevance relationship results in undirected graphs as shown in FIG. 47 . Specifically, FIG. 47 illustrates an undirected graph 8102 for a set of eight images and its adjacency matrix 4704 , respectively.
- the present invention employs a relevance graph structure that relates PRIs in a way that facilitates graph-based image search and clustering.
- a graph Once the image relevance is represented by a graph, one can use numerous well-established generic graph algorithms for image search.
- a query image is given and it is a vertex in a relevance graph, it is possible to find the most relevant images by searching the graph for the lowest-cost image vertices from the source query vertex.
- a shortest-path algorithm such as Dijkstra's will assign lowest costs to each vertex from the source and the vertices can be sorted by their costs from the query vertex. See, Mark A. Weiss, “Algorithms, Data Structures, and Problem Solving with C++,” Addison-Wesley, MA, 1995.
- the first step of most image/video search algorithms is to extract a K-dimensional feature vector for each image/frame representing the salient characteristics to be matched.
- the search problem is then translated as the minimization of a distance function d(o i , q) with respect to i, where q is the feature vector for the query image and o i is the feature vector for the i-th image/frame in the database.
- the hypershell search disclosed in the present invention also reduces the number of distance evaluations at query time, thus resulting in the fast retrieval.
- the hypershell algorithm uses the distances to a group of predefined distinguished points (hereafter called reference points) in a feature space to speed up the search.
- the hypershell algorithm computes and stores in advance the distances to k reference points (d(o,p 1 ), . . . , d(o,p k )) for each feature vector o in the database of images/frames. Given the query image/frame q, its distances to the k reference points (d(q,p 1 ), . . . , d(q,p k )) are first computed.
- the videos should be indexed.
- a special data structure for the videos should be built in order to minimize the search cost at query time.
- the indexing process of the hypershell algorithm consists of a couple of steps.
- the indexer simply takes a video as an input and sequentially scans the video frames to see if they can be representative frames (or key frames), subject to some predefined distortion measure. For each representative frame, the indexer extracts a low-level feature vector such as color correlogram, color histogram, or color coherent vector. The feature vector should be selected to well represent the significant characteristics of the representative frame.
- the current exemplary embodiment of the indexer uses color correlogram that has information on spatial correlation of colors as well as color distribution. See, J. Huang, S. K. Kumar, M. Mitra, W. Zhu and R. Zabih, “Image indexing using color correlogram,” in Proc. IEEE on Computer Vision and Pattern Recognition, 1997.
- the indexer performs PCA (Principal Component Analysis) on the whole set of the feature vectors extracted in the previous step.
- PCA Principal Component Analysis
- the PCA method reduces the dimensions of the feature vectors, thereby representing the video more compactly and revealing the relationship between feature vectors to facilitate the search.
- the LBG (Linde-Buzo-Gray) clustering is performed on the entire population of the dimension-reduced feature vectors. See, Y. Linde, A. Buzo and R. Gray, “An algorithm for vector quantization design,” in IEEE Trans. on Communications, 28(1), pp. 84-95, January, 1980.
- the clustering starts with a codebook of a single codevector (or cluster centroid) that is the average of the entire feature vectors.
- the code vector is split into two and the algorithm is run with these two codevectors.
- the two resulting codevectors are split again into four and the same process is repeated until the desired number of codevectors is obtained.
- These cluster centroids are used as the reference points for the hyperhsell search method.
- the indexer computes distance graphs for each reference point and each cluster.
- mk distance graphs are computed and stored into a database.
- the indexing data such as dimension-reduced feature vectors, cluster information, and distance graphs produced at the above steps are fully exploited by the hypershell search algorithm to find the best matches to the query image from the database.
- FIG. 48 illustrates this indexing process.
- FIG. 48 illustrates the system 4800 of the present invention for implementing the hypershell search.
- the system 4800 is composed generally of an indexing module 4802 and a query module 4804 .
- the indexing module contains storage devices in a storage module 4806 for storing frame and vector data. Specifically, storage space is allocated for key frames 4808 , dimension-reduced feature vectors 4810 , clusters and related centroids 4812 , and distance graphs 4816 .
- the storage elements mentioned above can be combined onto a single storage device, or dispersed over multiple storage devices such as a RAID array, storage area network, or multiple servers (not shown).
- the digital video 4836 is sent to a key frame module 4818 which extracts feature vector information from selected frames.
- the key frames and associated feature vectors are then forwarded to the PCA module 4820 which both stores the feature vector information into storage module 4810 , as well as forwards the dimension-reduced feature vectors 4840 to the LGB clustering module 4822 .
- the LGB clustering module 4822 stores the clusters and their associated centroids into the cluster storage module 4812 and forwards the clusters and their centroids to the compute module 4824 .
- the compute module 4824 computes the distance graphs and stores them into the distance graph storage module 4816 .
- the indexing module 4802 is typically a combination of hardware and software, although the indexing module is capable of being implemented solely in hardware or solely in software.
- the information stored in the indexing module is available to the query module 4802 (i.e., the query module 4804 is operably connected to the indexing module 4802 through a data bus, network, or other communications mechanism).
- the query module 4802 is typically implemented in software, although it can be implemented in hardware or a combination of hardware and software.
- the query module 4804 receives a query 4834 (typically in the form of an address or vector) for image or for frame information.
- the query is received by the find module 4826 which finds the nearest one or more clusters nearest to the query vector.
- the hypershell intersection either basic, partitions, and/or partitions-dynamic
- module 4830 all of the feature vectors that are within the intersected regions (found by module 4828 ) are ranked. Thereafter, the ranked results are displayed to the user via display module 4832 .
- the problem of proximity search is to find all the feature points whose distance from a query point q is less than distance ⁇ where distance ⁇ is a real number indicating the fidelity of the search results. See, E. Chavez, J. Marroquin and G. Navarro, “Fixed queries array: a fast and economical data structure for proximity searching,” in Multimedia Tools and Applications , pp. 113-135, 2001.
- the present invention called the hypershell search algorithm provides one of the efficient solutions for the proximity search.
- the feature points inside the circle S of FIG. 63 are those feature points similar to the query point q, up to the degree of ⁇ , and thus are the desired results of a proximity search.
- the value of ⁇ may be predetermined at the time of database buildup or determined dynamically by a user at the time of query. Since all the points in the circle are contained in the intersections I 1 and I 2 , it is desirable to search only the intersections instead of the whole feature space, thus dramatically reducing the search space.
- FIG. 63 there may be more than one intersection resulting from hypershell intersection in a multidimensional feature space.
- the two intersected regions I 1 and I 2 , of the 2-D feature space are illustrated in FIG. 63 .
- the region I 1 is highly pertinent to the query point q while the region I 2 is not.
- the least relevant regions, such as I 2 should be eliminated.
- One way to achieve such elimination is to partition the original feature space into a certain number of smaller spaces (also called clusters) and to apply the hypershell intersection to the clusters or segmented feature spaces.
- FIGS. 63 and 64 illustrates clusters 6402 , 6404 , 6406 , 6408 , 6410 , 6412 , 6414 and 6416 whose boundaries are denoted by dotted lines. Collectively, the dotted lines may be referred to as a Voronoi diagram of cluster centroids. Referring to FIGS. 63 and 64 , among the intersection I 1 and I 2 , only the region I 1 would be considered a relevant region because it resides inside the same cluster to which the query point Q belongs.
- a basic hypershell search algorithm may be used.
- a partitioned hypershell search algorithm or a partitioned-dynamic hypershell search algorithm may be used.
- the basic hypershell search algorithm is discussed below with reference to FIG. 65 .
- the partitioned hypershell search algorithm and the partitioned-dynamic hypershell search algorithm are also discussed below with reference to FIGS. 66 and 67 , respectively.
- O ⁇ o k ⁇
- I j denotes the hypershell that is 2 ⁇ wide and centered at the reference point p j
- I denotes the set of intersections obtained by intersecting all the hypershells I j .
- three hypershells 6502 , 6504 , and 6506 are generated by the basic hypershell search algorithm upon running an image/frame query with a distortion ⁇ . Further, the use of the hypershells 6502 , 6504 and 6506 produces the intersection 6508 , bounded by bold lines. As mentioned above, the feature vector points within the intersection 6508 include those points that would be retrieved in a proximity search.
- the basic shell search algorithm tends to cause a considerable search cost, namely time to intersect hypershells, because the number of data (image/frame) points contained in the intersection are usually relatively larger than the other two methods.
- O ⁇ o k ⁇
- I j denotes the hypershell that is 2 ⁇ wide and centered at the reference point p j and I denotes the set of intersections obtained by intersecting all the hypershells.
- ⁇ of cluster C n as shown in FIG. 66 is searched.
- a feature point o that is close enough to the query image q (i.e., d(o,q) ⁇ ) but resides in the neighboring cluster would not be included in the outcome of the proximity search. It is often the case that many other cluster-based search algorithms do not guarantee the search results with a given fidelity.
- the lines 6602 , 6604 , 6606 and 6608 indicate the original cluster boundaries
- the dotted lines 6610 and 6612 indicate the original cluster boundaries expanded by a distortion ⁇
- the darkened region 6614 denotes the expanded cluster C n that includes the expansion region 6616 over which the search is performed.
- FIG. 66 illustrates three hypershells 6618 , 6620 and 6622 that were created upon running an image/frame query q given a distortion ⁇ .
- the region 6614 can be selected as the most pertinent region for further consideration.
- the intersecting region 6624 is identified and actually searched.
- partitioned hypershell search algorithm While the partitioned hypershell search algorithm is the fastest of three algorithms, it also has a larger memory requirement than its alternatives. The extra storage is needed due to boundary expansion. For instance, a feature (image/frame) point near a cluster boundary, i.e., boundary lines 6702 , 6704 , 6706 and 6708 of FIG. 67 , often turns out to be an element contained in the multiple clusters. Therefor, as an alternative, the partitioned-dynamic hypershell search algorithm is a light version of partitioned hypershell search algorithm with less memory requirement, but approximately same search time as the partitioned hypershell search algorithm.
- the I j denotes the hypershell that is 2 ⁇ wide and centered at the reference point p j , and I denotes the set of intersections obtained by intersecting all the hypershells.
- the r is the shortest of all the distances between the query point and the cluster centroids.
- the C is the set of clusters whose centroids are within the distance r+ ⁇ from the query point.
- the present invention of the fast codebook search is used to find the closest cluster for the hypershell search described previously.
- H(•) stand for the Haar transform.
- a Haar-transform based multi-resolution structure for vector X is defined to be a sequence of vectors ⁇ X h,0 , X h,1 , . . . , X h,n , . . .
- FIG. 29 illustrates the use of the Haar transform in the present invention.
- the original feature space 2902 contains various elements X 0 2904 , X 1 2906 , X 2 2908 , and X 3 2910 as illustrated in FIG. 29 .
- the transformation 2930 there appear the corresponding transform elements X h,0 2914 , X h,1 2916 , X h,2 2918 , and X h,3 2920 in the Haar transform space 2912 corresponding to elements X 0 2904 , X 1 2906 , X 2 2908 , and X 3 2910 , respectively.
- D n (Q h , X h ) symbolizes the L 2 distance between two n-th level vectors Q h,n and X h,n in Haar transform space. Then the following inequality holds true: D m ( Q h ,X h ) ⁇ D ( Q h ,X h ) ⁇ . . . ⁇ D 1 (Q h ,X h ) ⁇ D 0 ( Q h ,x h )
- FIG. 25 is a flowchart illustrating the method 2500 of the present invention.
- the method begins generally at step 2502 .
- a new user enters the peer-to-peer (P2P) network in step 2504 .
- the new user multicasts a “ping” (service request) signal to announce its presence in step 2506 .
- the new user then waits to receive one or more “pong” (acknowledgement) signals from other users on the network, step 2508 .
- the new user keeps track of the nodes that sent “pong” messages in order to retain a list of active nodes for subsequent connections, step 25 10 .
- the new user then initiates a search request by multicasting a query message to the network in step 2512 .
- the source node (SN) 2524 receives the new user's search request and executes a “visual” search using the query parameters in the new user's query message, step 2526 .
- the source node then routes the search results to the new user in step 2528 .
- the new user receives the search result message that contains the source node's IP address as well as a list of names and sizes of found files, step 2514 .
- the new user makes a connection to the source node using the source node's IP address, and downloads multimedia files, in step 2516 .
- a check is made to determine if the new user wants another search request in step 2518 . If so, the execution loops back to the step 2512 . Otherwise, the user leaves the P2P network in step 2520 and terminates the program in step 2522 .
- the present invention includes a method and system of editing video materials in which it only edits the metadata of input videos to create a new video, instead of actually editing videos stored as computer files.
- the present invention can be applied not only to videos stored on CD-ROM, DVD, and hard disk, but also to streaming videos on a local area network (LAN) and wide area networks (WAN) such as the Internet.
- the present invention further includes a method of automatically generating an edited metadata using the metadata of input videos.
- the present invention can be used on a variety of systems related to video editing, browsing, and searching. This aspect of the present invention can also be used on stand-alone computers as well those connected to a LAN or WAN such as the Internet.
- Metadata of an input video file to be edited contain a URL of the video file and segment identifiers which enables one to uniquely identify metadata of a segment such as time information, title, keywords, annotations, and key frames of the segment.
- a virtually edited metafile contains metadata copied from some specific segments of several input metafiles, or contains only the URIs (Uniform Resource Identifier) of these segments. In the latter, each URI consists of both a URL of the input metafile and an identifier of the segment within the metafile.
- FIG. 32 compares the former video editing concept 3200 with the concept of virtual editing in the present invention 3200 ′.
- the metadata used during the virtual editing is stored on a separate metafile.
- the prior art method FIG. 32 ( a )
- the method of the present invention utilizes metafiles 3204 of the videos 3202 and edits the metafiles 3204 in the virtual video editor 3206 ′ to produce a metafile 3210 of a virtually edited video.
- FIG. 33 is an example of the creation of a new video using the virtual editing of the present invention with the metafile of the three videos.
- Video 3340 consists of four segments 3342 , 3344 , 3346 , 3348 that correspond to elements 1 , 2 , 3 , and 4 , respectively, in the metafile 3302 of video 3340 .
- Segments 1 and 2 of metafile 3302 are grouped to segment 5 ; segments 3 and 4 are grouped to segment 6 , and segments 5 and 6 themselves are grouped into segment 7 of metafile 3302 .
- video 2 ( 3350 ) has three segments 3352 , 3354 , and 3356 which correspond to elements a, b, and c, respectively, of metafile 3304 .
- metafile 3304 groups the elements in a hierarchical structure (a and b into d, and c and d into e).
- Video 3 ( 3360 ) has five elements 3362 , 3364 , 3366 , 3368 , and 3370 that correspond to elements A, B, C, D, and E, respectively, of metafile 3306 as illustrated in FIG. 33 .
- metafile 3306 has its elements grouped in a hierarchical structure, namely, A, B, and C into F; and D and E into G from which F and G are grouped into H as illustrated in FIG. 33 .
- the virtually edited metadata 3308 is composed of segments 3310 , 3316 , 3322 , and 3328 each of which has an segment identifiers 3312 , 3318 , 3324 , and 3330 , respectively, indicating that, for example, segment 3310 is from segment 5 ( 3314 ) of metadata 3302 , segment 3316 is from segment c ( 3320 ) of metadata 3304 , and segments 3322 and 3328 are from segment A ( 3326 ) and C ( 3332 ) of metadata 3306 as shown in FIG. 33 .
- two segments 3380 and 3382 are defined in metafile 3308 as shown in FIG. 33 .
- a composing segment can have other composing segments and/or component segments as its child node, while the component segment cannot have any child nodes.
- Virtual video editing is, essentially, the process of selecting and rearranging segments from the several input video metafiles, hence the composing segments are defined in such a way as to form a desired hierarchical tree structure with the component segments chosen from the input metafiles.
- FIG. 33 describes the process of generating the virtually edited metadata.
- Segment 5 ( 3314 ) of metafile 3302 the segment to be edited, is selected by browsing through metafile 3302 .
- Composing segment 3382 is newly generated, and it has the selected segment 5 ( 3314 ) as its child node by generating a new segment 3310 and saving an identifier of the segment 5 ( 3314 ) into the new segment. Therefor, the new segment 3310 becomes a component segment within the hierarchical structure being edited.
- Segment c ( 3320 ) is selected by browsing through metafile 3304 .
- a new segment 3316 is generated and an identifier of the segment c ( 3320 ) is saved into the new segment.
- the composing segment 3382 has then another newly created composing segment 3380 as its child node, write the title into metadata of the segment 3380 .
- the segment 3380 has the selected segments A ( 3326 ) and C ( 3332 ) as its children by generating two new segment 3322 and 3328 , and saving identifiers of the segment A ( 3326 ) and C ( 3332 ) into the new segments, respectively.
- the new segments 3322 and 3328 thereby become component segments within the hierarchical structure being edited.
- FIG. 34 illustrates the virtually edited metadata 3408 and its corresponding restructured video 3440 .
- segment 5 presents video segments 3442 and 3444 .
- segment c presents video segment 3446
- segments A ( 3426 ) and C present video segments 3448 and 3450 , respectively.
- the copy operation can be performed by one of the two ways described below.
- the URL of input video file containing the copied or reference segment has to be stored in the corresponding input metafile.
- a virtually edited metafile generated with the first method if the video URLs of all the sibling nodes belonging to a component segment are equal, the URL of the video file is stored to the composing components having these nodes as children, and remove the URL of the video file from the metadata of these nodes.
- This step guarantees that all the segments belonging to the composing segment come from the same video file if metadata of a composing segment has the URL of a video file.
- FIG. 35 is a flowchart of the method of the present invention for virtual video editing based on metadata.
- the present invention can only be applied in the situation where the content-based hierarchically structured metadata of the video is within the metafile itself or in a database management system (DBMS).
- DBMS database management system
- the method of the present invention can be applied if each segment can be uniquely identified by providing some type of key or identifier of an database object.
- step 3502 a metafile of an input video is loaded.
- step 3504 one or more segments are selected while browsing through the metafile.
- step 3506 A check is made in step 3506 to determine if a composing segment should be created. If so, step 3508 is performed where the composing segment is created in a hierarchical structure being edited within the composing buffer. Thereafter, or if the result of step 3506 is negative, step 3510 is performed, where a composing segment is specified from newly created or pre-existing ones and a component segment is created as a child node of the specified composing segment.
- step 3512 a check is made to determine if a copy of the metadata is to be used, or a URI is used in its place. If a copy of the segment is used, then step 3516 is performed where metadata of the selected segment is copied to the newly created component segment. If the URI is to be used, then step 3514 is executed where the URI of the selected segment is copied to the component segment. In either case, step 3518 is next performed, where the URL of the input video file is written to the component segment. Next, a check is made at step 3520 to determine if all of the URL's of any of the sibling nodes are identical. If so, step 3522 is performed where the URL is written to the parent composing segment and URL's of all of the child segments are deleted.
- step 3524 a check is made to determine if another segment is to be selected. If so, execution is looped back to step 3504 . Otherwise, a check is made at step 3526 to determine if another metafile is to be input to the process. If so, then execution loops back all the way to step 3502 . Otherwise, a virtually edited metafile is generated from the composing buffer in step 3528 and the method ends.
- FIGS. 36, 37 , 38 , 39 , and 40 describe the preferred application of the present invention.
- Video 1 and its metafile along with video 2 and its metafile are stored in a computer with the domain name www.video.server 1 , as inputs.
- Video 3 and its metafile (see FIG. 33 ) are stored in www.video.server 2 .
- FIG. 36 is a description of the metafile for video 1 (see FIG. 33 ) using extensible markup language (XML), the universal format for structured documents.
- the metafile of video 1 contains the URL to video 1 , and every pre-defined segment contains several metadata including the time information of the segment. The pre-defined segment also has its own segment identifier to uniquely distinguish them within a file.
- Video 2 , and video 3 of FIG. 33 are described in XML in the same way in FIG. 37 and FIG. 38 , respectively.
- FIG. 39 and 40 are the representation of the metafile in XML, after virtually editing video 1 , video 2 , and video 3 . Assume that the metafile is stored in www.video.server 2 . As indicated in FIG. 35 , there are two ways in copying a metadata of input metafile's selected segment to a component segment of a virtually edited metafile.
- FIG. 39 was composed by the first method, which is to copy all the metadata within a selected segment to the component segment.
- FIG. 40 was composed by the second method, which is to store the URI of the selected segment to the composition segment.
- the URI is composed of the input metafile's URL and the segment identifier within the file, according to the xlink and xpointer specification.
- the “#” between the URL and the segment identifier indicates that the URI is composed of URL and segment identifier with XML.
- the id( ) function which has the segment identifier as its parameter, indicates that the segment identifier is uniquely identifiable.
- FIG. 41 is a representation of the play list of the root segment in FIG. 39 , and FIG. 40 using XML.
- FIG. 42 is the block diagram of a virtual video editor supporting virtual video editing.
- the dotted line represents the flow of data file, solid line the flow of metadata, and the bold solid line the flow of control signal.
- the major components of the virtual video editor are as follows.
- the input video file ( 4208 , 4210 , 4214 ) and their metafile ( 4204 , 4206 , 4212 ) reside in the local computer or computers connected by network.
- video 1 ( 4208 ) and video 2 ( 4210 ) resides in the local computer and video 3 ( 4214 ) in a computer connected by network.
- video file and metafile are transferred to the virtual video editor 4202 through network.
- the above process is processed by the file controller 4222 and the network controller 4220 .
- the file controller 4222 reads the video file as well as the metafile in the local computer, or the video file and the metafile transferred by the network.
- the metafile read from the file controller is transferred to the XML parser 4224 .
- the XML parser validates whether the transferred metadata are well-formed according to XML syntax, the metadata is stored to input buffer 4226 .
- the metadata stored in the input buffer has a hierarchical structure described in the input metafile.
- a user performs virtual video editing with the structure manager 4228 .
- the process of copying the metadata of the selected segment to the composing buffer is done by the structure manager 4228 . That is, all the operations related to the creation of edited hierarchical structure as well as the management done within the input buffer, such as the selection of a particular composing segment, constructing a new composing segment as well as a component segment, copying the metadata, are performed by the structure manager.
- segment c ( 3320 ) of video 2 ( 3304 ) (see FIG. 33 ) is selected by the editor.
- the URL of video 2 is www.video.server 1 /video 2
- the URI of a segment c 3320 in the metafile is www.video.server 1 /metafile 2 .xml#id(seg_c).
- the metadata of segment 'seg_c′ of video 2 is as follows.
- the metadata of the newly created component segment contains the URL to the relevant videos of the segment.
- a play list generator 4236 is used to play segments in the hierarchical structure of the input buffer or composing buffer. Through the metafile's URL and time information obtained by the metadata, the play list generator passes the play list such as FIG. 41 , to video player 4238 . The video player plays the segments defined in the play list sequentially. The video being played is shown through the display device 4240 .
- the hierarchical structure edited in the composing buffer is saved as metafile 4242 by the XML generator 4234 .
- the present invention also provides a novel scheme for transcoding an image to fit the size of the respective client display when an image is transmitted to a variety of client devices with different display sizes.
- the method of perceptual hints for each image block is introduced, and then an image transcoding algorithm is presented as well as an embodiment in the form of a system that incorporates the algorithm to produce the desired result.
- the perceptual hint provides the information on the minimum allowable spatial resolution reduction for a given semantically important block in an image.
- the image transcoding algorithm selects the best image representation to meet the client capabilities while delivering the largest content value.
- the content value is defined as a quantitative measure of the information on importance and spatial resolution for the transcoded version of an image.
- a spatial resolution reduction (SRR) value is determined by either the author or publisher as well as by an image analysis algorithm and can also be updated after each user interaction.
- SRR specifies a scale factor for the maximum spatial resolution reduction of each semantically important block within an image.
- a block is defined as a spatial segment/region within an image that often corresponds to the area of an image that depicts a semantic object such as car, bridge, face, and so forth.
- the SRR value represents the information on the minimum allowable spatial resolution, namely, width and height in pixels, of each block at which users can perceptually recognize according to the author's expectation.
- the SRR value for each block can be used as a threshold that determines whether the block is to be sub-sampled or dropped when the block is transcoded.
- the SRR value ranges from 0 to 1 where 0.5 indicates that the resolution can be reduced by half and 1 indicates the resolution cannot be reduced.
- the author of the block of information can indicate that the resolution of the block could be reduced up to the size of 70 ⁇ 70 (thus, minimum allowable resolution) without degrading the perceptibility of users. This value can then be used to determine the acceptable boundaries of resolutions that can be viewed by a given device over the system of the present invention illustrated in FIG. 53 .
- the SRR value also provides a quantitative measure of how much the important blocks in an image can be compressed to reduce the overall data size of the compressed image while preserving the image fidelity that the author intended.
- the SRR value can be best used with the importance value in J. R. Smith, R. Mohan, and C.-S. Li, “Content-based Transcoding of Images in the Internet,” in Proc. IEEE Intern. Conf. on Image Processing , October 1998; and S. Paek and J. R. Smith, “Detecting Image Purpose in World-Wide Web Documents,” in Proc. SPIE/IS & T Photonics West, Document Recognition , January 1998.
- Both SRR value (r i ) and importance value (s i ) are associated with each B i .
- Image transcoding can be viewed in a sense as adapting the content to meet resource constraints.
- Rakesh Mohan, et al. modeled the content adaptation process as a resource allocation in a generalized rate-distortion framework. See, e.g., R. Mohan, J. R. Smith and C.-S. Li, “Multimedia Content Customization for Universal Access,” in Multimedia Storage and Archiving Systems , Boston, Mass.: SPIE, Vol. 3527, November 1998; R. Mohan, J. R. Smith and C.-S. Li, “Adapting Multimedia Internet Content for Universal Access,” IEEE Trans. on Multimedia, Vol. 1, No. 1, pp. 104-14, March 1999; and R. Mohan, J. R. Smith and C.-S.
- the value-resource framework does not provide the quantitative information on the allowable factor with which blocks can be compressed while preserving the minimum fidelity that an author or a publisher intended. In other words, it does not provide the quantified measure of perceptibility indicating the degree of allowable transcoding. For example, it is difficult to measure the loss of perceptibility when an image is transcoded to a set of a cropped and/or scaled ones.
- V defines the quantitative measure of how much the transcoded version of an image can have both importance and perceptual information.
- the V takes a value from 0 to 1, where 1 indicates that all of important blocks can be perceptible in the transcoded version of image and 0 indicates that none can be perceptible.
- the value function is assumed to have the following property:
- the content adaptation is modeled as the following resource allocation problem: maximize (V(I, r)) such that and ⁇ r ⁇ ⁇ x u - x l ⁇ ⁇ W and r ⁇ ⁇ y u - y l ⁇ ⁇ H where the transcoded image is represented by a rectangular bounding box whose lower and upper bound points are (x l , y l ) and (x u , y u ) respectively.
- FIGS. 43 and 44 demonstrate the results of transcoding according to the method of the present invention.
- FIG. 43 illustrates a comparison 4300 of a non-transformed resolution reduction scheme 4302 to a transcoded scheme 4304 of the present invention.
- each example is a content value parameter indicative of the “value” seen by the user.
- the images for workstations 4306 and 4316 are identical in content value (1.0).
- the entire image is merely shrunk proportionally and the content value for the images 4308 and 4318 remains 1.0.
- a small television for example, has a smaller screen.
- the prior art method shrinks the image 4310 yet again, bringing the resolution detail and thus the content value to 0, while the transcoding method of the present invention preserves the resolution of the areas of interest 4330 in the image 4320 while removing (cropping) relatively extraneous information and thus commands a higher content value of 0.53.
- images 4312 and 4314 for the HHC and PDA of the prior art method; and for images 4322 and 4324 for the respective examples employing the method of the present invention.
- the designation of the area(s) of interest 4330 can be specified by the author or an image analysis algorithm, or it may be identified by adaptive techniques through user-feedback as explained elsewhere within this disclosure.
- FIG. 44 illustrates a comparison 4400 of a non-transformed resolution reduction scheme 4402 to a transcoded scheme 4404 of the present invention.
- each example is a content value parameter indicative of the “value” seen by the user.
- the images for workstations 4406 and 4416 are identical in content value (1.0). When moved to a color PC with a smaller screen, the entire image is merely shrunk proportionally and the content value for the images 4408 and 4418 remains 1.0. However, a small television, for example, has a smaller screen.
- the prior art method shrinks the image 4410 yet again, bringing the resolution detail and thus the content value to 0, while the transcoding method of the present invention preserves the resolution of the area of interest 4430 in the image 4420 while removing (cropping) relatively extraneous information and thus commands a higher content value of 1.0.
- images 4412 and 4414 for the HHC and PDA of the prior art method; and for images 4422 and 4424 , for the respective examples employing the method of the present invention.
- this disclosure has provided a novel scheme for transcoding an image to fit the size of the respective client display when an image is transmitted to a variety of client devices with different display sizes.
- the method of the present invention further provides a scheme to transcode video with a variety of client devices having different display sizes.
- a general overview of the scheme is illustrated in FIG. 45 .
- the content transcoder 4502 contains various modules that take data from a content database 4504 , modify the content and forward the modified content to the Internet for viewing by various devices.
- the system 4500 has content database 4504 that maintains content information as well as (optionally) publisher and author preferences.
- a signal is received by the policy engine 4506 that resides within the content transcoder 4502 .
- the policy engine 4506 is operative with the content database 4504 and can receive policy information from the database 4504 as illustrated in FIG. 45 .
- Content information is retrieved from the database 4504 to the content analyzer 4508 that then forwards the content to the content selection module 4510 that is operative also with the policy engine 4506 .
- Based upon policy and information from the content analysis and manipulation library 4512 specific content is selected and forwarded to the content manipulation module 4514 , which modifies the content for viewing by the specific requesting device.
- the content analysis and manipulation library 4512 is operative with most of the main modules, specifically the content analyzer 4508 as well as the content selection module 4510 and the content manipulation module 4514 .
- the output information from the content transcoder is forwarded to the Internet for eventual receipt and display on, for example, personal computer 4524 for the enjoyment of user 4526 , personal data appliance 4522 , laptop 4520 , mobile telephone 4518 , and television 4516 .
- the policy engine module 4506 gathers the capabilities of the client, the network conditions and the transcoding preferences of the user as well as from the publisher and/or author. This information is used to define the transcoding options for the client. The system then selects the output-versions of the content and uses a library of content analysis and manipulation routines to generate the optimal content to be delivered to the client device.
- the content analyzer 4508 analyzes the video, namely the scene of video frames, to find their type and purpose, the motion vector direction, and face/text, etc. Based on this information, the content selection module 4510 and the manipulation module 4514 transcode the video by selecting adaptively the attention area that is defined by a position and size for a rectangular window, for example, in a video that is intended to fit the size of the respective client display.
- the system 4500 will select a dynamically transcoded (for example, scaled and/or cropped) area in the video without degrading the perceptibility of users. Also, this system has the manual editing routine that alters/adjusts manually the position and size of the transcoded area by the publisher and author.
- FIG. 46 illustrates an example of focus of attention area 4604 within the video frame 4602 that is defined by an adaptive rectangular window in the figure.
- the adaptive window is represented by the position and size as well as by the spatial resolution (width and height in pixels).
- the scene (or content) analysis adaptively determines the window position as well as the spatial resolution for each frame/clip of the video.
- the information on the gradient of the edges in the image can be used to intelligently determine the minimum allowable spatial resolution given the window position and size.
- the video is then fast transcoded by performing the cropping and scaling operations in the compressed domain such as DCT in case of MPEG-1/2.
- the present invention also enables the author or publisher to dictate the default window size. That size represents the maximum spatial resolution of area that users can perceptually recognize according to the author's expectation.
- the default window position is defined as the central point of the frame. For example, one can assume that this default window size is to contain the central 64% area by eliminating 10% background from each of the four edges, assuming no resolution reduction.
- the default window can be varied or updated after the scene analysis.
- the content/scene analyzer module analyzes the video frames to adaptively track the attention area. The following are heuristic examples of how to identify the attention area. These examples include frame scene types (e.g., background), synthetic graphics, complex, etc., that can help to adjust the window position and size.
- Computers have difficulty finding outstanding objects perceptually. But certain types of objects can be identified by text and face detection or object segmentation. Where the objects are defined as spatial region(s) within a frame, they may correspond to regions that depict different semantic objects such as cards, bridges, faces, embedded texts, and so forth. For example, in the case that there exist no larger objects (especially faces and text) than a specific threshold value within the frame, one can define this specific frame as the landscape or background. One may also use the default window size and position.
- the text detection algorithm can determine the window size.
- the importance of an object can be measured as follows:
- the visual rhythm of a video is a single image, that is, a two-dimensional abstraction of the entire three-dimensional content of the video constructed by sampling certain group of pixels of each image sequence and temporally accumulating the samples along time.
- Each vertical line in the visual rhythm of a video consists of a small number of pixels sampled from a corresponding frame of the video according to a specific sampling strategy.
- FIG. 26 shows several different sampling strategies 2600 such as horizontal sampling 2603 , vertical sampling 2605 , and diagonal sampling 2607 .
- the diagonal sampling strategy 2607 is to sample some pixels regularly from those lying at a diagonal line of each frame of a video.
- the sampling strategies illustrated in FIG. 26 are only a partial list of all realizable sampling strategies for visual rhythm utilized for many useful applications such as shot detection and caption text detection.
- Diagonal sampling provides the best visual features for distinguishing various video editing effects on the visual rhythm. All visual rhythms presented hereafter are assumed to be constructed using the diagonal sampling strategy for shot detection. But the presented invention can be easily applied to any sampling strategy.
- a compression method that employs only spatial redundancy is referred to as an intraframe coda, and frames coded in such a way are defined as intra-coded frames.
- Most video coders adopt block-based coding either in the spatial or transform domain for intraframe coding to reduce spatial redundancy.
- MPEG adopts discrete cosine transform (DCT) of 8 ⁇ 8 block into which 64 neighboring pixels are exclusively grouped.
- DCT discrete cosine transform
- DCT discrete cosine transform
- interframe coding predictive, interpolate
- Intra-pictures are compressed using intraframe coding; that is, they do not reference any other pictures in the coded bit stream.
- predicted pictures (P-picture) 2204 and 2202 are coded using motion-compensated prediction from past I-picture 2206 or P-picture 2204 , respectively.
- Bidirectionally predicted picture (B-pictures) 2210 are coded using motion-compensated prediction from either past and/or future I-pictures 2206 or P-pictures 2204 and 2202 . Therefor, given a pixel selected by a predetermined sampling strategy for constructing visual rhythm, one needs only decode the blocks in I-, P- and B-pictures needed to decode the block containing the corresponding pixel in the current picture.
- FIG. 23 and FIG. 24 illustrate the shaded blocks that need to be decompressed for the construction of visual rhythm in frames that can be referenced by other frames for motion compensation and frames that can't be referenced by other frames, respectively.
- Visual rhythm constructed by sampling the diagonal pixels located on 2308 of a frame 2302 , one only needs to decompress the shaded blocks in FIG. 23 which lie in between the lines 2304 and 2310 (separated by value 2306 , the search range p of motion prediction).
- Such approach allows one to obtain certain group of pixels without decoding unnecessary blocks and guarantees that the pixel values obtained from the decoded blocks can be obtained for constructing visual rhythm even without fully decoding the whole blocks composing each frame sequence.
- DCT discrete cosine transform
- a DCT block of N ⁇ N pixels is transformed to the frequency domain representation resulting in one DC and (N ⁇ N ⁇ 1 ) AC coefficients.
- the single DC coefficient is N-times the average of all N ⁇ N pixel values. It means that the DC coefficient of a DCT block can be served as a pixel value of a pixel included in the block if accurate pixel values may not be required. Extraction of a DC coefficient from a DCT block can be performed fast because it does not fully decode the DCT block.
- the extraction of DC coefficients from the blocks can be utilized instead of fully decoding the blocks and obtaining the pixel values of the pixels that will be selected by a predetermined sampling strategy for constructing visual rhythm.
- the same approach can be applied to any given compression scheme by only utilizing any coefficients readily available through compression.
- f DC (x, y, t) be the pixel value at location (x, y) of an arbitrary DC image that consists of the DC coefficients of the original frame t.
- the visual rhythm is a two-dimensional image consisting of DC coefficients sampled from a three-dimensional data (DC sequence).
- Visual rhythm is also an important visual feature that can be utilized to detect scene changes.
- FIG. 26 illustrates a set of sampling strategies for constructing visual rhythm from a set of frames making up a video stream.
- the frame sequence 2602 utilizes a single horizontal sampling 2603 across the middle of the frame.
- the frame sequence 2604 utilizes vertical sampling 2605 from top to bottom of the frame midway between the left and right sides.
- the frame sequence 2606 utilizes diagonal sampling 2607 from one corner of the frame to the cattycorner.
- the scanning techniques noted above can be mixed and matched (e.g., combining vertical and diagonal) and that multiple scans can take place (e.g., multiple horizontal scans, or cross-diagonal scans) to enhance the search, albeit with a potential performance loss due to the extra computational overhead.
- the sampling strategies can be set in a flexible manner for text detection of specific video materials where the approximate regions of caption text are known a priori.
- FIG. 27 ( a ) shows an example of visual rhythm when diagonals of a frame are sampled.
- frame 2714 is one of a set of frames used to construct binarized visual rhythm 2712 where only the pixels 2718 corresponding to caption text are represented in white.
- a caption 2716 is embedded in the frame 2714 and the subsequent set of frames used to construct the binarized visual rhythm 2712 so that “caption line” 2718 is formed within the binarized visual rhythm 2712 .
- FIG. 27 ( a ) and FIG. 27 (b) illustrate the visual rhythm 2702 of video content ( FIG. 27 ( a )) and its corresponding binarized visual rhythm 2708 where pixels corresponding to caption 2710 are represented in white ( FIG. 27 ( b )).
- Caption text embedded in zone 2706 of visual rhythm illustrated in FIG. 27 ( a ) shows that caption possess certain properties such as in region 2704 .
- This region 2704 of FIG. 27 ( a ) can be separated and is represented in white 2710 as in FIG. 27 ( b ) to form binarized visual rhythm 2708 .
- binarized visual rhythm 2708 Once the binarized visual rhythm 2708 is obtained, only a portion of the content of the entire frame need be scanned in order to extract the textual information in order to create appropriate multimedia bookmarks according to the method of the present invention. As illustrated in FIG.
- the method of the present invention similarly enables to locate the caption text 2804 of a frame 2802 , as well as multiple captions 2808 , 2810 , and 2812 from another frame 2806 and extract the text and obtain the binarized results 2804 ′, 2808 ′, 2810 ′, and 2812 ′ for subsequent processing, recognizing text, indexing, storing and retrieving.
- the caption frame detection stage seeks for caption frames, which herein are defined as a video or an image frame that contains one or more caption text. Caption frame detection algorithm is based on the following characteristics of caption text within video:
- stationary caption text is more often an important carrier of information and herewith more suitable for indexing and retrieving than moving caption text.
- stationary caption text for caption text mentioned in the rest of this disclosure.
- Caption lines with lengths shorter than frame length corresponding to a specific amount of time is neglected, since caption text usually remains in the scene for a number of consecutive frames.
- shortest captions appear to be active for at least two seconds, which translates into a caption line with frame length of 60 if the video is digitized at 30 frames per second. Thus caption lines with length less than 2 seconds can be eliminated.
- the resulting set of caption lines with the temporal duration appear in the form: LINE k ,
- ,k 1 , . . . ,N LINE
- N LINE is the total number of caption lines.
- the caption lines are ordered by increasing starting frame number: t 1 start ⁇ t 2 start ⁇ . . . ⁇ t N LINE start FIG.
- the frames not in between the temporal duration of the resulting set of caption lines can be assumed to not contain any caption text and are thus omitted as caption frame candidates.
- Caption text localization stage seeks to spatially localize caption text within the caption frame along with its temporal duration within the video.
- DC (x, y, t) be the pixel value at (x,y) of the DC image of frame t.
- a caption line can be used to approximate the location of caption text within the frame, and enable one to provide an algorithm to focus on specific area of the frame.
- two regions are considered to be of similar height if the height of a shorter region is at least 40% of the height of a taller one.
- regions are project onto the Y-axis. If the overlap of the projections of two regions is at least 50% of the shorter one, they are considered to be horizontally aligned.
- regions corresponding to the same caption text should be close to each other.
- the spacing between the characters and words of a caption text is usually less than three times the height of the tallest character, and so is the width of a character in most fonts.
- the following criterion is optionally used to merge regions corresponding to portions of caption text to obtain a bounding box around the caption text:
- Width Height > ⁇ A , ( ⁇ A 0.7 ) .
- Width and Height are the width and height of the final caption text region.
- the caption text region is expected to meet the above constraint; otherwise, they are removed as text regions.
- the final caption text region takes the temporal duration of its corresponding caption line.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Television Signal Processing For Recording (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
A method and system are provided for tagging, indexing, searching, retrieving, manipulating, and editing video images on a wide area network such as the Internet. A first set of methods is provided for enabling users to add bookmarks to multimedia files, such as movies, and audio files, such as music. The multimedia bookmark facilitates the searching of portions or segments of multimedia files, particularly when used in conjunction with a search engine. Additional methods are provided that reformat a video image for use on a variety of devices that have a wide range of resolutions by selecting some material (in the case of smaller resolutions) or more material (in the case of larger resolutions) from the same multimedia file. Still more methods are provided for interrogating images that contain textual information (in graphical form) so that the text may be copied to a tag or bookmark that can itself be indexed and searched to facilitate later retrieval via a search engine.
Description
- This application is a continuing application that is a divisional of commonly-owned, copending U.S. patent application Ser. No. 09/911,293, filed Jul. 23, 2001 by Sull et al.
- 1. Field of the Invention
- The present invention relates generally to marking multimedia files. More specifically, the present invention relates to applying or inserting tags into multimedia files for indexing and searching, as well as for editing portions of multimedia files, all to facilitate the storing, searching, and retrieving of the multimedia information.
- 2. Background of the Related Art
- 1. Multimedia Bookmarks
- With the phenomenal growth of the Internet, the amount of multimedia content that can be accessed by the public has virtually exploded. There are occasions where a user who once accessed particular multimedia content needs or desires to access the content again at a later time, possibly at or from a different place. For example, in the case of data interruption due to a poor network condition, the user may be required to access the content again. In another case, a user who once viewed multimedia content at work may want to continue to view the content at home. Most users would want to restart accessing the content from the point where they had left off. Moreover, subsequent access may be initiated by a different user in an exchange of information between users. Unfortunately, multimedia content is represented in a streaming file format so that a user has to view the file from the beginning in order to look for the exact point where the first user left off.
- In order to save the time involved in browsing the data from the beginning, the concept of a bookmark may be used. A conventional bookmark marks a document such as a static web page for later retrieval by saving a link (address) to the document. For example, Internet browsers support a bookmark facility by saving an address called a Uniform Resource Identifier (URI) to a particular file. Internet Explorer, manufactured by the Microsoft Corporation of Redmond, Washington, uses the term “favorite” to describe a similar concept.
- Conventional bookmarks, however, store only the information related to the location of a file, such as the directory name with a file name, a Universal Resource Locator (URL), or the URI. The files referred to by conventional bookmarks are treated in the same way regardless of the data formats for storing the content. Typically, a simple link is used for multimedia content also. For example, to link to a multimedia content file through the Internet, a URI is used. Each time the file is revisited using the bookmark, the multimedia content associated with the bookmark is always played from the beginning.
-
FIG. 1 illustrates alist 108 ofconventional bookmarks 110, each comprisingpositional information 112 andtitle 114. Thepositional information 112 of a conventional bookmark is composed of a URI as well as a bookmarkedposition 106. The bookmarked position is a relative time or byte position measured from a beginning of the multimedia content. Thetitle 114 can be specified by a user, as well as delivered with the content, and it is typically used to make the user easily recognize the bookmarked URI in abookmark list 108. For the case of a conventional bookmark without using a bookmarked position, when a user wants to replay the specified multimedia file, the file is played from the beginning of the file each time, regardless of how much of the file the user has already viewed. The user has no choice but to record the last accessed position on a memo and to move manually the last stopped point. If the multimedia file is viewed by streaming, the user must go through a series of buffering to find out the last accessed position, thus wasting much time. Even for the conventional bookmark with a bookmarked position, the same problem occurs when the multimedia content is delivered in live broadcast, since the bookmarked position within the multimedia content is not usually available, as well as when the user wants to replay one of the variations of the bookmarked multimedia content. - Further, conventional bookmarks do not provide a convenient way of switching between different data formats. Multimedia content may be generated and stored in a variety of formats. For example, video may be stored in the formats such as MPEG, ASF, RM, MOV, and AVI. Audio may be stored in the formats such as MID, MP3, and WAV. There may be occasions where a user wants to switch the play of content from one format to another. Since different data formats produced from the same multimedia content are often encoded independently, the same segment is stored at different temporal positions within the different formats. Since conventional bookmarks have no facility to store any content information, users have no choice but to review the multimedia content from the beginning and to search manually for the last-accessed segment within the content.
- Time information may be incorporated into a bookmark to return to the last-accessed segment within the multimedia content. The use of time information only, however, fails to return to exactly the same segment at a later time for the following reasons. If a bookmark incorporating time information was used to save the last-accessed segment during the preview of multimedia content broadcast, the bookmark information would not be valid during a regular full-version broadcast, so as to return to the last-accessed segment. Similarly, if a bookmark incorporating time information was used to save the last-accessed segment during real-time broadcast, the bookmark would not be effective during later access because the later available version may have been edited or a time code was not available during the real-time broadcast.
- Many video and audio archiving systems, consisting of several differently compressed files called “variations”, could be produced from a single source multimedia content. Many web-casting sites provide multiple streaming files for a single video content with different bandwidths according to each video format. For example, CNN.com provides five different streaming videos for a single video content: two different types of streaming videos with the bandwidths of 28.8 kbps and 80 kbps, both encoded in Microsoft's Advanced Streaming Format (ASF). CNN.com also provides RM streaming format by RealNetworks, Inc. of Seattle, Wash. (RM), and a streaming video with the smart bandwidth encoded in Apple Computer, Inc.'s QuickTime streaming format (MOV). In this case, the five video files may start and end at different time points from the viewpoint of the source video content, since each variation may be produced by an independent encoding process varying the values chosen for encoding formats, bandwidths, resolutions, etc. This results in mismatches of time points because a specific time point of the source video content may be presented as different media time points in the five video files.
- When a multimedia bookmark is utilized, the mismatches of positions cause a problem of mis-positioned playback. Consider a simple case where one makes a multimedia bookmark on a master file of a multimedia content (for example, video encoded in a given format), and tries to play another variation (for example, video encoded in a different format) from the bookmarked position. If the two variations do not start at the same position of the source content, the playback will not start at the bookmarked position. That is, the playback will start at the position that is temporally shifted with the difference between the start positions of the two variations.
- The entire multimedia presentation is often lengthy. However, there are frequent occasions when the presentation is interrupted, voluntarily or forcibly, to terminate before finishing. Examples include a user who starts playing a video at work leaves the office and desires to continue watching the video at home, or a user who may be forced to stop watching the video and log out due to system shutdown. It is thus necessary to save the termination position of the multimedia file into persistent storage in order to return directly to the point of termination without a time-consuming playback of the multimedia file from the beginning.
- The interrupted presentation of the multimedia file will usually resume exactly at the previously saved terminated position. However, in some cases, it is desirable to begin the playback of the multimedia file a certain time before the terminated point, since such rewinding could help refresh the user's memory.
- In the prior art, the EPG (Electronic Program Guide) has played a crucial role as a provider of TV programming information. EPG facilitates a user's efforts to search for TV programs that he or she wants to view. However, EPG's two-dimensional presentation (channels vs. time slots) becomes cumbersome as terrestrial, cable, and satellite systems send out thousands of programs through hundreds of channels. Navigation through a large table of rows and columns in order to search for desired programs is frustrating.
- One of the features provided by the recent set-top box (STB) is the personal video recording (PVR) that allows simultaneous recording and playback. Such STB usually contains digital video encoder/decoder based on an international digital video compression standard such as MPEG-1/2, as well as the large local storage for the digitally compressed video data. Some of the recent STBs also allow connection to the Internet. Thus, STB users can experience new services such as time-shifting and web-enhanced television (TV).
- However, there still exist some problems for the PVR-enabled STBs. The first problem is that even the latest STBs alone cannot fully satisfy users' ever-increasing desire for diverse functionalities. The STBs now on the market are very limited in terms of computing and memory and so it is not easy to execute most CPU and memory intensive applications. For example, the people who are bored with plain playback of the recorded video may desire more advanced features such as video browsing/summary and search. Actually, all of those features require metadata for the recorded video. The metadata are usually the data describing content, such as the title, genre and summary of a television program. The metadata also include audiovisual characteristic data such as raw image data corresponding to a specific frame of the video stream. Some of the description is structured around “segments” that represent spatial, temporal or spatio-temporal components of the audio-visual content. In the case of video content, the segment may be a single frame, a single shot consisting of successive frames, or a group of several successive shots. Each segment may be described by some elementary semantic information using texts. The segment is referenced by the metadata using media locators such as frame number or time codes. However, the generation of such video metadata usually requires intensive computation and a human operator's help, so practically speaking, it is not feasible to generate the metadata in the current STB. Thus, one possible solution for this problem is to generate the metadata in the server connected to the STB and to deliver it to the STB via network. However, in this scenario, it is essential to know the start position of recorded video with respect to the video stream used to generate the metadata in the server/content provider in order to match the temporal position referenced by the metadata to the position of the recorded video.
- The second problem is related to discrepancy between the two time instants: the time instant at which the STB starts the recording of the user-requested TV program, and the time instant at which the TV program is actually broadcast. Suppose, for instance, that a user initiated PVR request for a TV program scheduled to go on the air at 11:30 AM, but the actual broadcasting time is 11:31 AM. In this case, when the user wants to play the recorded program, the user has to watch the unwanted segment at the beginning of the recorded video, which lasts for one minute. This time mismatch could bring some inconvenience to the user who wants to view only the requested program. However, the time mismatch problem can be solved by using metadata delivered from the server, for example, reference frames/segment representing the beginning of the TV program. The exact location of the TV program, then, can be easily found by simply matching the reference frames with all the recorded frames for the program.
- 2. Search
- The rapid expansion of the World Wide Web (WWW) and mobile communications has also brought great interest in efficient multimedia data search, browsing and management. Content-based image retrieval (CBIR) is a powerful concept for finding images based on image contents, and content-based image search and browsing have been tested using many CBIR systems. See, M. Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Q. Huang, Byron Dom, Monika Gorkani, Jim Hafine, Denis Lee, Dragutin Petkovic, David Steele and Peter Yanker, “Query by image and video content: The QBIC system,” IEEE Computer, Vol. 28. No. 9, pp. 23-32, September, 1995; Carson, Chad et al., “Region-Based Image Querying [Blobworld],” Workshop on Content-Based Access of Image and Video Libraries, Puerto Rico, June 1997; J. R. Smith and S. Chang, “Visually searching the web for content,” IEEE Multimedia Magazine, Vol. 4, No. 3, pp. 12-20,
Summer 1997, also Columbia U. CU/CTR Technical Report 459-96-25; A. Pentland, R. W. Picard and S. Sclaroff, “A Photobook: tools for content-based manipulation of image databases,” in Proc. Of SPIE Conf. On Storage and Retrieval for Image and Video Databases-II, No. 2185, pp. 34-47, San Jose, Calif., February, 1944; J. R. Bach, C. Fuller, A. Guppy, A. Hampapur, B. Horowitz, R. Humphrey, R. C. Jain and C. Shu, “Virage image search engine: an open framework for image management,” Symposium on Electronic Imaging: Science and Technology—Storage & Retrieval for Image and Video Databases IV, IS&T/SPIE'96, February, 1996; J. R. Smith and S. Chang, “VisualSEEk: A Fully Automated Content-Based Image Query System,” ACM Multimedia Conference, Boston, Mass., November 1996; Jing Huang, S. Ravi Kumar, Mandar Mitra, Wei-Jing Zhu and Ramin Zabih. “Image Indexing Using Color Correlograms,” in IEEE Conference on Computer Vision and Pattern Recognition, pp. 762-768, June., 1997; and Simone Santini, and Ramesh Jain, “The ‘El Nino’ Image Database System,” in International Conference on Multimedia Computing and Systems, pp. 524-529, June, 1999. - Currently, most of the content-based image search engines rely on low-level image features such as color, texture and shape. While high-level image descriptors are potentially more intuitive for common users, the derivation of high-level descriptors is still in its experimental stages in the field of computer vision and requires complex vision processing. Despite its efficiency and ease of implementation, on the other hand, the main disadvantage of low-level image features is that they are perceptually non-intuitive for both expert and non-expert users, and therefor, do not normally represent users' intent effectively. Furthermore, they are highly sensitive to a small amount of image variation in feature shape, size, position, orientation, brightness and color. Perceptually similar images are often highly dissimilar in terms of low-level image features. Searches made by low-level features are often unsuccessful and it usually takes many trials to find images satisfactory to a user.
- Efforts have been made to overcome the limitations of low-level features. Relevance feedback is a popular idea for incorporating user's perceptual feedback in the image search. See, Y. Rui, T. Huang, and S. Mehrota, “A relevance feedback architecture in content-based multimedia information retrieval systems,” in IEEE Workshop on Content-based Access of Image and Video Libraries, Puerto Rico, pp. 82-89, June, 1997; Yong Rui, Thomas S. Huang, Michael Ortega, and Sharad Mehrotra, “Relevance Feedback: A Power Tool in Interactive Content-Based Image Retrieval,” in IEEE Tran on Circuits and Systems for Video Technology, Special Issue on Segmentation, Description, and Retrieval of Video Content, pp. 644-655, Vol. 8, No. 5, September, 1998; G. Aggarwal, P. Dubey, S. Ghosal, A. Kulshreshtha, and A. Sarkar, “iPURE: perceptual and user-friendly retrieval of images,” in Proc. of IEEE International Conference on Multimedia and Exposition, Vol. 2, pp. 693-696, July, 2000; Ye Lu, Chunhui Hu, Xingquan Zhu, HongJiang Zhang and Qiang Yang, “A unified framework for semantics and feature based relevance feedback in image retrieval systems,” in Proc. of ACM International Conference on Multimedia, pp. 31-37, October , 2000; H. Muller, W. Muller, S. Marchand-Maillet, and T. Pun, “Strategies for positive and negative relevance feedback in image retrieval,” in Proc. of IEEE Conference on Pattern Recognition, Vol. 1, pp. 1043-1046, September, 2000; S. Aksoy, R. M. Haralick, F. A. Cheikh, and M. Gabbouj, “A weighted distance approach to relevance feedback,” in Proc. of IEEE Conference on Pattern Recognition, Vol. 4, pp. 812-815, September, 2000.; I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, and P. N. Yianilos, “The Bayesian image retrieval system, PicHunter:theory, implementation, and psychophysical experiments,” in IEEE Transaction on Image Processing, Vol. 9, pp. 20-37, January, 2000; P. Muneesawang, and Guan Ling, “Multi-resolution-histogram indexing and relevance feedback learning for image retrieval,” in Proc. of IEEE International Conference on Image Processing, Vol. 2, pp. 526-529, January, 2001. A user can manually establish relevance between a query and retrieved images, and the relevant images can be used for refining the query. When the refinement is made by adjusting a set of low-level feature weights, however, the user's intent is still represented by low-level features and their basic limitations still remain.
- Several approaches have been made to the integration of human perceptual responses and low-level features in image retrieval. One notable approach is to adjust an image's feature's distance attributes based on the human perceptual input. See, Simone Santini, and Ramesh Jain, “The ‘El Nino’ Image Database System,” in International Conference on Multimedia Computing and Systems, pp. 524-529, June, 1999. Another approach, called “blob world,” combines low-level features to derive slightly higher-level descriptions and presents the “blobs”“of grouped features to a user to provide a better understanding of feature characteristics. See, Carson, Chad, et al., “Region-Based Image Querying [Blobworld],” Workshop on Content-Based Access of Image and Video Libraries, Puerto Rico, June, 1997. While those schemes successfully reflect a user's intent to some degree, it remains to be seen how grouping of features or feature distance modification can achieve the perceptual relevance in image retrieval. A more traditional computer vision approach to the derivation of high-level object descriptors based on generic object recognition has been presented for image retrieval. See, David A. Forsyth and Margaret Fleck, “Body Plans,” in IEEE Conference on Computer Vision and Pattern Recognition, pp. 678-683, June, 1997. Due to its limited feasibility for general image objects and complex processing, its utility is still restricted.
- With the rapid proliferation of large image/video databases, there has been an increasing demand for effective methods to search the large image/video databases automatically by their content. For a query image/video clip given by a user, these methods search the databases for the images/videos that are most similar to the query. In other words, the goal of the image/video search is to find best matches to the query image/video from the database.
- Several approaches have been made towards the development of the fast, effective multimedia search methods. Milanes et al. utilized hierarchical clustering to organize an image database into visually similar groupings. See, R. Milanese, D. Squire, and T. Pun, “Correspondence analysis and hierarchical indexing for content-based image retrieval,” in Proc. IEEE Int. Conf. Image Processing, Vol. 3, Lausanne, Switzerland, pp. 859-862, September, 1996. Zhang and Zhong provided a hierarchical self-organizing map (HSOM) method to organize an image database into a two-dimensional grid. See, H. J. Zhang and D. Zhong, “A scheme for visual feature based image indexing,” in Proc. SPIE/IS&T Conf. Storage Retrieval Image Video Database III, Vol. 2420, pp. 36-46, San Jose, Calif., February, 1995. However, a weakness of HSOM is that it is generally too computationally expensive to apply to a large multimedia database.
- In addition, there are other well known solutions using Voronoi diagram, Kd-tree, and R-tree. See, J. Bentley, “Multidimensional binary search trees used for associative searching,” Comm. of the ACM, Vol. 18, No. 9, pp. 509-517, 1975; S. Brin, “Near neighbor search in large metric spaces,” in Proc. 21st Conf. On Very Large Databases (VLDB'95), Zurich, Switzerland, pp. 574-584, 1995. However, it is also known that those approaches are not adequate for the high dimensional feature vector spaces, and thus, they are useful only in low dimensional feature spaces.
- Peer to Peer Searching
- Peer-to-Peer (P2P) is a class of applications making the most of previously unused resources (for example, storage, content, and/or CPU cycles), which are available on the peers at the edges of networks. P2P computing allows the peers to share the resources and services, or to aggregate CPU cycles, or to chat with each other, by direct exchange. Two of the more popular implementations of P2P computing are Napster and Gnutella. Napster has its peers register files with a broker, and uses the broker to search for files to copy. The broker plays the role of server in a client-server model to facilitate the interaction between the peers. Gnutella has peers register files with network neighbors, and searches the P2P network for files to copy. Since this model does not require a centralized broker, Gnutella is considered to be a true P2P system.
- 3. Editing
- In the prior art, video files were edited through video editing software by copying several segments of the input videos and pasting them to an output video. The prior art method, however, confronts two major problems mentioned below.
- The first problem of the prior art method is that it requires additional storage to store the new version of an edited video file. Conventional video editing software generally uses the original input video file to create an edited video. In most of the cases, editors having a large database of videos attempt to edit the videos to create a new one. In this case, the storage is wasted storing duplicated portions of the video. The second problem with the prior art method is that a whole new metadata have to be generated for a newly created video. If the metadata are not edited in accordance with the edition of the video, even if the metadata for the specific segment of the input video are already constructed, the metadata may not accurately reflect the content. Because considerable effort is required to create the metadata of videos, it is desirable to reuse efficiently existing metadata, if possible.
- Metadata of a video segment contain textual information such as time information (for example, starting frame number and duration, or starting frame number as well as the finishing frame number), title, keyword, and annotation, as well as image information such as the key frame of a segment. The metadata of segments can form a hierarchical structure where the larger segment contains the smaller segments. Because it is hard to store both the video and their metadata into a single file, the video metadata are separately stored as a metafile, or stored in a database management system (DBMS).
- If metadata having a hierarchical structure are used, browsing a whole video, searching for a segment using the keyword and annotation of each segment, and using the key frames of each segment for visual summary of the video are supported. Also, not only does it support the existing simple playback, but also the playback and repeated playback of a specific segment. Therefor, the use of hierarchically-structured metadata is becoming popular.
- 4. Transcoding
- With the advance of information technology, such as the popularity of the Internet, multimedia presentation proliferates into ever increasing kinds of media, including wireless media. Multimedia data are accessed by ever increasing kinds of devices such as hand-held computers (HHCs), personal digital assistants (PDAs), and smart cellular phones. There is a need for accessing multimedia content in a universal fashion from a wide variety of devices. See, J. R. Smith, R. Mohan and C. Li, “Transcoding Internet Content for Heterogeneous Client Devices,” in Proc. ISCASA, Monterey, Calif., 1998.
- Several approaches have been made to enable effectively such universal multimedia access (UMA). A data representation, the InfoPyramid, is a framework for aggregating the individual components of multimedia content with content descriptions, and methods and rules for handling the content and content descriptions. See, C. Li, R. Mohan and J. R. Smith, “Multimedia Content Description in the InfoPyramid,” in Proc. IEEE Intern. Conf. on Acoustics, Speech and Signal Processing, May, 1998. The InfoPyramid describes content in different modalities, at different resolutions and at multiple abstractions. Then a transcoding tool dynamically selects the resolutions or modalities that best meet the client capabilities from the InfoPyramid. J. R. Smith proposed a notion of importance value for each of the regions of an image as a hint to reduce the overall data size in bits of the transcoded image. See, J. R. Smith, R. Mohan and C. Li, “Content-based Transcoding of Images in the Internet,” in Proc. IEEE Intern. Conf. on Image Processing, October, 1998; S. Paek and J. R. Smith, “Detecting Image Purpose in World-Wide Web Documents,” in Proc. SPIE/IS&T Photonics West, Document Recognition, January, 1998. The importance value describes the relative importance of the region/block in the image presentation compared with the other regions. This value ranges from 0 to 1, where 1 stands for the highest important region and 0 for the lowest. For example, the regions of high importance are compressed with a lower compression factor than the remaining part of the image. Then, the other parts of the image are first blurred and then compressed with a higher compression factor in order to reduce the overall data size of the compressed image.
- When an image is transmitted to a variety of client devices with different display sizes, a scaling mechanism, such as format/resolution change, bit-wise data size reduction, and object dropping, is needed. More specifically, when an image is transmitted to a variety of client devices with different display sizes, a system should generate a transcoded (e.g., scaled and cropped) image to fit the size of the respective client display. The extent of transcoding depends on the type of objects embedded in the image, such as cards, bridges, face, and so forth. Consider, for example, an image containing an embedded text or a human face. If the display size of a client device is smaller than the size of the image, sub-sampling and/or cropping to fit the client display must reduce the spatial resolution of the image. Users very often in such a case have difficulty in recognizing the text or the human face due to the excessive resolution reduction. Although the importance value may be used to provide information on which part of the image can be cropped, it does not provide a quantified measure of perceptibility indicating the degree of allowable transcoding. For example, the prior art does not provide the quantitative information on the allowable compression factor with which the important regions can be compressed while preserving the minimum fidelity that an author or a publisher intended. The InfoPyramid does not provide either the quantitative information about how much the spatial resolution of the image can be reduced or ensure that the user will perceive the transcoded image as the author or publisher initially intended.
- 5. Visual Rhythm
- Fast Construction of Visual Rhythm
- Once the digital video is indexed, more manageable and efficient forms of retrieval may be developed based on the index that facilitate storage and retrieval. Generally, the first step for indexing and retrieving of visual data is to temporally segment the input video, that is, to find shot boundaries due to camera shot transitions. The temporally segmented shots can improve the storing and retrieving of visual data if keywords to the shots are also available. Therefor, a fast and accurate automatic shot detector needs to be developed as well as an automatic text caption detector to automatically annotate keywords to the temporally segmented shots.
- Even if abrupt scene changes are relatively easy to detect, it is more difficult to identify special effects, such as dissolve and wipe. Unfortunately, these special effects are normally used to stress the importance of the scene change (from a content point of view), so they are extremely relevant therefor they should not be missed. However, the wipe sequence detection method, relative to dissolve sequence, is less discussed and concerned. For scene change detection, a matching process between two consecutive frames is required. In order to segment a video sequence into shots a dissimilarity measure between two frames must be defined. This measure must return a high value only when two frames fall in different shots. Several researchers have used the dissimilarity measure based on the luminance or color histogram, correlogram, or any other visual feature to match two frames. However, these approaches usually produce many false alarms and it is very hard for humans to exactly locate various types of shots (especially dissolves and wipes) of a given video even when the dissimilarity measure between two frames are plotted, for example when they are plotted in 1-D graph where the horizontal axis represents time of a video sequence and the vertical axis represents the dissimilarity values between the histograms of the frames along time. They also require high computation load to handle different shapes, directions and patterns of various wipe effects. Therefor, it is important to develop a tool that enables human operator to efficiently verify the results of automatic shot detection where there usually might be many falsely detected and missing shots. Visual rhythm satisfies much of the above conditions. Visual rhythm contains distinctive patterns or visual features for many type of video editing effects, especially for all wipe-like effects which manifest as visually distinguishable lines or curves on the visual rhythm with very little computational time, which enables an easy verification of automatically detected shots by human without actually playing the whole individual frame sequence to minimize or possible eliminate all false as well as missing shots. Visual rhythm on the other hand contains visual features readily available to detect caption text also. See, H. Kim, J. Lee and S. M. Song, “An efficient graphical shot verifier incorporating visual rhythm”, in Proceedings of IEEE International Conference on Multimedia Computing and Systems, pp. 827-834, June, 1999.
- Detecting Text in Video and Graphic Images
- As contents become readily available on wide area networks such as the Internet, archiving, searching, indexing and locating desired content in large volumes of multimedia containing image and video, in addition to the text information, will become even more difficult. One important source of information about image and video is the text contained therein. The video can be easily indexed if access to this textual information content is available. The text provides clear semantics of video and are extremely useful in deducing the contents of video.
- There are many ways that segment and recognize text in printed documents. Current video research tackles the text caption recognition problem as a series of sub-problems to: (a) identify the existence and location of text captions in complex background; (b) segment text regions; and (c) post-process the text regions for recognition using a standard OCR. Most current research focuses on tackling sub-problems (a) and (b) in raw spatial domain, with a few methods that can be extended to compressed domain processing.
- A large number of methods has been studied extensively in recent years to detect text frames in uncompressed images and video. Ohya et al. performed character extraction through local thresholding and detected character candidate regions by evaluating gray level differences between adjacent regions. See, J. Ohya, A, Shio and S. Akamatsu, “Recognizing Characters in Scene Image,” in IEEE Trans. On pattern Analysis and Machine Intelligence, Vol. 16, pp. 214-224. Haupmann and Smith used the spatial context of text and high contrast of text regions in scene images to merge large numbers of horizontal and vertical edges in spatial proximity to detect text. See, A. Haupmann, M. Smith, “Text, Speech, and Vision for Video Segmentation: The Informedia Project,” in AAAI Symposium on Computational Models for Integrating Language and Vision, 1995. Shim et al. introduced a generalized region labeling algorithm to find homogeneous regions for text extraction. See, J. Shim, C. Dorai and M. Smith, “Automatic Text Extraction from Video for Content-Based Annotation and Retrieval,” in Proc. ICPR, pp. 618-620, 1998. Manmatha showed the algorithm to detect and segment texts as regions of distinctive texture using pyramid technique for handling text fonts of different sizes. See, W. Manmatha, “Finding Text in Images,” in Proc. of ACM Int'l Conf. On Digital Libraries, 3-12. Lienhart and Stuber provided Split- and- Merge algorithm based on characteristics of artificial text to segment text. See, R. Lienhart, “Automatic Text Recognition for Video Indexing,” in Proc. Of A CM MM, pp. 11-20. Doermann and Kia used wavelet analysis and employed a multi-frame coherence approach to cluster edges into rectangular shape. See, L. Doermann, O. Kia, “Automatic Text Detection and Tracking in Digital Video,” in IEEE Trans. On Image Processing, Vol. 9, pp. 147-156. Sato et al. adopted a multi-frame integration technique to separate static text from moving background. See, T. Sato, T. Kanade and S. Satoh, “Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Captions,” in Multimedia Systems, Vol. 7, pp. 385-394.
- Finally, several compressed domain methods have also been proposed to detect text regions. Yeo and Liu proposed a method for the detection of text caption events in video by modified scene change detection which cannot handle captions that gradually enter or disappear from frames. See, B. L. Yeo, “Visual Content Highlighting Visa Automatic Extraction of Embedded Captions on MPEG Compressed Video,” in SPIE/IS&TSymp. on Electronic Imaging Science and Technology, Vol. 2668, 1996. Zhong et al. examined the horizontal variations of AC values in DCT to locate text frames and examined the vertical intensity variation within the text regions to extract the final text frames. See, Y. Zhong, K. Karu and A. Jain, “Automatic captions localization in compressed video,” in IEEE Trans. On PAMI, 22(4), pp. 385-392. Zhong derived a binarized gradient energy representation directly from DCT coefficients which are subject to constraints on text properties and temporal coherence to locate text. See, Y. Zhong, “Detection of text captions in compressed domain video,” in Proc. Of Multimedia Information Retrieval Workshop ACM Multimedia’ 2000, November 201-204. However, most of the compressed domain methods restrict the detection of text in I-frames of a video because it is time-consuming to obtain the AC values in DCT for intra-frame coded frames.
- There is, therefor, a need in the art for a method and system that will enable the tagging of multimedia images for indexing, editing, searching and retrieving. There is also a need in the art to enable the indexing of textual information that is embedded in graphical images or other multimedia data so that the text in the image can also be tagged, indexed, searched and retrieved, as is other textual information. Further, there is also a need in the art for editing multimedia data for display, indexing, and searching in ways the prior art does not provide.
- The invention overcomes the above-identified problems as well as other shortcomings and deficiencies of existing technologies by providing
- 1. Multimedia Bookmark The present invention provides a system and method for accessing multimedia content stored in a multimedia file having a beginning and an intermediate point, the content having at least one segment at the intermediate point. At a minimum, the system includes a multimedia bookmark, the multimedia bookmark having content information about the segment at the intermediate point, wherein a user can utilize the multimedia bookmark to access the segment without accessing the beginning of the multimedia file.
- The system of the present invention can include a wide area network such as the Internet. Moreover, the method of the present invention can facilitate the creating, storing, indexing, searching, retrieving and rendering of multimedia content on any device capable of connecting to the network and performing one or more of the aforementioned functions. The multimedia content can be one or more frames of video, audio data, text data such as a string of characters, or any combination or permutation thereof.
- The system of the present invention includes a search mechanism that locates a segment in the multimedia file. An access mechanism is included in the system that reads the multimedia content at the segment designated by the multimedia bookmark. The multimedia content can be partial data that are related to a particular segment.
- The multimedia bookmark used in conjunction with the system of the present invention includes positional information about the segment. The positional information can be a URI, an elapsed time, a time code, or other information. While the multimedia file used in conjunction with the system of the present invention can be contained on local storage, it can also be stored at remote locations.
- The system of the present invention can be a computer server that is operably connected to a network that has connected to it one or more client devices. Local storage on the server can optionally include a database and sufficient circuitry and/or logic, in the form of hardware and/or software in any combination that facilitates the storing, indexing, searching, retrieving and/or rendering of multimedia information.
- The present invention further provides a methodology and implementation for adaptive refresh rewinding, as opposed to traditional rewinding, which simply performs a rewind from a particular position by a predetermined length. For simplicity, the exemplary embodiment described below will demonstrate the present invention using video data. Three essential parameters are identified to control the behavior of adaptive refresh rewinding, that is, how far to rewind, how to select certain frames in the rewind interval, and how to present the chosen refresh video frames on a display device.
- The present invention also provides a new way to generate and deliver programming information that is customized to the user's viewing preferences. This embodiment of the present invention removes the navigational difficulties associated with EPG. Specifically, data regarding the user's habits of recording, scheduling, and/or accessing TV programs or Internet movies are captured and stored. Over a long period of time, these data can be analyzed and used to determine the user's trends or patterns that can be used to predict future viewing preferences.
- The present invention also relates to the techniques to solve the two problems by downloading the metadata from a distant metadata server and then synchronizing/matching the content with the received metadata. While this invention is described in the context of video content stored on STB having PVR function, it can be extended to other multimedia content such as audio.
- The present invention also allows the reuse of the content prerecorded on the analog VCR videotapes. Using the PVR function of STB, once the content of the VCR tape is converted into digital video and is stored on the hard disk on the STB, the present invention works equally well.
- The present invention also provides a method for searching for relevant multimedia content based on at least one feature saved in a multimedia bookmark. The method preferably includes transmitting at least one feature saved in a multimedia bookmark from a client system to a server system in response to a user's selection of the multimedia bookmark. The server may then generate a query for each feature received and, subsequently, use each query generated to search one or more storage devices. The search results may be presented to the user upon completion.
- In yet another embodiment, the present invention provides a method for verifying inclusion of attachments to electronic mail messages. The method preferably includes scanning the electronic mail message for at least one indicator of an attachment to be included and determining whether at least one attachment to the electronic mail message is present upon detection of the at least one indicator. In the event an indicator is present but an attachment is not, the method preferably also includes displaying a reminder to a user that no attachment is present.
- In yet another embodiment, the present invention provides a method for searching for multimedia content in a peer to peer environment. The method preferably includes broadcasting a message from a user system to announce its entrance to the peer to peer environment. Active nodes in the peer to peer environment preferably acknowledge receipt of the broadcast message while the user system preferably tracks the active nodes. Upon initiation of a search request at the user system, a query message including multimedia features is preferably broadcast to the peer to peer environment. Upon receipt of the query message, a multimedia search engine on a multimedia database included in a storage device on one or more active nodes is preferably executed. A search results message including a listing of found filenames and network locations is preferably sent to the user system upon completion of the database search.
- The present invention further provides a method for sending a multimedia bookmark between devices over a wireless network. The method preferably includes acknowledging receipt of a multimedia bookmark by a video bookmark message service center upon receipt of the multimedia bookmark from a sending device. After requesting and receiving routing information from a home location register, the video bookmark message service center preferably invokes a send multimedia bookmark operation at a mobile switching center. The mobile switching center then preferably sends the multimedia bookmark and, upon acknowledgement of receipt of the multimedia bookmark by the recipient device, notifies the video bookmark message service center of the completed multimedia bookmark transaction.
- In another embodiment, the present invention provides a method for sending multimedia content over a wireless network for playback on a mobile device. In this embodiment, the mobile device preferably sends a multimedia bookmark and a request for playback to a mobile switching center. The mobile switching center then preferably sends the request and the multimedia bookmark to a video bookmark message service center. The video bookmark message service center then preferably determines a suitable bit rate for transmitting the multimedia content to the mobile device. Based on the bit rate and various characteristics of the mobile device, the video bookmark message service center also preferably calculates a new multimedia bookmark. The new multimedia bookmark is then sent to a multimedia server which streams the multimedia content to the video bookmark message service center before the multimedia content is delivered to the mobile device via the mobile switching center.
- 2. Search
- The present invention further provides a new approach to utilizing user-established relevance between images. Unlike conventional content-based and text-based approaches, the method of the present invention uses only direct links between images without relying on image descriptors such as low-level image features or textual annotations. Users provide relevance information in the form of relevance feedback, and the information is accumulated in each image's queue of links and propagated through linked images in a relevance graph. The collection of direct image links can be effective for the retrieval of subjectively similar images when they are gathered from a large number of users over a considerable period of time. The present invention can be used in conjunction with other content-based and text-based image retrieval methods.
- The present invention also provides a new method to fast find from a large database of image/frames the objects close enough to a query image/frame under a certain distortion. With the metric property of distance function, the information on LBG clustering, and Haar-transform based fast codebook search algorithm, which is also disclosed herein, the present invention reduces the number of distance evaluations at query time, thus resulting in fast retrieval of data objects from the database. Specifically, the present invention sorts and stores in advance the distances to a group of predefined distinguished points (called reference points) in the feature space and performs binary searches on the distances so as to speed up the search.
- The present invention introduces an abstract multidimensional structure called hypershell. More practically, the hypershell can be conceived as a set of all the feature vectors in the feature space which lie away r±ε from its corresponding reference point, where r is the distance between a query feature point and the reference point, and ε is a real number indicating the fidelity of search results. And the intersection of such hypershells leads to some intersected regions which are often small partitions of the whole feature space. Therefor, instead of the whole feature space, the present invention performs the search only on the intersected regions to improve the search speed.
- 3. Editing
- The present invention further provides a new approach to editing video materials, in which it only virtually edits the metadata of input videos to create a new video, instead of actually editing videos stored as computer files. In the present invention, the virtual editing is performed either by copying the metadata of a video segment of interest in an input metafile or copying only the URI of the segment into a newly constructed metafile. The present invention provides a way of playing the newly edited video only with its metadata. The present invention also provides a system for the virtual editing. The present invention can be applied not only to videos stored on CD-ROM, DVD, and hard disk, but also to streaming videos over a network.
- The present invention also provides a method for virtual editing multimedia files. Specifically, the one or more video files are provided. A metadata file is created for each of the video files, each of the metadata files having at least one segment to be edited. Thereafter, a single edited metafile is created that contains the segments to were to be edited from each of the metadata files so that when the edited metadata file is accessed, the user is able to play the segments to be edited in the edited order.
- The present invention also provides a method for virtual editing multimedia files. Specifically, the one or more video files are provided. A metadata file is created for each of the video files, each of the metadata files having at least one segment to be edited. Thereafter, a single edited metafile is created that contains links to the segments to were to be edited from each of the metadata files so that when the edited metadata file is accessed, the user is able to play the segments to be edited in the edited order.
- The present invention also includes a method for editing a multimedia file by providing a metafile, the metafile having at least one segment that is selectable; selecting a segment in the metafile; determining if a composing segment should be created, and if the composing segment should be created, then creating a composing segment in a hierarchical structure; specifying the composing segment as a child of a parent composing segment; determining if metadata is to be copied or if a URI is to be used; if the metadata is to be copied, then copying metadata of the selected segment to the component segment; if the URI is to be used, then writing a URI of the selected segment to the component segment; writing a URL of an input video file to the component segment; determining if all URLs of any sibling files are the same; and if the URL is the same as any of the sibling's URLs, then writing the URL to the parent composing segment and deleting the URLs of all sibling segments.
- In a further embodiment, the method for editing a multimedia file includes determining if another segment is to be selected and if another segment is to be selected, then performing the step of selecting a segment in a metafile.
- In yet a further embodiment of the method for editing a multimedia file, the method includes determining if another metafile is to be browsed and if another metafile is to be browsed, then performing the step of providing a metafile. The metafiles may be XML files or some other format.
- The present invention also provides a virtual video editor in one embodiment. The virtual video editor includes a network controller constructed and arranged to access remote metafiles and remote video files and a file controller in operative connection to the network controller and constructed and arranged to access local metafiles and local video files, and to access the remote metafiles and the remote video files via the network controller. A parser constructed and arranged to receive information about the files from the file controller and an input buffer constructed and arranged to receive parser information from the parser are also included in the virtual video editor. Further, a structure manager constructed and arranged to provide structure data to the input buffer, a composing buffer constructed and arranged to receive input information from the input buffer and structure information from the structure manager to generate composing information and a generator constructed and arranged to receive the composing information from the composing buffer are preferably included and wherein the generator generates output information in a pre-selected format are preferably included.
- In a further embodiment, the virtual video editor also includes a playlist generator constructed and arranged to receive structure information from the structure manager in order to generate playlist information and a video player constructed and arranged to receive the playlist information from the playlist generator and file information from the file controller in order to generate display information.
- In yet a further embodiment, the virtual video editor also includes a display device constructed and arranged to receive the display information from the video player and to display the display information to a user.
- In a further embodiment, the present invention provides a method for transcoding an image for display at multiple resolutions. Specifically, the method includes providing a multimedia file, designating one or more regions of the multimedia file as focus zones and providing a vector to each of the focus zones. The method continues by reading the multimedia file with a client device, the client device having a maximum display resolution and determining if the resolution of the multimedia file exceeds the maximum display resolution of the client device. If the multimedia file resolution exceeds the maximum display resolution of the display device, the method determines the maximum number focus zones that can be displayed on the client device. Finally, the method includes displaying the maximum number of focus zones on the client device.
- 4. Transcoding
- The present invention also provides a novel scheme for generating transcoded (scaled and cropped) image to fit the size of the respective client display when an image is transmitted to a variety of client devices with different display sizes. The scheme has two key components: 1) perceptual hint for each image block, and 2) an image transcoding algorithm. For a given semantically important block in an image, the perceptual hint provides the information on the minimum allowable spatial resolution. Actually, it provides a quantitative information on how much the spatial resolution of the image can be reduced while ensuring that the user will perceive the transcoded image as the author or publisher want to represent it. The image transcoding algorithm that is basically a content adaptation process selects the best image representation to meet the client capabilities while delivering the largest content value. The content adaptation algorithm is modeled as a resource allocation problem to maximize the content value.
- 5. Visual Rhythm
- One of the embodiments of the method of the present invention provides a fast and efficient approach for constructing visual rhythm. Unlike the conventional approaches which decode all pixels composing a frame to obtain certain group of pixel values using conventional video decoders, the present invention provides a method such that only few of the pixels composing a frame are decoded to obtain the actual group of pixels needed for constructing visual rhythm. Most video compressions adopt intraframe and interframe coding to reduce spatial as well as temporal redundancies. Therefor, once the group of pixels is determined for constructing visual rhythm, one only decodes this group of pixels in frames which are not referenced by other frames for interframe coding. For frames referenced by other frames for interframe coding, one decodes the determined group of pixels for constructing visual rhythm as well as other few pixels needed to decode this group of pixels for frames referencing to those frames. This allows fast generation of visual rhythm for its application to shot detection, caption text detection, or any other possible applications derived from it.
- The other embodiment of the method of present invention provides an efficient and fast-compressed DCT domain method to locate caption text regions in intra-coded and inter-coded frames through visual rhythm from observations that caption text generally tend to appear on certain areas on video or are known a prior; and secondly, the method employs a combination of contrast and temporal coherence information on the visual rhythm, to detect text frame and uses information obtained through visual rhythm to locate caption text regions in the detected text frame along with their temporal duration within the video.
- In one embodiment of the present invention, a content transcoder for modifying and forwarding multimedia content maintained in one or more multimedia content databases to a wide area network for display on a requesting client device is provided. In this embodiment, the content transcoder preferably includes a policy engine coupled to the multimedia content database and a content analyzer operably coupled to both the policy engine and the multimedia content database. The content transcoder of the present invention also preferably includes a content selection module operably coupled to both the policy engine and the content analyzer and a content manipulation module operably coupled to the content selection module. Finally, the content transcoder preferably includes a content analysis and manipulation library operably coupled to the content analyzer, the content selection module and the content manipulation module. In operation, the policy engine may receive a request for multimedia content from the requesting client device via the wide area network and policy information from the multimedia content database. The content analyzer may retrieve multimedia content from the multimedia content database and forward the multimedia content to the content selection module. The content selection module may select portions of the multimedia content based on the policy information and information from the content analysis and manipulation library and forward the selected portions of multimedia content to the content manipulation module. The content manipulation module may then modify the multimedia content for display on the requesting client device before transmitting the modified multimedia content over the wide area network to
- Features and advantages of the invention will be apparent from the following description of the embodiments, given for the purpose of disclosure and taken in conjunction with the accompanying drawings.
- A more complete understanding of the present invention and advantages thereof may be acquired by referring to the following description taken in conjunction with the accompanying drawings, wherein:
-
FIG. 1 is an illustration of a conventional prior art bookmark. -
FIG. 2 is an illustration of a multimedia bookmark in accordance with the present invention. -
FIG. 3 is an illustration of exemplary searching for multimedia content relevant to the content information saved in the multimedia bookmark of the present invention, where both positional and content information are used. -
FIG. 4 is an illustration of an exemplary tree structure used by two exemplary search methods in accordance with the present invention. -
FIG. 5 is an example of five variations encoded by the present invention from the same source video content. -
FIG. 6 is an example of two multimedia contents and their associated metadata of the present invention. -
FIG. 7 is a list of example multimedia bookmarks of the present invention. -
FIG. 8 is an illustration of an exemplary method of adjusting bookmarked positions in the durable bookmark system of the present invention. -
FIG. 9 is an illustration of an exemplary user interface incorporating a multimedia bookmark of the present invention. -
FIG. 10 is a flowchart illustrating an exemplary embodiment of a method of the present invention that is effective to implement the disclosed processing system. -
FIG. 11 is a flowchart illustrating the overall process of saving and retrieving multimedia bookmarks of the present invention. -
FIG. 12 is a flowchart illustrating an exemplary process of playing a multimedia bookmark of the present invention. -
FIG. 13 is a flowchart illustrating an exemplary process of deleting a multimedia bookmark of the present invention. -
FIG. 14 is a flowchart illustrating an exemplary process of adding a title to a multimedia bookmark of the present invention. -
FIG. 15 is a flowchart illustrating an exemplary process of the present invention for searching for the relevant multimedia content based upon content, as well as textual information if available. -
FIG. 16 is a flow chart illustrating an exemplary process of the present invention for sending a bookmark to other people via e-mail. -
FIG. 17 is a flowchart illustrating an exemplary method of the present invention for e-mailing a multimedia bookmark of the present invention. -
FIG. 18 is a block diagram illustrating an exemplary system for transmitting multimedia content to a mobile device using the multimedia bookmark of the present invention. -
FIG. 19 is a block diagram illustrating an exemplary message signal arrangement of the present invention between a personal computer and a mobile device. -
FIG. 20 is a block diagram illustrating an exemplary message signal arrangement of the present invention between two mobile devices. -
FIG. 21 is a block diagram illustrating an exemplary message signal arrangement of the present invention between a video server and a mobile device. -
FIG. 22 is a block diagram illustrating an exemplary data correlation method of the present invention. -
FIG. 23 is a block diagram illustrating an exemplary swiping technique of the present invention. -
FIG. 24 is a block diagram illustrating an alternate exemplary swiping technique of the present invention. -
FIG. 25 is a flowchart illustrating an exemplary peer-to-peer exchange of the multimedia bookmark of the present invention. -
FIG. 26 is a block diagram illustrating different sampling strategies. -
FIG. 27 is a block diagram illustrating an exemplary visual rhythm method of the present invention. -
FIG. 28 is a block diagram illustrating the localization and segmentation of text information according to the present invention. -
FIG. 29 is a block diagram illustrating the use of an exemplary Haar transformation according to the present invention. -
FIG. 30 is a block diagram illustrating an exemplary queue for image links of the present invention. -
FIG. 31 is a block diagram illustrating an alternate exemplary queue for image links of the present invention. - FIGS. 32(a) and (b) are block diagrams illustrating a comparison of a prior art video methodology and an exemplary editing method of the present invention.
-
FIG. 33 is a block diagram illustrating an exemplary segmentation and reconstruction of a new multimedia video presentation according to the method of the present invention. -
FIG. 34 is a block diagram illustrating an exemplary edited multimedia file according to the present invention. -
FIG. 35 is a flowchart of an exemplary method of the present invention for virtual video editing based on metadata. -
FIG. 36 is an exemplary pseudocode implementation of the method of the present invention. -
FIG. 37 is an exemplary pseudocode implementation of the method of the present invention. -
FIG. 38 is an exemplary pseudocode implementation of the method of the present invention. -
FIG. 39 is an exemplary pseudocode implementation of the method of the present invention. -
FIG. 40 is an exemplary pseudocode implementation of the method of the present invention. -
FIG. 41 is an exemplary pseudocode implementation of the method of the present invention. -
FIG. 42 is a block diagram illustrating an exemplary virtual video editor of the present invention. -
FIG. 43 is a block diagram illustrating an exemplary transcoding method of the present invention without SRR value. -
FIG. 44 is a block diagram illustrating an exemplary transcoding method of the present invention with SRR value. -
FIG. 45 is a block diagram illustrating an exemplary content transcoder of the present invention. -
FIG. 46 is a block diagram illustrating an exemplary adaptive widow focusing method of the present invention. -
FIG. 47 is a block diagram and table illustrating image nodes and edges according to an exemplary method of the present invention. -
FIG. 48 is a block diagram illustrating an exemplary hypershell search method of the present invention. -
FIG. 49 is a block diagram illustrating the contents of an embodiment of the video bookmark of the present invention. -
FIG. 50 is a block diagram illustrating the recommendation engine of the present invention. -
FIG. 51 is a block diagram illustrating the video bookmark process of the present invention in conjunction with an EPG channel. -
FIG. 52 is a block diagram illustrating the video bookmark process of the present invention in conjunction with a network. -
FIG. 53 is a block diagram of the system of the present invention. -
FIG. 54 is a block diagram of an exemplary relevance queue of the present invention. -
FIG. 55 is a timeline diagram showing an exemplary embodiment of the rewind method of the present invention. -
FIG. 56 is a timeline diagram showing an exemplary embodiment of the rewind method of the present invention. -
FIG. 57 is a flowchart showing an exemplary embodiment of the retrieval method of the present invention. -
FIG. 58 is a flowchart showing another exemplary embodiment of the retrieval method of the present invention. -
FIG. 59 is a flowchart showing another exemplary embodiment of the retrieval method of the present invention. -
FIG. 60 is a block diagram illustrating a hierarchical arrangement of images that exemplifies a navigation method of the present invention. -
FIG. 61 is a web page illustrating a web page having an exemplary duration bar of the present invention. -
FIG. 62 is a web page illustrating a web page having an exemplary duration bar of the present invention. -
FIG. 63 is a diagram illustrating an exemplary hypershell search method of the present invention. -
FIG. 64 is a diagram illustrating another exemplary hypershell search method of the present invention. -
FIG. 65 is a diagram illustrating another exemplary hypershell search method of the present invention. -
FIG. 66 is a diagram illustrating another exemplary hypershell search method of the present invention. -
FIG. 67 is a diagram illustrating another exemplary hypershell search method of the present invention. -
FIG. 68 is a block diagram illustrating an exemplary embodiment of the metadata server and metadata agent of the present invention. -
FIG. 69 is a block diagram illustrating an alternate exemplary embodiment of the metadata server and metadata agent of the present invention. -
FIG. 70 is a timeline comparison illustrating exemplary offset recording capability of the present invention. -
FIG. 71 is a timeline comparison illustrating alternate exemplary offset recording capability of the present invention. -
FIG. 72 is a timeline comparison illustrating exemplary interrupt recording capability of the present invention. -
FIG. 73 is a timeline comparison illustrating the exemplary disparate and sequential recording capabilities of the present invention. - While the present invention is susceptible to various modifications and alternative forms, specific exemplary embodiments thereof have been shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the description herein of specific embodiments is not intended to limit the invention to the particular forms disclosed, but, on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.
-
FIG. 53 illustrates the system of the present invention. At the heart of the system of the present invention is aWide Area Network 5350, exemplary or most famously embodied in the Internet. The present invention can be contained within theserver 5314, as well as a series of clients such asLaptop 5322,Video Camera 5324,Telephone 5326,Digitizing Pad 5328, Personal Digital Assistance (PDA) 5330,Television 5332, Set Top Box 5340 (that is connected to and serves Television 5338),Scanner 5334,Facsimile Machine 5336,Automobile 5302,Truck 5304,Screen 5308,Work Station 5312,Satellite Dish 5310, andCommunications Tower 5306, all useful for communications to or from remote devices for use with the system of the present invention. The present invention is particularly useful for settop boxes 5340. The settop boxes 5340 may be used as intermediate video servers for home networking, serving televisions, personal computers, game stations and other appliances. Theserver 5314 can be connected to an internal local area network via, for example,Ethernet 5316, although any type of communications protocol in a local area network or wide area network is possible for use with the present invention. Preferably, the local area network for theserver 5314 has with it connections fordata storage 5318 which can include database storage capability. The local area network connected toEthernet 5316 may also hold one or morealternate servers 5320 for purposes of load balancing, performance, etc. The multimedia bookmarking scheme of the present invention can utilize the servers and clients of the system of the present invention, as illustrated inFIG. 53 , for use in transferring data to or loading data from the servers through theWide Area Network 5350. - In general, the present invention is useful for storing, indexing, searching, retrieving, editing, and rendering multimedia content over networks having at least one device capable of storing and/or manipulating an electronic file, and at least one device capable of playing the electronic file. The present invention provides various methodologies for tagging multimedia files to facilitate the indexing, searching, and retrieving of the tagged files. The tags themselves can be embedded in the electronic file, or stored separately in, for example, a search engine database. Other embodiments of the present invention facilitate the e-mailing of multimedia content. Still other embodiments of the present invention employ user preferences and user behavioral history that can be stored in a separate database or queue, or can also be stored in the tag related to the multimedia file in order to further enhance the rich search capabilities of the present invention.
- Other aspects of the present invention include using hypershell and other techniques to read text information embedded in multimedia files for use in indexing, particularly tag indexes. Still more methods of the present invention enable the virtual editing of multimedia files by manipulating metadata and/or tags rather than editing the multimedia files themselves. Then the edited file (with rearranged tags and/or metadata) can be accessed in sequence in order to link seamlessly one or more multimedia files in the new edited arrangement.
- Still other methods of the present invention enable the transcoding of images/videos so that they enable users to display images/videos on devices that do not have the same resolution capabilities as the devices for which the images/videos were originally intended. This allows devices such as, for example,
PDA 5330,laptop 5322, andautomobile 5302, to retrieve useable portions of the same image/video that can be displayed on, for example,workstation 5312,screen 5308, andtelevision 5332. - Finally, the indexing methods of the present invention are enhanced by the unique modification of visual rhythm techniques that are part of other methods of the present invention. Modification of prior art visual rhythm techniques enable the system of the present invention to capture text information in the form of captions that are embedded into multimedia information, and even from video streams as they are broadcast, so that text information about the multimedia information can be included in the multimedia bookmarks of the present invention and utilized for storing, indexing, searching, retrieving, editing and rendering of the information.
- 1. Multimedia Bookmark
- The methods of the present invention described in this disclosure can be implemented, for example, in software on a digital computer having a processor that is operable with system memory and a persistent storage device. However, the methods described herein may also be implemented entirely in hardware, or entirely in software, and in any combination thereof.
- In general, after a multimedia content is analyzed automatically and/or annotated by a human operator, the results of analysis and annotation are saved as “metadata” with the multimedia content. The metadata usually include information on description of multimedia data content such as distinctive characteristic of the data, structure and semantics of the content. Some of the description provides information on the whole content such as summary, bibliography and media format. However, in general, most of the description is structured around “segments” that represent spatial, temporal or spatial-temporal components of the audio-visual content. In the case of video content, the segment may be a single frame, a single shot consisting of successive frames, or a group of several successive shots. Low-level features and some elementary semantic information may describe each segment. Examples of such descriptions include color, texture, shape, motion, audio features and annotated texts.
- If it is desired to generate metadata for several variations of a multimedia content, it would be natural to generate the metadata only for a single variation, called a master file, and then have the other variations share the same metadata. This sharing of metadata would save a lot of time and effort by skipping the time-consuming and labor-intensive work of generating multiple versions of metadata. In this case, the media positions (in terms of time points or bytes) contained in the metadata obtained with respect to the master file may not be directly applied to the other variations. This is because there may be mismatches of media positions between the master and the other variations if the master and the other variations do not start at the same position of the source content.
- The method and system of the present invention include a tag that can contain information about all or a portion of a multimedia file. The tag can come in several varieties, such as text information embedded into the multimedia file itself, appended to the end of the multimedia file, or stored separately from the multimedia file on the same or remote network storage device.
- Alternatively, the multimedia file has embedded within it one or more global unique identifiers (GUIDs). For example, each scene in a movie can be provided with its own GUID. The GUIDs can be indexed by a search engine and the multimedia bookmarks of the present invention can reference the GUID that is in the movie. Thus, multiple multimedia bookmarks of the present invention can reference the same GUID in a multimedia document without impacting the size of the multimedia document, or the performance of servers handling the multimedia document. Furthermore, the GUID references in the multimedia bookmarks of the present invention are themselves indexable. Thus, a search on a given multimedia document can prompt a search for all multimedia bookmarks that reference a GUID embedded within the multimedia file, providing a richer and more extensive resource for the user.
-
FIG. 2 shows amultimedia bookmark 210 of the present invention comprisingpositional information 212 andcontent information 214. Thepositional information 212 is used for accessing amultimedia content 204 starting from a bookmarkedposition 206. Thecontent information 214 is used for visually displaying multimedia bookmarks in abookmark list 208, as well as for searching one or more multimedia content databases for the content that matches thecontent information 214. - The
positional information 212 may be composed of a URI, a URL, or the like, and a bookmarked position (relative time or byte position) within the content. For the purposes of this disclosure, a URI is synonymous with a position of a file and can be used interchangeably with a URL or other file location identifier. Thecontent information 214 may be composed of audio-visual features and textual features. The audio-visual features are the information, for example, obtained by capturing or sampling themultimedia content 204 at the bookmarkedposition 206. The textual features are text information specified by the user(s), as well as delivered with the content. Other aspects of the textual features may be obtained by accessing metadata of the multimedia content. - In one embodiment of the
multimedia bookmark 210 of the present invention, thepositional information 212 is composed of a URI and a bookmarked position like an elapsed time, time code or frame number. Thecontent information 214 is composed of audio-visual features, such as thumbnail image data of the captured video frame, and visual feature vectors like color histogram for one or more of the frames. Thecontent information 214 of amultimedia bookmark 210 is also composed of such textual features as a title specified by a user as well as delivered with the content, and annotated text of a video segment corresponding to the bookmarked position. - In the case of an audio bookmark of the present invention, the
positional information 212 is composed of a URI, a URL, or the like, and a bookmarked position such as elapsed time. Similarly, thecontent information 214 is composed of audio-visual features such as the sampled audio signal (typically of short duration) and its visualized image. Thecontent information 214 of anaudio bookmark 210 is also composed of such textual features as a title, optionally specified by a user or simply delivered with the content, and annotated text of an audio segment corresponding to the bookmarked position. In the case of atext bookmark 210, thepositional information 212 is composed of a URI, URL, or the like, and an offset from the starting point of a text document. The offset can be of any size, but is normally about a byte in size. Thecontent information 214 is composed of a sampled text string present at the bookmarked position, and text information specified by user(s) and/or delivered with the content, such as the title of the text document. -
FIG. 3 shows an illustration of searching for multimedia contents that are relevant to the content information 314 (that correlates toelement 214 ofFIG. 2 ) that is stored in themultimedia bookmark 210 ofFIG. 2 of the present invention where both positional and content information are used. Thecontent information 314 is comprised of audio-visual features 320 such as a capturedframe 322 and a sampledaudio data 324, andtextual features 326 such as annotatedtext 328 and atitle 330. There are many cases where a bookmark system that utilizes only positional information, such as URI and an elapsed time, such as that used by conventional bookmarks, may not be valid. For example, if a bookmark were generated during the preview of multimedia content broadcast, the bookmark would not be valid for viewing a full version of the broadcast. If a bookmark were saved during live Internet broadcast, the bookmark would not be valid for viewing an edited version of the live broadcast. Further, if a user wanted to access the bookmarked multimedia content from another site that also provides the content, even the positional information such as URI would be not be valid. - To solve the problems described in the background section, the present invention uses content information 314 (
element 214 ofFIG. 2 ) that is saved in the multimedia bookmark to obtain the actual positional information of the last-visited segment by searching themultimedia database 310 using thecontent information 314 as a query input. Content information characteristics such as capturedframe 322, sampledaudio data 324, annotated text of the segment corresponding to a bookmarkedposition 328, and the title delivered with thecontent 330 can be used as query input to amultimedia search engine 332. The multimedia search engine searches itsmultimedia database 310 by performing content-based and/or text-based multimedia searches, and finds the relevant positions of multimedia contents. The search engine then retrieves a list ofrelevant segments 334 with their positional information such as URI, URL and the like, and the relative position. With amultimedia player 336, a user can start playing from the retrieved segments of the contents. The retrievedsegments 334 are usually those segments having contents relevant or similar to the content information saved in the multimedia bookmark. -
FIG. 4 illustrates an embodiment of a key frame hierarchy used by a search method of the multimedia search engine 332 (seeFIG. 3 ) in accordance with the present invention. The method arranges key frames in a hierarchical fashion to enable fast and accurate searching of frames similar to a query image. - The key frame hierarchy illustrated in
FIG. 4 is a tree-structured representation for multi-level abstraction of a video by key frames, where a node denotes each key frame. A number Df is associated with each node and represents the maximum distance between the low-level feature vector of thenode 414 and those of its decendent nodes in its subtree (for example,nodes 416 and 418). An example of such feature vector is the color histogram of a frame. If a video database composed of one or more key frame hierarchies, which correspond to different video sequences, must be searched to find a specific query image fq, the dissimilarity between fq and a subtree rooted at the key frame fm is measured by testing d(fq, fm)>Df+e where d(fq, fm) is a distance metric measuring dissimilarity such as the L1 norm between feature vectors, and e is a threshold value set by a user. If the condition is satisfied, searching of the subtree rooted at the nodefm is skipped (i.e., the subtree is “pruned” from the search). This method of the present invention reduces the search time substantially by pruning out the unnecessary comparison steps. - Durable Multimedia Bookmark using Offset and Time Scale
-
FIG. 5 shows an example of five variations encoded from the samesource video content 502.FIG. 5 shows two ASF format files 504, 506 with the bandwidths of 28.8 and 80 kbps that start and end exactly at the same time points.FIG. 5 also shows the firstRM format file 508 with the bandwidth of 80 kbps. In theRM file 508, source content starts to be encoded with the time interval o1 before the start time point of the ASF files 504, 506, and ends to be encoded with the time interval o4, before the end time point of the ASF files 504 and 506. TheRM file 508 thus has an extra video segment with the duration of o1 at the beginning. Consequently, compared with a start time point of aspecific video segment 514 in the ASF files, the start time point of the video segment in the RM file is temporally shifted right with the time interval o1 The start time point of the video segment in the RM file can be computed by adding the time interval o1 to the start time point of the video segment in the ASF files. Similarly, the second RM file 510 with the bandwidth of 28.8 kbps does not have a leading video segment with the duration of o2. The start time point of thevideo segment 514 in the second RM file can be computed by subtracting the time interval o2 from the start time point of the video segment in the ASF files. Also, the MOV file 512 with the smart bandwidth of 56 kbps has two extra segments with the duration of o3 and o6, respectively. - In another example, designate one of the different variations encoded with the same source multimedia content as the master file, and the other variations as slave files. In the example illustrated in
FIG. 5 , the ASF file encoded at the bandwidth of 80kbps 504 is to be the master file, and the other four files are slave files. In this example, an offset of a slave file will be the difference of positions in time duration or byte offset between a start position of a master file and a start position of the slave file. In this example, the difference of positions o1, o2, and o3 are offsets. The offset of a slave file is computed by subtracting the start position of a slave file from the start position of a master file. In this formula, the two start positions are measured with respect to the source content. Thus, the offset will have a positive value if the start position of a slave occurred before the start position of a master with reference to the source content. Conversely, the offset will have a negative value if the start position of a slave occurred after the start position of a master. For the example shown inFIG. 5 , the offsets o1 and o3 are positive values, and o2 is negative. Although not specifically required, by convention an offset of a master file is set to zero. - Consider the different variations encoded from the same source multimedia content. A user generates a multimedia bookmark with respect to one of the variations that is to be called a bookmarked file. Then, the multimedia bookmark is used at a later time to play one of the variations that is called a playback file. In other words, the bookmarked file pointed to by the multimedia bookmark, and the playback file selected by the user, may not be the same variation, but refer to the same multimedia content.
- If there is only one variation encoded from the original content, both the bookmarked and the playback files should be the same. However, if there are multiple variations, a user can store a multimedia bookmark for one variation and later play another variation by using the saved bookmark. The playback may not start at the last accessed position because there may be mismatches of positions between the bookmarked and the playback files.
- Associated with a multimedia content are metadata containing the offsets of the master and slave variations of the multimedia content in the form of media profiles. Each media profile corresponds to the different variation that can be produced from a single source content depending on the values chosen for the encoding formats, bandwidths, resolutions, etc. Each media profile of a variation contains at least a URI and an offset of the variation. Each media profile of a variation optionally contains a time scale factor of the media time of the variation encoded in different temporal data rates with respect to its master variation. The time scale factor is specified on a zero to one scale where a value of one indicates the same temporal data rate, and 0.5 indicates that the temporal data rate of the variation is reduced by half with respect to the master variation.
- Table 1 is an example metadata for the five variations in
FIG. 5 . The metadata is written according to the ISO/IEC MPEG-7 metadata description standard which is under development. The metadata are described by XML since MPEG-7 adopted XML Schema as its description language. In the table, the offset values of the threevariations variation 512 is assumed to be reduced by half with respect to themaster variation 504, and the other variations are not temporally reduced.TABLE 1 An example of Metadata Description for Five Variations <VariationSet> <Source> <Video> <MediaLocator> <MediaUri>http://www.server.com/sample-80.asf</MediaUri> </MediaLocator> </Video> </Source> <Variation timeOffset=“PT0S” timeScale=“1”> <Source> <Video> <MediaLocator> <MediaUri>http://www.server.com/sample-28.asf</ MediaUri> </MediaLocator> </Video> </Source> <VariationRelationship>alternativeMediaProfile</ VariationRelationship> </Variation> <Variation timeOffset=“PT3S” timeScale=“1”> <Source> <Video> <MediaLocator> <MediaUri>http://www.server.com/sample-80.rm</MediaUri> </MediaLocator> </Video> </Source> <VariationRelationship>alternativeMediaProfile</ VariationRelationship> </Variation> <Variation timeOffset=“−PT2S” timeScale=“1”> <Source> <Video> <MediaLocator> <MediaUri>http://www.server.com/sample-28.rm</MediaUri> </MediaLocator> </Video> </Source> <VariationRelationship>alternativeMediaProfile</ VariationRelationship> </Variation> <Variation timeOffset=“PT10S” timeScale=“0.5”> <Source> <Video> <MediaLocator> <MediaUri>http://www.server.com/sample-56.mov</ MediaUri> </MediaLocator> </Video> </Source> <VariationRelationship>alternativeMediaProfile</ VariationRelationship> <VariationRelationship>temporalReduction</VariationRelationship> </Variation> </VariationSet> -
FIG. 6 shows an example of two multimedia contents and their associated metadata. Since the first multimedia content has five variations and the second has three variations, there are five media profiles in the metadata of thefirst multimedia content 602, and three media profiles in the metadata of the second 604. InFIG. 6 , two subscripts attached to identifiers of variations, URIs, URLs or the like, and offsets represent a specific variation of a multimedia content. For example, the third variation of thefirst multimedia content 610 has the associatedmedia profile 612 in the metadata of thefirst multimedia content 602. Themedia profile 612 provides the values of a URI and an offset of the third variation of thefirst multimedia content 610. - When a user at the client terminal wants to make a multimedia bookmark for a multimedia content having multiple variations, the following steps are taken. First, the user selects one of several variations of the multimedia content from a list of the variations and starts to play the selected variation from the beginning. When the user makes a multimedia bookmark on the selected variation, which now becomes a bookmarked file, a bookmark system stores the following positional information along with content information in the multimedia bookmark:
-
- a. A URI of the bookmarked file;
- b. A bookmarked position within the bookmarked file; and
- c. A metadata identification (ID) of the bookmarked file.
The metadata ID may be a URI, URL or the like of the metafile or an ID of the database object containing the metadata. The user then continues or terminates playing of the variation.
-
FIG. 7 shows an example of a list ofbookmarks 702 for the variations of two multimedia contents inFIG. 6 . The list contains the first andsecond bookmarks fifth bookmarks - When a user wants to play the multimedia content from a saved bookmark position, the following steps are taken. The user selects one of the saved multimedia bookmarks from the user's bookmark list. The user can also select a variation from the list of possible variations. The selected variation now becomes a playback file. The bookmark system then checks whether the selected bookmarked file is equal to the playback file or not. If they are not equal, the bookmark system adjusts the saved bookmarked position in order to obtain an accurate playback position on the playback file. This adjustment is performed by using the offsets saved in a metafile and a bookmarked position saved in a multimedia bookmark. Assume that Pb is a bookmarked position of a bookmarked file, and Pp is the desirable position (adjusted bookmark position) of the playback file. Also, let ob and op be the offsets of bookmarked and playback files, respectively. Further, let sb and sp be the time scale factors of bookmarked and playback files, respectively, and s=sp/sb be a time scale ratio which converts a media time of a bookmarked file into the media time with respect to a playback file by multiplying the ratio to the media time of the bookmarked file. Then, the Pp can be computed using the following formula:
P p =s×P b if o p =s×o b i)
P p =s×P b+(|o p |+|s×o b|) if o p>0>s×o b ii)
P p =s×P b+(|o p −s×o b|) if o p >s×o b≧0 or 0≧o p >s×o b iii)
P p =s×P b−(|o p |+|s×o b|) if o p>0>s×o b iv)
P p =s×P b−(|o p −s×o b|) if 0≦o p <s×o b or o p <s×o b≦0. v) -
FIG. 8 shows the five distinct cases (802, 804, 806, 808, 810) illustrating the above formula. InFIG. 8 , both the time scale factors of bookmarked and playback files are assumed to be the same, thus making the time scale ratio be one, that is, s =1. In the above example, one offset is assumed for each slave file. In general, however, there may be a list of offset values for each slave file for the cases where the frame skipping occurs during the encoding of the slave file or the part of the slave file is edited. - This durable multimedia bookmark is to be explained with the examples in
FIGS. 6 and 7 . Suppose that a user wants to play back thethird variation 610 of the first multimedia content inFIG. 6 from the position stored in thesecond bookmark 706 inFIG. 7 . Thesecond bookmark 706 was made with reference to thefirst variation 606 of the first multimedia content inFIG. 6 . Note that the bookmarkedfile 606 is not equal to theplayback file 610. Using the metadata ID saved in the bookmark, the bookmark system accesses the metadata of thefirst multimedia content 602. From the metadata, the system reads the media profile of thefirst variation 608 and thethird variation 612. Using the offsets saved in the two profiles and a bookmarked position saved in a multimedia bookmark, the system adjusts the bookmarked position, thus obtaining a correct playback position of a playback file. - Offset Computation
- In
FIG. 5 , an offset of a slave file is defined as the difference between the start position of a master file and the start position of a slave file. This offset calculation requires locating a referential segment, for example, thesegment A 514 inFIG. 5 . After aligning the start position of the referential segment from a master file with the start position of the same referential segment from a slave file, the offset is calculated as the start time of the master file minus the start time of the slave file. - A referential segment may be any multimedia segment bounded by two different time positions. In practice, however, a segment bounded between two specific successive shot boundaries in the case of a video is frequently used as a referential segment. Thus, the following method may be used to determine a referential segment:
-
- 1. Locate the first two shot boundaries from the beginning of each of the master and the slave file using a technique of shot boundary detection;
- 2. Check whether the starting frame at the first shot detected from the master file is visually similar to the corresponding frame detected from the slave file using a content-based frame/video matching technique. Check whether the same is true for the ending frames of the shots, too; and
- 3. Determine the segment satisfying the conditions in 1) and 2) and let it be the referential segment.
The method of choosing a referential segment is not limited to the procedure mentioned above. There may be other procedures within the framework of the above method of automatic detection of a referential segment and computation of an offset based on the referential segment detected.
- User Interface and Flow Chart
-
FIG. 9 shows an example of a user interface incorporating the multimedia bookmark of the present invention. Theuser interface 900 is composed of aplayback area 912 and abookmark list 916. Further, theplayback area 912 is also composed of amultimedia player 904 and a variation list 910. Themultimedia player 904 providesvarious buttons 906 for normal VCR (Video Cassette Recorder) controls such as play, pause, stop, fast forward and rewind. Also, it provides another add-bookmark control button 908 for making a multimedia bookmark. If a user selects this button while playing a multimedia content, a new multimedia bookmark having both positional and content information is saved in a persistent storage. Also, in thebookmark list 916, the saved bookmark is visually displayed with its content information. For example, a spatially reduced thumbnail image corresponding to the temporal location of interest saved by a user in the case of a multimedia bookmark is presented to help the user to easily recognize the previously bookmarked content of the video. - In the
bookmark list 916, every bookmark has five bookmark controls just below its visually displayed content information. The left-most play-bookmark control button 918 is for playing a bookmarked multimedia content from a saved bookmarked position. The delete-bookmark control button 920 is for managing bookmarks. If this button is selected, the corresponding bookmark is deleted from the persistent storage. The add-bookmark-title control button 922 is used to input a title of bookmark given by a user. If this button is not selected, a default title is used. Thesearch control button 924 is used for searching multimedia database for multimedia contents relevant to the selectedcontent information 914 as a multimedia query input. There are a variety of cases when this control might be selected. For example, when a user selects a play-bookmark control to play a saved bookmark, the user might find out that the multimedia content being played is not in accordance with the displayed content information due to the mismatches of positional information for some reason. Further, the user might want to find multimedia contents similar to the content information of the saved bookmark. The send-bookmark control button 926 is used for sending both positional and content information saved in the corresponding bookmark to other people via e-mail. It should be noted that the positional information sent via e-mail includes either a URI or other locator, and a bookmarked position. - For durable bookmarks, the variation list 910 provides possible variations of a multimedia content with corresponding check boxes. Before a traditional normal playback or a bookmarked playback, a user selects a variation by checking the corresponding mark. If the multimedia content does not have multiple variations, this list may not appear in the user interface.
-
FIG. 10 is an exemplary flow chart illustrating theoverall method 1000 of saving and retrieving multimedia bookmarks with the two additional functions: i) Searching for other multimedia content relevant to the content pointed by the bookmark and ii) Sending a bookmark to another person via e-mail. In the multimedia process,step 1002, if a user wants to play the multimedia content (step 1004), the multimedia player is first displayed to the user instep 1006. A check is made instep 1008 to determine if multiple variations of multimedia content are available. If so, then two extra steps are taken. Instep 1010, the variation list is presented to the user and (optionally) with a default variation instep 1012. Thereafter, instep 1014, the list of multimedia bookmarks is displayed to the user by using their content information and bookmark controls. In a select control,step 1016 is performed. A check is made to determine if the user wants to change the variation,step 1018. If so, the user can select the other variation,step 1020. Thereafter, instep 1022, a check is made to determine if the user has selected one of the conventional VCR-type controls (e.g., play, pause, stop, fast forward, and rewind) or one of the bookmark-type controls (add-bookmark, play-bookmark, delete-bookmark, add-bookmark-title, search, and send-bookmark). If the user selects a conventional control button, the execution of the method jumps to the selectedfunction 1024. Otherwise, if the user selects one of the controls related to the bookmarks (1026, 1030, 1034, 1038, 1042, and 1046), the program goes to the corresponding routine (1028, 1032, 1036, 1040, 1044, and 1048), respectively. Until the different multimedia content is selected (step 1004), the multimedia player with the variation list and the bookmark list will continue to be displayed (steps -
FIG. 11 is a flow chart illustrating the process of adding a multimedia bookmark. When the add-bookmark control is selected (step 1026 ofFIG. 10 ), execution of the method proceeds to step 1028 ofFIG. 11 . In thisportion 1100 of the method of the present invention, the multimedia playback is suspended instep 1102. Then, the URI, URL or similar address is obtained instep 1104. A check is made instep 1106 to determine if the information on the bookmarked position such as time code is available at the currently suspended multimedia content. If so, execution is moved to step 1108, where the bookmarked position is obtained. Instep 1110, the bookmarked position data, if available, are used to capture, sample or derive audio-visual features of the suspended multimedia content at the bookmarked position. Instep 1112, a check is made to determine if the metadata exists. If not, then execution jumps to step 1124 where the URI (or the like), the bookmarked position, and the audio-visual features are stored in persistent storage. Otherwise (i.e., the metadata of the suspended multimedia content exist), the search is conducted to find a segment corresponding to the bookmarked position in the metadata instep 11 14. Next, a check is made to determine if the annotated text is available for the segment. If so, then the annotated text is obtained instep 11 18. If not,step 1118 is skipped and execution resumes atstep 1120, where a check is made to determine if there are media profiles that contain offset values of the suspended multimedia content. If so,step 1122 is performed where a metadata ID is obtained in order to adjust the bookmarked position in future playback. Otherwise,step 1122 is skipped and the method proceeds directly to step 1124, where the annotated text and the metadata ID are also stored in persistent storage. Then, instep 1126, the list of multimedia bookmarks is redisplayed with their content information and bookmark controls. The multimedia playback is resumed instep 1128, and execution of the method is moved to a clearing-off routine 1610 (ofFIG. 16 ) that is performed at the end of every bookmark control routine. - In the clearing-
off routine 1610, illustrated inFIG. 16 , a check is made instep 1612 to determine if the user wants to play back different multimedia content. If so, the method returns to step 1002 (seeFIG. 10 ) where another multimedia process begins. Otherwise, the method resumes atstep 1016 ofFIG. 10 , where the multimedia process waits for the user to select one of the conventional VCR or bookmark controls. -
FIG. 12 is a flow chart illustrating the process of playing a multimedia bookmark. When the play-bookmark control is selected by the user in step 1030 (seeFIG. 10 ),step 1032 is invoked. In step 1202 (seeFIG. 12 ), the URI or the like, bookmarked position, and metadata ID for the multimedia content to be played back are read from persistent storage. A check is made instep 1204 to determine if the URI of the content is valid. If not, execution of the method is shifted to step 1044 (seeFIG. 10 ) where the process of the content-based and/or text-based search begins. The URI of the content becomes invalid when the multimedia content is moved to other location, for example. If the URI of the content is valid (the result ofstep 1204 is positive), a check is made to determine if the bookmarked position is available. If not, a check is made to determine if the user desires to select the content-based and/or text-based search instep 1208. If so, execution is moved to step 1044 (seeFIG. 10 ). Otherwise, the method moves to step 1210, where the user can just play the multimedia content from the beginning. If the URI of the content is valid and the bookmarked position is available (e.g., both results ofsteps step 1212 to determine if the metadata ID is available. If it is not available, the multimedia playback starts from the bookmarked position instep 1222. Otherwise, the bookmarked and playback files are identified instep 1214 and the values of their respective offsets are read from the metadata instep 1216. Then, instep 1218, the bookmarked position is adjusted by using offsets. The multimedia playback starts from the adjusted bookmarked position instep 1220. After starting one of the playbacks (1210, 1220, or 1222), the method executes the clearing-off routine instep 1610 ofFIG. 16 . -
FIG. 13 is a flow chart illustrating the process of deleting a multimedia bookmark. When the delete-bookmark control is selected (step 1034 ofFIG. 10 ), the method invokes the routine illustrated inFIG. 13 . In thisparticular portion 1300 of the method of the present invention, all positional and content information of the selected multimedia bookmark is deleted from the persistent storage instep 1302. Then, the list of multimedia bookmarks is redisplayed with their content information and bookmark controls instep 1304, and then execution is shifted to the clearing-off routine,step 1610 ofFIG. 16 . -
FIG. 14 is a flow chart illustrating the process of adding a title to a multimedia bookmark. When the add-bookmark-title control is selected (step 1038 ofFIG. 10 ), the program goes through thisportion 1400 of the method of the present invention. In this routine, the user will be prompted to enter a title instep 1402 for the saved multimedia bookmark. A check is made to determine if the user entered a title instep 1404. If not, the program may provide a default title instep 1406 that may be made in accordance with a predetermined routine. In any case, execution proceeds to step 1408, where the list of multimedia bookmarks is redisplayed with their content information, including the titles and bookmark controls. Thereafter, the method executes the clearing-off routine ofstep 1610 ofFIG. 16 . -
FIG. 15 is a flow chart illustrating theportion 1500 of the present invention for searching for the relevant multimedia content based on audio-visual features as well as textual features saved in a multimedia bookmark, if available. The search methods currently available can be largely categorized into two types: content-based search and text-based search. Most of the prior art search engines utilize a text-based information retrieval technique. The present invention also employs content-based multimedia search engines which use, for example, the retrieval technique based on such visual and audio characteristics or features as color histogram and audio spectrum. The content information of a particular segment, stored in a multimedia bookmark, may be used to find other relevant information about the particular segment. For example, a frame-based video search may be employed to find other video segments similar to the particular video segments. - Alternatively, a text-based search may be combined with a frame-based video search to improve the search result. Most of frame-based video search methods are based on comparing low-level features such as colors and texture. These methods lack semantics necessary for recognition of high-level features. This limitation may be overcome by combining a text-based search. Most available multimedia contents are annotated with text. For example, video segments showing President Clinton may be annotated with “Clinton.” In that case, the combined search using the image of Clinton wearing a red shirt as a bookmark may find other video segments containing Clinton, such as the segment showing Clinton wearing a blue shirt.
- When the user selects as a query input a particular bookmark or partial segment of the multimedia content such as a thumbnail image in the case of a video search, the search routine (1044 of
FIG. 15 ) is invoked in the following three scenarios: -
- i. The user selects search control (
step 1042 ofFIG. 10 ) in order to retrieve the multimedia content relevant to the query; - ii. The URI of the bookmarked multimedia content is not valid (the result of
step 1204 ofFIG. 12 is negative); and - iii. The URI of the bookmarked multimedia content is valid, but the bookmarked position is not available (the result of
step 1206 ofFIG. 12 is negative and the result ofstep 1208 is positive).
- i. The user selects search control (
- Once invoked, this
portion 1500 is invoked and the content information of the multimedia bookmark such as audio-visual and textual features of the query input and the positional information, if available, are read from persistent storage instep 1502. Examples of visual features for the multimedia bookmark include, but are not limited to, captured frames in JPEG image compression format or color histograms of the frames. - In
step 1504, a check is made to determine if the annotated texts are available. If so, the annotated text is retrieved directly from the content information of the bookmark instep 1506 and execution proceeds immediately to step 1516, where the process of the text-based multimedia search is performed by using the annotated texts as query input, resulting in the multimedia segments having texts relevant to the query. If the result ofstep 1504 is negative, the annotated texts can be also obtained by accessing the metadata, using the positional information. Thus a check is made instep 1508 to determine if the positional information is available. If so, then another check is made to determine if the metatdata exist instep 1510. If so (i.e., the result ofstep 1510 is positive),step 1512 is executed, where a segment corresponding to the bookmarked position in the metadata is found. A check is then made to determine if some annotated texts for the segment are available instep 1514. If so (i.e., the result ofstep 1514 is positive), the text-based multimedia search is also performed instep 1516. If the annotated texts or the positional information is not available from the content information of the bookmark (i.e., the result ofstep 1514 is negative) or from the metadata (i.e., the result ofstep 1510 is negative), then a content-based multimedia search is performed by using the audio-visual features of the bookmark as query input instep 1518. The result ofstep 1518 is that the resulting multimedia segments have audiovisual features similar to the query. It should be noted that both the text-based multimedia search (step 1516) and the content-based multimedia search (step 1518) can be performed in sequences, thus combining their results. Alternatively, one search can be performed based the results of the other search, although they are not presented in the flow chart ofFIG. 15 . - The audio-visual features of the retrieved segments at their retrieved positions are computed in
step 1520 and temporarily stored to show visually the search results instep 1522, as well as to be used as query input to another search if desired by the user insteps step 1524 is positive, the user selects a retrieved segment instep 1526, and plays back the segment from the beginning of the segment instep 1528. The beginning of the retrieved segment that was selected is called as the retrieved position in eitherstep 1528 orstep 1508. If the user wants another search (i.e., the result ofstep 1530 is positive), the user selects one of retrieved segments instep 1532. Then, the content information, including audio-visual features and annotated texts for the selected segment, is obtained by accessing temporarily stored audio-visual features and/or the corresponding metadata instep 1534, and the new search process begins atstep 1504. If the user wants no more playbacks and searches, the execution is transferred to the clearing-off routine,step 1610 ofFIG. 16 . - Depending on the kind of information available in the multimedia bookmark, there can be a handful of client-server-based search scenarios. An excellent example is the multimedia bookmarks of the present invention. With the combination of the multimedia bookmark information tabulated in Table 2, some examples of the client-server-based search scenario are described. Note that even if the text-based search is used in the description of the present invention, a user does not type in the keywords to describe the video that the user seeks. Moreover, the user might be unaware of doing text-based search. The present invention is designed to hide this cumbersome process of keyword typing from the user.
TABLE 2 Search types with available bookmark information Available bookmark information Search Captured Positional Annotated Type Image Info. Text A ✓ B ✓ C ✓ D ✓ ✓ E ✓ ✓ F ✓ ✓ G ✓ ✓ ✓ - Search Type A: The multimedia bookmark has only information on image.
-
- 1. When a user at a client side selects a bookmarked image, the client sends the image data to the server as a query frame.
- 2. The server finds the segment containing the query frame using a frame-based video search.
- 3. The server checks if the segment has annotated text. If so, go to
step 4. Otherwise, provide the user with the result of the frame-based video search and terminate. - 4. The server performs a text-based video search using the annotated text as keywords.
- 5. Provide the user with the combined results of the frame-based search in
step 2 and the text-based search instep 4.
- Search Type B: The multimedia bookmark has only positional information.
-
- 1. When a user at a client side selects a multimedia bookmark, the client sends the position information about the image to the server.
- 2. The server performs a frame-based video search, using as a query frame the frame corresponding to the specified position.
- 3. The server checks if the segment at the specified position has annotated text. If so, go to
step 4. Otherwise, provide the user with the result of the frame-based video search and terminate. - 4. The server performs a text-based video search using the annotated text as keywords.
- 5. Provide the user with the combined results of
steps
- Search Type C: The multimedia bookmark has only annotated text. When a sever at a client side selects a multimedia bookmark, the client sends the annotated text to the server.
-
- 1. The server performs a text-based video search using the annotated text as keywords.
- 2. Provide the user with the result of
step 2.
- Search Type D: The multimedia bookmark has both image and positional information. This type of search can be implemented in the way of either Search Type A or B.
- Search Type E: The multimedia bookmark has both image and annotated text.
-
- 1. When a user at a client side selects a bookmark image, the client sends the image data and the annotated text to the server.
- 2. The server performs a frame-based video search using the image as a query image.
- 3. The server performs a text-based video search using the annotated texts as search keywords. Note that the execution order of
steps - 4. Provide the user with the combined results of
steps
- Search Type F: The multimedia bookmark has both positional information and annotated text.
-
- 1. When a user at a client side selects a multimedia bookmark, the client sends the positional information and the annotated texts to the server;
- 2. The server performs a frame based video search, using the frame corresponding to the specified position as a query frame.
- 3. The server performs a text-based video search using the annotated texts as search keywords. Note that the execution order of
steps - 4. Provide the user with the combined results of
steps
- Search Type G: The multimedia bookmark has all the information: image, position, and annotated text. This type of search can be implemented in the way of either Search Type E or F.
-
FIG. 16 is a flow chart illustrating the method of sending a bookmark to other people via e-mail. When the send-bookmark control is selected (step 1046 ofFIG. 10 ),step 1048 ofFIG. 16 is invoked. According to the method ofFIG. 16 , all saved bookmark information, including the URI, the bookmarked position and metadata ID, the audio-visual and the textual features of a selected multimedia bookmark to be sent, are read from the persistent storage instep 1602. Then, instep 1604, the user will be prompted to enter some related input in order to send an e-mail to another individual or a group of people. If all of the necessary information is input by the user instep 1606, the e-mail is sent to the designated persons with the bookmark information instep 1608. At this point, the method goes into the clearing-off routine,step 1610, that may be entered from several other portions of the method shown inFIGS. 11, 12 , 13, 14, and 15. As shown inFIG. 16 , a check is made instep 1612 to determine if other multimedia contents are available. If so, execution of the method is transferred to step 1002 ofFIG. 10 . Otherwise, execution of the method is transferred to step 1016 ofFIG. 10 . - The multimedia bookmark may consist of the following bookmarked information:
-
- 1. URI of a bookmarked file;
- 2. Bookmarked position;
- 3. Content information such as an image captured at a bookmarked position;
- 4. Textual annotations attached to a segment which contains the bookmarked position;
- 5. Title of the bookmark;
- 6. Metadata identification (ID) of the bookmarked file;
- 7. URI of an opener web page from which the bookmarked file started to play; and
- 8. Bookmarked date.
The bookmarked information includes not only positional (1 and 2) and content information (3, 4, 5, and 6) but also some other useful information, such as opener web page and bookmarked date, etc.
- The content information can be obtained at the client or server side when its corresponding multimedia content is being played in networked environment. In case of a multimedia bookmark, for example, the image captured at a bookmarked position (3) can be obtained from a user's video player or a video file stored at a server. The title of a bookmark (5) might be obtained at a client side if a user types in his own title. Otherwise, a default title, such as a title of a bookmarked file stored at a server, can be used as the title of the bookmark. The textual annotations attached to a segment which contains the bookmarked position are stored in a metadata in which offsets and time scales of variations also exist for the durable bookmark. Thus, the textual annotations (4) and metadata ID (6) are obtained at a server.
- The bookmarked information can be stored at a client's or server's storage regardless of the place where the bookmarked information is obtained. The user can send the bookmarked information to others via e-mail. When the bookmarked information is stored at a server, it is simple to send the bookmarked information via e-mail, that is, to send just a link of the bookmarked information stored at a server. But, when the bookmarked information is stored at a user's storage, the user has to send all of the information to another via e-mail. The delivered bookmarked information can then be stored at the receiver's storage, and the bookmarked multimedia content starts to play exactly from the bookmarked position. Also, the bookmarked multimedia content can be replayed at any time the receiver wants.
- Some content information of the bookmarked information, such as a captured image, is also multimedia data, and all the other information, including the positional information is textual data. Both forms of the bookmarked information stored at a user's storage are sent to other person within a single e-mail. There can be two possible methods of sending the information from one user to another user via an e-mail:
-
- 1. Using the watermarking technology: All textual information can be encoded into the content information. For the case of multimedia bookmark, all textual information such as a URL of a video file and a bookmarked position expressed as a time code can be encoded into an thumbnail image captured at the bookmarked position. According to the watermarking technology, the image encoded with the texts can be visually almost the same as the original image. The image encoded with the texts can be attached to any e-mail message. The image delivered with the message can then be decoded, and the separated image and the texts be saved at a receiver's storage.
- 2. Using an HyperText Markup Language (HTML) document: An HTML document can be sent via e-mail. All textual parts of bookmarked information can be directly included in the HTML document to be sent via e-mail. But the captured image in case of a multimedia bookmark cannot be directly included in the HTML from which the included image will be detached and stored at a receiver's local storage. This is because the image is represented in a binary file format. Sending the binary image within an HTML document can be possible by converting the binary image into a text string with encoders, such as Base-16 or Base-64, and directly including it in an HTML document as a normal character string. The converted image is called as an inline media by which one can locate any multimedia file in an HTML document. When the HTML is sent to another user, the included text image is decoded into a binary image, thus being saved and displayed at the user's storage and screen, respectively. The receiving user may not view the detailed information, but can play the multimedia content from the bookmarked position. Table 3 is a sample HTML document which includes both the captured content image and the last of the textual bookmarked information.
TABLE 3 An example of HTML document holding bookmarked information <Html> <Body> <Object id=“IMDisplay” codebase=http://www.server.com/BookmarkViewer classid=CLSID:FFD1F137-722C-46B7 VIEWASTEXT> <Param name=“BookmarkedFile” value=“mms://www.server.com/sample.mpg”> <Param name=“BookmarkedPosition” value=“435.78705499999995”> <Param name=“OpenerURL” value=“http://www.server.com/sample..html”> <Param name=“BookmarkTitle” value=“Sample Title”> <Param name=“BookmarkDate” value=“July 24”> <!-- Inline media: character coded binary image --> <Param name=“CapturedImage” value=“/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAMCAgMCAg MDAwMEAwMEBQgFBQQEBQoHBwYIDAoMDAsKCwsNDhIQDQ4RDgsLEBYQERMU FRUVDA8XGBYUGBIUFRT/2wBDAQMEBAUEBQkFBQkUDQsNFBQUFBQUFBQUFBQ UFBQUFBQUF/ . . . . xXluhEakJ9+7Db8blCELwzAvsfiP4htpVE9yHtY12pawxwoI0MqyFUwhCrjeoUAAB8AYGD4 lRR7Fdyrva59E6f+0F4s0HV7bXNHvDp2twwJb29zb29vGsK7JUkEapEMKyugKsSD5eW3fK Ex/GfxZ8WfEOx0W28SarJq0GkI0NgJ4o1aGNipZA4UMV+UEAk4xk58OooVLl0uFz//2Q==” > </Object> </Body> </Html> -
FIG. 17 is an exemplary flow chart illustrating the process of saving a multimedia bookmark at a receiving user's local storage. When a user invokes his e-mail program instep 1704, the user selects a message to read instep 1706. A check is made instep 1708 to determine if the message includes a multimedia bookmark. If not, execution is moved to step 1706 where the user selects another message to read. Otherwise, another check is made instep 1710 to determine if the user wants to play the multimedia bookmark by selecting a play control button, which appears within the message. If not, execution is also moved to step 1706, where the user selects another message to read. Otherwise, instep 1712, a multimedia bookmark program having such a user interface illustrated inFIG. 9 is invoked. Instep 1714, the delivered bookmark information included in the message is saved at the user's persistent storage, thus adding the delivered multimedia bookmark into the user's list of local multimedia bookmarks. Then, instep 1716, content information of the saved multimedia bookmark can appear at the multimedia bookmark program. Next, the play-bookmark control is internally selected instep 1718. Execution is then moved to step 1032 ofFIG. 12 . - Sending Messages to Mobile Devices
- Short Message Service (SMS) is a wireless service enabling the transmission of short alphanumeric messages to and from mobile phones, facsimile machines, and/or IP addresses. The method of the present invention, which provides for sending a multimedia bookmark of the present invention between an IP address and a mobile phone, and also between mobile phones and other mobile phones, is based on the SMS architecture and technologies.
-
FIG. 18 illustrates the basic elements of this embodiment of the present invention. Specifically, the video server VS 1804 of theserver network 1802 is responsible for streaming video over wired or wireless networks. Theserver network 1802 also has thevideo database 1806 that is operably connected to thevideo server 1804. - The multimedia bookmark message service center (VMSC) 1818 acts as a store-and-forward system that delivers a multimedia bookmark of the present invention over mobile networks. The multimedia bookmark sent by a
user PC 1810, either stand-alone or part of alocal area network 1808, is stored inVMSC 1818, which then forwards it to the destinationmobile phone 1828 when themobile phone 1828 is available for receiving messages. - The gateway to the
mobile switching center 1820 is a mobile network's point of contact with other networks. It receives a short message like a multimedia bookmark from VMSC and requests the HLR about routing information, and forwards the message to the MSC near to the recipient mobile phone. - The home location register (HLR) 1822 is the main database in the mobile network. The
HLR 1822 retains information about the subscriptions and service profile, and also about the routing information. Upon the request by theGWMSC 1820, theHLR 1822 provides the routing information for the recipientmobile phone 1828 or personaldigital assistant 1830. Themobile phone 1828 is typically a mobile handset. ThePDA 1830 includes, but is not limited to, small handheld devices, such as a Blackberry, manufactured by Research in Motion (RIM) of Canada. - The mobile switching center 1824 (MSC) switches connections between mobile stations or between mobile stations and other telephone and data networks (not shown).
- Sending a Multimedia Bookmark to a Mobile Phone from a PC
-
FIG. 19 illustrates the method of the present invention for sending a multimedia bookmark from a personal computer to a mobile telephone over a mobile network. Instep 1 ofFIG. 19 , the personal computer submits a multimedia bookmark to theVMSC 1918. Next, instep 2, theVMSC 1918 returns an acknowledgement to thePC 1910, indicating the reception of the multimedia bookmark. Instep 3, theVMSC 1918 sends a request to theHRL 1922 to look up the routing information for the recipient mobile. Then theHRL 1922 sends the routing information back to theVMSC 1918,step 4. Instep 5, theVMSC 1918 invokes the operation to send the multimedia bookmark to theMSC 1924. Then, instep 6, the MSC delivers the multimedia bookmark to themobile phone 1928. Instep 7, themobile phone 1928 returns an acknowledgement to theMSC 1924. Then instep 8, theMSC 1924 notifies theVMSC 1918 of the outcome of the operation invoked instep 5. Incidentally, the method described above is equally applicable to personal digital assistants that are connected to mobile networks. - Sending a Multimedia Bookmark to a Mobile Phone from Another Mobile Phone
-
FIG. 20 illustrates an alternate embodiment of the present invention that enables the transmission of a multimedia bookmark from one mobile device to another. Referring toFIG. 20 , the method begins atstep 1, where themobile phone 2028 submits a request to theMSC 2024 to send a multimedia bookmark to another mobile telephone customer. Instep 2, theMSC 2024 sends the multimedia bookmark to theVMSC 2018. Thereafter, instep 3, theVMSC 2018 returns an acknowledgement to theMSC 2024. Instep 4, theMSC 2024 returns to the sendingmobile phone 2028 an acknowledgement indicating the acceptance of the request. Instep 5, theVMSC 2018 queries theHLR 2022 for the location of the recipientmobile phone 2030. It should be noted that the sender or the recipient need not be a mobile telephone. The sending and/or receiving device could be any device that can send or receive a signal on a mobile network. Instep 6 ofFIG. 20 , theHLR 2022 returns the identity of thedestination MSC 2024 that is close to therecipient device 2030. Then theVMSC 2018 delivers the multimedia bookmark to theMSC 2024 instep 7. Then, instep 8, theMSC 2024 delivers the multimedia bookmark to the recipientmobile device 2030. Instep 9, themobile device 2030 returns an acknowledgement to theMSC 2024 for the acceptance of the multimedia bookmark. Finally, instep 10, theMSC 2024 returns to theVMSC 2018 the outcome of the request (to send the multimedia bookmark). - Playing Video on a Mobile Handset or other Mobile Device
-
FIG. 21 illustrates an alternate embodiment of the present invention for playing video sequences on a mobile device. Specifically, the method begins generally atstep 1, where themobile device 2128 submits a request to theMSC 2124 to play the video associated with the multimedia bookmark. Instep 2, theMSC 2128 sends the request with the multimedia bookmark to theVMSC 2118. It is often the case that the video pointed to by the multimedia bookmark cannot be streamed directly to themobile device 2128. For example, if the marked video that is in high bit rate format is to be transmitted to themobile device 2128, then the high bit rate video data might not be delivered properly due to the limited bandwidth available. Further, the video might not be properly decoded on themobile device 2128 due to the limited computing resources on the mobile device. In that case, it is desirable to deliver a low bit rate version of the same video content to themobile device 2128. However, a problem occurs when the position specified by the multimedia bookmark does not point to the same content for the low bit rate video. To solve the problem, prior to relaying the request toVS 2104, theVMSC 2118 decides which bit rate video is the most suitable for the currentmobile device 2128. TheVMSC 2118 also calculates the new marked location to compensate for the offset value due to the different encoding format or different frame rate needed to display the video on themobile device 2128. After completing this internal decision and computation, instep 3, theVMSC 2118 sends the modified multimedia bookmark to thevideo server 2104, using the server IP address designated in the multimedia bookmark. Thereafter, instep 4, thevideo server 2104 starts to stream the video data down to theVMSC 2118. Subsequently, instep 5, theVMSC 2118 passes the video data to theMSC 2124. Then, instep 6, theMSC 2124 delivers the video data to the service requester,mobile device 2128.Steps 4 though 6 are repeated until themobile device 2128 issues a termination request. - User History
- The metadata associated with multimedia bookmark include positional information and content information. The positional information can be a time code or byte offset to denote the marked time point of the video stream. The content information consists of textual information (features) and audio-visual information. There are two types of textual information depending upon its source: i) a bookmark user and ii) a bookmark server. When a user makes a multimedia bookmark at the specific position of the video stream (generally, multimedia file), i) a user can input the text annotation and other metadata that the user would like to associate with the bookmark, and/or ii) the multimedia bookmark system (server) delivers and associates the corresponding metadata with the bookmark. An example of metadata from the server includes the textual annotation describing the semantic information of the bookmarked position of the video stream.
- The semantic annotation or description or indexing is often performed by humans since it is usually difficult to automatically generate semantic metadata by using the current state of the art video processing technologies. However, the problem is that the manual annotation process is time-consuming, and, further, different people, even the specialists, can differently describe the same video frames/segment.
- The present invention discloses an approach to solve the above problem by making use of (bookmark) user's annotations. It enables video metadata to gradually be populated with information from users as time goes by. That is, the textual metadata for each video frames/segment are improved using a large number of users' textual annotations.
- The idea behind the invention is as follows. When a user makes a multimedia bookmark at the specific position, the user is asked to enter the textual annotation. If the user is willing to annotate for his/her own later use, the user will describe the bookmark using his/her own words. This textual annotation is delivered to the server. The server collects and analyzes all the information from users for each video stream. Then, the analyzed metadata that basically represent the common view/description among a large number of users are attached to the corresponding position of the video stream.
- For each video stream, there is a queue of size N, called “relevance queue,” that keeps the textual annotation with the corresponding bookmarked position as shown in
FIG. 54 . Specifically,FIG. 54 shows arelevance queue 5402 having anenqueue 5404 and adequeue 5406 with one or moreintermediate elements 5408. - The queue of
FIG. 54 is initially empty. When a user makes a multimedia bookmark at the specific position of the video stream (generally multimedia file), a user inputs the text annotation that the user would like to associate with the bookmark. The text annotation is delivered to the server and is enqueued. For example, assume the first element of thequeue 5404 for the golf video stream Va is “Tiger Woods; 01:21:13:29.” A second user subsequently marks a new element at the 01:21:17:00 in hours:minutes:seconds:frames of the golf video stream Va (same video stream as before) and enters the keyword “Tee Shot.” Then, the first element is shifted to the second and the new input is entered into therelevance queue 5402 for the video stream Va at theenqueue 5404. This queue operation continues indefinitely. - Periodically, the
video indexing server 5410 regularly analyzes each queue. Suppose, for instance, that the video stream is segmented into a finite number of time intervals using the automatic shot boundary detection method. Theindexing server 5410 groups the elements inside the queue by checking time codes so that the time codes for each group are included by each time interval corresponding to each segment. For each group, the frequency of each keyword is computed and the highly frequent keywords are considered as new semantic text annotation for the corresponding segment. In this way, the semantic textual metadata for each segment can be generated by utilizing a large number of users. - Application of User History to Text Search Engine
- When users make a bookmark for a specific URL like www.google.com, they can add their own annotations. Thus, if the text engine maintains a queue for each document/URL, it can collect a large number of users' annotations. Therefor, it can analyze the queue and find out the most frequent words that become new metadata for the document/URL.
- In this way, the search engine would continuously have users update and enrich the text databases. This would help in the internationalization of the process, as users who are not native speakers of the particular web site content would annotate the contents in their own language and help their countrymen who conduct a search using their native tongue to find the site.
- Adaptive Refreshing
- The present invention provides a methodology and implementation for adaptive refresh rewinding, as opposed to traditional rewinding, which simply performs a rewind from a particular position by a predetermined length. For simplicity, the exemplary embodiment described below will demonstrate the present invention using video data. Three essential parameters are identified to control the behavior of adaptive refresh rewinding: that is, how far to rewind, how to select which refresh frames in the rewind interval, and how to present the chosen refresh video frames on a display device.
- Rewind Scope
- The scope of rewinding implies how much to rewind a video back toward the beginning. For example, it is reasonable to set 30 seconds before the saved termination position, or the last scene boundary position viewed by the user. Depending on a user preference, the rewind scope may be set to a particular value.
- Frame Selection
- Depending on the time a set of refresh frames is determined, the selection can be static or dynamic. A static selection allows the refresh frames to be predetermined at the time of DB population or at the time of saving the termination position, while a dynamic selection determines the refresh frames at the time of the user's request to play back the terminated video.
- The candidate frames for user refresh can be selected in many different ways. For example, the frames can be picked out at random or at some fixed interval over the rewind interval. Alternatively, the frames at which a video scene change takes place can be selected.
- Frame Presentation
- Depending on the screen size of display devices, there might be two presentation styles: slide show and storyboard. The slide show is good for devices with a small display screen while the storyboard may be preferred with devices having a large display screen. In the slide show presentation, the frames keep appearing sequentially on the display screen at regular time intervals. In the storyboard presentation, a group of frames is simultaneously placed on the large display panel.
-
FIG. 55 illustrates an embodiment of the rewind aspect of the present invention. If during playback a video is paused, terminated or otherwise interrupted, the viewing user or the client system displaying the video preferably sends a request to mark the video at the point of interruption to the server delivering the multimedia content to the client device. As illustrated inFIG. 55 , upon receipt of a request to mark, an instance between beginning 5504 and end 5518 of video ormultimedia content 5502 is preferably selected as the videos termination ormarked position 5514. Then, usingmarked position 5514 and metadata associated with the video or multimedia content, the server randomly selects a sequence ofrefresh frames rewind interval 5516 for storage on a storage device. When the viewing user or client later initiates playback of the interrupted video, the server first delivers the sequence ofrefresh frames multimedia content 5502 resumes playback from termination ormarked position 5514. -
FIG. 56 illustrates an alternate embodiment of the rewind aspect of the present invention. In this embodiment, upon interruption ofmultimedia content 5602, having a length from beginning 5604 to end 5608, such as a video, a request to mark the current location of video is sent by the client system to the network server. Having preferably run a scene change detection algorithm over the video ormultimedia content 5602 at the time of database population, the network server has already retained a list of scene change frames 5610, 5612, 5618, 5620, 5622, 5624, 5628 and 5632. Using the list of scene change frames 5610, 5612, 5618, 5620, 5622, 5624, 5628 and 5632 as well as the information associated with termination ormarked position 5630, the network server is able to determine the sequence ofrefresh frames viewing termination position 5630 andbeginning position 5614, or alternatively, the rewind internal 5616. Once playback of the video ormultimedia content 5602 is restarted, the network server preferably delivers to the client the sequence of selectedrefresh frames multimedia content 5602 continues fromtermination position 5630. - A third embodiment of the method of the present invention may also be gleaned from
FIG. 56 . In this embodiment, a request to mark the current location ortermination position 5630 of the video is sent to the network server by the client. When playback of the interrupted video ormultimedia content 5602 is later requested, the server preferably executes a scene change detection algorithm on therewind interval 5616, i.e., the segment ofmultimedia content 5602 betweenviewing beginning position 5614 andtermination position 5630. Upon completion of the scene detection algorithm, the network server sends the client system the resulting list of scene boundaries or scene change frames 5618, 5620, 5622, 5624 and 5628, which will serve as refresh frames. Playback of the video ormultimedia content 5602 preferably begins upon completion of the client's display ofrefresh frames - Illustrated in
FIG. 57 is a flow chart depicting a static method of adaptive refresh rewinding implemented on a network server according to teachings of the present invention. Upon initiation atstep 5702,method 5700 preferably proceeds to step 5704, where the network server runs a scene detection algorithm on video or other multimedia content to obtain a list of scene boundaries in advance of video or other multimedia content playback. - Upon completion of the scene detection algorithm at
step 5704,method 5700 preferably proceeds to step 5706, where a request received from a client system by the network server is evaluated to determine its type. Specifically,step 5706 determines whether the request received by the network server is a video or multimedia content bookmark or playback request. - If the request is determined to be a playback request, the playback request is preferably received by the network server at
step 5708. Atstep 5710, the network server then preferably sends the client system a pre-computed list of refresh frames and the previous termination position for the video or multimedia media content requested for playback. - Alternatively, if the request is determined to be a video or multimedia content bookmark request at
step 5706,method 5700 preferably proceeds to step 5712. Atstep 5712, a multimedia bookmark, preferably using termination position information received from the client, may be created and saved in persistent storage. - At
step 5714, the rewind scope for the bookmark is preferably decided. As mentioned above, the rewind scope generally defines how much to rewind the video or multimedia file back towards its beginning. For example, the rewind scope may be a fixed amount before the termination position or the last scene boundary prior to the termination position. User preferences may also be employed to determine the rewind scope. - Once the rewind scope has been decided at
step 5714,method 5700 preferably proceeds to step 5716 where the method of frame selection for determining the refresh scenes to be later displayed at the client system is determined. As mentioned above, refresh frames can be selected in many different ways. For example, refresh frames can be selected randomly, at some fixed-interval or at each scene change. Depending upon user preference settings, or upon other settings,method 5700 may proceed fromstep 5716 to step 5718 where refresh frames may be selected randomly over the rewind scope.Method 5700 may also proceed fromstep 5716 to step 5720 where refresh frames may be selected at fixed or regular intervals. Alternatively,method 5700 may proceed fromstep 5716 to step 5722 where refresh frames are selected based on scene changes. Upon completion of the selection of refresh frames at any ofsteps method 5700 preferably returns to step 5706 to await the next request from a client. - Referring now to
FIG. 58 , a flow chart illustrating a method of adaptive refresh rewinding implemented on a client system according to teachings of the present invention is shown. Upon initiation atstep 5802,method 5800 preferably waits atstep 5804 for a user request. Upon receipt of a user request, the request is evaluated to determine whether the request is a video or multimedia content bookmark request or whether the request is a video or multimedia content playback request. - If at
step 5804, a video or multimedia content bookmark request is received,method 5800 preferably proceeds to step 5806. Atstep 5806, a bookmark creation request is preferably sent to a network server configured to usemethod 5700 ofFIG. 57 ormethod 5900 ofFIG. 59 . Once the bookmark request has been sent,method 5800 preferably returns to step 5804 where the next user request is awaited. - If at
step 5804, a video or multimedia content playback request is received,method 5800 preferably proceeds to step 5808. Atstep 5808, the client system sends a playback request to the network server providing the video or multimedia content. After sending the playback request to the network server,method 5800 preferably proceeds to step 5810 where the client system waits to receive the refresh frames from the network server. - Upon receipt of the refresh frames at
step 5810,method 5800 preferably proceeds to step 5812 where a determination is made whether to display the refresh frames in a storyboard or a slide show manner.Method 5800 preferably proceeds to step 5814 if a slide show presentation of the refresh frames is to be shown and to step 5816 if a storyboard presentation of the refresh frames is to be shown. Once the refresh frames have been presented at eitherstep method 5800 preferably proceeds to step 5820. - At
step 5820, the client system begins playback of the interrupted video or multimedia content from the previously terminated position (seeFIGS. 55 and 56 ). Once the video or multimedia content has completed playback or is otherwise stopped,method 5800 preferably proceeds to step 5822 where a determination is made whether or not to end the client's connection with the network server. The determination to be made atstep 5822 may be made from a user prompt, from user preferences, from server settings or by other methods. If it is determined atstep 5822 that the client connection with the server is to end,method 5800 preferably severs the connection and proceeds to step 5824 wheremethod 5800 ends. Alternatively, if a determination is made atstep 5822 that the client connection with the server is to be maintained,method 5800 preferably proceeds to step 5804 to await a user request. - Referring now to
FIG. 59 , a flow chart illustrating a dynamic method of adaptive refresh rewinding implemented on a network server according to teachings of the present invention is shown. Upon initiation atstep 5902,method 5900 preferably proceeds to step 5904 where a request received from a client by the network server is evaluated to determine its type. Specifically,step 5904 determines whether the request received by the network server is a video or multimedia content bookmark or playback request. - If, at
step 5904, the request is determined to be a video or multimedia content bookmark request,method 5900 preferably proceeds to step 5906. Atstep 5906, a bookmark, preferably using termination position information received from the client, may be created and saved in persistent storage. - Alternatively, if at
step 5904 the request is determined to be a playback request, the playback request is preferably received by the network server atstep 5908. In addition, a decision regarding the rewind scope of the playback request is made by the network server atstep 5908. Upon completing receipt of the playback request and determining the rewind scope,method 5900 preferably proceeds to step 5910 where the type of refresh frame selection to be made is determined. - At
step 5910, the network server determines whether refresh frame selection should be made based on randomly selected refresh frames from the rewind scope, refresh frames selected at fixed intervals throughout the rewind scope or scene boundaries during the rewind scope. If a determination is made that the refresh frames should be selected randomly,method 5900 preferably proceeds to step 5912 where refresh frames are randomly selected from the rewind scope. If, atstep 5910, a determination is made that the refresh frames should be selected at fixed or regular intervals over the rewind scope, such selection preferably occurs atstep 5914. Alternatively, if the scene boundaries should be used as the refresh frames,method 5900 preferably proceeds to step 5916. Atstep 5916, the network server preferably runs a scene detection algorithm on the segment of video or multimedia content bounded by the rewind scope to obtain a listing of scene boundaries. Upon completion of the selection of refresh frames at any ofsteps method 5900 preferably proceeds to step 5918. - At
step 5918, the network server preferably sends the selected refresh frames to the client system. In addition, the network server also preferably sends the client system its previous termination position for the video or multimedia content requested for playback. Once the selected refresh frames and the termination position have been sent to the client system,method 5900 preferably returns to step 5904 where another client request may be awaited. - Storage of User Preferences
- The multimedia bookmark of the present invention, in its simplest form, denotes a marked location in a video that consists of positional information (URL, time code), content information (sampled audio, thumbnail image), and some metadata (title, type of content, actors). In general, multimedia bookmarks are created and stored when a user wants to watch the same video again at a later time. Sometimes, however, the multimedia bookmarks may be received from friends via e-mail (as described herein) and may be loaded into a receiving user's bookmark folder. If the bookmark so received does not attract the attention of the user, it may be deleted shortly thereafter. With the lapse of time, only the multimedia bookmarks intriguing the user will likely remain in the user's bookmark folder, the remaining bookmarks thereby representing the most valuable information about a user's viewing tastes. Accordingly, one aspect of the present invention provides a method and system embodied in a “recommendation engine” that uses multimedia bookmarks as an input element for the prediction of a user's viewing preferences.
-
FIG. 49 , indicated generally at 4900, illustrates the elements of an embodiment of a multimedia bookmark of the present invention. Themultimedia bookmark 4902 containspositional information 4910 preferably consisting of aURL 4912 and atime code 4914.Content information 4920 may also be stored in themultimedia bookmark 4902. Exemplary of the present invention,audio data 4922 and athumbnail 4924 of the visual information are preferably stored in thecontent information 4920. Preferably included inmetadata information 4930 ofmultimedia bookmark 4902 aregenre description 4932, thetitle 4934 of the associated video and information regarding one ormore actors 4936 featured in the video. Other types of information may also be stored inmultimedia bookmark 4902. - Indicated generally at 5000 in
FIG. 50 is a block diagram depicting one aspect of the method of the present invention. According to teachings of the present invention, arecommendation engine 5004 may be employed to evaluate a user'smultimedia bookmark folder 5002 to determine or predict a user's viewing preferences. Generally,recommendation engine 5004 is preferably configured to read any positional, content and/or metadata information contained in any of themultimedia bookmarks multimedia bookmark folder 5002. - In one embodiment, the
recommendation engine 5004 periodically visits the user'smultimedia bookmark folder 5002 and performs a statistical analysis upon themultimedia bookmarks recommendation engine 5004 examines the “genre” attribute contained in the metadata of each multimedia bookmark, it preferably counts the number of specific keywords and infers that this user's most favorite genre is sports followed by science fiction and situation comedy. Over time and as the user saves additional multimedia bookmarks, therecommendation engine 5004 is better able to identify the user's viewing preferences. As a result, whenever the user wishes to view a program, the recommendation engine can use its predictive capabilities to serve as a guide to the user through a multitude of program channels by automatically bringing together the user's preferred programs. Therecommendation engine 5004 may also be configured to perform similar analyses on such metadata information as the “actors,” “title,” etc. - Illustrated in
FIG. 51 , indicated generally at 5100, is a block diagram incorporating one or moreEPG channel streams 5104 with teachings of the present invention. Upon receipt, by themultimedia bookmark process 5106, of a user request for creation of a multimedia bookmark, the preferred information to be associated with the multimedia bookmark, i.e., the positional, content and metadata information illustrated inFIG. 49 , is preferably gathered. While aspects of the positional information, i.e., desired URL and time code information, used in the multimedia bookmark as well as the content information, i.e., a desired audio segment and thumbnail image, may be gathered directly from the video's source, the metadata will likely have to be found elsewhere. Accordingly, in the embodiment illustrated inFIG. 51 , the metadata (genre, title, actors) information sought by themultimedia bookmark process 5106 may be obtained from theEPG channel 5102 viaEPG channel stream 5104. This metadata is the source of information used by the recommendation engine of the present invention to examine the users' viewing preferences. After extracting the metadata from theEPG channel stream 5104, themultimedia bookmark process 5106 creates a new multimedia bookmark and places the multimedia bookmark into the user's multimedia bookmark folder on the user'sstorage device 5108. - Illustrated in
FIG. 52 is a block diagram of a system incorporating teachings of the present invention without an EPG channel. Upon receipt, by themultimedia bookmark process 5206, of a user request to create a multimedia bookmark, the preferred information to be associated with the multimedia bookmark, i.e., the positional, content and metadata information illustrated inFIG. 49 , is preferably gathered. Again, the positional and content information to be included in the multimedia bookmark may be readily obtained from the video's source. However, to obtain the desired metadata, themultimedia bookmark process 5206 preferably accessesnetwork 5202 via two-way communication medium 5204 to thereby establish a communication link withmetadata server 5210. Preferably located onmetadata server 5210 is such metadata as genre, title, actors, etc. Once a communication link is established betweenmultimedia bookmark process 5206 andmetadata server 5210, themultimedia bookmark process 5206 may download or otherwise obtain the metadata information it prefers for inclusion in the multimedia bookmark. After the desired metadata has been obtained by themultimedia bookmark process 5206, the user's multimedia bookmark is preferably placed in the user's multimedia bookmark folder on the user'sstorage device 5208. - MetaSync First Embodiment
-
FIG. 68 shows the system to implement the present invention for a set top box (“STB”) with the personal video recorder (“PVR”) functionality. In thisembodiment 6800 of the present invention, themetadata agent 6806 receives metadata for the video content of interest from aremote metadata server 6802 via thenetwork 6804. For example, a user could provide the STB with a command to record a TV program beginning at 10:30 PM and ending at 11:00 PM. TheTV signal 6816 is received by thetuner 6814 of theSTB 6820. Theincoming TV signal 6816 is processed by thetuner 6814 and then digitized byMPEG encoder 6812 for storage of the video stream in thestorage device 6810. Metadata received by themetadata agent 6806 can be stored in ametadata database 6808, or in the samedata storage device 6810 that contains the video streams. The user could also indicate a desire to interactively browse the recorded video. Assume further that due to emergency news or some technical difficulties, the broadcasting station sends the program out on the air from 10:45 PM to 11:15 PM. - In accordance with the user's directions, the PVR on the STB starts recording the broadcast TV program at 10:30 sharp. In addition to the recording, since the user also wants to browse the video, the STB also needs the metadata for browsing the program. An example of such metadata is shown in the Table 4. Unfortunately, it is not easy to automatically generate the metadata on the STB if it has only limited processing (CPU) capability. Thus, the
metadata agent 6806 requests from aremote metadata server 6802 for the metadata needed for browsing the video that was specified by the user via themetadata agent 6806. Upon the request, the corresponding metadata is delivered to theSTB 6820 transparently to the user. - The delivered metadata might include a set of time codes/frame numbers pointing to the segments of the video content of interest. Since these time codes are defined relative to the start of the video used to generate the metadata, they are meaningful only when the start of the recorded video matches that of the video used for metadata. However, in this scenario, there is a 15-minute time difference between the recorded content on the
STB 6820 and the content on themetadata server 6802. Therefor, the received metadata cannot be directly applied to the recorded content without proper adjustments. The detailed procedure to solve this mismatch will be described in the next section. - MetaSync Second Embodiment
-
FIG. 69 shows thesystem 6900 that implements the present invention when aSTB 6930 with PVR is connected to the analog video cassette recorder (VCR) 6920. In this case, everything is the same as the previous embodiment, except for the source of the video stream. Specifically,metadata server 6902 interacts with themetadata agent 6906 vianetwork 6904. The metadata received by the metadata agent 6906 (and optionally any instructions stored by the user) are stored inmetadata database 6908 or videostream storage device 6910. Theanalog VCR 6920 provides ananalog video signal 6916 to theMPEG encoder 6912 of theSTB 6930. As before, the digitized video stream is stored by theMPEG encoder 6912 in the videostream storage device 6910. - From the business point of view, this embodiment might be an excellent model to reuse the content stored in the conventional videotapes for the enhanced interactive video service. This model is beneficial to both consumers and content providers. Thus, unless consumers want very high quality video compared to VHS format, they can reuse their content which they already paid for whereas the content providers can charge consumers at the nominal cost for metadata download.
- Video Synchronization with the Metadata Delivered
- Forward Collation
- Video synchronization is necessary when a TV program is broadcast behind schedule (noted above and illustrated in
FIG. 70 ). Starting from the beginning 7024 of one recorded video stream A′ (7020) of interest in the STB, the forward collation is to match the reference frames/segment A1 (7004) which is delivered from the server, against all the frames on the STB and to find the most similar frames/segment A1′ (7024). As a result of this matching, the temporal media offset value d (7010) is determined, which implies that each representative frame number (or time code) that is received from the server for metadata services has to be added by the offset d (7010). In this way, the downloaded metadata is synchronized with the video stream encoded in the STB. As illustrated inFIG. 70 , the use of the offset 7010 enables correlation of frames A1 (7004) to A1′ (7024), A2 (7006), and A3 (7008) to A3′ (7028). - For the synchronization, the server can send the STB characteristic data other than image data that represents the reference frame or segment. The important thing to do is to send the STB a characteristic set of data that uniquely represents the content of reference frame or segment for the video under consideration. Such data can include audio data and image data such as color histogram, texture and shape as well as the sampled pixels. This synchronization generally works for both analog and digital broadcasting of programs since the content information is utilized.
- In the case when the broadcast TV program to be recorded is in the form of digital video stream such as MPEG-2 and the downloaded metadata was generated with reference to the same digital stream, the information such as PTS (presentation time stamp) present in the packet header can be utilized for synchronization. This information is needed especially when the program is recorded from the middle of the program or when the recording of the program stops before the end of the program. Since both the first and last PTSs are not available in the STB, it is difficult to compute the media time code with respect to the start of the broadcast program unless such information is periodically broadcast with the program. In this case, if the first and the last PTSs of the digital video stream are delivered to the STB with the metadata from the server, the STB can synchronize the time code of the recorded program with respect to the time code used in the metadata by computing the difference between the first and last PTS since the video stream of the broadcast program is assumed to be identical to that used to generate the metadata.
- Backward Collation
- A backward collation is needed when a TV program (7102) is broadcast ahead of the schedule as illustrated in
FIG. 71 . Starting from the end of one recorded video stream A′ (7122) in the STB, the backward collation is to match the reference frame A1 (7104) from the metadata server against all the frames on the STB and to find the most similar frame A1′ (7124) to the reference frame A1 (7104). As a result of this matching, the offset value d (7110) is determined, which implies that each representative frame number or time code that is received from the server has to be subtracted by the offset d (7110) to obtain, for example, the correlation between frames A2 (7106) with A2′ (7126) and A3 (7108) with A3′ (7128) as illustrated inFIG. 71 . - Detection of Commercial Clip
- In this scenario, the user has set a flag instructing the STB to ignore commercials that are embedded in the video stream. For this scenario, assume that the metadata server knows which advertisement clip is inserted in the regular TV program, but it does not know exactly the temporal position of inserted clip. Assume further that the frame P (7212) is the first frame of the advertisement clip SC (7230), the frame Q (7212) is the last frame of SC (7230), the temporal length of the clip SC is dC (7236) and the total temporal length of the TV program (video stream A 7202) is dT (7204) as illustrated in
FIG. 72 . - i) Forward Detection of Advertisement Segment
- Given the reference frame P (7212), examining the frames from the beginning to the end of a recorded video stream A′ (7222), the most similar frame P′ (7232) to the reference frame P (7212) is identified by using an image matching technique and the temporal distance hi (7224) between the start frame (7223) to the frame P′ (7232) is computed. Then, for each received representative frame whose frame number (or time code) is greater than h1 (7224), the value of dC (7236) is added.
- ii) Backward Detection of Advertisement Segment
- Given the reference frame Q (7212), examining the frames from the end to the head of a recorded video stream A′ (7222), the most similar frame Q′ (7234) to the reference frame Q (7212) is found and the temporal distance h2 (7226) between the end frame (7227) to the frame Q′ (7234) is computed. Then, for each received representative frame whose frame number (or time code) is greater than dT−(h2+dc), it is adjusted by adding by dc (7236).
- Detection of Individual Program Segments from a Composite Video File
- This case takes place when a user issues a request to record multiple programs into a single video stream in a sequential order as shown in
FIG. 73 . For a given reference frame, this procedure computes the frame (or time code) offset from the first frame of the video stream up to the frame which is most similar to the reference frame. For example, assume there are three reference start frames A1 (7304), Bi (7314), and C1 (7324), and endframes B 7312, andC 7322, respectively. For the reference frame A1 (7304), moving in the direction from the beginning to the end of thevideo stream 7303, the procedure matches the frame A1 (7304) against all the frames on thestream 7303 and finds the most similar frame A1′ (7344). The offset “offA” (7348) from the beginning 7305 to the location of A1′ (7344) is now computed. - This process is repeated in the same manner for the other reference frames B1 (7314) and C1 (7324) for
video streams video streams video streams TABLE 4 An example of metadata for video browsing in XML Schema <?xml version=“1.0” encoding=“EUC-KR”?> <Mpeg7 xmlns=http://www.mpeg7.org/2001/MPEG-7_Schema xmlns:xsi=“http://www.w3c.org/1999/XMLSchema-instance” xml:lang=“en” type=“complete”> <ContentDescription xsi:type=“SummaryDescriptionType”> <Summarization> <Summary xsi:type=“HierarchicalSummaryType” components=“keyVideoClips” hierarchy=“independent”> <SourceLocator> <MediaUri>mms://www.server.com/news.asf</MediaUri> </SourceLocator> <HighlightSummary level=“0” duration=“00:01:35:04”> <Name>Top Stories</Name> <HighlightSegment> <KeyVideoClip> <MediaTime> <MediaTimePoint>00:09:05:22</MediaTimePoint> <MediaDuration>00:00:24:28</MediaDuration> </MediaTime> </KeyVideoClip> <KeyFrame><MediaUri>16354.jpg</MediaUri></ KeyFrame> </HighlightSegment> <HighlightChild level=“1” duration=“00:00:24:28”> <Name>Wrestler Hogan</Name> <HighlightSegment> <KeyVideoClip> <MediaTime> <MediaTimePoint>00:09:05:22</MediaTimePoint> <MediaDuration>00:00:24:28</MediaDuration> </MediaTime> </KeyVideoClip> <KeyFrame><MediaUri>16354.jpg</MediaUri></ KeyFrame> </HighlightSegment> </HighlightChild> <HighlightChild level=“1” duration=“00:00:35:21”> <Name>Gun Shoots in Colorado</Name> <HighlightSegment> <KeyVideoClip> <MediaTime> <MediaTimePoint>00:09:30:20</MediaTimePoint> <MediaDuration>00:00:35:21</MediaDuration> </MediaTime> </KeyVideoClip> <KeyFrame><MediaUri>17096.jpg</MediaUri></ KeyFrame> </HighlightSegment> </HighlightChild> <HighlightChild level=“1” duration=“00:00:34:15”> <Name>Women Wages</Name> <HighlightSegment> <KeyVideoClip> <MediaTime> <MediaTimePoint>00:10:06:11</MediaTimePoint> <MediaDuration>00:00:34:15</MediaDuration> </MediaTime> </KeyVideoClip> <KeyFrame><MediaUri>18171.jpg</MediaUri></ KeyFrame> </HighlightSegment> </HighlightChild> </HighlightSummary> </Summary> </Summarization> </ContentDescription> </Mpeg7> - Automatic Labeling of Captured Video With Text from EPG
- Imagine that a show program from a cable TV is stored on a user's hard disk using PVR. Incidentally, if the user wants to browse the video, he would need some metadata for it. One of the convenient ways to get the metadada about the show is to use the information from the EPG stream. Thus, if one could grab the EPG data, one could generate some level of automatic authoring and associate at least title, date, show time and other metadata with the video.
- E-mail Attachments
- Users often forget to attach documents when they send e-mail. A solution to that problem would be to analyze the e-mail content and give a message to the user asking if he or she indeed attached it. For example, if the user sets an option flag on his e-mail client software program that is equipped with the present invention, a small program or other software routine then analyzes the e-mail content in order to determine if there is the possibility or likelihood of an attachment being referenced by the user. If so, then a check is made to determine if the draft e-mail message has an attachment. If there is no attachment, then a reminder message is issued to the user inquiring about the apparent need for an attachment.
- An example of the method of content analysis of the present invention includes:
-
- 1. Matching the words in the e-mail text by scanning the e-mail contents for words like “enclose,” or “attach” or their equivalent in other languages, preferably the language setting designated by the user.
- 2. If one of the keywords is present, then determining if the e-mail has at least one attachment.
- 3. If no attachment exists and a keyword was found, then issuing a reminder message to the user regarding the need for an attachment.
- User Interface for Showing Relative Position
- Reference is made to
FIGS. 61 and 62 that illustrate portions of the highlights of the Masters tournament of 1997. Specifically, inFIG. 61 , is abrowser window 6102 having aWeb page 6104 and a remotecontrol bar button 6106 along the bottom of thewindow 6102. On theweb page 6104 are various hyperlinks and references made to portions of video, thethird round 6120, thefourth round 6122, Tiger Woods'biography 6124 and the endingnarration 6126. The remote control buttons have various functionality, for example, there is aprogram list button 6108, abrowsing button 6110, aplay button 6112, and astory board button 6116. In the center of the buttons is amultifunction button 6114 that can be enabled with various functionality for moving among various selections within a web page. This is particularly useful if the page contains a number of thumbnail images in a tabular format. -
FIG. 62 contains a drill-down from one of the video links inFIG. 61 . Specifically, inFIG. 62 there is the standardweb browsing window 6202 with theweb page 6204 and thebutton control bar 6206. As withFIG. 61 , the remotecontrol button bar 6206 has identical functionality as the one described inFIG. 61 . Similarly, the remote control buttons have various functionality, for example, there is aprogram list button 6208, abrowsing button 6210, aplay button 6212, and astory board button 6216. As illustrated inFIG. 62 , the selected image fromFIG. 61 , namely 6120, appears inFIG. 62 again aselement 6120. The corresponding video portion of Tiger Woods' play on the ninth hole iselement 6220, and the web page illustrates several other video clips, namely the play to the18th hole 6232, and the interview withplayers 6234. -
FIG. 60 illustrates a hierarchical navigation scheme of the present invention as it relates toFIGS. 61 and 62 . This hierarchical tree is usually utilized as a semantic representation of video content. Specifically, there is thewhole video 6002 that contains all the video segments which compose a single hierarchical tree. Subsets of the video segments were shown invideo clip 6004, thethird round 6020, thefourth round 6022, Tiger Woods'biography 6024, and the endingnarration 6026 that correspond toelements FIG. 61 . The lower three boxes ofFIG. 60 correspond to the three choices available, as illustrated inFIG. 62 , namely, Tiger Woods' first nineholes 6021, which corresponds toelement 6220 ofFIG. 62 , as well as Tiger Woods' second nineholes 6032, and theinterview 6034, which correspond to the remaining two elements illustrated inFIG. 62 . As shown inFIG. 60 , the hierarchical navigation scheme allows a user to quickly drill down to the desired web page without having to wait for the rendering of multiple interceding web pages. The hierarchical status bar, using different colors, can be used to show the relative position of the segment as currently selected by the user. - Referring back to
FIG. 61 ,FIG. 61 further contains astatus bar 6150 that shows therelative position 6152 of the selectedvideo segment 6120, as illustrated inFIG. 61 . Similarly, inFIG. 62 , thestatus bar 6250 illustrates the relative position of thevideo segment 6120 asportion 6252, and the sub-portion of thevideo segment 6120, i.e., 6254, that corresponds to Tiger Woods' play to the18 th hole 6232. - Optionally, the
status bar element 6254, the user would be given a web page containing starting thumbnail of Tiger Woods' play to the 18th hole, as well as Tiger Woods' play to the ninth hole, as well as the initial thumbnail for the highlights of the Masters tournament, in essence, giving a quick map of the branch of the hierarchical tree from the position on which the user clicked on the map status bar. - Alternate Embodiments
- Preferably, the video files are stored in each user's storage devices, such as a hard disk on a personal computer (PC) that are themselves connected to a P2P server so that those files can be downloaded to other users who are interested in watching them. In this case, if a user A makes a multimedia bookmark on a video file stored in his/her local storage and sends the multimedia bookmark via an e-mail to the user B, the user B cannot play the video starting from the position pointed to by the bookmark unless the user B downloads the entire video file from user A's storage device. Depending upon the size of the video file and the bandwidth available, the full download could take a considerable length of time. The present invention solves this problem by sending the multimedia bookmark as well as a part of the video as follows:
-
- 1) The user A sends the summary of the video generated manually, or automatically by video analysis, or semiautomatically. The summary could be a set of key frames representing the whole video where one of the keyframes is the bookmarked frame that is highlighted.
- 2) The user A then sends the short video clip file near the bookmarked position. The video clip file can be generated by editing the video file such as an MPEG-2, among others.
Thus, the user B can decide if he/she wants to download the whole video after watching the part of the video containing the bookmarked position. By use of the present invention, bandwidth can be saved that would otherwise have been devoted to downloading whole video files in which user B would not have sufficient interest to justify the download.
- Yet another embodiment of the present invention deals with the problem with the broadcast video when the user cannot make the bookmark of his/her favorite segment when the segment disappears and thereafter a new scene appears at the same place in the video. One solution would be to use the time-shifting property of the digital personal video recorder (PVR). Thus, as long as a certain amount of video segment prior to the current part of the video being played is always recorded by the PVR and stored in temporary (or permanent non-volatile) storage, the user always can go back to his/her favorite position of the video.
- Alternatively, suppose that the user A sends a bookmark to the user B as described above. There still occurs a problem if the video is broadcast without video-on-demand functionality. In this case, when the smart set-top box (STB) of the user B receives a bookmark, the STB can check the electronic programming guide (EPG) and see if the same program will be scheduled to be broadcast sometime in the future. If so, the STB can automatically records the same program at the scheduled time and then the user B can play the bookmarked video.
- 2. Search
- An embodiment of the present invention is based on the observation that perceptually relevant images often do not share any apparent low-level features but still appear conceptually and contextually similar to humans. For instance, photographs that show people in swimsuits may be drastically inconsistent in terms of shape, color and texture but conceptually look alike to humans. In contrast to the methodologies mentioned above, the present invention does not rely on the low-level image features, except in an initialization stage, but mostly on the perceptual links between images that are established by many human users over time. While it is unfeasible to manually provide links between a huge number of images at once, the present invention is based on the notion that a large number of users over a considerable period of time can build a network of meaningful image links. The method of the present invention is a scheme that accumulates information provided by human interaction in a simpler way than image feature-based relevance feedback and utilizes the information for perceptually meaningful image retrieval. It is independent of and complementary to the image search methods that use low-level features and therefor can be used in conjunction with them.
- This embodiment of the method of the present invention is a set of algorithms and data structures for organizing and accumulating users' experience in order to build image links and to retrieve conceptually relevant images. A small amount of extra data space, a queue of image links, is needed for each query image in order to document the prior browsing and searching. Based on this queue of image links, a graph data structure with image objects and image links is formed and the constructed graph can be used to search and cluster perceptually relevant images effectively. The next section describes the underlying mathematical model for accumulating users' browsing and search based on image links. The subsequent section presents the algorithm for the construction of perceptual relevance graph and searching.
- Information Accumulation Using Image Links
- Data Structure for Collecting Relevance Information
- There are potentially many ways of accumulating information about users' prior feedback. The present invention utilizes the concept of collecting and propagating perceptual relevance information using simple data structures and algorithms. The relevance information provided by users can be based on image content, concept, or both. For storing an image's links to other images that some relevance is established to, each image has a queue of finite length as illustrated in
FIG. 30 . This is called the “relevance queue.” Therelevance queue 3006 can be initially empty or filled with links to computationally similar images (CSIs) determined by low-level image feature descriptors such as color, shape and texture descriptors that are commonly used in a conventional content-based image search engine. - A perceptually relevant image (PRI) is determined by a user's selection in a manner that is similar to that of general relevance feedback schemes. When the image of interest is presented as a query and initial image retrieval is performed, the user views the retrieved images and establishes relevance by clicking perceptually related images as positive examples.
FIG. 30 illustrates the case ofImage 5 3004 of the retrievedimages 3002 being clicked and its link being enqueued 3010 into therelevance queue Q n 3006 of thequery Image n 3008. In contrast to previous relevance feedback schemes where the positive examples are used for adjusting low-level feature weights or distances, the method of the present invention inserts the link to the clicked image, the PRI, into the query image's relevance queue by the normal “enqueue”operation 3010. The oldest image link is deleted from the queue in ade-queue operation 3012. The list of PRIs for each image queue is updated dynamically whenever a link is made to the image by a user's relevance feedback, and thus, an initially small set of links will grow over time. The frequency at which a PRI appears in the queue is the frequency of the users' selection and can be taken as the degree of relevance. This data structure that is comprised of image data and image links will become the basic vertex and edge structures, respectively, in the relevance graph that is developed for image searching, and the frequency of the PRI will be used for determining edge weights in the graph. - Conventional relevance feedback methods explicitly require users to select positive or negative examples and may further require imposing weighting factors on selected images. In this embodiment of the present invention, users are not explicitly instructed to click similar images. Instead, the user simply browses and searches images motivated only by their interest. During the users' browsing and searching, it is expected that they are likely to click more often on relevant images than irrelevant images so the relevance information is likewise accumulated in the relevance queues.
- Mathematical Model for Information Accumulation
- It is conceivable to develop a sophisticated update scheme that minimizes the variability of users' expertise, experience, goodwill and other psychological effects. In the present invention, however, only the basic framework for PRI links without psychology-based user modeling is presented. The assumption is that there are more users with good intention than others, and in this case, it is shown in the experimental studies that the effect of sporadic false links to irrelevant images is minimized over time for the proposed scheme.
- The structure of the image queue as defined above affords many different interpretations. The entire queue structure, one queue for each image in the database, may be viewed upon as a state vector that gets updated after each user interaction, namely by the enqueue and dequeue operations. If all images are labeled in the database by the
image index 1 through N, where N is the total number of images, the content of the queue maybe represented by the queue matrix Q=[Q1| . . . |QN] of size NQ×N, where NQ is the length of the image queue. The nth column of the queue matrix, Qn contains the image indices as its elements and they may be initialized according to some low-level image relevance criteria. - When a user searches (queries) the database using the nth image, the system will return with a list of similar images on the display window. Suppose the user then clicks the image with index m. This would result in updating the nth column Qn of the queue matrix corresponding to enqueue and dequeue operations. This can simply be modeled by the following update equation for the jth element of Qn:
The queue matrix defined as such, immediately allows the following definition of the state vector. - The state vector representing the image queue is defined by an N×N matrix S=[S1| . . . |SN] whose nth column S is an N×1 vector which basically represents the image queue for the nth image in the database. The jth element of Sn is defined to be:
where 0<α<1. Note that if the weighting α(1−α)i−1 inside the summation is 1, then Sn would simply be the histogram of image indices of the nth image queue, Qn. Thus, Sn as defined above is basically a weighted histogram of image indices of the nth image queue Qn. The weight α serves as the forgetting factor. Note that for an infinite queue ( NQ=∞), Sn is a valid probability mass function as ΣSn(j)=1 and Sn(j)≧0. Even for a finite queue, for instance, with NQ=256 and the forgetting factor α=0.1, the sum ΣSn(j)≈1−2×10−12. With the above relationship between the queue content and the state vector, evolution of the state at time p may be described by the following update equation:
S n (p+1)=(1−α)S n (p) +αe EQ (p)
where eEQ (p) is a natural basis vector where all elements are zero except for one whose row index is identical to the index of the image currently being enqueued. What one would like, of course, is for this state vector to approach a state that makes sense for the current database content. Given a database of N images, assume that there exists a unique N×N image relevance matrix R=[R1| . . . RN] . The matrix is composed of elements rmn, the relevance values, which in essence is the probability of a viewer clicking the mth image while searching (querying) for images similar to the nth image. The actual values in the relevance matrix R will necessarily be different for different individuals. However, when all users are viewed upon as a collective whole, the assumption of the existence of a unique R becomes rather natural. The state update equation, during steady-state operation, may be expressed by the expectation operation:
E[S n (∞) ]=E[e EQ (∞) ]=R n =nth column of R - The above equality expresses precisely the desired result. That is, the state vector (matrix) S converges to the image relevance matrix R, provided that an image relevance matrix exists. Although the discussion of the state vector is helpful in identifying the state to which it converges, the actual construction and update (of the state vector) is not necessary. As the image queue has all information that it needs to compute the state vector (or the image relevance values), the implementation requires only the image queue itself. The current state vector is computed as required. As such, it is during the image retrieval process, when it needs to use the forgetting factor α to return images similar to the query image based on the current image relevance values.
- Relevance Queue Initialization
- The discussion in the previous subsection assumes steady state of the relevance queue. When a new image is inserted into a database, it does not have any links to PRIs and no images can be presented to a user to click. The relevance queue is initialized with CSIs obtained with a conventional search engine in a manner that makes higher-ranked CSIs have higher relevance values. In the initialization stage, CSI links are put into the relevance queue evenly but higher-ranked CSI links more frequently. An initialization method is illustrated for eight retrieved
CSIs 3102 in therelevance queue 3106 inFIG. 31 where the image link numbers denote the ranks of the retrieved CSIs. This technique ensures that higher-ranked CSIs will remain longer in the queue as users replace CSIs with PRIs by relevance feedback. - Construction of Relevance Graph and Image Search
- Construction of Relevance Graph
- Graph is a natural model for representing syntactic and semantic relationships among multimedia data objects. Weighted graphs are used by the present invention to represent relevance relationships between images in an image database. As shown in
FIG. 47 , thevertices 4706 of thegraph 4702 represent the images and theedges 4708 are made by image links in the image queue. - An edge between two image vertices Pn and Pj is established if image Pj is selected by users when Pn is used as a query image, and therefor image Pj appears for a certain number of times in the image link queue of Pn. The edge cost is determined by the frequency of image Pj in the image link queue of Pn, i.e., the degree of relevance established by users. Among many potential cost functions, the following function is used:
Cost(n,j)=Thr[1−S n(j)],
where the threshold function is defined as:
The threshold function signifies the fact that Pj is related to Pn by a weighted edge only when Pj appears in the image link of Pn more than a certain number of times. If the frequency of Pj is very low, Pj is not considered to be relevant to Pn. Associative and transitive relevance relationships are given as:
(P n→Cost(n,j) P j)→Cost(j,k) P k =P n→Cost(n,j)(P j→Cost(j,k) P k),
If P n→Cost(n,j) P j and P j→Cost(j,k) P k, then P n→Cost(n,k) P k,
where Pn→Cost(nj) Pj denotes the relevance relationship from Pn to Pj with Cost(nj), and Cost (n,k)=Cost (nj)+Cost (j,k). - It would require many user studies using various sets of images to determine which of the symmetric and asymmetric relevance relationships is more effective. A relevance relationship can possibly be asymmetric while a relevance graph is generally a directed graph. However, in the present invention, assume a symmetric relationship simply because it propagates image links more in a graph for a given number of user trials. The symmetry of relevance is represented by the symmetric cost function:
Cost(n,j)=Cost(j,n)=Min[Cost(nj),Cost(j,n)],
and the commutative relevance relationship:
P n→Cost(nj) P j =P j→Cost(j,n) P n.
The symmetry of relevance relationship results in undirected graphs as shown inFIG. 47 . Specifically,FIG. 47 illustrates an undirected graph 8102 for a set of eight images and itsadjacency matrix 4704, respectively. - Image Search
- The present invention employs a relevance graph structure that relates PRIs in a way that facilitates graph-based image search and clustering. Once the image relevance is represented by a graph, one can use numerous well-established generic graph algorithms for image search. When a query image is given and it is a vertex in a relevance graph, it is possible to find the most relevant images by searching the graph for the lowest-cost image vertices from the source query vertex. A shortest-path algorithm such as Dijkstra's will assign lowest costs to each vertex from the source and the vertices can be sorted by their costs from the query vertex. See, Mark A. Weiss, “Algorithms, Data Structures, and Problem Solving with C++,” Addison-Wesley, MA, 1995.
- Hypershell Search
- Generally, the first step of most image/video search algorithms is to extract a K-dimensional feature vector for each image/frame representing the salient characteristics to be matched. The search problem is then translated as the minimization of a distance function d(oi, q) with respect to i, where q is the feature vector for the query image and oi is the feature vector for the i-th image/frame in the database. Further, it has been known that search time can be reduced when the distance function d(•,•) has metric properties: 1) d(x,y)≧0; 2) d(x,y)=d(y,x); 3) d(x,y)≦d(x,z)+d(z,y) (a triangular inequality). Using the metric properties, particularly triangular inequality property, the hypershell search disclosed in the present invention also reduces the number of distance evaluations at query time, thus resulting in the fast retrieval. Specifically, the hypershell algorithm uses the distances to a group of predefined distinguished points (hereafter called reference points) in a feature space to speed up the search.
- To be more specific, the hypershell algorithm computes and stores in advance the distances to k reference points (d(o,p1), . . . , d(o,pk)) for each feature vector o in the database of images/frames. Given the query image/frame q, its distances to the k reference points (d(q,p1), . . . , d(q,pk)) are first computed. If, for some reference point pi, |d(q,pi)−d(o,pi)|>ε, then d(o,q)>ε holds by triangular inequality, which means that the feature vector o is not close enough to the query q that there is no need to explicitly evaluate d(o,q). This is one of the underlying ideas of the hypershell search algorithm.
- Indexing (or Preprocessing)
- To make videos searchable, the videos should be indexed. In other words, prior to searching the videos, a special data structure for the videos should be built in order to minimize the search cost at query time. The indexing process of the hypershell algorithm consists of a couple of steps.
- First, the indexer simply takes a video as an input and sequentially scans the video frames to see if they can be representative frames (or key frames), subject to some predefined distortion measure. For each representative frame, the indexer extracts a low-level feature vector such as color correlogram, color histogram, or color coherent vector. The feature vector should be selected to well represent the significant characteristics of the representative frame. The current exemplary embodiment of the indexer uses color correlogram that has information on spatial correlation of colors as well as color distribution. See, J. Huang, S. K. Kumar, M. Mitra, W. Zhu and R. Zabih, “Image indexing using color correlogram,” in Proc. IEEE on Computer Vision and Pattern Recognition, 1997.
- Second, the indexer performs PCA (Principal Component Analysis) on the whole set of the feature vectors extracted in the previous step. The PCA method reduces the dimensions of the feature vectors, thereby representing the video more compactly and revealing the relationship between feature vectors to facilitate the search.
- Third, given the metric distance such as L2 norm, the LBG (Linde-Buzo-Gray) clustering is performed on the entire population of the dimension-reduced feature vectors. See, Y. Linde, A. Buzo and R. Gray, “An algorithm for vector quantization design,” in IEEE Trans. on Communications, 28(1), pp. 84-95, January, 1980. The clustering starts with a codebook of a single codevector (or cluster centroid) that is the average of the entire feature vectors. The code vector is split into two and the algorithm is run with these two codevectors. The two resulting codevectors are split again into four and the same process is repeated until the desired number of codevectors is obtained. These cluster centroids are used as the reference points for the hyperhsell search method.
- Finally, the indexer computes distance graphs for each reference point and each cluster. For a reference point pi and a cluster Cj, the distance graph Gi,j={(a, n)} is a data structure to store a sequence of value pairs (a,n), where a is the distance from the reference point pi to the feature vectors in the cluster Cj and n is the number of feature vectors at the distance α from pi. Therefor, if the number of reference points is k and the number of cluster m, then mk distance graphs are computed and stored into a database.
- The indexing data such as dimension-reduced feature vectors, cluster information, and distance graphs produced at the above steps are fully exploited by the hypershell search algorithm to find the best matches to the query image from the database.
FIG. 48 illustrates this indexing process. -
FIG. 48 illustrates thesystem 4800 of the present invention for implementing the hypershell search. Thesystem 4800 is composed generally of anindexing module 4802 and aquery module 4804. The indexing module contains storage devices in astorage module 4806 for storing frame and vector data. Specifically, storage space is allocated forkey frames 4808, dimension-reducedfeature vectors 4810, clusters andrelated centroids 4812, anddistance graphs 4816. The storage elements mentioned above can be combined onto a single storage device, or dispersed over multiple storage devices such as a RAID array, storage area network, or multiple servers (not shown). In operation thedigital video 4836 is sent to akey frame module 4818 which extracts feature vector information from selected frames. The key frames and associated feature vectors are then forwarded to thePCA module 4820 which both stores the feature vector information intostorage module 4810, as well as forwards the dimension-reducedfeature vectors 4840 to theLGB clustering module 4822. TheLGB clustering module 4822 stores the clusters and their associated centroids into thecluster storage module 4812 and forwards the clusters and their centroids to thecompute module 4824. Thecompute module 4824 computes the distance graphs and stores them into the distancegraph storage module 4816. Theindexing module 4802 is typically a combination of hardware and software, although the indexing module is capable of being implemented solely in hardware or solely in software. - The information stored in the indexing module is available to the query module 4802 (i.e., the
query module 4804 is operably connected to theindexing module 4802 through a data bus, network, or other communications mechanism). Thequery module 4802 is typically implemented in software, although it can be implemented in hardware or a combination of hardware and software. Thequery module 4804 receives a query 4834 (typically in the form of an address or vector) for image or for frame information. The query is received by thefind module 4826 which finds the nearest one or more clusters nearest to the query vector. Next, inmodule 4828, the hypershell intersection (either basic, partitions, and/or partitions-dynamic) is performed. Next, inmodule 4830, all of the feature vectors that are within the intersected regions (found by module 4828) are ranked. Thereafter, the ranked results are displayed to the user viadisplay module 4832. - Search Algorithm
- The problem of proximity search is to find all the feature points whose distance from a query point q is less than distance ε where distance ε is a real number indicating the fidelity of the search results. See, E. Chavez, J. Marroquin and G. Navarro, “Fixed queries array: a fast and economical data structure for proximity searching,” in Multimedia Tools and Applications, pp. 113-135, 2001. The present invention called the hypershell search algorithm provides one of the efficient solutions for the proximity search.
- A two-dimensional feature vector space is assumed in
FIG. 63 for simplicity. Assume further that there are two reference points p1 and P2, respectively, in the 2D feature space. Given a query point q, the hypershell search first computes all of the distances Di (i=1,2) between the query point q and the reference points pi (i=1,2) and then generates one hypershell for each reference point. Each hypershell denoted by 6302 and 6304 is preferably 2ε in thickness and lies Di (i=1,2) away from its center located at its corresponding reference point pi. The intersection of the two hypershells 6302 and 6304 leads to the two regions I1 and I2 indicated in bold lines inFIG. 63 . As illustrated, the intersection region I1 includes a circle S of radius ε centered at query point q. - The feature points inside the circle S of
FIG. 63 are those feature points similar to the query point q, up to the degree of ε, and thus are the desired results of a proximity search. The value of ε may be predetermined at the time of database buildup or determined dynamically by a user at the time of query. Since all the points in the circle are contained in the intersections I1 and I2, it is desirable to search only the intersections instead of the whole feature space, thus dramatically reducing the search space. - As illustrated in
FIG. 63 , there may be more than one intersection resulting from hypershell intersection in a multidimensional feature space. For example, the two intersected regions I1 and I2, of the 2-D feature space are illustrated inFIG. 63 . In such case, however, it is possible that one or more of intersected regions may be irrelevant to the search. For example, inFIG. 63 , the region I1 is highly pertinent to the query point q while the region I2 is not. Thus, to improve search performance, the least relevant regions, such as I2, should be eliminated. One way to achieve such elimination is to partition the original feature space into a certain number of smaller spaces (also called clusters) and to apply the hypershell intersection to the clusters or segmented feature spaces.FIG. 64 illustratesclusters FIGS. 63 and 64 , among the intersection I1 and I2, only the region I1 would be considered a relevant region because it resides inside the same cluster to which the query point Q belongs. - Three Preferred Embodiments
- In searching for information according to the present invention, one or more of three preferred methods may be employed. In one embodiment of the present invention where clusters are not employed, a basic hypershell search algorithm may be used. In another embodiment of the present invention where clusters obtained by using the LBG algorithm described above are employed to improve search times, a partitioned hypershell search algorithm or a partitioned-dynamic hypershell search algorithm may be used. The basic hypershell search algorithm is discussed below with reference to
FIG. 65 . The partitioned hypershell search algorithm and the partitioned-dynamic hypershell search algorithm are also discussed below with reference toFIGS. 66 and 67 , respectively. Regardless of the search algorithm employed, however, for a given query image/frame q and distortion ε, a set of the images/frames, O, satisfying,
O={o k |d(o k ,q)≦ε, o k ∈R}
are searched, where R is an image/video database and d(•,•) is a metric distance. - Basic Hypershell Search Algorithm
- In the first preferred embodiment of the basic hypershell search algorithm,
where pj's are the predetermined reference points and J is the number of reference points. And, Ij denotes the hypershell that is 2ε wide and centered at the reference point pj, and I denotes the set of intersections obtained by intersecting all the hypershells Ij. As illustrated inFIG. 65 , three hypershells 6502, 6504, and 6506 are generated by the basic hypershell search algorithm upon running an image/frame query with a distortion ε. Further, the use of thehypershells intersection 6508, bounded by bold lines. As mentioned above, the feature vector points within theintersection 6508 include those points that would be retrieved in a proximity search. It is worth noting that compared with the other two embodiments described afterward, the basic shell search algorithm tends to cause a considerable search cost, namely time to intersect hypershells, because the number of data (image/frame) points contained in the intersection are usually relatively larger than the other two methods. - Partitioned Hypershell Search Algorithm
- In the second preferred embodiment of the partitioned hypershell search algorithm,
where Cn represents the closest cluster from query image/frame q. Similarly to the first embodiment, Ij denotes the hypershell that is 2ε wide and centered at the reference point pj and I denotes the set of intersections obtained by intersecting all the hypershells. In this case, however, only the portion of hypershells surrounded by the expanded boundary by ε of cluster Cn, as shown inFIG. 66 is searched. Without the boundary expansion, a feature point o that is close enough to the query image q (i.e., d(o,q)≦ε) but resides in the neighboring cluster would not be included in the outcome of the proximity search. It is often the case that many other cluster-based search algorithms do not guarantee the search results with a given fidelity. Thelines dotted lines darkened region 6614 denotes the expanded cluster Cn that includes theexpansion region 6616 over which the search is performed. - Similar to
FIG. 65 ,FIG. 66 illustrates threehypershells cluster boundaries region 6614 can be selected as the most pertinent region for further consideration. For theregion 6614, theintersecting region 6624 is identified and actually searched. - Partitioned-Dynamic Hypershell Search Algorithm
- While the partitioned hypershell search algorithm is the fastest of three algorithms, it also has a larger memory requirement than its alternatives. The extra storage is needed due to boundary expansion. For instance, a feature (image/frame) point near a cluster boundary, i.e.,
boundary lines FIG. 67 , often turns out to be an element contained in the multiple clusters. Therefor, as an alternative, the partitioned-dynamic hypershell search algorithm is a light version of partitioned hypershell search algorithm with less memory requirement, but approximately same search time as the partitioned hypershell search algorithm.
where d(Ck,q) is the distance between a center of cluster and a feature point. The Ij denotes the hypershell that is 2ε wide and centered at the reference point pj, and I denotes the set of intersections obtained by intersecting all the hypershells. The r is the shortest of all the distances between the query point and the cluster centroids. The C is the set of clusters whose centroids are within the distance r+ε from the query point. - Fast Codebook Search
- Given an input vector Q, a codebook search problem is defined to select a particular code vector Xi in a codebook C such that
∥Q−X i ∥<∥Q−X j∥ for j=1,2, . . . ,N,j≠i
where N denotes the size of codebook C. The present invention of the fast codebook search is used to find the closest cluster for the hypershell search described previously. - Multi-resolution Structure Based on Haar Transform
- Let H(•) stand for the Haar transform. Suppose further that a vector X=(x1,x2, . . . ,xk)∈Rk, and its transformed one, Xh=H(X)=(x1 h, x2 h, . . . , xk h), where k is the power of 2, for example, 2m. Then, a Haar-transform based multi-resolution structure for vector X is defined to be a sequence of vectors {Xh,0, Xh,1, . . . , Xh,n, . . . , Xh,m 56 , where Xh,n is an n-th level vector of
size 2n and Xh,m=Xh. The multi-resolution structure is built in bottom-up direction, taking the vector Xh=Xh,m as an initial input and successively producing the (m−1), (m−2), . . . , n, . . . , 2, 1, 0-th level vectors in this order. Specifically, n-th level vector is obtained from (n+1)-th level vector by simple substitution:
X h,n [p]=X h,(n+1) [p] for p=1,2, . . . ,2n
where Xh,n[p] denotes p-th coordinate of vector Xh,n. -
FIG. 29 illustrates the use of the Haar transform in the present invention. Specifically, theoriginal feature space 2902 contains various elements X0 2904,X 1 2906,X 2 2908, andX 3 2910 as illustrated inFIG. 29 . Upon thetransformation 2930, there appear the corresponding transform elements Xh,0 2914,X h,1 2916,X h,2 2918, andX h,3 2920 in the Haar transformspace 2912 corresponding toelements X 0 2904,X 1 2906,X 2 2908, andX 3 2910, respectively. - Properties
- Property 1:
- Suppose Q=(q1, q2, . . . , qk), X=(x1, x2, . . . xk), Qh=H(Q)=(q1 h, q2 h, . . . , qk h), and Xh=H(X)=(x1 h, x2 h, . . . , xk h). Then, the L2 distance between Q and X is equal to the L2 distance of between Qh and Xh:
- Property 2:
- Assume that Dn(Qh, Xh) symbolizes the L2 distance between two n-th level vectors Qh,n and Xh,n in Haar transform space. Then the following inequality holds true:
D m(Q h ,X h)≧D(Q h ,X h)≧. . . ≧D 1(Qh ,X h)≧D 0(Q h ,x h) - The following pseudo code provides a workable method for the use of the cookbook search:
Input: Q // query vector HaarCodeBk // codebook), CbSize // size of codebook) VecSize // Dimension of codevector) Output: NN // index of the codevector nearest to Q) - Algorithm:
min_dist = ∞; Q_haar = HaarTrans(Q); // Compute Haar transform of Q for(i = 0; i < CbSize; i++) { for(length = 1; length <= VecSize; length = length * 2) { dist = LevelwiseL2Dist (Q_haar, HaarCodeBk[i], length); if(dist >= min_dist) { break; // Go to the outer loop to try another codevector } if(length == VecSize) { min_dist = dist; NN = i; } } } return NN; Peer to Peer Searching - To the best of the present inventors' knowledge, most of current P2P systems perform searches only using a string of keywords. However, it is well-known that if the search for multimedia content is made with visual features as well as the textual keywords, it could yield the enhanced results. Furthermore, if the search engine is enforced by advantages of P2P computing, the scope of the results can be expanded to include a plurality of diverse resources on peer's local storage as well as Web pages. Additionally, the time dedicated to the search will be remarkably reduced due to the distributed and concurrent computing. Taking the best parts from the visual search engine and the P2P computing architecture, the present invention offers a seamless, optimized integration of both technologies.
- Basic assumptions underlying the implementation of this method of the present invention: (Gnutella model: server-less model or pure peer-to-peer model)
-
- 1. The network consists of nodes (i.e., peers) and connections between them.
- 2. The nodes have same capability and responsibility. There is no central server node. Each node functions as both a client and a server.
- 3. A node knows only its own neighbors.
- The following is a scenario to find image files according to an embodiment of the present invention:
-
- 1. A new user (denoted as NU) enters the P2P network.
- 2. NU broadcasts or multicasts a message called ping to announce its presence.
- 3. Nodes that receive the ping send a pong back to NU to acknowledge that they have received the ping message.
- 4. NU keeps track of nodes that sent those pong messages so that it retains a list of active nodes to which NU is able to connect.
- 5. When NU initiates a search request, it broadcasts or multicasts to the network the query message that contains visual features as well as a string of keywords.
- 6. A node (denoted as SN) that receives the query message runs image search engine upon the image database on the node's local storage. If SN finds images to satisfy the search criteria, it responds to NU with the search result message that may contain the SN's IP address and a list of found file sizes and names.
- 7. NU attempts to make a connection to the node SN using SN's IP address and download image files.
- 8. If NU triggers another search request, go to
step 5. Otherwise, it terminates the connection and leaves the P2P network.
-
FIG. 25 is a flowchart illustrating themethod 2500 of the present invention. The method begins generally atstep 2502. Thereafter, a new user (NU) enters the peer-to-peer (P2P) network instep 2504. The new user multicasts a “ping” (service request) signal to announce its presence instep 2506. The new user then waits to receive one or more “pong” (acknowledgement) signals from other users on the network,step 2508. The new user keeps track of the nodes that sent “pong” messages in order to retain a list of active nodes for subsequent connections, step 25 10. The new user then initiates a search request by multicasting a query message to the network instep 2512. The source node (SN) 2524 receives the new user's search request and executes a “visual” search using the query parameters in the new user's query message,step 2526. The source node then routes the search results to the new user instep 2528. The new user receives the search result message that contains the source node's IP address as well as a list of names and sizes of found files,step 2514. Thereafter, the new user makes a connection to the source node using the source node's IP address, and downloads multimedia files, instep 2516. A check is made to determine if the new user wants another search request instep 2518. If so, the execution loops back to thestep 2512. Otherwise, the user leaves the P2P network instep 2520 and terminates the program instep 2522. - 3. Editing
- [DS_3_Editing.doc]
- The present invention includes a method and system of editing video materials in which it only edits the metadata of input videos to create a new video, instead of actually editing videos stored as computer files. The present invention can be applied not only to videos stored on CD-ROM, DVD, and hard disk, but also to streaming videos on a local area network (LAN) and wide area networks (WAN) such as the Internet. The present invention further includes a method of automatically generating an edited metadata using the metadata of input videos. The present invention can be used on a variety of systems related to video editing, browsing, and searching. This aspect of the present invention can also be used on stand-alone computers as well those connected to a LAN or WAN such as the Internet.
- In order for the present invention to achieve such goals, metadata of an input video file to be edited contain a URL of the video file and segment identifiers which enables one to uniquely identify metadata of a segment such as time information, title, keywords, annotations, and key frames of the segment. A virtually edited metafile contains metadata copied from some specific segments of several input metafiles, or contains only the URIs (Uniform Resource Identifier) of these segments. In the latter, each URI consists of both a URL of the input metafile and an identifier of the segment within the metafile.
- The significance and the practical application of the present invention are described in detail by referencing the illustrated figures.
-
FIG. 32 compares the formervideo editing concept 3200 with the concept of virtual editing in thepresent invention 3200′. InFIG. 32 , it is assumed that the metadata used during the virtual editing, is stored on a separate metafile. Referring toFIG. 32 , the prior art method (FIG. 32 (a)) merely sends thevarious video files 3202 to thevideo editor 3206 where a user edits the videos to produce an editedvideo 3208. In contrast, the method of the present invention, as illustrated inFIG. 32 (b), utilizesmetafiles 3204 of thevideos 3202 and edits themetafiles 3204 in thevirtual video editor 3206′ to produce ametafile 3210 of a virtually edited video. -
FIG. 33 is an example of the creation of a new video using the virtual editing of the present invention with the metafile of the three videos.Video 3340 consists of foursegments elements metafile 3302 ofvideo 3340.Segments metafile 3302 are grouped tosegment 5;segments segment 6, andsegments segment 7 ofmetafile 3302. Similarly, video 2 (3350) has threesegments metafile 3304. As withmetafile 3302,metafile 3304 groups the elements in a hierarchical structure (a and b into d, and c and d into e). Video 3 (3360), meanwhile, has fiveelements metafile 3306 as illustrated inFIG. 33 . As with the other two metafiles,metafile 3306 has its elements grouped in a hierarchical structure, namely, A, B, and C into F; and D and E into G from which F and G are grouped into H as illustrated inFIG. 33 . - The virtually edited
metadata 3308 is composed ofsegments segment identifiers segment 3310 is from segment 5 (3314) ofmetadata 3302,segment 3316 is from segment c (3320) ofmetadata 3304, andsegments metadata 3306 as shown inFIG. 33 . In order to form a hierarchical structure with the above segments, twosegments metafile 3308 as shown inFIG. 33 . - There are two kinds of segments within the metafile of the virtually edited video: a component segment of which the metadata has already been defined in the input video metafile, such as
segments segments -
FIG. 33 describes the process of generating the virtually edited metadata. Segment 5 (3314) ofmetafile 3302, the segment to be edited, is selected by browsing throughmetafile 3302. Composingsegment 3382 is newly generated, and it has the selected segment 5 (3314) as its child node by generating anew segment 3310 and saving an identifier of the segment 5 (3314) into the new segment. Therefor, thenew segment 3310 becomes a component segment within the hierarchical structure being edited. Segment c (3320), another segment to be edited, is selected by browsing throughmetafile 3304. In order to make the selected segment c (3320) be a child of thesegment 3382, anew segment 3316 is generated and an identifier of the segment c (3320) is saved into the new segment. One can browse throughmetafile 3306, and want to make two non-consecutive segments A (3326) and C (3332) be a consecutive segment and give some title to the new segment. Thecomposing segment 3382 has then another newly createdcomposing segment 3380 as its child node, write the title into metadata of thesegment 3380. Thesegment 3380 has the selected segments A (3326) and C (3332) as its children by generating twonew segment new segments - Eventually, the edited metadata of
FIG. 33 must be transformed into video that is useful to the user.FIG. 34 illustrates the virtually editedmetadata 3408 and its corresponding restructuredvideo 3440. Specifically, segment 5 (3414) presentsvideo segments video segment 3446, and segments A (3426) and C (3432)present video segments - When metadata of a selected segment in an input metafile is copied to a component segment in a virtually edited metafile, the copy operation can be performed by one of the two ways described below. First, all the metadata belonging to the selected segment of an input metafile are copied to a component segment within the hierarchical structure being edited. This method is quite simple. Moreover, a user can freely modify or customize the copied metadata without affecting the input metafile. Second, record only the URI of the selected segment of an input metafile into the component segment within the hierarchical structure being edited. Since the URI is composed of a URL of the input metafile, and an identifier of the selected segment within the file, the segment within the input metafile can be accessed from a virtually edited metafile if the URI is given. With this method, a user cannot customize the metadata of the selected segment. Users can only reference it as it is. Also, if the metadata of a referenced segment is modified, the virtually edited metafile referencing the segment will be reflected accordingly regardless of the user's intention.
- In both methods, for the playback of the virtually edited metafile, the URL of input video file containing the copied or reference segment has to be stored in the corresponding input metafile. In a virtually edited metafile generated with the first method, if the video URLs of all the sibling nodes belonging to a component segment are equal, the URL of the video file is stored to the composing components having these nodes as children, and remove the URL of the video file from the metadata of these nodes. This step guarantees that all the segments belonging to the composing segment come from the same video file if metadata of a composing segment has the URL of a video file. When making a play list for playback of a composing segment, an efficient algorithm can be achieved using this characteristic. That is, when inspecting a composing segment in order to make its play list, without inspecting its all descendents, the inspection can be stop if the segment has a URL of a video file.
-
FIG. 35 is a flowchart of the method of the present invention for virtual video editing based on metadata. The present invention can only be applied in the situation where the content-based hierarchically structured metadata of the video is within the metafile itself or in a database management system (DBMS). In the flowchart ofFIG. 35 , it is assumed that the metadata exists in the form of metafile. Even if the metadata is stored in a DBMS, the method of the present invention can be applied if each segment can be uniquely identified by providing some type of key or identifier of an database object. - A detailed description of the method depicted in
FIG. 35 is as follows. The method begins generally atstep 3502, where a metafile of an input video is loaded. Next, instep 3504, one or more segments are selected while browsing through the metafile. A check is made instep 3506 to determine if a composing segment should be created. If so,step 3508 is performed where the composing segment is created in a hierarchical structure being edited within the composing buffer. Thereafter, or if the result ofstep 3506 is negative,step 3510 is performed, where a composing segment is specified from newly created or pre-existing ones and a component segment is created as a child node of the specified composing segment. Next, instep 3512, a check is made to determine if a copy of the metadata is to be used, or a URI is used in its place. If a copy of the segment is used, then step 3516 is performed where metadata of the selected segment is copied to the newly created component segment. If the URI is to be used, then step 3514 is executed where the URI of the selected segment is copied to the component segment. In either case,step 3518 is next performed, where the URL of the input video file is written to the component segment. Next, a check is made atstep 3520 to determine if all of the URL's of any of the sibling nodes are identical. If so,step 3522 is performed where the URL is written to the parent composing segment and URL's of all of the child segments are deleted. Thereafter, instep 3524, a check is made to determine if another segment is to be selected. If so, execution is looped back tostep 3504. Otherwise, a check is made atstep 3526 to determine if another metafile is to be input to the process. If so, then execution loops back all the way to step 3502. Otherwise, a virtually edited metafile is generated from the composing buffer instep 3528 and the method ends. -
FIGS. 36, 37 , 38, 39, and 40 describe the preferred application of the present invention.Video 1 and its metafile along withvideo 2 and its metafile (seeFIG. 33 ) are stored in a computer with the domain name www.video.server1, as inputs. Video3 and its metafile (seeFIG. 33 ) are stored in www.video.server2.FIG. 36 is a description of the metafile for video 1 (seeFIG. 33 ) using extensible markup language (XML), the universal format for structured documents. The metafile ofvideo 1 contains the URL to video1, and every pre-defined segment contains several metadata including the time information of the segment. The pre-defined segment also has its own segment identifier to uniquely distinguish them within a file.Video 2, andvideo 3 ofFIG. 33 are described in XML in the same way inFIG. 37 andFIG. 38 , respectively. -
FIG. 39 and 40 are the representation of the metafile in XML, after virtually editingvideo 1,video 2, andvideo 3. Assume that the metafile is stored in www.video.server2. As indicated inFIG. 35 , there are two ways in copying a metadata of input metafile's selected segment to a component segment of a virtually edited metafile.FIG. 39 was composed by the first method, which is to copy all the metadata within a selected segment to the component segment.FIG. 40 was composed by the second method, which is to store the URI of the selected segment to the composition segment. InFIG. 40 , the URI is composed of the input metafile's URL and the segment identifier within the file, according to the xlink and xpointer specification. The “#” between the URL and the segment identifier indicates that the URI is composed of URL and segment identifier with XML. The id( ) function which has the segment identifier as its parameter, indicates that the segment identifier is uniquely identifiable. - To play a specific segment of the virtual edited metafile, a play list of the actual videos within the segment has to be created. The play list contains the URLs of the videos contained in the selected segment as well as the time information (for example, the starting frame number and duration) sequentially. When the virtual video player receives the play list, it will play the segments arranged in the play list sequentially.
FIG. 41 is a representation of the play list of the root segment inFIG. 39 , andFIG. 40 using XML. -
FIG. 42 is the block diagram of a virtual video editor supporting virtual video editing. InFIG. 42 , the dotted line represents the flow of data file, solid line the flow of metadata, and the bold solid line the flow of control signal. The major components of the virtual video editor are as follows. - The input video file (4208, 4210, 4214) and their metafile (4204, 4206, 4212) reside in the local computer or computers connected by network. In
FIG. 42 , video1 (4208) and video 2 (4210) resides in the local computer and video 3 (4214) in a computer connected by network. Therefor, when the video file and metafile are in the computer connected by network, its video file and metafile are transferred to thevirtual video editor 4202 through network. The above process, is processed by thefile controller 4222 and thenetwork controller 4220. In other words, after the video and metafile are transferred from thenetwork controller 4220 to user, thefile controller 4222 reads the video file as well as the metafile in the local computer, or the video file and the metafile transferred by the network. The metafile read from the file controller is transferred to theXML parser 4224. After the XML parser validates whether the transferred metadata are well-formed according to XML syntax, the metadata is stored to inputbuffer 4226. In this case, the metadata stored in the input buffer has a hierarchical structure described in the input metafile. - A user performs virtual video editing with the
structure manager 4228. First, by browsing and playing some segments of the input buffer through thedisplay device 4240 usingvideo player 4238, select a video segment to be copied. The process of copying the metadata of the selected segment to the composing buffer is done by thestructure manager 4228. That is, all the operations related to the creation of edited hierarchical structure as well as the management done within the input buffer, such as the selection of a particular composing segment, constructing a new composing segment as well as a component segment, copying the metadata, are performed by the structure manager. - For example, assume that segment c (3320) of video 2 (3304) (see
FIG. 33 ) is selected by the editor. The URL ofvideo 2 is www.video.server1/video2, and the URI of asegment c 3320 in the metafile is www.video.server1/metafile2.xml#id(seg_c). By referring toFIG. 37 , the metadata of segment 'seg_c′ ofvideo 2 is as follows.<Segment id=“seg_c” title=“segment c” duration=“150”> <StartTime>230</StartTime> <MediaDuration>150</ MediaDuration> <Keyframe>...</Keyframe> <Annotation>...</Annotation> ... </Segment> - There are two methods on copying the metadata to the component segment of a composing buffer as described in
FIG. 35 . First, the selected metadata itself is copied to the component segment generated at the composing buffer (seeFIG. 39 ).<Segment id=“seg_c” title=“segment c” duration=“150”> <MediaURI>//www.video.server1/video2</MediaURI> <StartTime>230</StartTime><MediaDuration>150</MediaDuration> <Keyframe>... </Keyframe><Annotation>...</Annotation> ... </Segment> - Second, only the URI of selected segment is copied to the component segment generated at the composing buffer (see
FIG. 40 ).<Segment xlink:form=“simple” show=“embed” href=“//www.video.server1/metafile2.xml#id(seg_c)”> </MediaURI> //www.video.server1/video2</MediaURI> </Segment> - To indicate which input video is related to the copied metadata, the metadata of the newly created component segment contains the URL to the relevant videos of the segment.
- A
play list generator 4236 is used to play segments in the hierarchical structure of the input buffer or composing buffer. Through the metafile's URL and time information obtained by the metadata, the play list generator passes the play list such asFIG. 41 , tovideo player 4238. The video player plays the segments defined in the play list sequentially. The video being played is shown through thedisplay device 4240. When the editing is done, the hierarchical structure edited in the composing buffer is saved asmetafile 4242 by theXML generator 4234. - 4. Transcoding
- 4.1 Perceptual Hint for Image Transcoding
- 4.1.1 Spatial Resolution Reduction Value
- The present invention also provides a novel scheme for transcoding an image to fit the size of the respective client display when an image is transmitted to a variety of client devices with different display sizes. First, the method of perceptual hints for each image block is introduced, and then an image transcoding algorithm is presented as well as an embodiment in the form of a system that incorporates the algorithm to produce the desired result. The perceptual hint provides the information on the minimum allowable spatial resolution reduction for a given semantically important block in an image. The image transcoding algorithm selects the best image representation to meet the client capabilities while delivering the largest content value. The content value is defined as a quantitative measure of the information on importance and spatial resolution for the transcoded version of an image.
- A spatial resolution reduction (SRR) value is determined by either the author or publisher as well as by an image analysis algorithm and can also be updated after each user interaction. SRR specifies a scale factor for the maximum spatial resolution reduction of each semantically important block within an image. A block is defined as a spatial segment/region within an image that often corresponds to the area of an image that depicts a semantic object such as car, bridge, face, and so forth. The SRR value represents the information on the minimum allowable spatial resolution, namely, width and height in pixels, of each block at which users can perceptually recognize according to the author's expectation. The SRR value for each block can be used as a threshold that determines whether the block is to be sub-sampled or dropped when the block is transcoded.
- Consider the n number of blocks of users' interests within an image IA. If one denotes the ith block as Bi, IA={Bi}, i=1, . . . , n, then, the SRR value ri of Bi is modeled as follows:
- The SRR value ranges from 0 to 1 where 0.5 indicates that the resolution can be reduced by half and 1 indicates the resolution cannot be reduced. For a 100×100 block whose SRR value is 0.7, for example, the author of the block of information can indicate that the resolution of the block could be reduced up to the size of 70×70 (thus, minimum allowable resolution) without degrading the perceptibility of users. This value can then be used to determine the acceptable boundaries of resolutions that can be viewed by a given device over the system of the present invention illustrated in
FIG. 53 . - The SRR value also provides a quantitative measure of how much the important blocks in an image can be compressed to reduce the overall data size of the compressed image while preserving the image fidelity that the author intended.
- 4.1.2 Transcoding Hint for Each Image Block
- The SRR value can be best used with the importance value in J. R. Smith, R. Mohan, and C.-S. Li, “Content-based Transcoding of Images in the Internet,” in Proc. IEEE Intern. Conf. on Image Processing, October 1998; and S. Paek and J. R. Smith, “Detecting Image Purpose in World-Wide Web Documents,” in Proc. SPIE/IS&T Photonics West, Document Recognition, January 1998. Both SRR value (ri) and importance value (si) are associated with each Bi. Thus:
I A ={B i}={(r i ,s i)}, i=1, . . . ,n. - 4.1.3 Image Transcoding Algorithm Based on Perceptual Hint
- 4.1.3.1 Content Value Function V
- Image transcoding can be viewed in a sense as adapting the content to meet resource constraints. Rakesh Mohan, et al., modeled the content adaptation process as a resource allocation in a generalized rate-distortion framework. See, e.g., R. Mohan, J. R. Smith and C.-S. Li, “Multimedia Content Customization for Universal Access,” in Multimedia Storage and Archiving Systems, Boston, Mass.: SPIE, Vol. 3527, November 1998; R. Mohan, J. R. Smith and C.-S. Li, “Adapting Multimedia Internet Content for Universal Access,” IEEE Trans. on Multimedia, Vol. 1, No. 1, pp. 104-14, March 1999; and R. Mohan, J. R. Smith and C.-S. Li, “Adapting Content to Content Resources in the Internet,” in Proc. IEEE Intern. Conf. on Multimedia Comp. and Systems ICMCS99, Florence, June 1999. This framework has been built on the Shannon's rate-distortion (R-D) theory that determines the minimum bit-rate R needed to represent a source with desired distortion D, or alternately, given a bit-rate R, the distortion D in the compressed version of the source. See, C. E. Shannon, “A Mathematical Theory of Communications,” Bell Syst. Tech. J., Vol. 27, pp. 379-423, 1948. They generalized the rate-distortion theory to a value-resource framework by considering different versions of a content item in an InfoPyramid as analogous to compressions, and different client resources as analogous to the bit-rates, respectively. However, the value-resource framework does not provide the quantitative information on the allowable factor with which blocks can be compressed while preserving the minimum fidelity that an author or a publisher intended. In other words, it does not provide the quantified measure of perceptibility indicating the degree of allowable transcoding. For example, it is difficult to measure the loss of perceptibility when an image is transcoded to a set of a cropped and/or scaled ones.
- To overcome this problem, an objective measure of fidelity is introduced in the present invention that models the human perceptual system that is called a content value function V for any transcoding configuration C:
C={I, r},
where I⊂{1, 2, . . . , n} is a set of indices of the blocks to be contained in the transcoded image and r is a SRR factor of the transcoded image. The content value function V can be defined as: - The above definition of V now provides a measure of fidelity that is applicable to the transcoding of an image at different resolution and different sub-image modalities. In other words, V defines the quantitative measure of how much the transcoded version of an image can have both importance and perceptual information. The V takes a value from 0 to 1, where 1 indicates that all of important blocks can be perceptible in the transcoded version of image and 0 indicates that none can be perceptible. The value function is assumed to have the following property:
- Property 1: The value V is monotonically increasing in proportion to r and I. Thus:
For a fixed I,V(I,r 1)≦V(I,r 2) if r1 <r 2, 1.1
For a fixed r,V(I 1 ,r)≦V(I 2 ,r) if I 1 ⊂I 2 1.2. - 4.1.4 Content Adaptation Algorithm
- Denoting the width and height of the client display size by W and H, respectively, the content adaptation is modeled as the following resource allocation problem: maximize (V(I, r)) such that and
where the transcoded image is represented by a rectangular bounding box whose lower and upper bound points are (xl, yl) and (xu, yu) respectively. - Lemma 1: For any I, the maximum resolution factor is given by
- The
Lemma 1 says that only those configurations C={I, r} with r≦rmax I are feasible. Combined with property 1.1, this implies that for a given I, the maximum value is attainable when C={I, rmax I}. Therefor other feasible configurations C={I, r}, r<rmax I} do not need to be searched. At this moment, one has a naive algorithm for finding an optimal solution: for all possible I⊂{1, 2, . . . ,n}, calculate rmax I by maximal resolution factor (above) and again V(I, rI max) by the content value function defined in the subsection 4.1.3.1 to find an optimal configuration Copt. - The algorithm can be realized by considering a graph
R=[r ij], 1≦i,j≦n,
and noting that an I corresponds to a complete subgraph (clique) of R, and then rI max is the minimum edge or node value in I. - Assume I to be a clique of degree K (K≧2). It is easily shown that among the cliques, denoted by S, of I, there are at least 2K−2 cliques whose rs max is equal to rI max, which, according to Property 1.2, need not be examined to find the maximum value of V. Therefor, only maximal clique will be searched. Initially, r is set to rR max so that all of the blocks could be contained in the transcoded image. Then r is increased discretely and for the given r, the maximal cliques are only examined. A minimum heap H is maintained in order to store and track maximal cliques with rmax as a sorting criterion. The following pseudo-code is illustrative of finding the optimal configuration:
Enqueue R into H WHILE H is not empty I is dequeued from H Calculate V(I, rI max) Enqueue maximal cliques inducible from I after removing the critical (minimum) edge or node. END_WHILE Print optimal configuration that maximizes V. -
FIGS. 43 and 44 demonstrate the results of transcoding according to the method of the present invention. Specifically,FIG. 43 illustrates acomparison 4300 of a non-transformedresolution reduction scheme 4302 to a transcodedscheme 4304 of the present invention. Underneath each example is a content value parameter indicative of the “value” seen by the user. As shown inFIG. 43 , the images forworkstations images image 4310 yet again, bringing the resolution detail and thus the content value to 0, while the transcoding method of the present invention preserves the resolution of the areas ofinterest 4330 in theimage 4320 while removing (cropping) relatively extraneous information and thus commands a higher content value of 0.53. This same result is illustrated forimages images interest 4330 can be specified by the author or an image analysis algorithm, or it may be identified by adaptive techniques through user-feedback as explained elsewhere within this disclosure. - Similarly,
FIG. 44 illustrates acomparison 4400 of a non-transformedresolution reduction scheme 4402 to a transcodedscheme 4404 of the present invention. Underneath each example is a content value parameter indicative of the “value” seen by the user. As shown inFIG. 44 , the images forworkstations images image 4410 yet again, bringing the resolution detail and thus the content value to 0, while the transcoding method of the present invention preserves the resolution of the area ofinterest 4430 in theimage 4420 while removing (cropping) relatively extraneous information and thus commands a higher content value of 1.0. This same result is illustrated forimages images 4422 and 4424, for the respective examples employing the method of the present invention. - As described above, this disclosure has provided a novel scheme for transcoding an image to fit the size of the respective client display when an image is transmitted to a variety of client devices with different display sizes. First the notion of perceptual hint for each image block is introduced, and then an optimal image transcoding algorithm is presented.
- 4.2 Video Transcoding Scheme
- The method of the present invention further provides a scheme to transcode video with a variety of client devices having different display sizes. A general overview of the scheme is illustrated in
FIG. 45 . Generally, thecontent transcoder 4502 contains various modules that take data from acontent database 4504, modify the content and forward the modified content to the Internet for viewing by various devices. More specifically, thesystem 4500 hascontent database 4504 that maintains content information as well as (optionally) publisher and author preferences. Upon a request, either from the Internet or from a client device such as television 4516 (or another transmitting device), a signal is received by thepolicy engine 4506 that resides within thecontent transcoder 4502. Thepolicy engine 4506 is operative with thecontent database 4504 and can receive policy information from thedatabase 4504 as illustrated inFIG. 45 . Content information is retrieved from thedatabase 4504 to thecontent analyzer 4508 that then forwards the content to thecontent selection module 4510 that is operative also with thepolicy engine 4506. Based upon policy and information from the content analysis andmanipulation library 4512, specific content is selected and forwarded to thecontent manipulation module 4514, which modifies the content for viewing by the specific requesting device. It should be noted that the content analysis andmanipulation library 4512 is operative with most of the main modules, specifically thecontent analyzer 4508 as well as thecontent selection module 4510 and thecontent manipulation module 4514. Typically, the output information from the content transcoder is forwarded to the Internet for eventual receipt and display on, for example,personal computer 4524 for the enjoyment ofuser 4526,personal data appliance 4522,laptop 4520,mobile telephone 4518, andtelevision 4516. - The
policy engine module 4506 gathers the capabilities of the client, the network conditions and the transcoding preferences of the user as well as from the publisher and/or author. This information is used to define the transcoding options for the client. The system then selects the output-versions of the content and uses a library of content analysis and manipulation routines to generate the optimal content to be delivered to the client device. - The
content analyzer 4508 analyzes the video, namely the scene of video frames, to find their type and purpose, the motion vector direction, and face/text, etc. Based on this information, thecontent selection module 4510 and themanipulation module 4514 transcode the video by selecting adaptively the attention area that is defined by a position and size for a rectangular window, for example, in a video that is intended to fit the size of the respective client display. Thesystem 4500 will select a dynamically transcoded (for example, scaled and/or cropped) area in the video without degrading the perceptibility of users. Also, this system has the manual editing routine that alters/adjusts manually the position and size of the transcoded area by the publisher and author. -
FIG. 46 illustrates an example of focus ofattention area 4604 within thevideo frame 4602 that is defined by an adaptive rectangular window in the figure. The adaptive window is represented by the position and size as well as by the spatial resolution (width and height in pixels). Given an input video, a simplified transcoding process can be summarized as: -
- 1. Perform a scene analysis within the entire frame or certain slices of the frame;
- 2. Determine the widow size and position and adjust accordingly; and
- 3. Transcode the video according to the determined window.
- Given the display size of the client device, the scene (or content) analysis adaptively determines the window position as well as the spatial resolution for each frame/clip of the video. The information on the gradient of the edges in the image can be used to intelligently determine the minimum allowable spatial resolution given the window position and size. The video is then fast transcoded by performing the cropping and scaling operations in the compressed domain such as DCT in case of MPEG-1/2.
- The present invention also enables the author or publisher to dictate the default window size. That size represents the maximum spatial resolution of area that users can perceptually recognize according to the author's expectation. Furthermore, the default window position is defined as the central point of the frame. For example, one can assume that this default window size is to contain the central 64% area by eliminating 10% background from each of the four edges, assuming no resolution reduction. The default window can be varied or updated after the scene analysis. The content/scene analyzer module analyzes the video frames to adaptively track the attention area. The following are heuristic examples of how to identify the attention area. These examples include frame scene types (e.g., background), synthetic graphics, complex, etc., that can help to adjust the window position and size.
- 4.2.1 Landscape or Background
- Computers have difficulty finding outstanding objects perceptually. But certain types of objects can be identified by text and face detection or object segmentation. Where the objects are defined as spatial region(s) within a frame, they may correspond to regions that depict different semantic objects such as cards, bridges, faces, embedded texts, and so forth. For example, in the case that there exist no larger objects (especially faces and text) than a specific threshold value within the frame, one can define this specific frame as the landscape or background. One may also use the default window size and position.
- 4.2.2 Synthetic Graphics
- One may also adjust the window to display the whole text. The text detection algorithm can determine the window size.
- 4.2.3 Complex
- In the case of the existing recognized (synthetic or natural) objects whose size is larger than a specific threshold value within the frame, initially one may select the most important object among objects and include this object in the window. The factors that have been found to influence the visual attention include the contrast, shape, size and location of the objects. For example, the importance of an object can be measured as follows:
-
- 1. Important objects are in general in high contrast with their background;
- 2. The bigger the size of an object is, the more important it is;
- 3. A thin object has high shape importance while a rounder object will have lower one; and
- 4. The importance of an object is inversely proportional to the distance of center of the object to the center of the frame.
At a highly semantic level, the criteria for adjusting the window are, for example: - 1. Frame with text at the bottom such as in news; and
- 2. Frame/scene where two people are talking each other. For example, person A is in the left side of the frame. The other is in the right side of the frame. Given the size of the adaptive window, one cannot include both in the given window size unless the resolution is reduced further. In this case, one has to include only one person.
- 5. Visual Rhythm
- The visual rhythm of a video is a single image, that is, a two-dimensional abstraction of the entire three-dimensional content of the video constructed by sampling certain group of pixels of each image sequence and temporally accumulating the samples along time. Each vertical line in the visual rhythm of a video consists of a small number of pixels sampled from a corresponding frame of the video according to a specific sampling strategy.
FIG. 26 shows severaldifferent sampling strategies 2600 such ashorizontal sampling 2603,vertical sampling 2605, anddiagonal sampling 2607. For example, thediagonal sampling strategy 2607 is to sample some pixels regularly from those lying at a diagonal line of each frame of a video. The sampling strategies illustrated inFIG. 26 are only a partial list of all realizable sampling strategies for visual rhythm utilized for many useful applications such as shot detection and caption text detection. - The sampling strategies must be carefully chosen for constructing the visual rhythm to retain the edit effects that characterize shot changes. Diagonal sampling provides the best visual features for distinguishing various video editing effects on the visual rhythm. All visual rhythms presented hereafter are assumed to be constructed using the diagonal sampling strategy for shot detection. But the presented invention can be easily applied to any sampling strategy.
- The construction of visual rhythm is, however, a very time-consuming process using conventional video decoders for digital video because they are designed to decode all pixels composing a frame while visual rhythm requires only a few pixels of a frame. Therefor, one needs an efficient method to construct visual rhythm as fast as possible in compressed video. The method will thus enable the time of shot detection process to be greatly reduced, as well as the text caption detection process, or any other application derived from it.
- In video terminology, a compression method that employs only spatial redundancy is referred to as an intraframe coda, and frames coded in such a way are defined as intra-coded frames. Most video coders adopt block-based coding either in the spatial or transform domain for intraframe coding to reduce spatial redundancy. For example, MPEG adopts discrete cosine transform (DCT) of 8×8 block into which 64 neighboring pixels are exclusively grouped. Therefor, whatever compression scheme (DCT, discrete wavelet transform, vector quantization, etc.) is adopted for a given block, one need only decompress a small number of blocks in an intra-coded frame, instead of decoding the whole blocks composing the frame when only few pixels out of the whole pixels are needed. This situation is similarly applied to loose JPEG on individual images. In order to achieve optimum compression, most video coders also use a method that exploits the temporal redundancy between frames, referred as interframe coding (predictive, interpolate) by tracking the N×M block in the reference picture that better matches (according to a given criterion) the characteristics of the block in the current picture, in which for the specific case of MPEG compression standard N, M=16, commonly referred to as macroblock. However, the present invention does not restrict to this rectangular geometry but assumes that the geometry of the matching block at the reference picture need not be the same as the geometry of the block in the current picture, since objects in the real world undergo scale changes as well as rotation and warping. An efficient way to only decode the actual group of pixels needed for constructing visual rhythm of such a hybrid (intraframe and interframe) coded frames can be processed as follows:
-
- 1. Out of the blocks composing a given frame sequence, decode only the blocks needed to decode the blocks containing at least one pixel, selected by a predetermined sampling strategy for constructing visual rhythm; and
- 2. Obtain the pixel values for constructing visual rhythm from the decoded blocks.
- For example, define three different types of pictures using the MPEG-1 terminology. Intra-pictures (I-pictures) are compressed using intraframe coding; that is, they do not reference any other pictures in the coded bit stream. Referring to
FIG. 22 , predicted pictures (P-picture) 2204 and 2202 are coded using motion-compensated prediction from past I-picture 2206 or P-picture 2204, respectively. Bidirectionally predicted picture (B-pictures) 2210 are coded using motion-compensated prediction from either past and/or future I-pictures 2206 or P-pictures - Many video coding applications restrict the search to a [−p, p−1] region around the original location of the block due to the computation-intensive operations to find an N×M pixel region in the reference picture that better matches (according to a given criterion) the characteristics of N×M pixel region in the current picture. This implies that one need only decompress the blocks within the [−p, p−1] region around original location of the blocks containing the pixels to be sampled for constructing visual rhythm in pictures possibly referenced by other picture types for motion compensation. For pictures that cannot be referenced by other picture types for motion compensation, one only needs to decompress the blocks containing the pixels sampled for visual rhythm.
- For example,
FIG. 23 andFIG. 24 illustrate the shaded blocks that need to be decompressed for the construction of visual rhythm in frames that can be referenced by other frames for motion compensation and frames that can't be referenced by other frames, respectively. Visual rhythm constructed by sampling the diagonal pixels located on 2308 of aframe 2302, one only needs to decompress the shaded blocks inFIG. 23 which lie in between thelines 2304 and 2310 (separated byvalue 2306, the search range p of motion prediction). For frames not referenced by other frames (B-pictures), one simply needs to decompress the blocks located along thediagonal line 2404 of theframe 2402 as illustrated inFIG. 24 . - Such approach allows one to obtain certain group of pixels without decoding unnecessary blocks and guarantees that the pixel values obtained from the decoded blocks can be obtained for constructing visual rhythm even without fully decoding the whole blocks composing each frame sequence.
- For some compression schemes using the discrete cosine transform (DCT) for intra-frame coding like Motion-JPEG and MPEG or any other transform domain compression schemes such as discrete wavelet transform, it is further possible to reduce the time for constructing visual rhythm. For example, a DCT block of N×N pixels is transformed to the frequency domain representation resulting in one DC and (N×N−1) AC coefficients. The single DC coefficient is N-times the average of all N×N pixel values. It means that the DC coefficient of a DCT block can be served as a pixel value of a pixel included in the block if accurate pixel values may not be required. Extraction of a DC coefficient from a DCT block can be performed fast because it does not fully decode the DCT block. In the present invention, after recognizing the shaded blocks illustrated in
FIGS. 23 and 24 , the extraction of DC coefficients from the blocks can be utilized instead of fully decoding the blocks and obtaining the pixel values of the pixels that will be selected by a predetermined sampling strategy for constructing visual rhythm. The same approach can be applied to any given compression scheme by only utilizing any coefficients readily available through compression. - Fast Text Detection
- For the design of an efficient real-time caption text locator, resort is made of using a portion of the original video called a partial video. The partial video must retain most, if not all, of the caption text information. The visual rhythm, as defined below, satisfies this requirement. Let fDC(x, y, t) be the pixel value at location (x, y) of an arbitrary DC image that consists of the DC coefficients of the original frame t. Using the sequences of DC images of a video called the DC sequence, the visual rhythm VR of the video V is defined as follows:
VR={ƒ VR(z,t)}={ƒDC(x(z),y(z),t)}
where x(z) and y(z) are one-dimensional functions of the independent variable z. Thus, the visual rhythm is a two-dimensional image consisting of DC coefficients sampled from a three-dimensional data (DC sequence). Visual rhythm is also an important visual feature that can be utilized to detect scene changes. - The sampling strategies, x(z) and y(z), must be carefully chosen for the visual rhythm to retain caption text information. One sets x(z), y(z) as
where W and H are the width and the height of the DC sequence, respectively. - The sampling strategies above are due partially, if not entirely, to empirical observations that portions of caption text generally tend to appear on these particular region.
FIG. 26 illustrates a set of sampling strategies for constructing visual rhythm from a set of frames making up a video stream. Specifically, theframe sequence 2602 utilizes a singlehorizontal sampling 2603 across the middle of the frame. Alternatively, theframe sequence 2604 utilizesvertical sampling 2605 from top to bottom of the frame midway between the left and right sides. Finally, theframe sequence 2606 utilizesdiagonal sampling 2607 from one corner of the frame to the cattycorner. It will be understood that the scanning techniques noted above can be mixed and matched (e.g., combining vertical and diagonal) and that multiple scans can take place (e.g., multiple horizontal scans, or cross-diagonal scans) to enhance the search, albeit with a potential performance loss due to the extra computational overhead. However, the sampling strategies can be set in a flexible manner for text detection of specific video materials where the approximate regions of caption text are known a priori. -
FIG. 27 (a) shows an example of visual rhythm when diagonals of a frame are sampled. Referring toFIG. 27 (c),frame 2714 is one of a set of frames used to construct binarizedvisual rhythm 2712 where only thepixels 2718 corresponding to caption text are represented in white. Acaption 2716 is embedded in theframe 2714 and the subsequent set of frames used to construct the binarizedvisual rhythm 2712 so that “caption line” 2718 is formed within the binarizedvisual rhythm 2712.FIG. 27 (a) andFIG. 27 (b) illustrate thevisual rhythm 2702 of video content (FIG. 27 (a)) and its corresponding binarizedvisual rhythm 2708 where pixels corresponding tocaption 2710 are represented in white (FIG. 27 (b)). Caption text embedded inzone 2706 of visual rhythm illustrated inFIG. 27 (a) shows that caption possess certain properties such as inregion 2704. Thisregion 2704 ofFIG. 27 (a) can be separated and is represented in white 2710 as inFIG. 27 (b) to form binarizedvisual rhythm 2708. Once the binarizedvisual rhythm 2708 is obtained, only a portion of the content of the entire frame need be scanned in order to extract the textual information in order to create appropriate multimedia bookmarks according to the method of the present invention. As illustrated inFIG. 28 , the method of the present invention similarly enables to locate thecaption text 2804 of aframe 2802, as well asmultiple captions frame 2806 and extract the text and obtain thebinarized results 2804′, 2808′, 2810′, and 2812′ for subsequent processing, recognizing text, indexing, storing and retrieving. - Caption Frame Detection
- The caption frame detection stage seeks for caption frames, which herein are defined as a video or an image frame that contains one or more caption text. Caption frame detection algorithm is based on the following characteristics of caption text within video:
-
- 1. Characters in a single caption text tend to have similar color;
- 2. Captioned text tends to retain their size and font over multiple frames;
- 3. Text caption is either stationary or linearly moving;
- 4. Text caption contrast with their background; and
- 5. Text caption remains in the scene for a number of consecutive frames.
- It is preferable to restrict oneself to locating only stationary caption text because stationary text is more often an important carrier of information and herewith more suitable for indexing and retrieving than moving caption text. Therefor, for purposes of this disclosure reference is made to stationary caption text for caption text mentioned in the rest of this disclosure.
- With the above characteristics of video, one could observe that pixels corresponding to caption text sampled from portions of DC sequence manifest themselves as long
horizontal line 2704 in high contrast with their background on thevisual rhythm 2702. Hence, horizontal lines on the visual rhythm in high contrast with their background are mostly due to caption text, and they provide clues of when each caption text appears within the video. Thus, visual rhythm serves as an important visual feature for detecting caption frames. - First of all, to detect caption frames, horizontal edge detection is performed on visual rhythm using Prewitt edge operator with convolution kernels
on visual rhythm to obtain VRedge(z, t) as follows: - To obtain caption line defined as horizontal line on the visual rhythm, possibly formed due to portions of caption text, edge values VRedge(z, t) value greater than τ=150 and edge values VRedge(z, t) are connected in the horizontal direction. Caption lines with lengths shorter than frame length corresponding to a specific amount of time is neglected, since caption text usually remains in the scene for a number of consecutive frames. Through several experiments on various types of video materials, shortest captions appear to be active for at least two seconds, which translates into a caption line with frame length of 60 if the video is digitized at 30 frames per second. Thus caption lines with length less than 2 seconds can be eliminated. The resulting set of caption lines with the temporal duration appear in the form:
LINEk ,|z k ,t k start ,t k end |,k=1, . . . ,N LINE
where |zk,tk start, tk end| denotes the Z coordinate, beginning and end frame of the occurrence of caption line LINEk on the visual rhythm, respectively, and NLINE is the total number of caption lines. The caption lines are ordered by increasing starting frame number:
t 1 start ≦t 2 start ≦ . . . ≦t NLINE start
FIG. 27 (b) shows VRBinarized(z,t), the binarized visual rhythm representing caption lines in white 2710 possibly formed due to caption text from visual rhythm ofFIG. 27 (a), where - The frames not in between the temporal duration of the resulting set of caption lines can be assumed to not contain any caption text and are thus omitted as caption frame candidates.
- Caption Text Localization
- Caption text localization stage seeks to spatially localize caption text within the caption frame along with its temporal duration within the video.
- Let DC(x, y, t) be the pixel value at (x,y) of the DC image of frame t. Given the sampling strategy in equation (2) for the visual rhythm, caption line, LINEk, is formed due to a portion of caption text located on (x,y)=(x(zk), y(zk)) in DC sequences between tk strt, and tk end.
- Furthermore, if a portion of caption text is located on (x, y)=(x(zk), y(zk)) within a DC image, one can assume for other portions of caption text to appear along y=y(zk) because caption text is usually horizontally aligned. Therefor, a caption line can be used to approximate the location of caption text within the frame, and enable one to provide an algorithm to focus on specific area of the frame.
- Thus, from the above observations, for each LINEk it is possible to simply segment caption text region located along y=y(zk) on a DC image in between tk smart and tk end and assume this segmented region to appear along the temporal duration of caption line LINEk.
- To localize a caption text candidate regions for caption line LINEk, it is preferable to cluster pixels with values fVR(zk, t)±δ(where δ=10) from the pixels of horizontal scanline y=y(zk) with value fVR(zk, t), using 4-connected clustering algorithm in the DC image of frame t, where t=(tk start+tk end)/2. This is partially because the character in a single text caption tends to have similar color and is horizontally aligned. Each of the clustered regions contains the value of leftmost, rightmost, top and down location of the pixels that are merged together.
- Once the clustered regions have been obtained for LINEk, one needs to merge regions corresponding to portions of a caption text to form bounding box around the caption text. It is preferable to verify whether each region is formed by caption text based upon the heuristic obtained through empirical observations on text across a range of text sources. Because the focus is on finding caption text, a clustered region should have similar clustered regions nearby that belong to the same caption text. Such heuristic can be described using connectability, which is defined as:
-
- Let A and B be different text candidate regions. A and B are connectable if they are of similar height and horizontally aligned, and there is a path between A and B.
- Here, two regions are considered to be of similar height if the height of a shorter region is at least 40% of the height of a taller one. To determine the horizontal alignment, regions are project onto the Y-axis. If the overlap of the projections of two regions is at least 50% of the shorter one, they are considered to be horizontally aligned. In addition, it is clear that regions corresponding to the same caption text should be close to each other. By empirical observations, the spacing between the characters and words of a caption text is usually less than three times the height of the tallest character, and so is the width of a character in most fonts. Therefor, the following criterion is optionally used to merge regions corresponding to portions of caption text to obtain a bounding box around the caption text:
-
- Two regions, A and B, are merged if they are connectable and there is a path between A and B whose length is less than 3 times the height of the taller region.
- Moreover, the aspect-ratio constraint can be enforced on the final merged regions:
where Width and Height are the width and height of the final caption text region. - The caption text region is expected to meet the above constraint; otherwise, they are removed as text regions. The final caption text region takes the temporal duration of its corresponding caption line.
- The above procedures are iterated to obtain a bounding box around the caption text for each caption line LINEk, in increasing order of k (k=1, . . . , NLINE). However, since several caption lines are usually formed due to the same caption text, the caption text localization process is omitted for a caption line LINEk if there exists any caption text region obtained beforehand on the horizontal scanline y =y(zk). The usefulness of this text region extraction step is that it is inexpensive and fast, robustly supplying bounding boxes around caption text along with their temporal information.
- The present invention, therefor, is well-adapted to carry out the objects and attain both the ends and the advantages mentioned, as well as other benefits inherent therein. While the present invention has been depicted, described, and is defined by reference to particular embodiments of the invention, such references do not imply a limitation on the invention, and no such limitation is to be inferred. The invention is capable of considerable modification, alternation, alteration, and equivalents in form and/or function, as will occur to those of ordinary skill in the pertinent arts. The depicted and described embodiments of the invention are exemplary only, and are not exhaustive of the scope of the invention. Consequently, the invention is intended to be limited only by the spirit and scope of the appended claims, giving full cognizance to equivalents in all respects.
Claims (13)
1-45. (canceled)
46. A method for virtual editing multimedia files, the method comprising:
providing one or more video files;
creating a metadata file for each of the video files, each of the metadata files having at least one segment to be edited; and
creating a single edited metafile containing the segments to be edited from each of the metadata files;
wherein when the edited metadata file is accessed, the user is able to play the segments to be edited in the edited order.
47. A method for virtual editing multimedia files, the method comprising:
providing one or more video files;
creating a metadata file for each of the video files, each of the metadata files having at least one segment to be edited; and
creating a single edited metafile containing links to the segments to be edited from each of the metadata files in an edited order;
wherein when the edited metadata file is accessed, the user is able to play the segments to be edited in the edited order.
48. A method for editing a multimedia file comprising:
providing a metafile, the metafile having at least one segment that is selectable;
selecting a segment in the metafile;
determining if a composing segment should be created, and if the composing segment should be created, then creating a composing segment in a hierarchical structure;
specifying the composing segment as a child of a parent composing segment;
determining if metadata is to be copied or if a URI is to be used;
if the metadata is to be copied, then copying metadata of the selected segment to the component segment;
if the URI is to be used, then writing a URI of the selected segment to the component segment;
writing a URL of an input video file to the component segment;
determining if all URLs of any sibling files are the same; and
if the URL is the same as any of the sibling's URLs, then writing the URL to the parent composing segment and deleting the URLs of all sibling segments.
49. The method of claim 48 , the method further comprising:
determining if another segment is to be selected; and
if another segment is to be selected, then performing the step of selecting a segment in a metafile.
50. The method of claim 49 , the method further comprising:
determining if another metafile is to be browsed; and
if another metafile is to be browsed, then performing the step of providing a metafile.
51. The method of claim 46 wherein the metafile is an XML file.
52. The method of claim 47 wherein the metafile is an XML file.
53. The method of claim 48 wherein the metafile is an XML file.
54. A virtual video editor comprising:
a network controller, the network controller constructed and arranged to access remote metafiles and remote video files;
a file controller, the file controller in operative connection to the network controller, the file controller constructed and arranged to access local metafiles and local video files, and to access the remote metafiles and the remote video files via the network controller;
a parser, the parser constructed and arranged to receive information about the files from the file controller;
an input buffer, the input buffer constructed and arranged to receive parser information from the parser;
a structure manager, the structure manager constructed and arranged to provide structure data to the input buffer;
a composing buffer, the composing buffer constructed and arranged to receive input information from the input buffer and structure information from the structure manager to generate composing information; and
a generator, the generator constructed and arranged to receive the composing information from the composing buffer;
wherein the generator generates output information in a pre-selected format.
55. The virtual video editor of claim 54 , the editor further comprising:
a playlist generator, the playlist generator constructed and arranged to receive structure information from the structure manager in order to generate playlist information; and
a video player, the video player constructed and arranged to receive the playlist information from the playlist generator and file information from the file controller in order to generate display information.
56. The virtual video editor of claim 55 , the editor further having a display device constructed and arranged to receive the display information from the video player and to display the display information to a user.
57-83. (canceled)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/423,134 US20070033515A1 (en) | 2000-07-24 | 2006-06-08 | System And Method For Arranging Segments Of A Multimedia File |
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US22139400P | 2000-07-24 | 2000-07-24 | |
US22184300P | 2000-07-28 | 2000-07-28 | |
US22237300P | 2000-07-31 | 2000-07-31 | |
US27190801P | 2001-02-27 | 2001-02-27 | |
US29172801P | 2001-05-17 | 2001-05-17 | |
US09/911,293 US7624337B2 (en) | 2000-07-24 | 2001-07-23 | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
US11/423,134 US20070033515A1 (en) | 2000-07-24 | 2006-06-08 | System And Method For Arranging Segments Of A Multimedia File |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/911,293 Division US7624337B2 (en) | 2000-07-24 | 2001-07-23 | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070033515A1 true US20070033515A1 (en) | 2007-02-08 |
Family
ID=27539825
Family Applications (9)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/911,293 Expired - Fee Related US7624337B2 (en) | 2000-07-24 | 2001-07-23 | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
US11/423,143 Abandoned US20070033533A1 (en) | 2000-07-24 | 2006-06-08 | Method For Verifying Inclusion Of Attachments To Electronic Mail Messages |
US11/423,138 Abandoned US20070033170A1 (en) | 2000-07-24 | 2006-06-08 | Method For Searching For Relevant Multimedia Content |
US11/423,136 Abandoned US20070033521A1 (en) | 2000-07-24 | 2006-06-08 | System And Method For Transcoding A Multimedia File To Accommodate A Client Display |
US11/423,140 Abandoned US20070033292A1 (en) | 2000-07-24 | 2006-06-08 | Method For Sending Multimedia Bookmarks Over A Network |
US11/423,134 Abandoned US20070033515A1 (en) | 2000-07-24 | 2006-06-08 | System And Method For Arranging Segments Of A Multimedia File |
US11/504,058 Expired - Fee Related US7823055B2 (en) | 2000-07-24 | 2006-08-14 | System and method for indexing, searching, identifying, and editing multimedia files |
US11/581,740 Abandoned US20070038612A1 (en) | 2000-07-24 | 2006-10-16 | System and method for indexing, searching, identifying, and editing multimedia files |
US12/605,874 Abandoned US20110093492A1 (en) | 2000-07-24 | 2009-10-26 | System and Method for Indexing, Searching, Identifying, and Editing Multimedia Files |
Family Applications Before (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/911,293 Expired - Fee Related US7624337B2 (en) | 2000-07-24 | 2001-07-23 | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
US11/423,143 Abandoned US20070033533A1 (en) | 2000-07-24 | 2006-06-08 | Method For Verifying Inclusion Of Attachments To Electronic Mail Messages |
US11/423,138 Abandoned US20070033170A1 (en) | 2000-07-24 | 2006-06-08 | Method For Searching For Relevant Multimedia Content |
US11/423,136 Abandoned US20070033521A1 (en) | 2000-07-24 | 2006-06-08 | System And Method For Transcoding A Multimedia File To Accommodate A Client Display |
US11/423,140 Abandoned US20070033292A1 (en) | 2000-07-24 | 2006-06-08 | Method For Sending Multimedia Bookmarks Over A Network |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/504,058 Expired - Fee Related US7823055B2 (en) | 2000-07-24 | 2006-08-14 | System and method for indexing, searching, identifying, and editing multimedia files |
US11/581,740 Abandoned US20070038612A1 (en) | 2000-07-24 | 2006-10-16 | System and method for indexing, searching, identifying, and editing multimedia files |
US12/605,874 Abandoned US20110093492A1 (en) | 2000-07-24 | 2009-10-26 | System and Method for Indexing, Searching, Identifying, and Editing Multimedia Files |
Country Status (4)
Country | Link |
---|---|
US (9) | US7624337B2 (en) |
KR (3) | KR20040041082A (en) |
AU (1) | AU2001283004A1 (en) |
WO (1) | WO2002008948A2 (en) |
Cited By (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030236716A1 (en) * | 2002-06-25 | 2003-12-25 | Manico Joseph A. | Software and system for customizing a presentation of digital images |
US20040044745A1 (en) * | 2002-08-30 | 2004-03-04 | Fujitsu Limited | Method, apparatus, and computer program for servicing viewing record of contents |
US20040158676A1 (en) * | 2001-01-03 | 2004-08-12 | Yehoshaphat Kasmirsky | Content-based storage management |
US20050010553A1 (en) * | 2000-10-30 | 2005-01-13 | Microsoft Corporation | Semi-automatic annotation of multimedia objects |
US20050055344A1 (en) * | 2000-10-30 | 2005-03-10 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
US20050102256A1 (en) * | 2003-11-07 | 2005-05-12 | Ibm Corporation | Single pass workload directed clustering of XML documents |
US20070050226A1 (en) * | 2005-08-31 | 2007-03-01 | Soichiro Iga | Information display system, information display apparatus, and information display method |
US20070204238A1 (en) * | 2006-02-27 | 2007-08-30 | Microsoft Corporation | Smart Video Presentation |
US20070273754A1 (en) * | 2004-07-14 | 2007-11-29 | Ectus Limited | Method and System for Correlating Content with Linear Media |
US20090132924A1 (en) * | 2007-11-15 | 2009-05-21 | Yojak Harshad Vasa | System and method to create highlight portions of media content |
US20090158203A1 (en) * | 2007-12-14 | 2009-06-18 | Apple Inc. | Scrolling displayed objects using a 3D remote controller in a media system |
WO2009094635A1 (en) * | 2008-01-25 | 2009-07-30 | Visual Information Technologies, Inc. | Scalable architecture for dynamic visualization of multimedia information |
US20090265649A1 (en) * | 2006-12-06 | 2009-10-22 | Pumpone, Llc | System and method for management and distribution of multimedia presentations |
US20100046909A1 (en) * | 2005-12-08 | 2010-02-25 | Louis Chevallier | Method for Identifying a Document Recorded by a Display, Selection of Key Images and an Associated Receptor |
US20100153847A1 (en) * | 2008-12-17 | 2010-06-17 | Sony Computer Entertainment America Inc. | User deformation of movie character images |
US20100275123A1 (en) * | 2009-04-22 | 2010-10-28 | Microsoft Corporation | Media Timeline Interaction |
US20100318600A1 (en) * | 2009-06-15 | 2010-12-16 | David Furbeck | Methods and apparatus to facilitate client controlled sessionless adaptation |
US8195734B1 (en) | 2006-11-27 | 2012-06-05 | The Research Foundation Of State University Of New York | Combining multiple clusterings by soft correspondence |
US20130097643A1 (en) * | 2011-10-17 | 2013-04-18 | Microsoft Corporation | Interactive video |
US20130185398A1 (en) * | 2010-10-06 | 2013-07-18 | Industry-University Cooperation Foundation Korea Aerospace University | Apparatus and method for providing streaming content |
US20130226930A1 (en) * | 2012-02-29 | 2013-08-29 | Telefonaktiebolaget L M Ericsson (Publ) | Apparatus and Methods For Indexing Multimedia Content |
US8639086B2 (en) | 2009-01-06 | 2014-01-28 | Adobe Systems Incorporated | Rendering of video based on overlaying of bitmapped images |
US8650489B1 (en) * | 2007-04-20 | 2014-02-11 | Adobe Systems Incorporated | Event processing in a content editor |
US20140089806A1 (en) * | 2012-09-25 | 2014-03-27 | John C. Weast | Techniques for enhanced content seek |
US20140281013A1 (en) * | 2010-10-06 | 2014-09-18 | Electronics And Telecommunications Research Institute | Apparatus and method for providing streaming content |
US8918311B1 (en) * | 2012-03-21 | 2014-12-23 | 3Play Media, Inc. | Intelligent caption systems and methods |
US9110562B1 (en) * | 2012-07-26 | 2015-08-18 | Google Inc. | Snapping a pointing-indicator to a scene boundary of a video |
US20150264149A1 (en) * | 2012-12-07 | 2015-09-17 | Huawei Technologies Co., Ltd. | Multimedia Redirection Method, Multimedia Server, and Computer System |
US20150365736A1 (en) * | 2014-06-13 | 2015-12-17 | Hulu, LLC | Video Delivery System Configured to Seek in a Video Using Different Modes |
US9253484B2 (en) | 2013-03-06 | 2016-02-02 | Disney Enterprises, Inc. | Key frame aligned transcoding using statistics file |
US9336302B1 (en) | 2012-07-20 | 2016-05-10 | Zuci Realty Llc | Insight and algorithmic clustering for automated synthesis |
US9336685B2 (en) * | 2013-08-12 | 2016-05-10 | Curious.Com, Inc. | Video lesson builder system and method |
US9411422B1 (en) * | 2013-12-13 | 2016-08-09 | Audible, Inc. | User interaction with content markers |
US9456170B1 (en) | 2013-10-08 | 2016-09-27 | 3Play Media, Inc. | Automated caption positioning systems and methods |
US9633015B2 (en) | 2012-07-26 | 2017-04-25 | Telefonaktiebolaget Lm Ericsson (Publ) | Apparatus and methods for user generated content indexing |
US9704111B1 (en) | 2011-09-27 | 2017-07-11 | 3Play Media, Inc. | Electronic transcription job market |
US9854260B2 (en) | 2013-03-06 | 2017-12-26 | Disney Enterprises, Inc. | Key frame aligned transcoding using key frame list file |
US10277660B1 (en) | 2010-09-06 | 2019-04-30 | Ideahub Inc. | Apparatus and method for providing streaming content |
US10289810B2 (en) | 2013-08-29 | 2019-05-14 | Telefonaktiebolaget Lm Ericsson (Publ) | Method, content owner device, computer program, and computer program product for distributing content items to authorized users |
US10311038B2 (en) | 2013-08-29 | 2019-06-04 | Telefonaktiebolaget Lm Ericsson (Publ) | Methods, computer program, computer program product and indexing systems for indexing or updating index |
US10324612B2 (en) | 2007-12-14 | 2019-06-18 | Apple Inc. | Scroll bar with video region in a media system |
US10362130B2 (en) | 2010-07-20 | 2019-07-23 | Ideahub Inc. | Apparatus and method for providing streaming contents |
US10445367B2 (en) | 2013-05-14 | 2019-10-15 | Telefonaktiebolaget Lm Ericsson (Publ) | Search engine for textual content and non-textual content |
US10491748B1 (en) | 2006-04-03 | 2019-11-26 | Wai Wu | Intelligent communication routing system and method |
USD893612S1 (en) * | 2016-11-18 | 2020-08-18 | International Business Machines Corporation | Training card |
WO2020214404A1 (en) * | 2019-04-19 | 2020-10-22 | Microsoft Technology Licensing, Llc | Previewing video content referenced by typed hyperlinks in comments |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
US11678031B2 (en) | 2019-04-19 | 2023-06-13 | Microsoft Technology Licensing, Llc | Authoring comments including typed hyperlinks that reference video content |
US11735186B2 (en) | 2021-09-07 | 2023-08-22 | 3Play Media, Inc. | Hybrid live captioning systems and methods |
US11785194B2 (en) | 2019-04-19 | 2023-10-10 | Microsoft Technology Licensing, Llc | Contextually-aware control of a user interface displaying a video and related user text |
Families Citing this family (1358)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8352400B2 (en) | 1991-12-23 | 2013-01-08 | Hoffberg Steven M | Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore |
US6769128B1 (en) | 1995-06-07 | 2004-07-27 | United Video Properties, Inc. | Electronic television program guide schedule system and method with data feed access |
US9630443B2 (en) | 1995-07-27 | 2017-04-25 | Digimarc Corporation | Printer driver separately applying watermark and information |
US7685426B2 (en) * | 1996-05-07 | 2010-03-23 | Digimarc Corporation | Managing and indexing content on a network with image bookmarks and digital watermarks |
US6601103B1 (en) * | 1996-08-22 | 2003-07-29 | Intel Corporation | Method and apparatus for providing personalized supplemental programming |
US20020120925A1 (en) * | 2000-03-28 | 2002-08-29 | Logan James D. | Audio and video program recording, editing and playback systems using metadata |
US20030093790A1 (en) * | 2000-03-28 | 2003-05-15 | Logan James D. | Audio and video program recording, editing and playback systems using metadata |
US7756721B1 (en) * | 1997-03-14 | 2010-07-13 | Best Doctors, Inc. | Health care management system |
US6735253B1 (en) * | 1997-05-16 | 2004-05-11 | The Trustees Of Columbia University In The City Of New York | Methods and architecture for indexing and editing compressed video over the world wide web |
AU733993B2 (en) | 1997-07-21 | 2001-05-31 | Rovi Guides, Inc. | Systems and methods for displaying and recording control interfaces |
US7596755B2 (en) * | 1997-12-22 | 2009-09-29 | Ricoh Company, Ltd. | Multimedia visualization and integration environment |
US7162052B2 (en) | 1998-04-16 | 2007-01-09 | Digimarc Corporation | Steganographically encoding specular surfaces |
CN1867068A (en) | 1998-07-14 | 2006-11-22 | 联合视频制品公司 | Client-server based interactive television program guide system with remote server recording |
US6898762B2 (en) | 1998-08-21 | 2005-05-24 | United Video Properties, Inc. | Client-server electronic program guide |
US8645838B2 (en) * | 1998-10-01 | 2014-02-04 | Digimarc Corporation | Method for enhancing content using persistent content identification |
US7143434B1 (en) | 1998-11-06 | 2006-11-28 | Seungyup Paek | Video description system and method |
US6453348B1 (en) | 1998-11-06 | 2002-09-17 | Ameritech Corporation | Extranet architecture |
US6859799B1 (en) | 1998-11-30 | 2005-02-22 | Gemstar Development Corporation | Search engine for video and graphics |
US7966078B2 (en) | 1999-02-01 | 2011-06-21 | Steven Hoffberg | Network media appliance system and method |
US7992163B1 (en) | 1999-06-11 | 2011-08-02 | Jerding Dean F | Video-on-demand navigational system |
US7010801B1 (en) | 1999-06-11 | 2006-03-07 | Scientific-Atlanta, Inc. | Video on demand system with parameter-controlled bandwidth deallocation |
US6817028B1 (en) | 1999-06-11 | 2004-11-09 | Scientific-Atlanta, Inc. | Reduced screen control system for interactive program guide |
US7908172B2 (en) | 2000-03-09 | 2011-03-15 | Impulse Radio Inc | System and method for generating multimedia accompaniments to broadcast data |
WO2003009592A1 (en) * | 2001-07-17 | 2003-01-30 | Impulse Radio, Inc. | System and method for transmitting digital multimedia data with analog broadcast data. |
US7975277B1 (en) | 2000-04-03 | 2011-07-05 | Jerding Dean F | System for providing alternative services |
US7200857B1 (en) | 2000-06-09 | 2007-04-03 | Scientific-Atlanta, Inc. | Synchronized video-on-demand supplemental commentary |
US8516525B1 (en) | 2000-06-09 | 2013-08-20 | Dean F. Jerding | Integrated searching system for interactive media guide |
US9602862B2 (en) * | 2000-04-16 | 2017-03-21 | The Directv Group, Inc. | Accessing programs using networked digital video recording devices |
CA2378342A1 (en) * | 2000-04-20 | 2001-11-01 | General Electric Company | Method and system for graphically identifying replacement parts for generally complex equipment |
US7934232B1 (en) | 2000-05-04 | 2011-04-26 | Jerding Dean F | Navigation paradigm for access to television services |
US8028314B1 (en) | 2000-05-26 | 2011-09-27 | Sharp Laboratories Of America, Inc. | Audiovisual information management system |
US8069259B2 (en) | 2000-06-09 | 2011-11-29 | Rodriguez Arturo A | Managing removal of media titles from a list |
US9038108B2 (en) * | 2000-06-28 | 2015-05-19 | Verizon Patent And Licensing Inc. | Method and system for providing end user community functionality for publication and delivery of digital media content |
US8126313B2 (en) * | 2000-06-28 | 2012-02-28 | Verizon Business Network Services Inc. | Method and system for providing a personal video recorder utilizing network-based digital media content |
GB0015896D0 (en) * | 2000-06-28 | 2000-08-23 | Twi Interactive Inc | Multimedia publishing system |
US7962370B2 (en) | 2000-06-29 | 2011-06-14 | Rodriguez Arturo A | Methods in a media service system for transaction processing |
KR100617237B1 (en) * | 2000-07-31 | 2006-08-31 | 엘지전자 주식회사 | Method for generating multimedia event using Short Message Service |
IE20010743A1 (en) * | 2000-08-04 | 2002-04-17 | Mobileaware Technologies Ltd | An e-business mobility platform |
US7165092B2 (en) | 2000-08-14 | 2007-01-16 | Imagitas, Inc. | System and method for sharing information among provider systems |
US7092370B2 (en) * | 2000-08-17 | 2006-08-15 | Roamware, Inc. | Method and system for wireless voice channel/data channel integration |
US7630959B2 (en) * | 2000-09-06 | 2009-12-08 | Imagitas, Inc. | System and method for processing database queries |
US8595372B2 (en) | 2000-09-12 | 2013-11-26 | Wag Acquisition, Llc | Streaming media buffering system |
US6766376B2 (en) | 2000-09-12 | 2004-07-20 | Sn Acquisition, L.L.C | Streaming media buffering system |
US7716358B2 (en) | 2000-09-12 | 2010-05-11 | Wag Acquisition, Llc | Streaming media buffering system |
US8205237B2 (en) | 2000-09-14 | 2012-06-19 | Cox Ingemar J | Identifying works, using a sub-linear time search, such as an approximate nearest neighbor search, for initiating a work-based action, such as an action on the internet |
US8020183B2 (en) | 2000-09-14 | 2011-09-13 | Sharp Laboratories Of America, Inc. | Audiovisual management system |
US7103906B1 (en) | 2000-09-29 | 2006-09-05 | International Business Machines Corporation | User controlled multi-device media-on-demand system |
US6774908B2 (en) * | 2000-10-03 | 2004-08-10 | Creative Frontier Inc. | System and method for tracking an object in a video and linking information thereto |
US8316450B2 (en) * | 2000-10-10 | 2012-11-20 | Addn Click, Inc. | System for inserting/overlaying markers, data packets and objects relative to viewable content and enabling live social networking, N-dimensional virtual environments and/or other value derivable from the content |
CN101715109A (en) | 2000-10-11 | 2010-05-26 | 联合视频制品公司 | Systems and methods for providing storage of data on servers in an on-demand media delivery system |
US7340759B1 (en) | 2000-11-10 | 2008-03-04 | Scientific-Atlanta, Inc. | Systems and methods for adaptive pricing in a digital broadband delivery system |
FI114364B (en) * | 2000-11-22 | 2004-09-30 | Nokia Corp | Data transfer |
EP1209614A1 (en) * | 2000-11-28 | 2002-05-29 | Koninklijke Philips Electronics N.V. | Methods for partionning a set of objects and method for searching in a partition of a set of objects |
GB0029893D0 (en) * | 2000-12-07 | 2001-01-24 | Sony Uk Ltd | Video information retrieval |
US7266704B2 (en) * | 2000-12-18 | 2007-09-04 | Digimarc Corporation | User-friendly rights management systems and methods |
US8055899B2 (en) | 2000-12-18 | 2011-11-08 | Digimarc Corporation | Systems and methods using digital watermarking and identifier extraction to provide promotional opportunities |
WO2002052565A1 (en) * | 2000-12-22 | 2002-07-04 | Muvee Technologies Pte Ltd | System and method for media production |
US20070300258A1 (en) * | 2001-01-29 | 2007-12-27 | O'connor Daniel | Methods and systems for providing media assets over a network |
US20020162118A1 (en) * | 2001-01-30 | 2002-10-31 | Levy Kenneth L. | Efficient interactive TV |
US20030192060A1 (en) * | 2001-01-30 | 2003-10-09 | Levy Kenneth L. | Digital watermarking and television services |
US20050183017A1 (en) * | 2001-01-31 | 2005-08-18 | Microsoft Corporation | Seekbar in taskbar player visualization mode |
WO2002063625A2 (en) * | 2001-02-08 | 2002-08-15 | Newsplayer International Ltd | Media editing method and software therefor |
US6971060B1 (en) * | 2001-02-09 | 2005-11-29 | Openwave Systems Inc. | Signal-processing based approach to translation of web pages into wireless pages |
US20030038796A1 (en) * | 2001-02-15 | 2003-02-27 | Van Beek Petrus J.L. | Segmentation metadata for audio-visual content |
CN101257609B (en) | 2001-02-21 | 2014-03-19 | 联合视频制品公司 | Systems and method for interactive program guides with personal video recording features |
SE520533C2 (en) * | 2001-03-13 | 2003-07-22 | Picsearch Ab | Method, computer programs and systems for indexing digitized devices |
US7216289B2 (en) * | 2001-03-16 | 2007-05-08 | Microsoft Corporation | Method and apparatus for synchronizing multiple versions of digital data |
US7380250B2 (en) * | 2001-03-16 | 2008-05-27 | Microsoft Corporation | Method and system for interacting with devices having different capabilities |
US20040019658A1 (en) * | 2001-03-26 | 2004-01-29 | Microsoft Corporation | Metadata retrieval protocols and namespace identifiers |
US6520032B2 (en) * | 2001-03-27 | 2003-02-18 | Trw Vehicle Safety Systems Inc. | Seat belt tension sensing apparatus |
US7143353B2 (en) * | 2001-03-30 | 2006-11-28 | Koninklijke Philips Electronics, N.V. | Streaming video bookmarks |
US7248715B2 (en) * | 2001-04-06 | 2007-07-24 | Digimarc Corporation | Digitally watermarking physical media |
US7181506B1 (en) * | 2001-04-06 | 2007-02-20 | Mcafee, Inc. | System and method to securely confirm performance of task by a peer in a peer-to-peer network environment |
US20030053656A1 (en) * | 2001-04-06 | 2003-03-20 | Levy Kenneth L. | Digitally watermarking physical media |
US7062555B1 (en) | 2001-04-06 | 2006-06-13 | Networks Associates Technology, Inc. | System and method for automatic selection of service provider for efficient use of bandwidth and resources in a peer-to-peer network environment |
US7280738B2 (en) * | 2001-04-09 | 2007-10-09 | International Business Machines Corporation | Method and system for specifying a selection of content segments stored in different formats |
US6741996B1 (en) * | 2001-04-18 | 2004-05-25 | Microsoft Corporation | Managing user clips |
US20020156832A1 (en) * | 2001-04-18 | 2002-10-24 | International Business Machines Corporation | Method and apparatus for dynamic bookmarks with attributes |
US7904814B2 (en) | 2001-04-19 | 2011-03-08 | Sharp Laboratories Of America, Inc. | System for presenting audio-video content |
US8401336B2 (en) | 2001-05-04 | 2013-03-19 | Legend3D, Inc. | System and method for rapid image sequence depth enhancement with augmented computer-generated elements |
US9286941B2 (en) | 2001-05-04 | 2016-03-15 | Legend3D, Inc. | Image sequence enhancement and motion picture project management system |
US8897596B1 (en) | 2001-05-04 | 2014-11-25 | Legend3D, Inc. | System and method for rapid image sequence depth enhancement with translucent elements |
US6735578B2 (en) * | 2001-05-10 | 2004-05-11 | Honeywell International Inc. | Indexing of knowledge base in multilayer self-organizing maps with hessian and perturbation induced fast learning |
CA2386303C (en) * | 2001-05-14 | 2005-07-05 | At&T Corp. | Method for content-based non-linear control of multimedia playback |
US7620363B2 (en) | 2001-05-16 | 2009-11-17 | Aol Llc | Proximity synchronization of audio content among multiple playback and storage devices |
US7890661B2 (en) * | 2001-05-16 | 2011-02-15 | Aol Inc. | Proximity synchronizing audio gateway device |
US8732232B2 (en) * | 2001-05-16 | 2014-05-20 | Facebook, Inc. | Proximity synchronizing audio playback device |
FR2825556A1 (en) * | 2001-05-31 | 2002-12-06 | Koninkl Philips Electronics Nv | GENERATION OF A DESCRIPTION IN A TAGGING LANGUAGE OF A STRUCTURE OF MULTIMEDIA CONTENT |
US7493397B1 (en) * | 2001-06-06 | 2009-02-17 | Microsoft Corporation | Providing remote processing services over a distributed communications network |
US6870956B2 (en) * | 2001-06-14 | 2005-03-22 | Microsoft Corporation | Method and apparatus for shot detection |
US7970260B2 (en) * | 2001-06-27 | 2011-06-28 | Verizon Business Global Llc | Digital media asset management system and method for supporting multiple users |
US8990214B2 (en) * | 2001-06-27 | 2015-03-24 | Verizon Patent And Licensing Inc. | Method and system for providing distributed editing and storage of digital media over a network |
US20060236221A1 (en) * | 2001-06-27 | 2006-10-19 | Mci, Llc. | Method and system for providing digital media management using templates and profiles |
US8972862B2 (en) | 2001-06-27 | 2015-03-03 | Verizon Patent And Licensing Inc. | Method and system for providing remote digital media ingest with centralized editorial control |
US20070089151A1 (en) * | 2001-06-27 | 2007-04-19 | Mci, Llc. | Method and system for delivery of digital media experience via common instant communication clients |
US8006262B2 (en) | 2001-06-29 | 2011-08-23 | Rodriguez Arturo A | Graphic user interfaces for purchasable and recordable media (PRM) downloads |
US7526788B2 (en) | 2001-06-29 | 2009-04-28 | Scientific-Atlanta, Inc. | Graphic user interface alternate download options for unavailable PRM content |
US7512964B2 (en) | 2001-06-29 | 2009-03-31 | Cisco Technology | System and method for archiving multiple downloaded recordable media content |
US7496945B2 (en) | 2001-06-29 | 2009-02-24 | Cisco Technology, Inc. | Interactive program guide for bidirectional services |
US7594218B1 (en) * | 2001-07-24 | 2009-09-22 | Adobe Systems Incorporated | System and method for providing audio in a media file |
GB2381086A (en) * | 2001-07-30 | 2003-04-23 | Tentendigital Ltd | Learning content management system |
US7296231B2 (en) * | 2001-08-09 | 2007-11-13 | Eastman Kodak Company | Video structuring by probabilistic merging of video segments |
US7016885B1 (en) * | 2001-08-28 | 2006-03-21 | University Of Central Florida Research Foundation, Inc. | Self-designing intelligent signal processing system capable of evolutional learning for classification/recognition of one and multidimensional signals |
GB0121170D0 (en) * | 2001-08-31 | 2001-10-24 | Nokia Corp | Improvements in and relating to content selection |
US6928405B2 (en) * | 2001-09-05 | 2005-08-09 | Inventec Corporation | Method of adding audio data to an information title of a document |
US7143102B2 (en) * | 2001-09-28 | 2006-11-28 | Sigmatel, Inc. | Autogenerated play lists from search criteria |
US7328344B2 (en) | 2001-09-28 | 2008-02-05 | Imagitas, Inc. | Authority-neutral certification for multiple-authority PKI environments |
US7415539B2 (en) * | 2001-09-28 | 2008-08-19 | Siebel Systems, Inc. | Method and apparatus for detecting insufficient memory for data extraction processes |
US7257649B2 (en) * | 2001-09-28 | 2007-08-14 | Siebel Systems, Inc. | Method and system for transferring information during server synchronization with a computing device |
US7474698B2 (en) | 2001-10-19 | 2009-01-06 | Sharp Laboratories Of America, Inc. | Identification of replay segments |
US7192235B2 (en) * | 2001-11-01 | 2007-03-20 | Palm, Inc. | Temporary messaging address system and method |
US20030098869A1 (en) * | 2001-11-09 | 2003-05-29 | Arnold Glenn Christopher | Real time interactive video system |
US6859803B2 (en) * | 2001-11-13 | 2005-02-22 | Koninklijke Philips Electronics N.V. | Apparatus and method for program selection utilizing exclusive and inclusive metadata searches |
US7320137B1 (en) | 2001-12-06 | 2008-01-15 | Digeo, Inc. | Method and system for distributing personalized editions of media programs using bookmarks |
US7032177B2 (en) * | 2001-12-27 | 2006-04-18 | Digeo, Inc. | Method and system for distributing personalized editions of media programs using bookmarks |
AU2002351310A1 (en) | 2001-12-06 | 2003-06-23 | The Trustees Of Columbia University In The City Of New York | System and method for extracting text captions from video and generating video summaries |
US8799975B2 (en) * | 2001-12-06 | 2014-08-05 | Sony Corporation | System and method for providing content associated with a television broadcast |
JP4000844B2 (en) * | 2001-12-11 | 2007-10-31 | 日本電気株式会社 | Content distribution system, content distribution system distribution server and display terminal, and content distribution program |
JP3733061B2 (en) * | 2001-12-18 | 2006-01-11 | 三洋電機株式会社 | Image recording device |
CN1620695A (en) * | 2001-12-25 | 2005-05-25 | 松下电器产业株式会社 | Reproducing device, computer readable program and reproducing method |
KR100493674B1 (en) * | 2001-12-29 | 2005-06-03 | 엘지전자 주식회사 | Multimedia data searching and browsing system |
US20030140093A1 (en) * | 2002-01-23 | 2003-07-24 | Factor Cory L. | Method and apparatus for providing content over a distributed network |
US20070113250A1 (en) * | 2002-01-29 | 2007-05-17 | Logan James D | On demand fantasy sports systems and methods |
US6899475B2 (en) * | 2002-01-30 | 2005-05-31 | Digimarc Corporation | Watermarking a page description language file |
US7287222B2 (en) * | 2002-01-31 | 2007-10-23 | Canon Kabushiki Kaisha | Information processing apparatus and method that determines effectiveness of metadata for editing information content |
US7142225B1 (en) * | 2002-01-31 | 2006-11-28 | Microsoft Corporation | Lossless manipulation of media objects |
US7334251B2 (en) | 2002-02-11 | 2008-02-19 | Scientific-Atlanta, Inc. | Management of television advertising |
US9479550B2 (en) * | 2002-02-12 | 2016-10-25 | Google Technology Holdings LLC | System for providing continuity of broadcast between clients and method therefor |
WO2003075184A1 (en) * | 2002-03-06 | 2003-09-12 | Chung-Tae Kim | Methods for constructing multimedia database and providing multimedia-search service and apparatus therefor |
TWI247295B (en) * | 2002-03-09 | 2006-01-11 | Samsung Electronics Co Ltd | Reproducing method and apparatus for interactive mode using markup documents |
US6976026B1 (en) * | 2002-03-14 | 2005-12-13 | Microsoft Corporation | Distributing limited storage among a collection of media objects |
JP4199671B2 (en) * | 2002-03-15 | 2008-12-17 | 富士通株式会社 | Regional information retrieval method and regional information retrieval apparatus |
US8214741B2 (en) | 2002-03-19 | 2012-07-03 | Sharp Laboratories Of America, Inc. | Synchronization of video and data |
US20030182139A1 (en) * | 2002-03-22 | 2003-09-25 | Microsoft Corporation | Storage, retrieval, and display of contextual art with digital media files |
US7143139B2 (en) | 2002-03-27 | 2006-11-28 | International Business Machines Corporation | Broadcast tiers in decentralized networks |
US7181536B2 (en) * | 2002-03-27 | 2007-02-20 | International Business Machines Corporation | Interminable peer relationships in transient communities |
US7251689B2 (en) | 2002-03-27 | 2007-07-31 | International Business Machines Corporation | Managing storage resources in decentralized networks |
US7177929B2 (en) * | 2002-03-27 | 2007-02-13 | International Business Machines Corporation | Persisting node reputations in transient network communities |
US7069318B2 (en) | 2002-03-27 | 2006-06-27 | International Business Machines Corporation | Content tracking in transient network communities |
US7039701B2 (en) * | 2002-03-27 | 2006-05-02 | International Business Machines Corporation | Providing management functions in decentralized networks |
KR20020057837A (en) * | 2002-03-29 | 2002-07-12 | 문의선 | Streaming service method and system |
US20030187820A1 (en) * | 2002-03-29 | 2003-10-02 | Michael Kohut | Media management system and process |
US8214655B2 (en) | 2002-03-29 | 2012-07-03 | Kabushiki Kaisha Toshiba | Data structure of multimedia file format, encrypting method and device thereof, and decrypting method and device thereof |
CN1452079A (en) * | 2002-04-16 | 2003-10-29 | 霍树亚 | Electronic information term selected transaction and transaction term transmission controlling system |
DE10218812A1 (en) * | 2002-04-26 | 2003-11-20 | Siemens Ag | Generic stream description |
AU2003231102A1 (en) * | 2002-04-26 | 2003-11-10 | Electronics And Telecommunications Research Institute | Method and system for optimal video transcoding based on utility function descriptors |
AU2003241340A1 (en) * | 2002-04-30 | 2003-11-17 | University Of Southern California | Preparing and presenting content |
US7200611B2 (en) | 2002-05-13 | 2007-04-03 | Microsoft Corporation | TV program database |
JP3747884B2 (en) * | 2002-05-23 | 2006-02-22 | ソニー株式会社 | Content recording / reproducing apparatus, content recording / reproducing method, and computer program |
JP4065142B2 (en) * | 2002-05-31 | 2008-03-19 | 松下電器産業株式会社 | Authoring apparatus and authoring method |
US7379654B2 (en) * | 2002-06-19 | 2008-05-27 | Microsoft Corporation | Programmable video recorder backing store for non-byte stream formats |
US7219308B2 (en) * | 2002-06-21 | 2007-05-15 | Microsoft Corporation | User interface for media player program |
JP2004030327A (en) * | 2002-06-26 | 2004-01-29 | Sony Corp | Device and method for providing contents-related information, electronic bulletin board system and computer program |
US20040002993A1 (en) * | 2002-06-26 | 2004-01-01 | Microsoft Corporation | User feedback processing of metadata associated with digital media files |
GB2391150B (en) * | 2002-07-19 | 2005-10-26 | Autodesk Canada Inc | Editing image data |
US20040021684A1 (en) * | 2002-07-23 | 2004-02-05 | Dominick B. Millner | Method and system for an interactive video system |
US20040019520A1 (en) * | 2002-07-24 | 2004-01-29 | Guglielmucci Luis Felipe | Business model for the sale of recorded media through the Internet and other distribution channels adapted to the acoustic print and/or replay system set up of the customer |
US20040019527A1 (en) * | 2002-07-24 | 2004-01-29 | Guglielmucci Luis Felipe | System for the sale of recorded media through the internet adapted to the acoustic print and replay system set up of the customer |
US7149755B2 (en) * | 2002-07-29 | 2006-12-12 | Hewlett-Packard Development Company, Lp. | Presenting a collection of media objects |
US7136866B2 (en) * | 2002-08-15 | 2006-11-14 | Microsoft Corporation | Media identifier registry |
US7290057B2 (en) * | 2002-08-20 | 2007-10-30 | Microsoft Corporation | Media streaming of web content data |
US20040059836A1 (en) * | 2002-09-23 | 2004-03-25 | Peter Spaepen | Method for generating and displaying a digital datafile containing video data |
US7240075B1 (en) * | 2002-09-24 | 2007-07-03 | Exphand, Inc. | Interactive generating query related to telestrator data designating at least a portion of the still image frame and data identifying a user is generated from the user designating a selected region on the display screen, transmitting the query to the remote information system |
FR2845179B1 (en) * | 2002-09-27 | 2004-11-05 | Thomson Licensing Sa | METHOD FOR GROUPING IMAGES OF A VIDEO SEQUENCE |
US7657907B2 (en) | 2002-09-30 | 2010-02-02 | Sharp Laboratories Of America, Inc. | Automatic user profiling |
US7574653B2 (en) * | 2002-10-11 | 2009-08-11 | Microsoft Corporation | Adaptive image formatting control |
US20040073950A1 (en) * | 2002-10-15 | 2004-04-15 | Koninklijke Philips Electronics N.V. | Method and apparatus for user-selective execution and recording of interactive audio/video components |
US7904936B2 (en) * | 2002-10-18 | 2011-03-08 | Time Warner Interactive Video Group, Inc. | Technique for resegmenting assets containing programming content delivered through a communications network |
CA2409114A1 (en) * | 2002-10-22 | 2004-04-22 | N-Liter Inc. | Method for information retrieval |
US7116716B2 (en) | 2002-11-01 | 2006-10-03 | Microsoft Corporation | Systems and methods for generating a motion attention model |
US7375731B2 (en) * | 2002-11-01 | 2008-05-20 | Mitsubishi Electric Research Laboratories, Inc. | Video mining using unsupervised clustering of video content |
US7926080B2 (en) | 2002-11-07 | 2011-04-12 | Microsoft Corporation | Trick mode support for VOD with long intra-frame intervals |
US7158957B2 (en) * | 2002-11-21 | 2007-01-02 | Honeywell International Inc. | Supervised self organizing maps with fuzzy error correction |
US7197503B2 (en) * | 2002-11-26 | 2007-03-27 | Honeywell International Inc. | Intelligent retrieval and classification of information from a product manual |
GB2395805A (en) | 2002-11-27 | 2004-06-02 | Sony Uk Ltd | Information retrieval |
US8204353B2 (en) * | 2002-11-27 | 2012-06-19 | The Nielsen Company (Us), Llc | Apparatus and methods for tracking and analyzing digital recording device event sequences |
GB2395808A (en) * | 2002-11-27 | 2004-06-02 | Sony Uk Ltd | Information retrieval |
US9396473B2 (en) | 2002-11-27 | 2016-07-19 | Accenture Global Services Limited | Searching within a contact center portal |
US20040111750A1 (en) * | 2002-12-05 | 2004-06-10 | Stuckman Bruce E. | DSL video service with automatic program selector |
US7870593B2 (en) * | 2002-12-05 | 2011-01-11 | Att Knowledge Ventures, L.P. | DSL video service with storage |
US8086093B2 (en) * | 2002-12-05 | 2011-12-27 | At&T Ip I, Lp | DSL video service with memory manager |
US20040111754A1 (en) * | 2002-12-05 | 2004-06-10 | Bushey Robert R. | System and method for delivering media content |
US20040111748A1 (en) * | 2002-12-05 | 2004-06-10 | Bushey Robert R. | System and method for search, selection and delivery of media content |
KR100511785B1 (en) * | 2002-12-20 | 2005-08-31 | 한국전자통신연구원 | A System and A Method for Authoring Multimedia Content Description Metadata |
ATE341381T1 (en) * | 2002-12-24 | 2006-10-15 | Koninkl Philips Electronics Nv | METHOD AND SYSTEM FOR MARKING A SOUND SIGNAL WITH METADATA |
EP1586045A1 (en) * | 2002-12-27 | 2005-10-19 | Nielsen Media Research, Inc. | Methods and apparatus for transcoding metadata |
US7082572B2 (en) * | 2002-12-30 | 2006-07-25 | The Board Of Trustees Of The Leland Stanford Junior University | Methods and apparatus for interactive map-based analysis of digital video content |
US20040125114A1 (en) * | 2002-12-31 | 2004-07-01 | Hauke Schmidt | Multiresolution image synthesis for navigation |
US7131059B2 (en) * | 2002-12-31 | 2006-10-31 | Hewlett-Packard Development Company, L.P. | Scalably presenting a collection of media objects |
US7676820B2 (en) * | 2003-01-06 | 2010-03-09 | Koninklijke Philips Electronics N.V. | Method and apparatus for similar video content hopping |
US7111000B2 (en) * | 2003-01-06 | 2006-09-19 | Microsoft Corporation | Retrieval of structured documents |
US7593915B2 (en) * | 2003-01-07 | 2009-09-22 | Accenture Global Services Gmbh | Customized multi-media services |
US8225194B2 (en) | 2003-01-09 | 2012-07-17 | Kaleidescape, Inc. | Bookmarks and watchpoints for selection and presentation of media streams |
US7197698B2 (en) * | 2003-01-21 | 2007-03-27 | Canon Kabushiki Kaisha | Information processing method and apparatus |
JP2004228721A (en) * | 2003-01-21 | 2004-08-12 | Hitachi Ltd | Contents display apparatus and method |
US7493646B2 (en) | 2003-01-30 | 2009-02-17 | United Video Properties, Inc. | Interactive television systems with digital video recording and adjustable reminders |
US20040152055A1 (en) * | 2003-01-30 | 2004-08-05 | Gliessner Michael J.G. | Video based language learning system |
US7913279B2 (en) * | 2003-01-31 | 2011-03-22 | Microsoft Corporation | Global listings format (GLF) for multimedia programming content and electronic program guide (EPG) information |
US20040151311A1 (en) * | 2003-02-04 | 2004-08-05 | Max Hamberg | Encrypted photo archive |
US7164798B2 (en) * | 2003-02-18 | 2007-01-16 | Microsoft Corporation | Learning-based automatic commercial content detection |
US7260261B2 (en) * | 2003-02-20 | 2007-08-21 | Microsoft Corporation | Systems and methods for enhanced image adaptation |
WO2004077793A1 (en) * | 2003-02-28 | 2004-09-10 | Matsushita Electric Industrial Co., Ltd. | System and method for content history log collection for digital rights management |
US20050177847A1 (en) * | 2003-03-07 | 2005-08-11 | Richard Konig | Determining channel associated with video stream |
US7694318B2 (en) * | 2003-03-07 | 2010-04-06 | Technology, Patents & Licensing, Inc. | Video detection and insertion |
US20050149968A1 (en) * | 2003-03-07 | 2005-07-07 | Richard Konig | Ending advertisement insertion |
US7738704B2 (en) * | 2003-03-07 | 2010-06-15 | Technology, Patents And Licensing, Inc. | Detecting known video entities utilizing fingerprints |
US7809154B2 (en) | 2003-03-07 | 2010-10-05 | Technology, Patents & Licensing, Inc. | Video entity recognition in compressed digital video streams |
US20040181545A1 (en) * | 2003-03-10 | 2004-09-16 | Yining Deng | Generating and rendering annotated video files |
US20040181550A1 (en) * | 2003-03-13 | 2004-09-16 | Ville Warsta | System and method for efficient adaptation of multimedia message content |
US7835504B1 (en) | 2003-03-16 | 2010-11-16 | Palm, Inc. | Telephone number parsing and linking |
US7231229B1 (en) | 2003-03-16 | 2007-06-12 | Palm, Inc. | Communication device interface |
US8832758B2 (en) * | 2003-03-17 | 2014-09-09 | Qwest Communications International Inc. | Methods and systems for providing video on demand |
US7886333B2 (en) * | 2003-03-19 | 2011-02-08 | Panasonic Corporation | In-vehicle recording/reproduction device, recording/reproduction device, recording/reproduction system, and recording/reproduction method |
US20040230328A1 (en) * | 2003-03-21 | 2004-11-18 | Steve Armstrong | Remote data visualization within an asset data system for a process plant |
US7885963B2 (en) * | 2003-03-24 | 2011-02-08 | Microsoft Corporation | Free text and attribute searching of electronic program guide (EPG) data |
EP1463258A1 (en) * | 2003-03-28 | 2004-09-29 | Mobile Integrated Solutions Limited | A system and method for transferring data over a wireless communications network |
US7526565B2 (en) * | 2003-04-03 | 2009-04-28 | International Business Machines Corporation | Multiple description hinting and switching for adaptive media services |
US7519685B2 (en) * | 2003-04-04 | 2009-04-14 | Panasonic Corporation | Contents linkage information delivery system |
US20040199491A1 (en) * | 2003-04-04 | 2004-10-07 | Nikhil Bhatt | Domain specific search engine |
US8392834B2 (en) * | 2003-04-09 | 2013-03-05 | Hewlett-Packard Development Company, L.P. | Systems and methods of authoring a multimedia file |
EP1469476A1 (en) * | 2003-04-16 | 2004-10-20 | Accenture Global Services GmbH | Controlled multi-media program review |
US8572104B2 (en) | 2003-04-18 | 2013-10-29 | Kaleidescape, Inc. | Sales of collections excluding those already purchased |
US20050050103A1 (en) * | 2003-07-15 | 2005-03-03 | Kaleidescape | Displaying and presenting multiple media streams from multiple DVD sets |
JP4611285B2 (en) | 2003-04-29 | 2011-01-12 | エルジー エレクトロニクス インコーポレイティド | RECORDING MEDIUM HAVING DATA STRUCTURE FOR MANAGING GRAPHIC DATA REPRODUCTION, RECORDING AND REPRODUCING METHOD AND APPARATUS THEREFOR |
US7552387B2 (en) * | 2003-04-30 | 2009-06-23 | Hewlett-Packard Development Company, L.P. | Methods and systems for video content browsing |
JP2004336343A (en) * | 2003-05-07 | 2004-11-25 | Canon Inc | Image processing system |
JP4661047B2 (en) * | 2003-05-30 | 2011-03-30 | ソニー株式会社 | Information processing apparatus, information processing method, and computer program |
US20040254960A1 (en) * | 2003-06-10 | 2004-12-16 | Scaturro Paul E. | System and method for delivering video and music files over network |
US20060156355A1 (en) * | 2003-06-11 | 2006-07-13 | Masahiro Kawasaki | Reproduction apparatus, program, integrated circuit |
US8069255B2 (en) * | 2003-06-18 | 2011-11-29 | AT&T Intellectual Property I, .L.P. | Apparatus and method for aggregating disparate storage on consumer electronics devices |
US8014557B2 (en) * | 2003-06-23 | 2011-09-06 | Digimarc Corporation | Watermarking electronic text documents |
US7757182B2 (en) | 2003-06-25 | 2010-07-13 | Microsoft Corporation | Taskbar media player |
US7512884B2 (en) | 2003-06-25 | 2009-03-31 | Microsoft Corporation | System and method for switching of media presentation |
US7555540B2 (en) * | 2003-06-25 | 2009-06-30 | Microsoft Corporation | Media foundation media processor |
US7734568B2 (en) * | 2003-06-26 | 2010-06-08 | Microsoft Corporation | DVD metadata wizard |
US7434170B2 (en) * | 2003-07-09 | 2008-10-07 | Microsoft Corporation | Drag and drop metadata editing |
US7293227B2 (en) * | 2003-07-18 | 2007-11-06 | Microsoft Corporation | Associating image files with media content |
US20050015405A1 (en) * | 2003-07-18 | 2005-01-20 | Microsoft Corporation | Multi-valued properties |
US20050015389A1 (en) * | 2003-07-18 | 2005-01-20 | Microsoft Corporation | Intelligent metadata attribute resolution |
US7392477B2 (en) * | 2003-07-18 | 2008-06-24 | Microsoft Corporation | Resolving metadata matched to media content |
US7468735B2 (en) * | 2003-07-24 | 2008-12-23 | Sony Corporation | Transitioning between two high resolution images in a slideshow |
US20050071881A1 (en) * | 2003-09-30 | 2005-03-31 | Deshpande Sachin G. | Systems and methods for playlist creation and playback |
US7400761B2 (en) * | 2003-09-30 | 2008-07-15 | Microsoft Corporation | Contrast-based image attention analysis framework |
CN100421095C (en) * | 2003-09-30 | 2008-09-24 | 索尼株式会社 | Content acquisition method |
JP2007513398A (en) * | 2003-09-30 | 2007-05-24 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Method and apparatus for identifying high-level structure of program |
US7352373B2 (en) * | 2003-09-30 | 2008-04-01 | Sharp Laboratories Of America, Inc. | Systems and methods for multi-dimensional dither structure creation and application |
KR100969966B1 (en) * | 2003-10-06 | 2010-07-15 | 디즈니엔터프라이지즈,인크. | System and method of playback and feature control for video players |
US8321534B1 (en) * | 2003-10-15 | 2012-11-27 | Radix Holdings, Llc | System and method for synchronization based on preferences |
US7471827B2 (en) * | 2003-10-16 | 2008-12-30 | Microsoft Corporation | Automatic browsing path generation to present image areas with high attention value as a function of space and time |
US20050144305A1 (en) * | 2003-10-21 | 2005-06-30 | The Board Of Trustees Operating Michigan State University | Systems and methods for identifying, segmenting, collecting, annotating, and publishing multimedia materials |
GB2407445B (en) * | 2003-10-24 | 2005-12-28 | Motorola Inc | A method and apparatus for selecting highlights from a recorded medium |
US7693855B2 (en) * | 2003-10-28 | 2010-04-06 | Media Cybernetics, Inc. | Method, system, and computer program product for managing data associated with a document stored in an electronic form |
US7693899B2 (en) * | 2003-10-28 | 2010-04-06 | Media Cybernetics, Inc. | Method, system, and computer program product for constructing a query with a graphical user interface |
US8650596B2 (en) * | 2003-11-03 | 2014-02-11 | Microsoft Corporation | Multi-axis television navigation |
GB0325673D0 (en) * | 2003-11-04 | 2003-12-10 | Koninkl Philips Electronics Nv | Virtual content directory service |
JP2005151362A (en) * | 2003-11-18 | 2005-06-09 | Pioneer Electronic Corp | Information processor, information editing apparatus, information processing system, reproducing apparatus, reproducing system, method thereof, program thereof, and recording medium recording this program |
US7673062B2 (en) * | 2003-11-18 | 2010-03-02 | Yahoo! Inc. | Method and apparatus for assisting with playback of remotely stored media files |
JP2005151363A (en) * | 2003-11-18 | 2005-06-09 | Pioneer Electronic Corp | Information processor, information editing apparatus, information processing system, reproducing apparatus, reproducing system, method thereof, program thereof, and recording medium recording this program |
KR100595616B1 (en) * | 2003-11-24 | 2006-06-30 | 엘지전자 주식회사 | Motion estimation method for digital video trans-coding |
KR100986417B1 (en) | 2003-11-27 | 2010-10-08 | 엘지전자 주식회사 | Multimedia message control method of mobile terminal |
US7523096B2 (en) | 2003-12-03 | 2009-04-21 | Google Inc. | Methods and systems for personalized network searching |
US7705859B2 (en) * | 2003-12-03 | 2010-04-27 | Sony Corporation | Transitioning between two high resolution video sources |
US20050122345A1 (en) * | 2003-12-05 | 2005-06-09 | Kirn Kevin N. | System and method for media-enabled messaging having publish-and-send feature |
US7519274B2 (en) | 2003-12-08 | 2009-04-14 | Divx, Inc. | File format for multiple track digital data |
US8472792B2 (en) | 2003-12-08 | 2013-06-25 | Divx, Llc | Multimedia distribution system |
CN1627293A (en) * | 2003-12-09 | 2005-06-15 | 皇家飞利浦电子股份有限公司 | Electronic bookmark |
US7333985B2 (en) * | 2003-12-15 | 2008-02-19 | Microsoft Corporation | Dynamic content clustering |
US20050132264A1 (en) * | 2003-12-15 | 2005-06-16 | Joshi Ajit P. | System and method for intelligent transcoding |
US8693043B2 (en) * | 2003-12-19 | 2014-04-08 | Kofax, Inc. | Automatic document separation |
WO2005064927A1 (en) * | 2003-12-25 | 2005-07-14 | Matsushita Electric Industrial Co., Ltd. | Television broadcast reception device, television broadcast reception method, and television broadcast reception program |
JP2005190088A (en) * | 2003-12-25 | 2005-07-14 | Matsushita Electric Ind Co Ltd | E-mail processor and e-mail processing system |
CA2454290C (en) * | 2003-12-29 | 2013-05-21 | Ibm Canada Limited-Ibm Canada Limitee | Graphical user interface (gui) script generation and documentation |
GB2409737A (en) * | 2003-12-31 | 2005-07-06 | Nokia Corp | Bookmarking digital content |
US7672864B2 (en) | 2004-01-09 | 2010-03-02 | Ricoh Company Ltd. | Generating and displaying level-of-interest values |
KR100597398B1 (en) * | 2004-01-15 | 2006-07-06 | 삼성전자주식회사 | Apparatus and method for searching for video clip |
US8161388B2 (en) | 2004-01-21 | 2012-04-17 | Rodriguez Arturo A | Interactive discovery of display device characteristics |
EP1557837A1 (en) * | 2004-01-26 | 2005-07-27 | Sony International (Europe) GmbH | Redundancy elimination in a content-adaptive video preview system |
GB2429597B (en) * | 2004-02-06 | 2009-09-23 | Agency Science Tech & Res | Automatic video event detection and indexing |
US8356317B2 (en) * | 2004-03-04 | 2013-01-15 | Sharp Laboratories Of America, Inc. | Presence based technology |
US8949899B2 (en) | 2005-03-04 | 2015-02-03 | Sharp Laboratories Of America, Inc. | Collaborative recommendation system |
JP4295644B2 (en) * | 2004-03-08 | 2009-07-15 | 京セラ株式会社 | Mobile terminal, broadcast recording / playback method for mobile terminal, and broadcast recording / playback program |
US8782654B2 (en) | 2004-03-13 | 2014-07-15 | Adaptive Computing Enterprises, Inc. | Co-allocating a reservation spanning different compute resources types |
WO2005089241A2 (en) | 2004-03-13 | 2005-09-29 | Cluster Resources, Inc. | System and method for providing object triggers |
US7983835B2 (en) | 2004-11-03 | 2011-07-19 | Lagassey Paul J | Modular intelligent transportation system |
WO2005091175A1 (en) * | 2004-03-15 | 2005-09-29 | Yahoo! Inc. | Search systems and methods with integration of user annotations |
KR20050094557A (en) * | 2004-03-23 | 2005-09-28 | 김정태 | System for extracting optional area in static contents |
WO2005089061A2 (en) * | 2004-03-23 | 2005-09-29 | Nds Limited | Optimally adapting multimedia content for mobile subscriber device playback |
US8688248B2 (en) * | 2004-04-19 | 2014-04-01 | Shazam Investments Limited | Method and system for content sampling and identification |
US7962938B2 (en) * | 2004-04-27 | 2011-06-14 | Microsoft Corporation | Specialized media presentation via an electronic program guide (EPG) |
US7889760B2 (en) * | 2004-04-30 | 2011-02-15 | Microsoft Corporation | Systems and methods for sending binary, file contents, and other information, across SIP info and text communication channels |
WO2005106699A1 (en) | 2004-05-03 | 2005-11-10 | Lg Electronics Inc. | Method and apparatus for managing bookmark information for content stored in a networked media server |
US7890604B2 (en) | 2004-05-07 | 2011-02-15 | Microsoft Corproation | Client-side callbacks to server events |
US7457516B2 (en) * | 2004-05-07 | 2008-11-25 | Intervideo Inc. | Video editing system and method of computer system |
US20050251380A1 (en) * | 2004-05-10 | 2005-11-10 | Simon Calvert | Designer regions and Interactive control designers |
US9026578B2 (en) * | 2004-05-14 | 2015-05-05 | Microsoft Corporation | Systems and methods for persisting data between web pages |
US8065600B2 (en) * | 2004-05-14 | 2011-11-22 | Microsoft Corporation | Systems and methods for defining web content navigation |
US9219729B2 (en) | 2004-05-19 | 2015-12-22 | Philip Drope | Multimedia network system with content importation, content exportation, and integrated content management |
US20070266388A1 (en) | 2004-06-18 | 2007-11-15 | Cluster Resources, Inc. | System and method for providing advanced reservations in a compute environment |
US7437358B2 (en) * | 2004-06-25 | 2008-10-14 | Apple Inc. | Methods and systems for managing data |
US7774326B2 (en) | 2004-06-25 | 2010-08-10 | Apple Inc. | Methods and systems for managing data |
US7730012B2 (en) | 2004-06-25 | 2010-06-01 | Apple Inc. | Methods and systems for managing data |
KR20060001554A (en) * | 2004-06-30 | 2006-01-06 | 엘지전자 주식회사 | System for managing contents using bookmark |
JP4251634B2 (en) * | 2004-06-30 | 2009-04-08 | 株式会社東芝 | Multimedia data reproducing apparatus and multimedia data reproducing method |
AU2005269957B2 (en) | 2004-07-02 | 2011-09-22 | The Nielsen Company (Us), Llc | Methods and apparatus for identifying viewing information associated with a digital media device |
JP4552540B2 (en) * | 2004-07-09 | 2010-09-29 | ソニー株式会社 | Content recording apparatus, content reproducing apparatus, content recording method, content reproducing method, and program |
US9053754B2 (en) | 2004-07-28 | 2015-06-09 | Microsoft Technology Licensing, Llc | Thumbnail generation and presentation for recorded TV programs |
US7590997B2 (en) | 2004-07-30 | 2009-09-15 | Broadband Itv, Inc. | System and method for managing, converting and displaying video content on a video-on-demand platform, including ads used for drill-down navigation and consumer-generated classified ads |
US20110030013A1 (en) * | 2004-07-30 | 2011-02-03 | Diaz Perez Milton | Converting, navigating and displaying video content uploaded from the internet to a digital TV video-on-demand platform |
US7631336B2 (en) | 2004-07-30 | 2009-12-08 | Broadband Itv, Inc. | Method for converting, navigating and displaying video content uploaded from the internet to a digital TV video-on-demand platform |
US9344765B2 (en) | 2004-07-30 | 2016-05-17 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US11259059B2 (en) | 2004-07-30 | 2022-02-22 | Broadband Itv, Inc. | System for addressing on-demand TV program content on TV services platform of a digital TV services provider |
US9584868B2 (en) | 2004-07-30 | 2017-02-28 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
JP4626210B2 (en) * | 2004-07-30 | 2011-02-02 | ソニー株式会社 | Content providing system, content providing server, information processing apparatus, and computer program |
US7986372B2 (en) * | 2004-08-02 | 2011-07-26 | Microsoft Corporation | Systems and methods for smart media content thumbnail extraction |
US7487072B2 (en) * | 2004-08-04 | 2009-02-03 | International Business Machines Corporation | Method and system for querying multimedia data where adjusting the conversion of the current portion of the multimedia data signal based on the comparing at least one set of confidence values to the threshold |
US8176490B1 (en) | 2004-08-20 | 2012-05-08 | Adaptive Computing Enterprises, Inc. | System and method of interfacing a workload manager and scheduler with an identity manager |
US20060064719A1 (en) * | 2004-09-17 | 2006-03-23 | Youden John J | Simultaneous video input display and selection system and method |
US8086575B2 (en) | 2004-09-23 | 2011-12-27 | Rovi Solutions Corporation | Methods and apparatus for integrating disparate media formats in a networked media system |
US20060067654A1 (en) * | 2004-09-24 | 2006-03-30 | Magix Ag | Graphical user interface adaptable to multiple display devices |
US20060112056A1 (en) * | 2004-09-27 | 2006-05-25 | Accenture Global Services Gmbh | Problem solving graphical toolbar |
US9373029B2 (en) | 2007-07-11 | 2016-06-21 | Ricoh Co., Ltd. | Invisible junction feature recognition for document security or annotation |
US7702673B2 (en) * | 2004-10-01 | 2010-04-20 | Ricoh Co., Ltd. | System and methods for creation and use of a mixed media environment |
US8989431B1 (en) | 2007-07-11 | 2015-03-24 | Ricoh Co., Ltd. | Ad hoc paper-based networking with mixed media reality |
US9495385B2 (en) | 2004-10-01 | 2016-11-15 | Ricoh Co., Ltd. | Mixed media reality recognition using multiple specialized indexes |
US9171202B2 (en) | 2005-08-23 | 2015-10-27 | Ricoh Co., Ltd. | Data organization and access for mixed media document system |
US9384619B2 (en) | 2006-07-31 | 2016-07-05 | Ricoh Co., Ltd. | Searching media content for objects specified using identifiers |
US7812986B2 (en) | 2005-08-23 | 2010-10-12 | Ricoh Co. Ltd. | System and methods for use of voice mail and email in a mixed media environment |
US9530050B1 (en) | 2007-07-11 | 2016-12-27 | Ricoh Co., Ltd. | Document annotation sharing |
US9405751B2 (en) | 2005-08-23 | 2016-08-02 | Ricoh Co., Ltd. | Database for mixed media document system |
JP2007065928A (en) * | 2005-08-30 | 2007-03-15 | Toshiba Corp | Information storage medium, information processing method, information transfer method, information reproduction method, information reproduction device, information recording method, information recording device, and program |
US20060083194A1 (en) * | 2004-10-19 | 2006-04-20 | Ardian Dhrimaj | System and method rendering audio/image data on remote devices |
US7797720B2 (en) * | 2004-10-22 | 2010-09-14 | Microsoft Corporation | Advanced trick mode |
JP4243862B2 (en) * | 2004-10-26 | 2009-03-25 | ソニー株式会社 | Content utilization apparatus and content utilization method |
US20060093320A1 (en) * | 2004-10-29 | 2006-05-04 | Hallberg Bryan S | Operation modes for a personal video recorder using dynamically generated time stamps |
US20060095338A1 (en) * | 2004-11-02 | 2006-05-04 | Microsoft Corporation | Strategies for gifting resources |
US8271980B2 (en) | 2004-11-08 | 2012-09-18 | Adaptive Computing Enterprises, Inc. | System and method of providing system jobs within a compute environment |
US7933338B1 (en) | 2004-11-10 | 2011-04-26 | Google Inc. | Ranking video articles |
US20060106869A1 (en) * | 2004-11-17 | 2006-05-18 | Ulead Systems, Inc. | Multimedia enhancement system using the multimedia database |
US20060174290A1 (en) * | 2004-11-23 | 2006-08-03 | Garwin Richard L | Enhanced program viewing method |
US9066063B2 (en) * | 2004-11-23 | 2015-06-23 | International Business Machines Corporation | Enhanced program viewing method |
US20060227721A1 (en) * | 2004-11-24 | 2006-10-12 | Junichi Hirai | Content transmission device and content transmission method |
KR100713517B1 (en) * | 2004-11-26 | 2007-05-02 | 삼성전자주식회사 | PVR By Using MetaData and Its Recording Control Method |
JP2006154262A (en) * | 2004-11-29 | 2006-06-15 | Kyocera Corp | Portable terminal, and method for controlling the same, and program |
JP4637557B2 (en) * | 2004-12-06 | 2011-02-23 | 京セラ株式会社 | Mobile terminal, mobile terminal control method and program |
KR100739770B1 (en) * | 2004-12-11 | 2007-07-13 | 삼성전자주식회사 | Storage medium including meta data capable of applying to multi-angle title and apparatus and method thereof |
US9420021B2 (en) | 2004-12-13 | 2016-08-16 | Nokia Technologies Oy | Media device and method of enhancing use of media device |
US7660431B2 (en) * | 2004-12-16 | 2010-02-09 | Motorola, Inc. | Image recognition facilitation using remotely sourced content |
JP2006174309A (en) * | 2004-12-17 | 2006-06-29 | Ricoh Co Ltd | Animation reproducing apparatus, program, and record medium |
JP4218758B2 (en) * | 2004-12-21 | 2009-02-04 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Subtitle generating apparatus, subtitle generating method, and program |
US7272592B2 (en) | 2004-12-30 | 2007-09-18 | Microsoft Corporation | Updating metadata stored in a read-only media file |
FI20041689A0 (en) * | 2004-12-30 | 2004-12-30 | Nokia Corp | Marking and / or splitting of media stream into a cellular network terminal |
KR100782810B1 (en) * | 2005-01-07 | 2007-12-06 | 삼성전자주식회사 | Apparatus and method of reproducing an storage medium having metadata for providing enhanced search |
US8842977B2 (en) * | 2005-01-07 | 2014-09-23 | Samsung Electronics Co., Ltd. | Storage medium storing metadata for providing enhanced search function |
JP4595555B2 (en) * | 2005-01-20 | 2010-12-08 | ソニー株式会社 | Content playback apparatus and content playback method |
JP4247626B2 (en) * | 2005-01-20 | 2009-04-02 | ソニー株式会社 | Playback apparatus and playback method |
US20060162546A1 (en) * | 2005-01-21 | 2006-07-27 | Sanden Corporation | Sealing member of a compressor |
US20080141135A1 (en) * | 2005-01-24 | 2008-06-12 | Fitphonic Systems, Llc | Interactive Audio/Video Instruction System |
TW200704183A (en) * | 2005-01-27 | 2007-01-16 | Matrix Tv | Dynamic mosaic extended electronic programming guide for television program selection and display |
US8021277B2 (en) | 2005-02-02 | 2011-09-20 | Mad Dogg Athletics, Inc. | Programmed exercise bicycle with computer aided guidance |
JP3789463B1 (en) * | 2005-02-07 | 2006-06-21 | 三菱電機株式会社 | Recommended program extracting apparatus and recommended program extracting method |
US8407201B2 (en) * | 2005-02-15 | 2013-03-26 | Hewlett-Packard Development Company, L.P. | Digital image search and retrieval system |
US7805679B2 (en) * | 2005-02-24 | 2010-09-28 | Fujifilm Corporation | Apparatus and method for generating slide show and program therefor |
US20060195859A1 (en) * | 2005-02-25 | 2006-08-31 | Richard Konig | Detecting known video entities taking into account regions of disinterest |
US20060195860A1 (en) * | 2005-02-25 | 2006-08-31 | Eldering Charles A | Acting on known video entities detected utilizing fingerprinting |
KR100798551B1 (en) * | 2005-03-01 | 2008-01-28 | 비브콤 인코포레이티드 | Method for localizing a frame and presenting segmentation information for audio-visual programs |
KR100825191B1 (en) * | 2005-03-03 | 2008-04-24 | 비브콤 인코포레이티드 | Fast metadata generation using indexing audio-visual programs and graphical user interface, and resuing segmentation metadata |
WO2006096612A2 (en) | 2005-03-04 | 2006-09-14 | The Trustees Of Columbia University In The City Of New York | System and method for motion estimation and mode decision for low-complexity h.264 decoder |
US8219635B2 (en) * | 2005-03-09 | 2012-07-10 | Vudu, Inc. | Continuous data feeding in a distributed environment |
US8904463B2 (en) | 2005-03-09 | 2014-12-02 | Vudu, Inc. | Live video broadcasting on distributed networks |
US20080022343A1 (en) * | 2006-07-24 | 2008-01-24 | Vvond, Inc. | Multiple audio streams |
US9176955B2 (en) * | 2005-03-09 | 2015-11-03 | Vvond, Inc. | Method and apparatus for sharing media files among network nodes |
US8863143B2 (en) | 2006-03-16 | 2014-10-14 | Adaptive Computing Enterprises, Inc. | System and method for managing a hybrid compute environment |
US8631130B2 (en) | 2005-03-16 | 2014-01-14 | Adaptive Computing Enterprises, Inc. | Reserving resources in an on-demand compute environment from a local compute environment |
US9231886B2 (en) | 2005-03-16 | 2016-01-05 | Adaptive Computing Enterprises, Inc. | Simple integration of an on-demand compute environment |
US7756388B2 (en) | 2005-03-21 | 2010-07-13 | Microsoft Corporation | Media item subgroup generation from a library |
US9769354B2 (en) | 2005-03-24 | 2017-09-19 | Kofax, Inc. | Systems and methods of processing scanned data |
US9137417B2 (en) | 2005-03-24 | 2015-09-15 | Kofax, Inc. | Systems and methods for processing video data |
TW200724142A (en) * | 2005-03-25 | 2007-07-01 | Glaxo Group Ltd | Novel compounds |
JP2006268800A (en) * | 2005-03-25 | 2006-10-05 | Fuji Xerox Co Ltd | Apparatus and method for minutes creation support, and program |
US20060218187A1 (en) * | 2005-03-25 | 2006-09-28 | Microsoft Corporation | Methods, systems, and computer-readable media for generating an ordered list of one or more media items |
JP4741267B2 (en) * | 2005-03-28 | 2011-08-03 | ソニー株式会社 | Content recommendation system, communication terminal, and content recommendation method |
US7647346B2 (en) | 2005-03-29 | 2010-01-12 | Microsoft Corporation | Automatic rules-based device synchronization |
US9165042B2 (en) * | 2005-03-31 | 2015-10-20 | International Business Machines Corporation | System and method for efficiently performing similarity searches of structural data |
US7363298B2 (en) * | 2005-04-01 | 2008-04-22 | Microsoft Corporation | Optimized cache efficiency behavior |
US7533091B2 (en) | 2005-04-06 | 2009-05-12 | Microsoft Corporation | Methods, systems, and computer-readable media for generating a suggested list of media items based upon a seed |
CA2603577A1 (en) | 2005-04-07 | 2006-10-12 | Cluster Resources, Inc. | On-demand access to compute resources |
US9973817B1 (en) | 2005-04-08 | 2018-05-15 | Rovi Guides, Inc. | System and method for providing a list of video-on-demand programs |
BRPI0612974A2 (en) * | 2005-04-18 | 2010-12-14 | Clearplay Inc | computer program product, computer data signal embedded in a streaming media, method for associating a multimedia presentation with content filter information and multimedia player |
US10210159B2 (en) * | 2005-04-21 | 2019-02-19 | Oath Inc. | Media object metadata association and ranking |
US8732175B2 (en) * | 2005-04-21 | 2014-05-20 | Yahoo! Inc. | Interestingness ranking of media objects |
US20060242198A1 (en) * | 2005-04-22 | 2006-10-26 | Microsoft Corporation | Methods, computer-readable media, and data structures for building an authoritative database of digital audio identifier elements and identifying media items |
US7526930B2 (en) * | 2005-04-22 | 2009-05-05 | Schlumberger Technology Corporation | Method system and program storage device for synchronizing displays relative to a point in time |
US7647128B2 (en) * | 2005-04-22 | 2010-01-12 | Microsoft Corporation | Methods, computer-readable media, and data structures for building an authoritative database of digital audio identifier elements and identifying media items |
WO2006110975A1 (en) * | 2005-04-22 | 2006-10-26 | Logovision Wireless Inc. | Multimedia system for mobile client platforms |
US20060239563A1 (en) * | 2005-04-25 | 2006-10-26 | Nokia Corporation | Method and device for compressed domain video editing |
JP2006311462A (en) * | 2005-05-02 | 2006-11-09 | Toshiba Corp | Apparatus and method for retrieval contents |
US7690011B2 (en) * | 2005-05-02 | 2010-03-30 | Technology, Patents & Licensing, Inc. | Video stream modification to defeat detection |
US8145528B2 (en) | 2005-05-23 | 2012-03-27 | Open Text S.A. | Movie advertising placement optimization based on behavior and content analysis |
US9648281B2 (en) | 2005-05-23 | 2017-05-09 | Open Text Sa Ulc | System and method for movie segment bookmarking and sharing |
US8141111B2 (en) | 2005-05-23 | 2012-03-20 | Open Text S.A. | Movie advertising playback techniques |
EP2309737A1 (en) | 2005-05-23 | 2011-04-13 | Thomas S. Gilley | Distributed scalable media environment |
US20060271855A1 (en) * | 2005-05-27 | 2006-11-30 | Microsoft Corporation | Operating system shell management of video files |
US8244796B1 (en) | 2005-05-31 | 2012-08-14 | Adobe Systems Incorporated | Method and apparatus for customizing presentation of notification lists |
US20060277177A1 (en) * | 2005-06-02 | 2006-12-07 | Lunt Tracy T | Identifying electronic files in accordance with a derivative attribute based upon a predetermined relevance criterion |
US20060277154A1 (en) * | 2005-06-02 | 2006-12-07 | Lunt Tracy T | Data structure generated in accordance with a method for identifying electronic files using derivative attributes created from native file attributes |
US20060277207A1 (en) * | 2005-06-06 | 2006-12-07 | Ure Michael J | Enterprise business intelligence using email analytics |
US8099511B1 (en) * | 2005-06-11 | 2012-01-17 | Vudu, Inc. | Instantaneous media-on-demand |
JP2008545992A (en) | 2005-06-13 | 2008-12-18 | ノキア コーポレイション | Support for assisted satellite positioning |
US20060287994A1 (en) * | 2005-06-15 | 2006-12-21 | George David A | Method and apparatus for creating searches in peer-to-peer networks |
KR100724984B1 (en) * | 2005-06-16 | 2007-06-04 | 삼성전자주식회사 | Method for playing digital multimedia broadcasting variously and apparatus thereof |
US7890513B2 (en) * | 2005-06-20 | 2011-02-15 | Microsoft Corporation | Providing community-based media item ratings to users |
JP2007004896A (en) * | 2005-06-23 | 2007-01-11 | Toshiba Corp | Information storage medium, information transfer method, information reproducing method, and information recording method |
US8171394B2 (en) * | 2005-06-24 | 2012-05-01 | Microsoft Corporation | Methods and systems for providing a customized user interface for viewing and editing meta-data |
US7877420B2 (en) * | 2005-06-24 | 2011-01-25 | Microsoft Corporation | Methods and systems for incorporating meta-data in document content |
US7703040B2 (en) * | 2005-06-29 | 2010-04-20 | Microsoft Corporation | Local search engine user interface |
JP2007011928A (en) * | 2005-07-04 | 2007-01-18 | Sony Corp | Content provision system, content provision device, content distribution server, content reception terminal and content provision method |
US7580932B2 (en) * | 2005-07-15 | 2009-08-25 | Microsoft Corporation | User interface for establishing a filtering engine |
JP5133508B2 (en) | 2005-07-21 | 2013-01-30 | ソニー株式会社 | Content providing system, content providing device, content distribution server, content receiving terminal, and content providing method |
KR100690819B1 (en) * | 2005-07-21 | 2007-03-09 | 엘지전자 주식회사 | Mobile terminal having bookmark function for contents service and operation method thereof |
EP1911263A4 (en) * | 2005-07-22 | 2011-03-30 | Kangaroo Media Inc | System and methods for enhancing the experience of spectators attending a live sporting event |
US20070027857A1 (en) * | 2005-07-28 | 2007-02-01 | Li Deng | System and method for searching multimedia and download the search result to mobile devices |
US7831913B2 (en) * | 2005-07-29 | 2010-11-09 | Microsoft Corporation | Selection-based item tagging |
WO2007015228A1 (en) * | 2005-08-02 | 2007-02-08 | Mobixell Networks | Content distribution and tracking |
KR100678954B1 (en) * | 2005-08-08 | 2007-02-06 | 삼성전자주식회사 | Method for using paused time information of media contents in upnp environment |
US7680824B2 (en) | 2005-08-11 | 2010-03-16 | Microsoft Corporation | Single action media playlist generation |
US7681238B2 (en) * | 2005-08-11 | 2010-03-16 | Microsoft Corporation | Remotely accessing protected files via streaming |
US7831605B2 (en) | 2005-08-12 | 2010-11-09 | Microsoft Corporation | Media player service library |
US20070048713A1 (en) * | 2005-08-12 | 2007-03-01 | Microsoft Corporation | Media player service library |
US7236559B2 (en) * | 2005-08-17 | 2007-06-26 | General Electric Company | Dual energy scanning protocols for motion mitigation and material differentiation |
US8189472B2 (en) | 2005-09-07 | 2012-05-29 | Mcdonald James F | Optimizing bandwidth utilization to a subscriber premises |
US9401080B2 (en) | 2005-09-07 | 2016-07-26 | Verizon Patent And Licensing Inc. | Method and apparatus for synchronizing video frames |
US9076311B2 (en) * | 2005-09-07 | 2015-07-07 | Verizon Patent And Licensing Inc. | Method and apparatus for providing remote workflow management |
US20070107012A1 (en) * | 2005-09-07 | 2007-05-10 | Verizon Business Network Services Inc. | Method and apparatus for providing on-demand resource allocation |
US8631226B2 (en) * | 2005-09-07 | 2014-01-14 | Verizon Patent And Licensing Inc. | Method and system for video monitoring |
US7469257B2 (en) * | 2005-09-08 | 2008-12-23 | Microsoft Corporation | Generating and monitoring a multimedia database |
KR100694152B1 (en) * | 2005-09-14 | 2007-03-12 | 삼성전자주식회사 | Method and apparatus for managing multimedia contents stored in the digital multimedia device |
US8024768B2 (en) * | 2005-09-15 | 2011-09-20 | Penthera Partners, Inc. | Broadcasting video content to devices having different video presentation capabilities |
US7707485B2 (en) * | 2005-09-28 | 2010-04-27 | Vixs Systems, Inc. | System and method for dynamic transrating based on content |
US7412534B2 (en) * | 2005-09-30 | 2008-08-12 | Yahoo! Inc. | Subscription control panel |
FR2891678A1 (en) * | 2005-09-30 | 2007-04-06 | France Telecom | Multimedia e.g. audio, digital document e.g. journal, delivering system, has streaming server triggering transmission of any document of media content server to user`s terminal from document portion marked based on current index value |
US20070077921A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Pushing podcasts to mobile devices |
US20070078897A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Filemarking pre-existing media files using location tags |
US20070078898A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Server-based system and method for retrieving tagged portions of media files |
US20070078714A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Automatically matching advertisements to media files |
US20070078832A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Method and system for using smart tags and a recommendation engine using smart tags |
US20070078883A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Using location tags to render tagged portions of media files |
US8108378B2 (en) * | 2005-09-30 | 2012-01-31 | Yahoo! Inc. | Podcast search engine |
US20070078896A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Identifying portions within media files with location tags |
US20070078712A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Systems for inserting advertisements into a podcast |
US20070078876A1 (en) * | 2005-09-30 | 2007-04-05 | Yahoo! Inc. | Generating a stream of media data containing portions of media files using location tags |
US8346789B2 (en) * | 2005-10-03 | 2013-01-01 | Intel Corporation | System and method for generating homogeneous metadata from pre-existing metadata |
US20070106627A1 (en) * | 2005-10-05 | 2007-05-10 | Mohit Srivastava | Social discovery systems and methods |
US20090148129A1 (en) * | 2005-10-11 | 2009-06-11 | Hiroyuki Hayashi | Audio visual device |
NO327155B1 (en) | 2005-10-19 | 2009-05-04 | Fast Search & Transfer Asa | Procedure for displaying video data within result presentations in systems for accessing and searching for information |
US8326775B2 (en) * | 2005-10-26 | 2012-12-04 | Cortica Ltd. | Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof |
US7688686B2 (en) * | 2005-10-27 | 2010-03-30 | Microsoft Corporation | Enhanced table of contents (TOC) identifiers |
US7773813B2 (en) | 2005-10-31 | 2010-08-10 | Microsoft Corporation | Capture-intention detection for video content analysis |
US8180826B2 (en) * | 2005-10-31 | 2012-05-15 | Microsoft Corporation | Media sharing and authoring on the web |
US8196032B2 (en) * | 2005-11-01 | 2012-06-05 | Microsoft Corporation | Template-based multimedia authoring and sharing |
KR100661179B1 (en) * | 2005-11-01 | 2006-12-26 | 삼성전자주식회사 | Interface apparatus for list play |
US20070118873A1 (en) * | 2005-11-09 | 2007-05-24 | Bbnt Solutions Llc | Methods and apparatus for merging media content |
US7801910B2 (en) * | 2005-11-09 | 2010-09-21 | Ramp Holdings, Inc. | Method and apparatus for timed tagging of media content |
US20070106685A1 (en) * | 2005-11-09 | 2007-05-10 | Podzinger Corp. | Method and apparatus for updating speech recognition databases and reindexing audio and video content using the same |
US9697231B2 (en) * | 2005-11-09 | 2017-07-04 | Cxense Asa | Methods and apparatus for providing virtual media channels based on media search |
US9697230B2 (en) * | 2005-11-09 | 2017-07-04 | Cxense Asa | Methods and apparatus for dynamic presentation of advertising, factual, and informational content using enhanced metadata in search-driven media applications |
US8751502B2 (en) * | 2005-11-29 | 2014-06-10 | Aol Inc. | Visually-represented results to search queries in rich media content |
US8132103B1 (en) * | 2006-07-19 | 2012-03-06 | Aol Inc. | Audio and/or video scene detection and retrieval |
US9247175B2 (en) * | 2005-11-30 | 2016-01-26 | Broadcom Corporation | Parallel television remote control |
US20070124331A1 (en) * | 2005-11-30 | 2007-05-31 | Sony Ericsson Mobile Communications Ab | Method and apparatus for the seamless delivery of content |
CA2671705A1 (en) * | 2005-12-06 | 2007-06-14 | Pumpone, Llc | System and method for delivery and utilization of content-based products |
KR100776293B1 (en) * | 2005-12-06 | 2007-11-15 | 엘지전자 주식회사 | Mobile communication terminal and operational method |
JP4894252B2 (en) * | 2005-12-09 | 2012-03-14 | ソニー株式会社 | Data display device, data display method, and data display program |
JP4437548B2 (en) * | 2005-12-09 | 2010-03-24 | ソニー株式会社 | Music content display device, music content display method, and music content display program |
US9319720B2 (en) | 2005-12-13 | 2016-04-19 | Audio Pod Inc. | System and method for rendering digital content using time offsets |
US11128489B2 (en) | 2017-07-18 | 2021-09-21 | Nicira, Inc. | Maintaining data-plane connectivity between hosts |
WO2007068119A1 (en) | 2005-12-13 | 2007-06-21 | Audio Pod Inc. | Segmentation and transmission of audio streams |
US8533199B2 (en) * | 2005-12-14 | 2013-09-10 | Unifi Scientific Advances, Inc | Intelligent bookmarks and information management system based on the same |
US20070156627A1 (en) * | 2005-12-15 | 2007-07-05 | General Instrument Corporation | Method and apparatus for creating and using electronic content bookmarks |
US20070157072A1 (en) * | 2005-12-29 | 2007-07-05 | Sony Ericsson Mobile Communications Ab | Portable content sharing |
US7599918B2 (en) * | 2005-12-29 | 2009-10-06 | Microsoft Corporation | Dynamic search with implicit user intention mining |
US8607287B2 (en) | 2005-12-29 | 2013-12-10 | United Video Properties, Inc. | Interactive media guidance system having multiple devices |
US9681105B2 (en) | 2005-12-29 | 2017-06-13 | Rovi Guides, Inc. | Interactive media guidance system having multiple devices |
US7509588B2 (en) | 2005-12-30 | 2009-03-24 | Apple Inc. | Portable electronic device with interface reconfiguration mode |
US7685210B2 (en) * | 2005-12-30 | 2010-03-23 | Microsoft Corporation | Media discovery and curation of playlists |
US8032840B2 (en) * | 2006-01-10 | 2011-10-04 | Nokia Corporation | Apparatus, method and computer program product for generating a thumbnail representation of a video sequence |
IL173222A0 (en) * | 2006-01-18 | 2006-06-11 | Clip In Touch Internat Ltd | Apparatus and method for creating and transmitting unique dynamically personalized multimedia messages |
US10418065B1 (en) * | 2006-01-21 | 2019-09-17 | Advanced Anti-Terror Technologies, Inc. | Intellimark customizations for media content streaming and sharing |
US20070174276A1 (en) * | 2006-01-24 | 2007-07-26 | Sbc Knowledge Ventures, L.P. | Thematic grouping of program segments |
US20070174246A1 (en) * | 2006-01-25 | 2007-07-26 | Sigurdsson Johann T | Multiple client search method and system |
KR100772865B1 (en) * | 2006-01-31 | 2007-11-02 | 삼성전자주식회사 | Method for recovering av session and control point for the same |
US8145656B2 (en) * | 2006-02-07 | 2012-03-27 | Mobixell Networks Ltd. | Matching of modified visual and audio media |
US7734579B2 (en) * | 2006-02-08 | 2010-06-08 | At&T Intellectual Property I, L.P. | Processing program content material |
KR100782836B1 (en) * | 2006-02-08 | 2007-12-06 | 삼성전자주식회사 | Method, apparatus and storage medium for managing contents and adaptive contents playback method using the same |
US8868547B2 (en) * | 2006-02-16 | 2014-10-21 | Dell Products L.P. | Programming content on a device |
US7653342B2 (en) * | 2006-02-16 | 2010-01-26 | Dell Products L.P. | Providing content to a device when lost a connection to the broadcasting station |
US20070198472A1 (en) * | 2006-02-17 | 2007-08-23 | Ford Motor Company | Multimedia system for a vehicle |
JP4811046B2 (en) * | 2006-02-17 | 2011-11-09 | ソニー株式会社 | Content playback apparatus, audio playback device, and content playback method |
KR101187787B1 (en) * | 2006-02-18 | 2012-10-05 | 삼성전자주식회사 | Method and apparatus for searching moving picture using key frame |
KR20080096761A (en) * | 2006-02-28 | 2008-11-03 | 샌디스크 아이엘 엘티디 | Bookmarked synchronization of files |
US8689253B2 (en) | 2006-03-03 | 2014-04-01 | Sharp Laboratories Of America, Inc. | Method and system for configuring media-playing sets |
TW200735665A (en) * | 2006-03-03 | 2007-09-16 | Hon Hai Prec Ind Co Ltd | System and method for processing streaming data |
WO2007098615A1 (en) * | 2006-03-03 | 2007-09-07 | Christian Germano Cotichini | Legacy application modernization by capturing, processing and analysing business processes |
WO2007102107A1 (en) * | 2006-03-06 | 2007-09-13 | Koninklijke Philips Electronics N.V. | Method of setting one or more playback markers for media playback and media player for performing the same |
US7739280B2 (en) | 2006-03-06 | 2010-06-15 | Veveo, Inc. | Methods and systems for selecting and presenting content based on user preference information extracted from an aggregate preference signature |
US7515710B2 (en) | 2006-03-14 | 2009-04-07 | Divx, Inc. | Federated digital rights management scheme including trusted systems |
US8316394B2 (en) | 2006-03-24 | 2012-11-20 | United Video Properties, Inc. | Interactive media guidance application with intelligent navigation and display features |
US9812169B2 (en) * | 2006-03-28 | 2017-11-07 | Hewlett-Packard Development Company, L.P. | Operational system and architectural model for improved manipulation of video and time media data from networked time-based media |
CA2647617A1 (en) * | 2006-03-28 | 2007-11-08 | Motionbox, Inc. | System and method for enabling social browsing of networked time-based media |
US8849945B1 (en) * | 2006-03-28 | 2014-09-30 | Amazon Technologies, Inc. | Annotating content with interactive objects for transactions |
US20090129740A1 (en) * | 2006-03-28 | 2009-05-21 | O'brien Christopher J | System for individual and group editing of networked time-based media |
WO2007112445A2 (en) * | 2006-03-28 | 2007-10-04 | Motionbox, Inc. | A system and data model for shared viewing and editing of time-based media |
US20100169786A1 (en) * | 2006-03-29 | 2010-07-01 | O'brien Christopher J | system, method, and apparatus for visual browsing, deep tagging, and synchronized commenting |
US8285595B2 (en) | 2006-03-29 | 2012-10-09 | Napo Enterprises, Llc | System and method for refining media recommendations |
US8190625B1 (en) | 2006-03-29 | 2012-05-29 | A9.Com, Inc. | Method and system for robust hyperlinking |
US20070244856A1 (en) * | 2006-04-14 | 2007-10-18 | Microsoft Corporation | Media Search Scope Expansion |
US20070244903A1 (en) * | 2006-04-18 | 2007-10-18 | Ratliff Emily J | Collectively managing media bookmarks |
US7913157B1 (en) * | 2006-04-18 | 2011-03-22 | Overcast Media Incorporated | Method and system for the authoring and playback of independent, synchronized media through the use of a relative virtual time code |
US8219553B2 (en) * | 2006-04-26 | 2012-07-10 | At&T Intellectual Property I, Lp | Methods, systems, and computer program products for managing audio and/or video information via a web broadcast |
JP2007300497A (en) * | 2006-05-01 | 2007-11-15 | Canon Inc | Program searching apparatus, and control method of program searching apparatus |
US20070256096A1 (en) * | 2006-05-01 | 2007-11-01 | Sbc Knowledge Ventures L.P. | System and method for pushing conditional message data between a client device and a server device in an internet protocol television network |
US8788588B2 (en) * | 2006-05-03 | 2014-07-22 | Samsung Electronics Co., Ltd. | Method of providing service for user search, and apparatus, server, and system for the same |
US20070260634A1 (en) * | 2006-05-04 | 2007-11-08 | Nokia Corporation | Apparatus, system, method, and computer program product for synchronizing the presentation of media content |
WO2007131230A2 (en) * | 2006-05-07 | 2007-11-15 | Wellcomemat, Llc | Methods and systems for online video-based property commerce |
US8015237B2 (en) * | 2006-05-15 | 2011-09-06 | Apple Inc. | Processing of metadata content and media content received by a media distribution system |
WO2007136103A1 (en) * | 2006-05-18 | 2007-11-29 | Eisai R & D Management Co., Ltd. | Antitumor agent for thyroid cancer |
US8341112B2 (en) * | 2006-05-19 | 2012-12-25 | Microsoft Corporation | Annotation by search |
US9507778B2 (en) * | 2006-05-19 | 2016-11-29 | Yahoo! Inc. | Summarization of media object collections |
US20070268406A1 (en) * | 2006-05-22 | 2007-11-22 | Broadcom Corporation, A California Corporation | Video processing system that generates sub-frame metadata |
US8001143B1 (en) | 2006-05-31 | 2011-08-16 | Adobe Systems Incorporated | Aggregating characteristic information for digital content |
WO2007143592A2 (en) * | 2006-06-01 | 2007-12-13 | Divx, Inc. | Content description system |
US20070294240A1 (en) * | 2006-06-07 | 2007-12-20 | Microsoft Corporation | Intent based search |
US20070294292A1 (en) * | 2006-06-14 | 2007-12-20 | Microsoft Corporation | Advertising transfer and playback on portable devices |
US8660407B2 (en) * | 2006-06-14 | 2014-02-25 | Sony Corporation | Method and system for altering the presentation of recorded content |
US7945142B2 (en) * | 2006-06-15 | 2011-05-17 | Microsoft Corporation | Audio/visual editing tool |
US8903843B2 (en) * | 2006-06-21 | 2014-12-02 | Napo Enterprises, Llc | Historical media recommendation service |
JP4229144B2 (en) * | 2006-06-23 | 2009-02-25 | ソニー株式会社 | Information processing apparatus, information processing method, and computer program |
KR100795358B1 (en) * | 2006-06-26 | 2008-01-17 | 계명대학교 산학협력단 | Music video service method and system and terminal |
US7917514B2 (en) * | 2006-06-28 | 2011-03-29 | Microsoft Corporation | Visual and multi-dimensional search |
US7958075B1 (en) * | 2006-06-29 | 2011-06-07 | At&T Intellectual Property Ii, Lp | Compressing rectilinear pictures and minimizing access control lists |
US7859543B2 (en) * | 2006-06-29 | 2010-12-28 | Apple Inc. | Displaying images |
US8805831B2 (en) | 2006-07-11 | 2014-08-12 | Napo Enterprises, Llc | Scoring and replaying media items |
US8327266B2 (en) | 2006-07-11 | 2012-12-04 | Napo Enterprises, Llc | Graphical user interface system for allowing management of a media item playlist based on a preference scoring system |
US7680959B2 (en) | 2006-07-11 | 2010-03-16 | Napo Enterprises, Llc | P2P network for providing real time media recommendations |
US7970922B2 (en) | 2006-07-11 | 2011-06-28 | Napo Enterprises, Llc | P2P real time media recommendations |
US9003056B2 (en) | 2006-07-11 | 2015-04-07 | Napo Enterprises, Llc | Maintaining a minimum level of real time media recommendations in the absence of online friends |
US8059646B2 (en) | 2006-07-11 | 2011-11-15 | Napo Enterprises, Llc | System and method for identifying music content in a P2P real time recommendation network |
JP5341755B2 (en) | 2006-07-17 | 2013-11-13 | コーニンクレッカ フィリップス エヌ ヴェ | Determining environmental parameter sets |
US8306396B2 (en) * | 2006-07-20 | 2012-11-06 | Carnegie Mellon University | Hardware-based, client-side, video compositing system |
US7783622B1 (en) | 2006-07-21 | 2010-08-24 | Aol Inc. | Identification of electronic content significant to a user |
US8364669B1 (en) | 2006-07-21 | 2013-01-29 | Aol Inc. | Popularity of content items |
US9256675B1 (en) | 2006-07-21 | 2016-02-09 | Aol Inc. | Electronic processing and presentation of search results |
US8874586B1 (en) | 2006-07-21 | 2014-10-28 | Aol Inc. | Authority management for electronic searches |
US7624416B1 (en) | 2006-07-21 | 2009-11-24 | Aol Llc | Identifying events of interest within video content |
US7624103B2 (en) * | 2006-07-21 | 2009-11-24 | Aol Llc | Culturally relevant search results |
US8065313B2 (en) * | 2006-07-24 | 2011-11-22 | Google Inc. | Method and apparatus for automatically annotating images |
US9176984B2 (en) | 2006-07-31 | 2015-11-03 | Ricoh Co., Ltd | Mixed media reality retrieval of differentially-weighted links |
US8201076B2 (en) | 2006-07-31 | 2012-06-12 | Ricoh Co., Ltd. | Capturing symbolic information from documents upon printing |
US9063952B2 (en) * | 2006-07-31 | 2015-06-23 | Ricoh Co., Ltd. | Mixed media reality recognition with image tracking |
US8489987B2 (en) | 2006-07-31 | 2013-07-16 | Ricoh Co., Ltd. | Monitoring and analyzing creation and usage of visual content using image and hotspot interaction |
US7769363B2 (en) * | 2006-08-01 | 2010-08-03 | Chew Gregory T H | User-initiated communications during multimedia content playback on a mobile communications device |
JP5257071B2 (en) * | 2006-08-03 | 2013-08-07 | 日本電気株式会社 | Similarity calculation device and information retrieval device |
US8412021B2 (en) * | 2007-05-18 | 2013-04-02 | Fall Front Wireless Ny, Llc | Video player user interface |
US20100194756A1 (en) * | 2006-08-07 | 2010-08-05 | Max-Planck-Gesellschaft Zur Forderung Der Wissenschaften E.V., A Corporation Of Germany | Method for producing scaleable image matrices |
KR100754225B1 (en) | 2006-08-07 | 2007-09-03 | 삼성전자주식회사 | Method and apparatus for recording and reproducing interactive service of digital broadcast |
US8620699B2 (en) | 2006-08-08 | 2013-12-31 | Napo Enterprises, Llc | Heavy influencer media recommendations |
US8090606B2 (en) | 2006-08-08 | 2012-01-03 | Napo Enterprises, Llc | Embedded media recommendations |
WO2008021459A2 (en) * | 2006-08-17 | 2008-02-21 | Anchorfree, Inc. | Software web crawlwer and method thereof |
US8296812B1 (en) | 2006-09-01 | 2012-10-23 | Vudu, Inc. | Streaming video using erasure encoding |
US7956849B2 (en) | 2006-09-06 | 2011-06-07 | Apple Inc. | Video manager for portable multifunction device |
US10313505B2 (en) | 2006-09-06 | 2019-06-04 | Apple Inc. | Portable multifunction device, method, and graphical user interface for configuring and displaying widgets |
US7864163B2 (en) | 2006-09-06 | 2011-01-04 | Apple Inc. | Portable electronic device, method, and graphical user interface for displaying structured electronic documents |
US8842074B2 (en) | 2006-09-06 | 2014-09-23 | Apple Inc. | Portable electronic device performing similar operations for different gestures |
US20080065693A1 (en) * | 2006-09-11 | 2008-03-13 | Bellsouth Intellectual Property Corporation | Presenting and linking segments of tagged media files in a media services network |
US8341152B1 (en) * | 2006-09-12 | 2012-12-25 | Creatier Interactive Llc | System and method for enabling objects within video to be searched on the internet or intranet |
WO2008032717A1 (en) * | 2006-09-12 | 2008-03-20 | Visionarts, Inc. | Method for storing and reading-out data handled by application operating on http client, data storage program, and data read-out program |
US7953713B2 (en) * | 2006-09-14 | 2011-05-31 | International Business Machines Corporation | System and method for representing and using tagged data in a management system |
US11303684B2 (en) * | 2006-09-14 | 2022-04-12 | Opentv, Inc. | Methods and systems for data transmission |
US20080071830A1 (en) * | 2006-09-14 | 2008-03-20 | Bray Pike | Method of indexing and streaming media files on a distributed network |
US8335873B2 (en) * | 2006-09-14 | 2012-12-18 | Opentv, Inc. | Method and systems for data transmission |
JP5003075B2 (en) * | 2006-09-21 | 2012-08-15 | ソニー株式会社 | Playback apparatus, playback method, and playback program |
US20080201201A1 (en) * | 2006-09-25 | 2008-08-21 | Sms.Ac | Methods and systems for finding, tagging, rating and suggesting content provided by networked application pods |
US20080086453A1 (en) * | 2006-10-05 | 2008-04-10 | Fabian-Baber, Inc. | Method and apparatus for correlating the results of a computer network text search with relevant multimedia files |
US20110161174A1 (en) * | 2006-10-11 | 2011-06-30 | Tagmotion Pty Limited | Method and apparatus for managing multimedia files |
US20080162568A1 (en) * | 2006-10-18 | 2008-07-03 | Huazhang Shen | System and method for estimating real life relationships and popularities among people based on large quantities of personal visual data |
KR101235341B1 (en) * | 2006-10-19 | 2013-02-19 | 엘지전자 주식회사 | Broadcast Terminal And Method Of Playing Broadcast Data Using Same |
US8520850B2 (en) | 2006-10-20 | 2013-08-27 | Time Warner Cable Enterprises Llc | Downloadable security and protection methods and apparatus |
US7631260B1 (en) * | 2006-10-23 | 2009-12-08 | Adobe Systems Inc. | Application modification based on feed content |
US10657168B2 (en) | 2006-10-24 | 2020-05-19 | Slacker, Inc. | Methods and systems for personalized rendering of digital media content |
KR100852526B1 (en) * | 2006-10-25 | 2008-08-14 | 엘지전자 주식회사 | Method and apparatus for controlling an saving information of an image display device |
EP1919216A1 (en) * | 2006-10-30 | 2008-05-07 | British Telecommunications Public Limited Company | Personalised media presentation |
KR20080038893A (en) | 2006-10-31 | 2008-05-07 | 삼성전자주식회사 | Moving picture file playback method and apparatus |
US8090694B2 (en) | 2006-11-02 | 2012-01-03 | At&T Intellectual Property I, L.P. | Index of locally recorded content |
US8296315B2 (en) * | 2006-11-03 | 2012-10-23 | Microsoft Corporation | Earmarking media documents |
US8269763B2 (en) * | 2006-11-03 | 2012-09-18 | Apple Inc. | Continuous random access points |
US8594702B2 (en) | 2006-11-06 | 2013-11-26 | Yahoo! Inc. | Context server for associating information based on context |
WO2008056284A1 (en) * | 2006-11-06 | 2008-05-15 | Nxp B.V. | System for encoding, transmitting and storing an audio and/or video signal, and a corresponding method |
US20080126919A1 (en) * | 2006-11-08 | 2008-05-29 | General Instrument Corporation | Method, Apparatus and System for Managing Access to Multimedia Content Using Dynamic Media Bookmarks |
US20080112690A1 (en) * | 2006-11-09 | 2008-05-15 | Sbc Knowledge Venturses, L.P. | Personalized local recorded content |
US8577204B2 (en) | 2006-11-13 | 2013-11-05 | Cyberlink Corp. | System and methods for remote manipulation of video over a network |
US9417758B2 (en) * | 2006-11-21 | 2016-08-16 | Daniel E. Tsai | AD-HOC web content player |
US9110903B2 (en) | 2006-11-22 | 2015-08-18 | Yahoo! Inc. | Method, system and apparatus for using user profile electronic device data in media delivery |
US8402356B2 (en) | 2006-11-22 | 2013-03-19 | Yahoo! Inc. | Methods, systems and apparatus for delivery of media |
US7921139B2 (en) * | 2006-12-01 | 2011-04-05 | Whitserve Llc | System for sequentially opening and displaying files in a directory |
US20080155627A1 (en) * | 2006-12-04 | 2008-06-26 | O'connor Daniel | Systems and methods of searching for and presenting video and audio |
US20080134088A1 (en) * | 2006-12-05 | 2008-06-05 | Palm, Inc. | Device for saving results of location based searches |
US20080134030A1 (en) * | 2006-12-05 | 2008-06-05 | Palm, Inc. | Device for providing location-based data |
US10416838B2 (en) * | 2006-12-11 | 2019-09-17 | Oath Inc. | Graphical messages |
US8874655B2 (en) | 2006-12-13 | 2014-10-28 | Napo Enterprises, Llc | Matching participants in a P2P recommendation network loosely coupled to a subscription service |
US20130166580A1 (en) * | 2006-12-13 | 2013-06-27 | Quickplay Media Inc. | Media Processor |
US9064010B2 (en) | 2006-12-13 | 2015-06-23 | Quickplay Media Inc. | Encoding and transcoding for mobile media |
US9571902B2 (en) | 2006-12-13 | 2017-02-14 | Quickplay Media Inc. | Time synchronizing of distinct video and data feeds that are delivered in a single mobile IP data network compatible stream |
KR100836197B1 (en) * | 2006-12-14 | 2008-06-09 | 삼성전자주식회사 | Apparatus for detecting caption in moving picture and method of operating the apparatus |
US8732166B1 (en) * | 2006-12-14 | 2014-05-20 | Amazon Technologies, Inc. | Providing dynamically-generated bookmarks or other objects which encourage users to interact with a service |
US8161387B1 (en) | 2006-12-18 | 2012-04-17 | At&T Intellectual Property I, L. P. | Creation of a marked media module |
US8082504B1 (en) | 2006-12-18 | 2011-12-20 | At&T Intellectual Property I, L.P. | Creation of a reference point to mark a media presentation |
EP2503475A1 (en) * | 2006-12-19 | 2012-09-26 | Swisscom AG | Method and device for selective access to data elements in a data set |
KR100773441B1 (en) * | 2006-12-19 | 2007-11-05 | 삼성전자주식회사 | Method and apparatus for searching stored files in mobile terminal |
US20080288869A1 (en) * | 2006-12-22 | 2008-11-20 | Apple Inc. | Boolean Search User Interface |
US8276098B2 (en) | 2006-12-22 | 2012-09-25 | Apple Inc. | Interactive image thumbnails |
US7954065B2 (en) * | 2006-12-22 | 2011-05-31 | Apple Inc. | Two-dimensional timeline display of media items |
US7559017B2 (en) * | 2006-12-22 | 2009-07-07 | Google Inc. | Annotation framework for video |
US9142253B2 (en) * | 2006-12-22 | 2015-09-22 | Apple Inc. | Associating keywords to media |
US20080162486A1 (en) * | 2006-12-27 | 2008-07-03 | Research In Motion Limited | Method and apparatus for storing data from a network address |
US20080159724A1 (en) * | 2006-12-27 | 2008-07-03 | Disney Enterprises, Inc. | Method and system for inputting and displaying commentary information with content |
US8099386B2 (en) * | 2006-12-27 | 2012-01-17 | Research In Motion Limited | Method and apparatus for synchronizing databases connected by wireless interface |
US8275741B2 (en) * | 2006-12-27 | 2012-09-25 | Research In Motion Limited | Method and apparatus for memory management in an electronic device |
US10156953B2 (en) * | 2006-12-27 | 2018-12-18 | Blackberry Limited | Method for presenting data on a small screen |
US8046803B1 (en) | 2006-12-28 | 2011-10-25 | Sprint Communications Company L.P. | Contextual multimedia metatagging |
US8769099B2 (en) | 2006-12-28 | 2014-07-01 | Yahoo! Inc. | Methods and systems for pre-caching information on a mobile computing device |
US20080162548A1 (en) * | 2006-12-29 | 2008-07-03 | Zahid Ahmed | Object oriented, semantically-rich universal item information model |
US9270963B2 (en) | 2007-01-03 | 2016-02-23 | Tivo Inc. | Program shortcuts |
ES2935410T3 (en) | 2007-01-05 | 2023-03-06 | Divx Llc | Video distribution system including progressive play |
US8214768B2 (en) * | 2007-01-05 | 2012-07-03 | Apple Inc. | Method, system, and graphical user interface for viewing multiple application windows |
US20080165148A1 (en) * | 2007-01-07 | 2008-07-10 | Richard Williamson | Portable Electronic Device, Method, and Graphical User Interface for Displaying Inline Multimedia Content |
US8519964B2 (en) | 2007-01-07 | 2013-08-27 | Apple Inc. | Portable multifunction device, method, and graphical user interface supporting user navigations of graphical objects on a touch screen display |
US9071729B2 (en) | 2007-01-09 | 2015-06-30 | Cox Communications, Inc. | Providing user communication |
US20080172704A1 (en) * | 2007-01-16 | 2008-07-17 | Montazemi Peyman T | Interactive audiovisual editing system |
US20090070185A1 (en) * | 2007-01-17 | 2009-03-12 | Concert Technology Corporation | System and method for recommending a digital media subscription service |
CN101227590B (en) * | 2007-01-19 | 2013-03-06 | 北京风行在线技术有限公司 | P2P protocol-based media file order program control method and apparatus |
US8923747B2 (en) * | 2007-01-22 | 2014-12-30 | Jook, Inc. | Wireless sharing of audio files and information for streamlined purchasing |
US7949300B2 (en) * | 2007-01-22 | 2011-05-24 | Jook, Inc. | Wireless sharing of audio files and related information |
US20090063994A1 (en) * | 2007-01-23 | 2009-03-05 | Cox Communications, Inc. | Providing a Content Mark |
US9135334B2 (en) | 2007-01-23 | 2015-09-15 | Cox Communications, Inc. | Providing a social network |
US20090049473A1 (en) * | 2007-01-23 | 2009-02-19 | Cox Communications, Inc. | Providing a Video User Interface |
US8806532B2 (en) * | 2007-01-23 | 2014-08-12 | Cox Communications, Inc. | Providing a user interface |
US8869191B2 (en) | 2007-01-23 | 2014-10-21 | Cox Communications, Inc. | Providing a media guide including parental information |
US8789102B2 (en) * | 2007-01-23 | 2014-07-22 | Cox Communications, Inc. | Providing a customized user interface |
US20090313664A1 (en) * | 2007-01-23 | 2009-12-17 | Cox Communications, Inc. | Providing a Video User Interface |
US8621540B2 (en) | 2007-01-24 | 2013-12-31 | Time Warner Cable Enterprises Llc | Apparatus and methods for provisioning in a download-enabled system |
US20080177536A1 (en) * | 2007-01-24 | 2008-07-24 | Microsoft Corporation | A/v content editing |
US20080181513A1 (en) * | 2007-01-31 | 2008-07-31 | John Almeida | Method, apparatus and algorithm for indexing, searching, retrieval of digital stream by the use of summed partitions |
US20100118190A1 (en) * | 2007-02-06 | 2010-05-13 | Mobixell Networks | Converting images to moving picture format |
EP1959449A1 (en) * | 2007-02-13 | 2008-08-20 | British Telecommunications Public Limited Company | Analysing video material |
US8751475B2 (en) * | 2007-02-14 | 2014-06-10 | Microsoft Corporation | Providing additional information related to earmarks |
US8958483B2 (en) | 2007-02-27 | 2015-02-17 | Adobe Systems Incorporated | Audio/video content synchronization and display |
WO2008109889A1 (en) | 2007-03-08 | 2008-09-12 | Slacker, Inc. | System and method for personalizing playback content through interaction with a playback device |
US8179475B2 (en) * | 2007-03-09 | 2012-05-15 | Legend3D, Inc. | Apparatus and method for synchronizing a secondary audio track to the audio track of a video source |
US7801888B2 (en) | 2007-03-09 | 2010-09-21 | Microsoft Corporation | Media content search results ranked by popularity |
US8103646B2 (en) | 2007-03-13 | 2012-01-24 | Microsoft Corporation | Automatic tagging of content based on a corpus of previously tagged and untagged content |
KR101316743B1 (en) * | 2007-03-13 | 2013-10-08 | 삼성전자주식회사 | Method for providing metadata on parts of video image, method for managing the provided metadata and apparatus using the methods |
US9967620B2 (en) | 2007-03-16 | 2018-05-08 | Adobe Systems Incorporated | Video highlights for streaming media |
US7465241B2 (en) * | 2007-03-23 | 2008-12-16 | Acushnet Company | Functionalized, crosslinked, rubber nanoparticles for use in golf ball castable thermoset layers |
US20100274820A1 (en) * | 2007-03-28 | 2010-10-28 | O'brien Christopher J | System and method for autogeneration of long term media data from networked time-based media |
US20080244750A1 (en) * | 2007-03-28 | 2008-10-02 | Benjamin Romero | Method and Apparatus Regarding Attachments to E-mails |
US20080240227A1 (en) * | 2007-03-30 | 2008-10-02 | Wan Wade K | Bitstream processing using marker codes with offset values |
US9071796B2 (en) * | 2007-03-30 | 2015-06-30 | Verizon Patent And Licensing Inc. | Managing multiple media content sources |
US9224427B2 (en) | 2007-04-02 | 2015-12-29 | Napo Enterprises LLC | Rating media item recommendations using recommendation paths and/or media item usage |
US20080250023A1 (en) * | 2007-04-03 | 2008-10-09 | Baker Peter N | System and method for bookmarking content with user feedback |
WO2008129600A1 (en) * | 2007-04-05 | 2008-10-30 | Sony Computer Entertainment Inc. | Content reproduction apparatus, content delivery apparatus, content delivery system, and method for generating metadata |
US8112720B2 (en) | 2007-04-05 | 2012-02-07 | Napo Enterprises, Llc | System and method for automatically and graphically associating programmatically-generated media item recommendations related to a user's socially recommended media items |
US9140552B2 (en) | 2008-07-02 | 2015-09-22 | Qualcomm Incorporated | User defined names for displaying monitored location |
US9031583B2 (en) | 2007-04-11 | 2015-05-12 | Qualcomm Incorporated | Notification on mobile device based on location of other mobile device |
US20080254811A1 (en) * | 2007-04-11 | 2008-10-16 | Palm, Inc. | System and method for monitoring locations of mobile devices |
US8929461B2 (en) * | 2007-04-17 | 2015-01-06 | Intel Corporation | Method and apparatus for caption detection |
US20080275732A1 (en) * | 2007-05-01 | 2008-11-06 | Best Doctors, Inc. | Using patterns of medical treatment codes to determine when further medical expertise is called for |
US10042898B2 (en) * | 2007-05-09 | 2018-08-07 | Illinois Institutre Of Technology | Weighted metalabels for enhanced search in hierarchical abstract data organization systems |
EP1990607A1 (en) * | 2007-05-10 | 2008-11-12 | Leica Geosystems AG | Method of position determination for a geodetic surveying apparatus |
US8396881B2 (en) | 2007-05-17 | 2013-03-12 | Research In Motion Limited | Method and system for automatically generating web page transcoding instructions |
CA2689065C (en) * | 2007-05-30 | 2017-08-29 | Creatier Interactive, Llc | Method and system for enabling advertising and transaction within user generated video content |
US20080300989A1 (en) * | 2007-05-31 | 2008-12-04 | Eyewonder, Inc. | Systems and methods for generating, reviewing, editing, and transmitting an advertising unit in a single environment |
US20080301187A1 (en) * | 2007-06-01 | 2008-12-04 | Concert Technology Corporation | Enhanced media item playlist comprising presence information |
US8285776B2 (en) | 2007-06-01 | 2012-10-09 | Napo Enterprises, Llc | System and method for processing a received media item recommendation message comprising recommender presence information |
US20090049045A1 (en) | 2007-06-01 | 2009-02-19 | Concert Technology Corporation | Method and system for sorting media items in a playlist on a media device |
US9164993B2 (en) | 2007-06-01 | 2015-10-20 | Napo Enterprises, Llc | System and method for propagating a media item recommendation message comprising recommender presence information |
US8839141B2 (en) | 2007-06-01 | 2014-09-16 | Napo Enterprises, Llc | Method and system for visually indicating a replay status of media items on a media device |
US9037632B2 (en) | 2007-06-01 | 2015-05-19 | Napo Enterprises, Llc | System and method of generating a media item recommendation message with recommender presence information |
US8099315B2 (en) * | 2007-06-05 | 2012-01-17 | At&T Intellectual Property I, L.P. | Interest profiles for audio and/or video streams |
US8103150B2 (en) * | 2007-06-07 | 2012-01-24 | Cyberlink Corp. | System and method for video editing based on semantic data |
CN101321265B (en) * | 2007-06-07 | 2011-03-16 | 中兴通讯股份有限公司 | Method and system for implementing peer-to-peer network media order frame-across broadcast mode |
US9232042B2 (en) * | 2007-07-20 | 2016-01-05 | Broadcom Corporation | Method and system for utilizing and modifying user preference information to create context data tags in a wireless system |
US9509795B2 (en) * | 2007-07-20 | 2016-11-29 | Broadcom Corporation | Method and system for tagging data with context data tags in a wireless system |
US20080313541A1 (en) * | 2007-06-14 | 2008-12-18 | Yahoo! Inc. | Method and system for personalized segmentation and indexing of media |
EP2160734A4 (en) * | 2007-06-18 | 2010-08-25 | Synergy Sports Technology Llc | System and method for distributed and parallel video editing, tagging, and indexing |
US7797352B1 (en) | 2007-06-19 | 2010-09-14 | Adobe Systems Incorporated | Community based digital content auditing and streaming |
US9933937B2 (en) | 2007-06-20 | 2018-04-03 | Apple Inc. | Portable multifunction device, method, and graphical user interface for playing online videos |
US20090017827A1 (en) * | 2007-06-21 | 2009-01-15 | Mobixell Networks Ltd. | Convenient user response to wireless content messages |
US9654833B2 (en) | 2007-06-26 | 2017-05-16 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US11570521B2 (en) | 2007-06-26 | 2023-01-31 | Broadband Itv, Inc. | Dynamic adjustment of electronic program guide displays based on viewer preferences for minimizing navigation in VOD program selection |
US9794605B2 (en) * | 2007-06-28 | 2017-10-17 | Apple Inc. | Using time-stamped event entries to facilitate synchronizing data streams |
US9772751B2 (en) | 2007-06-29 | 2017-09-26 | Apple Inc. | Using gestures to slide between user interfaces |
US8503523B2 (en) * | 2007-06-29 | 2013-08-06 | Microsoft Corporation | Forming a representation of a video item and use thereof |
TWI423041B (en) * | 2007-07-09 | 2014-01-11 | Cyberlink Corp | Av playing method capable of improving multimedia interactive mechanism and related apparatus |
KR20090005845A (en) * | 2007-07-10 | 2009-01-14 | 삼성전자주식회사 | Method for controlling playing of media signal and apparatus thereof |
US8407750B2 (en) * | 2007-07-11 | 2013-03-26 | Hewlett-Packard Development Company, L.P. | Enabling users of peer to peer clients to socially interact while viewing videos |
US20090019492A1 (en) | 2007-07-11 | 2009-01-15 | United Video Properties, Inc. | Systems and methods for mirroring and transcoding media content |
US8170392B2 (en) * | 2007-11-21 | 2012-05-01 | Shlomo Selim Rakib | Method and apparatus for generation, distribution and display of interactive video content |
US8582956B1 (en) * | 2007-07-18 | 2013-11-12 | Adobe Systems Incorporated | System and method for previewing multimedia files |
KR100908890B1 (en) * | 2007-07-18 | 2009-07-23 | (주)엔써즈 | Method and apparatus for providing video data retrieval service using video data cluster |
US9009210B2 (en) * | 2007-08-15 | 2015-04-14 | Sony Corporation | Distribution of multimedia files using a transportation provider wireless device |
US8893203B2 (en) | 2007-08-17 | 2014-11-18 | Phoenix Myrrh Technology Pty Ltd. | Method and system for content delivery |
US8639681B1 (en) * | 2007-08-22 | 2014-01-28 | Adobe Systems Incorporated | Automatic link generation for video watch style |
US8260794B2 (en) * | 2007-08-30 | 2012-09-04 | International Business Machines Corporation | Creating playback definitions indicating segments of media content from multiple content files to render |
US8619038B2 (en) | 2007-09-04 | 2013-12-31 | Apple Inc. | Editing interface |
US8060407B1 (en) | 2007-09-04 | 2011-11-15 | Sprint Communications Company L.P. | Method for providing personalized, targeted advertisements during playback of media |
US11126321B2 (en) | 2007-09-04 | 2021-09-21 | Apple Inc. | Application menu user interface |
US9619143B2 (en) * | 2008-01-06 | 2017-04-11 | Apple Inc. | Device, method, and graphical user interface for viewing application launch icons |
KR20090031142A (en) * | 2007-09-21 | 2009-03-25 | 삼성전자주식회사 | A method for providing gui to display related contents when contents are made by user, and a multimedia apparatus thereof |
US8041773B2 (en) | 2007-09-24 | 2011-10-18 | The Research Foundation Of State University Of New York | Automatic clustering for self-organizing grids |
US7769767B2 (en) | 2007-09-27 | 2010-08-03 | Domingo Enterprises, Llc | System and method for filtering content on a mobile device based on contextual tagging |
KR20090034086A (en) * | 2007-10-02 | 2009-04-07 | 삼성전자주식회사 | Apparatus and method for generating a graphic user interface |
JP2011501847A (en) * | 2007-10-17 | 2011-01-13 | アイティーアイ・スコットランド・リミテッド | Computer-implemented method |
US20090113466A1 (en) * | 2007-10-30 | 2009-04-30 | Einat Amitay | System, Method and Computer Program Product for Evaluating Media Streams |
US7865522B2 (en) | 2007-11-07 | 2011-01-04 | Napo Enterprises, Llc | System and method for hyping media recommendations in a media recommendation system |
US9060034B2 (en) | 2007-11-09 | 2015-06-16 | Napo Enterprises, Llc | System and method of filtering recommenders in a media item recommendation system |
US8059865B2 (en) | 2007-11-09 | 2011-11-15 | The Nielsen Company (Us), Llc | Methods and apparatus to specify regions of interest in video frames |
JP2009122847A (en) * | 2007-11-13 | 2009-06-04 | Ricoh Co Ltd | File access device |
US20090133054A1 (en) * | 2007-11-16 | 2009-05-21 | Matthew Thomas Boggie | Presentation of auxiliary content via a content presentation device |
KR20100106327A (en) | 2007-11-16 | 2010-10-01 | 디브이엑스, 인크. | Hierarchical and reduced index structures for multimedia files |
US7444347B1 (en) * | 2007-11-16 | 2008-10-28 | International Business Machines Corporation | Systems, methods and computer products for compression of hierarchical identifiers |
US8165451B2 (en) | 2007-11-20 | 2012-04-24 | Echostar Technologies L.L.C. | Methods and apparatus for displaying information regarding interstitials of a video stream |
EP2061239B1 (en) * | 2007-11-19 | 2017-09-20 | EchoStar Technologies L.L.C. | Methods and apparatus for identifying video locations in a video stream using text data |
US8165450B2 (en) | 2007-11-19 | 2012-04-24 | Echostar Technologies L.L.C. | Methods and apparatus for filtering content in a video stream using text data |
US20110246471A1 (en) * | 2010-04-06 | 2011-10-06 | Selim Shlomo Rakib | Retrieving video annotation metadata using a p2p network |
US8630497B2 (en) * | 2007-11-27 | 2014-01-14 | Intelliview Technologies Inc. | Analyzing a segment of video |
US8195635B1 (en) * | 2007-12-06 | 2012-06-05 | Sprint Communications Company L.P. | Indicating related but absent media content |
US8069142B2 (en) * | 2007-12-06 | 2011-11-29 | Yahoo! Inc. | System and method for synchronizing data on a network |
US20090150784A1 (en) * | 2007-12-07 | 2009-06-11 | Microsoft Corporation | User interface for previewing video items |
US8671154B2 (en) | 2007-12-10 | 2014-03-11 | Yahoo! Inc. | System and method for contextual addressing of communications on a network |
US8307029B2 (en) * | 2007-12-10 | 2012-11-06 | Yahoo! Inc. | System and method for conditional delivery of messages |
KR20090063528A (en) * | 2007-12-14 | 2009-06-18 | 엘지전자 주식회사 | Mobile terminal and method of palying back data therein |
US8166168B2 (en) | 2007-12-17 | 2012-04-24 | Yahoo! Inc. | System and method for disambiguating non-unique identifiers using information obtained from disparate communication channels |
US9224150B2 (en) | 2007-12-18 | 2015-12-29 | Napo Enterprises, Llc | Identifying highly valued recommendations of users in a media recommendation network |
US9734507B2 (en) | 2007-12-20 | 2017-08-15 | Napo Enterprise, Llc | Method and system for simulating recommendations in a social network for an offline user |
US8396951B2 (en) | 2007-12-20 | 2013-03-12 | Napo Enterprises, Llc | Method and system for populating a content repository for an internet radio service based on a recommendation network |
US8316015B2 (en) | 2007-12-21 | 2012-11-20 | Lemi Technology, Llc | Tunersphere |
US20100214111A1 (en) * | 2007-12-21 | 2010-08-26 | Motorola, Inc. | Mobile virtual and augmented reality system |
US8060525B2 (en) * | 2007-12-21 | 2011-11-15 | Napo Enterprises, Llc | Method and system for generating media recommendations in a distributed environment based on tagging play history information with location information |
US8117193B2 (en) | 2007-12-21 | 2012-02-14 | Lemi Technology, Llc | Tunersphere |
US8875023B2 (en) * | 2007-12-27 | 2014-10-28 | Microsoft Corporation | Thumbnail navigation bar for video |
US9706345B2 (en) | 2008-01-04 | 2017-07-11 | Excalibur Ip, Llc | Interest mapping system |
US9626685B2 (en) | 2008-01-04 | 2017-04-18 | Excalibur Ip, Llc | Systems and methods of mapping attention |
US8762285B2 (en) | 2008-01-06 | 2014-06-24 | Yahoo! Inc. | System and method for message clustering |
US8724600B2 (en) * | 2008-01-07 | 2014-05-13 | Tymphany Hong Kong Limited | Systems and methods for providing a media playback in a networked environment |
JP5232478B2 (en) * | 2008-01-09 | 2013-07-10 | 任天堂株式会社 | Information processing program, information processing apparatus, information processing system, and information processing method |
US9235648B2 (en) * | 2008-01-16 | 2016-01-12 | International Business Machines Corporation | Automated surfacing of tagged content in vertical applications |
US10699242B2 (en) | 2008-01-16 | 2020-06-30 | International Business Machines Corporation | Automated surfacing of tagged content adjunct to vertical applications |
US20090182618A1 (en) | 2008-01-16 | 2009-07-16 | Yahoo! Inc. | System and Method for Word-of-Mouth Advertising |
US8238427B2 (en) * | 2008-01-17 | 2012-08-07 | Texas Instruments Incorporated | Rate distortion optimized adaptive intra refresh for video coding |
US8126858B1 (en) * | 2008-01-23 | 2012-02-28 | A9.Com, Inc. | System and method for delivering content to a communication device in a content delivery system |
US20090187588A1 (en) * | 2008-01-23 | 2009-07-23 | Microsoft Corporation | Distributed indexing of file content |
US20090240734A1 (en) * | 2008-01-24 | 2009-09-24 | Geoffrey Wayne Lloyd-Jones | System and methods for the creation, review and synchronization of digital media to digital audio data |
US8117283B2 (en) | 2008-02-04 | 2012-02-14 | Echostar Technologies L.L.C. | Providing remote access to segments of a transmitted program |
US8181197B2 (en) * | 2008-02-06 | 2012-05-15 | Google Inc. | System and method for voting on popular video intervals |
US9251899B2 (en) * | 2008-02-12 | 2016-02-02 | Virident Systems, Inc. | Methods for upgrading main memory in computer systems to two-dimensional memory modules and master memory controllers |
US8112702B2 (en) | 2008-02-19 | 2012-02-07 | Google Inc. | Annotating video intervals |
US7996431B2 (en) * | 2008-02-25 | 2011-08-09 | International Business Machines Corporation | Systems, methods and computer program products for generating metadata and visualizing media content |
US7996432B2 (en) * | 2008-02-25 | 2011-08-09 | International Business Machines Corporation | Systems, methods and computer program products for the creation of annotations for media content to enable the selective management and playback of media content |
US8027999B2 (en) * | 2008-02-25 | 2011-09-27 | International Business Machines Corporation | Systems, methods and computer program products for indexing, searching and visualizing media content |
US20090216743A1 (en) * | 2008-02-25 | 2009-08-27 | International Business Machines Corporation | Systems, Methods and Computer Program Products for the Use of Annotations for Media Content to Enable the Selective Management and Playback of Media Content |
WO2009109958A2 (en) * | 2008-03-03 | 2009-09-11 | Aliza Barak | Editing and embedding advertising multimedia content |
US8538811B2 (en) | 2008-03-03 | 2013-09-17 | Yahoo! Inc. | Method and apparatus for social network marketing with advocate referral |
US8560390B2 (en) | 2008-03-03 | 2013-10-15 | Yahoo! Inc. | Method and apparatus for social network marketing with brand referral |
US8554623B2 (en) | 2008-03-03 | 2013-10-08 | Yahoo! Inc. | Method and apparatus for social network marketing with consumer referral |
US20090307572A1 (en) * | 2008-03-04 | 2009-12-10 | Zak Zacharia | TV set and remote guide to represent a web site home page |
US10216761B2 (en) * | 2008-03-04 | 2019-02-26 | Oath Inc. | Generating congruous metadata for multimedia |
US20090228492A1 (en) * | 2008-03-10 | 2009-09-10 | Verizon Data Services Inc. | Apparatus, system, and method for tagging media content |
US8224890B1 (en) | 2008-03-13 | 2012-07-17 | Google Inc. | Reusing data in content files |
KR20090098247A (en) * | 2008-03-13 | 2009-09-17 | 삼성전자주식회사 | Image processing apparatus, image processing system having image processing apparatus and control method thereof |
US20090238538A1 (en) * | 2008-03-20 | 2009-09-24 | Fink Franklin E | System and method for automated compilation and editing of personalized videos including archived historical content and personal content |
US8606085B2 (en) | 2008-03-20 | 2013-12-10 | Dish Network L.L.C. | Method and apparatus for replacement of audio data in recorded audio/video stream |
US8725740B2 (en) * | 2008-03-24 | 2014-05-13 | Napo Enterprises, Llc | Active playlist having dynamic media item groups |
US7860866B2 (en) * | 2008-03-26 | 2010-12-28 | Microsoft Corporation | Heuristic event clustering of media using metadata |
US8589486B2 (en) | 2008-03-28 | 2013-11-19 | Yahoo! Inc. | System and method for addressing communications |
US8745133B2 (en) | 2008-03-28 | 2014-06-03 | Yahoo! Inc. | System and method for optimizing the storage of data |
US8271506B2 (en) | 2008-03-31 | 2012-09-18 | Yahoo! Inc. | System and method for modeling relationships between entities |
US8312376B2 (en) * | 2008-04-03 | 2012-11-13 | Microsoft Corporation | Bookmark interpretation service |
US8190604B2 (en) * | 2008-04-03 | 2012-05-29 | Microsoft Corporation | User intention modeling for interactive image retrieval |
US20110106784A1 (en) * | 2008-04-04 | 2011-05-05 | Merijn Camiel Terheggen | System and method for publishing media objects |
WO2009123594A1 (en) * | 2008-04-04 | 2009-10-08 | Fabian-Baber, Inc. | Correlating the results of a computer network text search with relevant multimedia files |
WO2009126785A2 (en) | 2008-04-10 | 2009-10-15 | The Trustees Of Columbia University In The City Of New York | Systems and methods for image archaeology |
US9697229B2 (en) * | 2008-04-11 | 2017-07-04 | Adobe Systems Incorporated | Methods and systems for creating and storing metadata |
US8079054B1 (en) * | 2008-04-14 | 2011-12-13 | Adobe Systems Incorporated | Location for secondary content based on data differential |
JP4453768B2 (en) * | 2008-04-15 | 2010-04-21 | ソニー株式会社 | Information processing apparatus and method, and program |
US8204883B1 (en) | 2008-04-17 | 2012-06-19 | Amazon Technologies, Inc. | Systems and methods of determining genre information |
US8484311B2 (en) | 2008-04-17 | 2013-07-09 | Eloy Technology, Llc | Pruning an aggregate media collection |
US8806530B1 (en) | 2008-04-22 | 2014-08-12 | Sprint Communications Company L.P. | Dual channel presence detection and content delivery system and method |
US20090288120A1 (en) * | 2008-05-15 | 2009-11-19 | Motorola, Inc. | System and Method for Creating Media Bookmarks from Secondary Device |
US8156520B2 (en) | 2008-05-30 | 2012-04-10 | EchoStar Technologies, L.L.C. | Methods and apparatus for presenting substitute content in an audio/video stream using text data |
US8566353B2 (en) | 2008-06-03 | 2013-10-22 | Google Inc. | Web-based system for collaborative generation of interactive videos |
JP4913777B2 (en) * | 2008-06-03 | 2012-04-11 | 株式会社シンメトリック | Web page distribution system |
US8542702B1 (en) * | 2008-06-03 | 2013-09-24 | At&T Intellectual Property I, L.P. | Marking and sending portions of data transmissions |
US8300953B2 (en) | 2008-06-05 | 2012-10-30 | Apple Inc. | Categorization of digital media based on media characteristics |
US20090307207A1 (en) * | 2008-06-09 | 2009-12-10 | Murray Thomas J | Creation of a multi-media presentation |
US8601526B2 (en) | 2008-06-13 | 2013-12-03 | United Video Properties, Inc. | Systems and methods for displaying media content and media guidance information |
US8364693B2 (en) | 2008-06-13 | 2013-01-29 | News Distribution Network, Inc. | Searching, sorting, and displaying video clips and sound files by relevance |
KR101032634B1 (en) * | 2008-06-17 | 2011-05-06 | 삼성전자주식회사 | Method and apparatus of playing a media file |
WO2009155281A1 (en) | 2008-06-17 | 2009-12-23 | The Trustees Of Columbia University In The City Of New York | System and method for dynamically and interactively searching media data |
US8775566B2 (en) * | 2008-06-21 | 2014-07-08 | Microsoft Corporation | File format for media distribution and presentation |
US8706406B2 (en) | 2008-06-27 | 2014-04-22 | Yahoo! Inc. | System and method for determination and display of personalized distance |
US8813107B2 (en) | 2008-06-27 | 2014-08-19 | Yahoo! Inc. | System and method for location based media delivery |
US8452855B2 (en) | 2008-06-27 | 2013-05-28 | Yahoo! Inc. | System and method for presentation of media related to a context |
US20110191684A1 (en) * | 2008-06-29 | 2011-08-04 | TV1.com Holdings, LLC | Method of Internet Video Access and Management |
US8259177B2 (en) * | 2008-06-30 | 2012-09-04 | Cisco Technology, Inc. | Video fingerprint systems and methods |
US20090327334A1 (en) * | 2008-06-30 | 2009-12-31 | Rodriguez Arturo A | Generating Measures of Video Sequences to Detect Unauthorized Use |
US8347408B2 (en) * | 2008-06-30 | 2013-01-01 | Cisco Technology, Inc. | Matching of unknown video content to protected video content |
KR101475939B1 (en) * | 2008-07-02 | 2014-12-23 | 삼성전자 주식회사 | Method of controlling image processing apparatus, image processing apparatus and image file |
US8269612B2 (en) | 2008-07-10 | 2012-09-18 | Black & Decker Inc. | Communication protocol for remotely controlled laser devices |
DE102008044635A1 (en) * | 2008-07-22 | 2010-02-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for providing a television sequence |
JP2011529293A (en) * | 2008-07-23 | 2011-12-01 | エルティーユー テクノロジーズ エスエーエス | Frame-based video matching |
US8086700B2 (en) | 2008-07-29 | 2011-12-27 | Yahoo! Inc. | Region and duration uniform resource identifiers (URI) for media objects |
US10230803B2 (en) | 2008-07-30 | 2019-03-12 | Excalibur Ip, Llc | System and method for improved mapping and routing |
US7635280B1 (en) * | 2008-07-30 | 2009-12-22 | Apple Inc. | Type A USB receptacle with plug detection |
US8583668B2 (en) | 2008-07-30 | 2013-11-12 | Yahoo! Inc. | System and method for context enhanced mapping |
US10007668B2 (en) * | 2008-08-01 | 2018-06-26 | Vantrix Corporation | Method and system for triggering ingestion of remote content by a streaming server using uniform resource locator folder mapping |
US8990195B2 (en) * | 2008-08-06 | 2015-03-24 | Cyberlink Corp. | Systems and methods for searching media content based on an editing file |
EP2350771A4 (en) * | 2008-08-06 | 2013-08-28 | Ericsson Telefon Ab L M | Media bookmarks |
US8520979B2 (en) * | 2008-08-19 | 2013-08-27 | Digimarc Corporation | Methods and systems for content processing |
US8386506B2 (en) | 2008-08-21 | 2013-02-26 | Yahoo! Inc. | System and method for context enhanced messaging |
US8813152B2 (en) * | 2008-08-26 | 2014-08-19 | At&T Intellectual Property I, L.P. | Methods, apparatus, and computer program products for providing interactive services |
US20140177964A1 (en) * | 2008-08-27 | 2014-06-26 | Unicorn Media, Inc. | Video image search |
US8843974B2 (en) * | 2008-08-27 | 2014-09-23 | Albert John McGowan | Media playback system with multiple video formats |
KR101537592B1 (en) | 2008-09-03 | 2015-07-22 | 엘지전자 주식회사 | Mobile terminal and method for controlling the same |
US8949718B2 (en) | 2008-09-05 | 2015-02-03 | Lemi Technology, Llc | Visual audio links for digital audio content |
WO2010026496A1 (en) * | 2008-09-07 | 2010-03-11 | Sportvu Ltd. | Method and system for fusing video streams |
US20100064221A1 (en) * | 2008-09-11 | 2010-03-11 | At&T Intellectual Property I, L.P. | Method and apparatus to provide media content |
US20100070858A1 (en) * | 2008-09-12 | 2010-03-18 | At&T Intellectual Property I, L.P. | Interactive Media System and Method Using Context-Based Avatar Configuration |
US20100070537A1 (en) * | 2008-09-17 | 2010-03-18 | Eloy Technology, Llc | System and method for managing a personalized universal catalog of media items |
US8281027B2 (en) | 2008-09-19 | 2012-10-02 | Yahoo! Inc. | System and method for distributing media related to a location |
US9961399B2 (en) * | 2008-09-19 | 2018-05-01 | Verizon Patent And Licensing Inc. | Method and apparatus for organizing and bookmarking content |
US20100077292A1 (en) * | 2008-09-25 | 2010-03-25 | Harris Scott C | Automated feature-based to do list |
US20100076923A1 (en) * | 2008-09-25 | 2010-03-25 | Microsoft Corporation | Online multi-label active annotation of data files |
US8843375B1 (en) * | 2008-09-29 | 2014-09-23 | Apple Inc. | User interfaces for editing audio clips |
US8620861B1 (en) | 2008-09-30 | 2013-12-31 | Google Inc. | Preserving file metadata during atomic save operations |
US8108778B2 (en) | 2008-09-30 | 2012-01-31 | Yahoo! Inc. | System and method for context enhanced mapping within a user interface |
US9600484B2 (en) | 2008-09-30 | 2017-03-21 | Excalibur Ip, Llc | System and method for reporting and analysis of media consumption data |
US9934240B2 (en) * | 2008-09-30 | 2018-04-03 | Google Llc | On demand access to client cached files |
US8935355B2 (en) * | 2008-10-02 | 2015-01-13 | International Business Machines Corporation | Periodic shuffling of data fragments in a peer-to-peer data backup and archival network |
JP5231928B2 (en) * | 2008-10-07 | 2013-07-10 | 株式会社ソニー・コンピュータエンタテインメント | Information processing apparatus and information processing method |
US20100153848A1 (en) * | 2008-10-09 | 2010-06-17 | Pinaki Saha | Integrated branding, social bookmarking, and aggregation system for media content |
US8484227B2 (en) | 2008-10-15 | 2013-07-09 | Eloy Technology, Llc | Caching and synching process for a media sharing system |
US8880599B2 (en) * | 2008-10-15 | 2014-11-04 | Eloy Technology, Llc | Collection digest for a media sharing system |
WO2010043269A1 (en) * | 2008-10-17 | 2010-04-22 | Telefonaktiebolaget Lm Ericsson (Publ) | Method and apparatus for use in a packet switched television network |
KR101593991B1 (en) * | 2008-10-23 | 2016-02-17 | 삼성전자주식회사 | Method and apparatus for recommending content |
US20100104004A1 (en) * | 2008-10-24 | 2010-04-29 | Smita Wadhwa | Video encoding for mobile devices |
WO2010050984A1 (en) * | 2008-10-31 | 2010-05-06 | Hewlett-Packard Development Company, L.P. | Organizing video data |
JP5344715B2 (en) * | 2008-11-07 | 2013-11-20 | 国立大学法人北海道大学 | Content search apparatus and content search program |
US20100121891A1 (en) * | 2008-11-11 | 2010-05-13 | At&T Intellectual Property I, L.P. | Method and system for using play lists for multimedia content |
US9049477B2 (en) | 2008-11-13 | 2015-06-02 | At&T Intellectual Property I, Lp | Apparatus and method for managing media content |
KR101499978B1 (en) * | 2008-11-14 | 2015-03-06 | 주식회사 케이티 | Method and apparatus for displaying thumbnail images and related information |
US8060492B2 (en) * | 2008-11-18 | 2011-11-15 | Yahoo! Inc. | System and method for generation of URL based context queries |
US8024317B2 (en) | 2008-11-18 | 2011-09-20 | Yahoo! Inc. | System and method for deriving income from URL based context queries |
US8032508B2 (en) | 2008-11-18 | 2011-10-04 | Yahoo! Inc. | System and method for URL based query for retrieving data related to a context |
US9805123B2 (en) | 2008-11-18 | 2017-10-31 | Excalibur Ip, Llc | System and method for data privacy in URL based context queries |
US9357247B2 (en) | 2008-11-24 | 2016-05-31 | Time Warner Cable Enterprises Llc | Apparatus and methods for content delivery and message exchange across multiple content delivery networks |
US10063934B2 (en) | 2008-11-25 | 2018-08-28 | Rovi Technologies Corporation | Reducing unicast session duration with restart TV |
US9224172B2 (en) | 2008-12-02 | 2015-12-29 | Yahoo! Inc. | Customizable content for distribution in social networks |
US8055675B2 (en) | 2008-12-05 | 2011-11-08 | Yahoo! Inc. | System and method for context based query augmentation |
US20100145971A1 (en) * | 2008-12-08 | 2010-06-10 | Motorola, Inc. | Method and apparatus for generating a multimedia-based query |
US9865302B1 (en) * | 2008-12-15 | 2018-01-09 | Tata Communications (America) Inc. | Virtual video editing |
US8166016B2 (en) | 2008-12-19 | 2012-04-24 | Yahoo! Inc. | System and method for automated service recommendations |
US8671069B2 (en) | 2008-12-22 | 2014-03-11 | The Trustees Of Columbia University, In The City Of New York | Rapid image annotation via brain state decoding and visual pattern mining |
US8588579B2 (en) | 2008-12-24 | 2013-11-19 | Echostar Technologies L.L.C. | Methods and apparatus for filtering and inserting content into a presentation stream using signature data |
US8407735B2 (en) | 2008-12-24 | 2013-03-26 | Echostar Technologies L.L.C. | Methods and apparatus for identifying segments of content in a presentation stream using signature data |
US8510771B2 (en) | 2008-12-24 | 2013-08-13 | Echostar Technologies L.L.C. | Methods and apparatus for filtering content from a presentation stream using signature data |
US8370737B2 (en) * | 2008-12-27 | 2013-02-05 | Flash Networks, Ltd | Method and system for inserting data in a web page that is transmitted to a handheld device |
US8583682B2 (en) * | 2008-12-30 | 2013-11-12 | Microsoft Corporation | Peer-to-peer web search using tagged resources |
US8578272B2 (en) | 2008-12-31 | 2013-11-05 | Apple Inc. | Real-time or near real-time streaming |
MX2011006973A (en) * | 2008-12-31 | 2011-12-06 | Apple Inc | Method for streaming multimedia data over a non-streaming protocol. |
US8260877B2 (en) | 2008-12-31 | 2012-09-04 | Apple Inc. | Variant streams for real-time or near real-time streaming to provide failover protection |
US8099476B2 (en) | 2008-12-31 | 2012-01-17 | Apple Inc. | Updatable real-time or near real-time streaming |
US8156089B2 (en) | 2008-12-31 | 2012-04-10 | Apple, Inc. | Real-time or near real-time streaming with compressed playlists |
WO2010080911A1 (en) | 2009-01-07 | 2010-07-15 | Divx, Inc. | Singular, collective and automated creation of a media guide for online content |
US8331677B2 (en) * | 2009-01-08 | 2012-12-11 | Microsoft Corporation | Combined image and text document |
US8095546B1 (en) | 2009-01-09 | 2012-01-10 | Google Inc. | Book content item search |
US8316032B1 (en) * | 2009-01-09 | 2012-11-20 | Google Inc. | Book content item search |
US20100185518A1 (en) * | 2009-01-21 | 2010-07-22 | Yahoo! Inc. | Interest-based activity marketing |
US20130124242A1 (en) | 2009-01-28 | 2013-05-16 | Adobe Systems Incorporated | Video review workflow process |
JP2010177945A (en) * | 2009-01-28 | 2010-08-12 | Sony Corp | Information providing device, mobile communication device, information providing system, information providing method, and program |
US20100192183A1 (en) * | 2009-01-29 | 2010-07-29 | At&T Intellectual Property I, L.P. | Mobile Device Access to Multimedia Content Recorded at Customer Premises |
US8200602B2 (en) * | 2009-02-02 | 2012-06-12 | Napo Enterprises, Llc | System and method for creating thematic listening experiences in a networked peer media recommendation environment |
US9183881B2 (en) | 2009-02-02 | 2015-11-10 | Porto Technology, Llc | System and method for semantic trick play |
US8350871B2 (en) * | 2009-02-04 | 2013-01-08 | Motorola Mobility Llc | Method and apparatus for creating virtual graffiti in a mobile virtual and augmented reality system |
US8425325B2 (en) * | 2009-02-06 | 2013-04-23 | Apple Inc. | Automatically generating a book describing a user's videogame performance |
US20100205203A1 (en) * | 2009-02-09 | 2010-08-12 | Vitamin D, Inc. | Systems and methods for video analysis |
US20100201815A1 (en) * | 2009-02-09 | 2010-08-12 | Vitamin D, Inc. | Systems and methods for video monitoring |
WO2010090622A1 (en) * | 2009-02-09 | 2010-08-12 | Vitamin D, Inc. | Systems and methods for video analysis |
US8774516B2 (en) | 2009-02-10 | 2014-07-08 | Kofax, Inc. | Systems, methods and computer program products for determining document validity |
US9767354B2 (en) | 2009-02-10 | 2017-09-19 | Kofax, Inc. | Global geographic information retrieval, validation, and normalization |
US8958605B2 (en) | 2009-02-10 | 2015-02-17 | Kofax, Inc. | Systems, methods and computer program products for determining document validity |
US9576272B2 (en) | 2009-02-10 | 2017-02-21 | Kofax, Inc. | Systems, methods and computer program products for determining document validity |
US9349046B2 (en) | 2009-02-10 | 2016-05-24 | Kofax, Inc. | Smart optical input/output (I/O) extension for context-dependent workflows |
US20100211605A1 (en) * | 2009-02-17 | 2010-08-19 | Subhankar Ray | Apparatus and method for unified web-search, selective broadcasting, natural language processing utilities, analysis, synthesis, and other applications for text, images, audios and videos, initiated by one or more interactions from users |
US20100211988A1 (en) * | 2009-02-18 | 2010-08-19 | Microsoft Corporation | Managing resources to display media content |
US8527537B2 (en) | 2009-02-19 | 2013-09-03 | Hulu, LLC | Method and apparatus for providing community-based metadata |
US8782709B2 (en) * | 2009-02-19 | 2014-07-15 | Hulu, LLC | Method and apparatus for providing a program guide having search parameter aware thumbnails |
US20100215340A1 (en) * | 2009-02-20 | 2010-08-26 | Microsoft Corporation | Triggers For Launching Applications |
CN101811644B (en) * | 2009-02-25 | 2014-05-14 | 马尼托瓦克起重机有限公司 | Swing drive system for cranes |
US9069585B2 (en) * | 2009-03-02 | 2015-06-30 | Microsoft Corporation | Application tune manifests and tune state recovery |
EP2404278A4 (en) * | 2009-03-02 | 2013-10-02 | Kalooga Bv | System and method for publishing media objects |
US20100241689A1 (en) * | 2009-03-19 | 2010-09-23 | Yahoo! Inc. | Method and apparatus for associating advertising with computer enabled maps |
US8150967B2 (en) | 2009-03-24 | 2012-04-03 | Yahoo! Inc. | System and method for verified presence tracking |
EP2234024B1 (en) * | 2009-03-24 | 2012-10-03 | Sony Corporation | Context based video finder |
US8826117B1 (en) | 2009-03-25 | 2014-09-02 | Google Inc. | Web-based system for video editing |
US9215423B2 (en) | 2009-03-30 | 2015-12-15 | Time Warner Cable Enterprises Llc | Recommendation engine apparatus and methods |
US11076189B2 (en) | 2009-03-30 | 2021-07-27 | Time Warner Cable Enterprises Llc | Personal media channel apparatus and methods |
US8132200B1 (en) | 2009-03-30 | 2012-03-06 | Google Inc. | Intra-video ratings |
US8433136B2 (en) * | 2009-03-31 | 2013-04-30 | Microsoft Corporation | Tagging video using character recognition and propagation |
US8769589B2 (en) | 2009-03-31 | 2014-07-01 | At&T Intellectual Property I, L.P. | System and method to create a media content summary based on viewer annotations |
US8346800B2 (en) | 2009-04-02 | 2013-01-01 | Microsoft Corporation | Content-based information retrieval |
US20100301695A1 (en) * | 2009-04-03 | 2010-12-02 | Asmo Co., Ltd. | Rotor and Motor |
US8219539B2 (en) * | 2009-04-07 | 2012-07-10 | Microsoft Corporation | Search queries with shifting intent |
US8818172B2 (en) * | 2009-04-14 | 2014-08-26 | Avid Technology, Inc. | Multi-user remote video editing |
US8412729B2 (en) | 2009-04-22 | 2013-04-02 | Genarts, Inc. | Sharing of presets for visual effects or other computer-implemented effects |
US8631326B2 (en) * | 2009-04-30 | 2014-01-14 | Apple Inc. | Segmented timeline for a media-editing application |
US20100280913A1 (en) * | 2009-05-01 | 2010-11-04 | Yahoo! Inc. | Gift credit matching engine |
US10440329B2 (en) * | 2009-05-22 | 2019-10-08 | Immersive Media Company | Hybrid media viewing application including a region of interest within a wide field of view |
US8943408B2 (en) | 2009-05-27 | 2015-01-27 | Adobe Systems Incorporated | Text image review process |
WO2010138776A2 (en) | 2009-05-27 | 2010-12-02 | Spot411 Technologies, Inc. | Audio-based synchronization to media |
US8489774B2 (en) | 2009-05-27 | 2013-07-16 | Spot411 Technologies, Inc. | Synchronized delivery of interactive content |
US8943431B2 (en) | 2009-05-27 | 2015-01-27 | Adobe Systems Incorporated | Text operations in a bitmap-based document |
US8769396B2 (en) * | 2009-06-05 | 2014-07-01 | Microsoft Corporation | Calibration and annotation of video content |
US9602864B2 (en) | 2009-06-08 | 2017-03-21 | Time Warner Cable Enterprises Llc | Media bridge apparatus and methods |
US8126897B2 (en) * | 2009-06-10 | 2012-02-28 | International Business Machines Corporation | Unified inverted index for video passage retrieval |
US8437617B2 (en) | 2009-06-17 | 2013-05-07 | Echostar Technologies L.L.C. | Method and apparatus for modifying the presentation of content |
US20100325662A1 (en) * | 2009-06-19 | 2010-12-23 | Harold Cooper | System and method for navigating position within video files |
US8266091B1 (en) * | 2009-07-21 | 2012-09-11 | Symantec Corporation | Systems and methods for emulating the behavior of a user in a computer-human interaction environment |
US10223701B2 (en) | 2009-08-06 | 2019-03-05 | Excalibur Ip, Llc | System and method for verified monetization of commercial campaigns |
US8914342B2 (en) | 2009-08-12 | 2014-12-16 | Yahoo! Inc. | Personal data platform |
US8364611B2 (en) | 2009-08-13 | 2013-01-29 | Yahoo! Inc. | System and method for precaching information on a mobile device |
US8805862B2 (en) * | 2009-08-18 | 2014-08-12 | Industrial Technology Research Institute | Video search method using motion vectors and apparatus thereof |
TWI443534B (en) * | 2009-08-18 | 2014-07-01 | Ind Tech Res Inst | Video search method and apparatus using motion vectors |
US8682391B2 (en) * | 2009-08-27 | 2014-03-25 | Lg Electronics Inc. | Mobile terminal and controlling method thereof |
US8898575B2 (en) * | 2009-09-02 | 2014-11-25 | Yahoo! Inc. | Indicating unavailability of an uploaded video file that is being bitrate encoded |
US9166714B2 (en) | 2009-09-11 | 2015-10-20 | Veveo, Inc. | Method of and system for presenting enriched video viewing analytics |
WO2011034924A1 (en) * | 2009-09-15 | 2011-03-24 | Envysion, Inc. | Video streaming method and system |
US8634704B2 (en) * | 2009-09-16 | 2014-01-21 | At&T Intellectual Property I, L.P. | Apparatus and method for storing and providing a portion of media content to a communication device |
US9148624B2 (en) * | 2009-09-17 | 2015-09-29 | Verizon Patent And Licensing Inc. | System for and method of providing graphical contents during a communication session |
US9014546B2 (en) | 2009-09-23 | 2015-04-21 | Rovi Guides, Inc. | Systems and methods for automatically detecting users within detection regions of media devices |
US8705933B2 (en) * | 2009-09-25 | 2014-04-22 | Sony Corporation | Video bookmarking |
US20110078717A1 (en) * | 2009-09-29 | 2011-03-31 | Rovi Technologies Corporation | System for notifying a community of interested users about programs or segments |
US9438861B2 (en) | 2009-10-06 | 2016-09-06 | Microsoft Technology Licensing, Llc | Integrating continuous and sparse streaming data |
US8488873B2 (en) * | 2009-10-07 | 2013-07-16 | Apple Inc. | Method of computing global-to-local metrics for recognition |
US8135221B2 (en) * | 2009-10-07 | 2012-03-13 | Eastman Kodak Company | Video concept classification using audio-visual atoms |
US10063812B2 (en) * | 2009-10-07 | 2018-08-28 | DISH Technologies L.L.C. | Systems and methods for media format transcoding |
US8396055B2 (en) | 2009-10-20 | 2013-03-12 | Time Warner Cable Inc. | Methods and apparatus for enabling media functionality in a content-based network |
US8990104B1 (en) | 2009-10-27 | 2015-03-24 | Sprint Communications Company L.P. | Multimedia product placement marketplace |
US11720290B2 (en) | 2009-10-30 | 2023-08-08 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
US10877695B2 (en) | 2009-10-30 | 2020-12-29 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
US10264029B2 (en) | 2009-10-30 | 2019-04-16 | Time Warner Cable Enterprises Llc | Methods and apparatus for packetized content delivery over a content delivery network |
KR20110047768A (en) | 2009-10-30 | 2011-05-09 | 삼성전자주식회사 | Apparatus and method for displaying multimedia contents |
US20110113357A1 (en) * | 2009-11-12 | 2011-05-12 | International Business Machines Corporation | Manipulating results of a media archive search |
KR101750049B1 (en) * | 2009-11-13 | 2017-06-22 | 삼성전자주식회사 | Method and apparatus for adaptive streaming |
KR101786050B1 (en) * | 2009-11-13 | 2017-10-16 | 삼성전자 주식회사 | Method and apparatus for transmitting and receiving of data |
JP5421739B2 (en) * | 2009-11-19 | 2014-02-19 | 株式会社日立国際電気 | Moving picture coding apparatus, moving picture decoding apparatus, and moving picture coding method |
US8761512B1 (en) | 2009-12-03 | 2014-06-24 | Google Inc. | Query by image |
US8682145B2 (en) | 2009-12-04 | 2014-03-25 | Tivo Inc. | Recording system based on multimedia content fingerprints |
US9519728B2 (en) | 2009-12-04 | 2016-12-13 | Time Warner Cable Enterprises Llc | Apparatus and methods for monitoring and optimizing delivery of content in a network |
US8973049B2 (en) * | 2009-12-04 | 2015-03-03 | Cox Communications, Inc. | Content recommendations |
WO2011068668A1 (en) | 2009-12-04 | 2011-06-09 | Divx, Llc | Elementary bitstream cryptographic material transport systems and methods |
US8774527B1 (en) | 2009-12-07 | 2014-07-08 | Google Inc. | Matching an approximately located query image against a reference image set using cellular base station and wireless access point information |
US8189964B2 (en) * | 2009-12-07 | 2012-05-29 | Google Inc. | Matching an approximately located query image against a reference image set |
KR101310900B1 (en) * | 2009-12-17 | 2013-09-25 | 한국전자통신연구원 | Method of Providing Services Information, System Thereof and Method of Receiving Service Information |
IT1397439B1 (en) * | 2009-12-30 | 2013-01-10 | St Microelectronics Srl | PROCEDURE AND DEVICES FOR THE DISTRIBUTION OF MEDIAL CONTENT AND ITS COMPUTER PRODUCT |
US9300722B2 (en) * | 2010-01-05 | 2016-03-29 | Qualcomm Incorporated | Auto-trimming of media files |
US8438504B2 (en) | 2010-01-06 | 2013-05-07 | Apple Inc. | Device, method, and graphical user interface for navigating through multiple viewing areas |
US8736561B2 (en) | 2010-01-06 | 2014-05-27 | Apple Inc. | Device, method, and graphical user interface with content display modes and display rotation heuristics |
US8290926B2 (en) * | 2010-01-21 | 2012-10-16 | Microsoft Corporation | Scalable topical aggregation of data feeds |
US8244754B2 (en) * | 2010-02-01 | 2012-08-14 | International Business Machines Corporation | System and method for object searching in virtual worlds |
US8832749B2 (en) * | 2010-02-12 | 2014-09-09 | Cox Communications, Inc. | Personalizing TV content |
US9244965B2 (en) * | 2010-02-22 | 2016-01-26 | Thoughtwire Holdings Corp. | Method and system for sharing data between software systems |
US8489600B2 (en) | 2010-02-23 | 2013-07-16 | Nokia Corporation | Method and apparatus for segmenting and summarizing media content |
US9342661B2 (en) | 2010-03-02 | 2016-05-17 | Time Warner Cable Enterprises Llc | Apparatus and methods for rights-managed content and data delivery |
US8422859B2 (en) * | 2010-03-23 | 2013-04-16 | Vixs Systems Inc. | Audio-based chapter detection in multimedia stream |
US8811801B2 (en) * | 2010-03-25 | 2014-08-19 | Disney Enterprises, Inc. | Continuous freeze-frame video effect system and method |
US8463845B2 (en) | 2010-03-30 | 2013-06-11 | Itxc Ip Holdings S.A.R.L. | Multimedia editing systems and methods therefor |
US8788941B2 (en) | 2010-03-30 | 2014-07-22 | Itxc Ip Holdings S.A.R.L. | Navigable content source identification for multimedia editing systems and methods therefor |
US8806346B2 (en) | 2010-03-30 | 2014-08-12 | Itxc Ip Holdings S.A.R.L. | Configurable workflow editor for multimedia editing systems and methods therefor |
US9281012B2 (en) * | 2010-03-30 | 2016-03-08 | Itxc Ip Holdings S.A.R.L. | Metadata role-based view generation in multimedia editing systems and methods therefor |
US8560642B2 (en) | 2010-04-01 | 2013-10-15 | Apple Inc. | Real-time or near real-time streaming |
US8805963B2 (en) | 2010-04-01 | 2014-08-12 | Apple Inc. | Real-time or near real-time streaming |
GB201105502D0 (en) | 2010-04-01 | 2011-05-18 | Apple Inc | Real time or near real time streaming |
TWI451279B (en) | 2010-04-07 | 2014-09-01 | Apple Inc | Content access control for real-time or near real-time streaming |
US8346767B2 (en) | 2010-04-21 | 2013-01-01 | Microsoft Corporation | Image search result summarization with informative priors |
CN102236750B (en) * | 2010-04-29 | 2016-03-30 | 国际商业机器公司 | The method and apparatus of control of authority is carried out in cloud storage system |
US8830327B2 (en) * | 2010-05-13 | 2014-09-09 | Honeywell International Inc. | Surveillance system with direct database server storage |
US9258175B1 (en) | 2010-05-28 | 2016-02-09 | The Directv Group, Inc. | Method and system for sharing playlists for content stored within a network |
US8903798B2 (en) | 2010-05-28 | 2014-12-02 | Microsoft Corporation | Real-time annotation and enrichment of captured video |
US9703782B2 (en) | 2010-05-28 | 2017-07-11 | Microsoft Technology Licensing, Llc | Associating media with metadata of near-duplicates |
US8755921B2 (en) * | 2010-06-03 | 2014-06-17 | Google Inc. | Continuous audio interaction with interruptive audio |
US20110307917A1 (en) * | 2010-06-11 | 2011-12-15 | Brian Shuster | Method and apparatus for interactive mobile coupon/offer delivery, storage and redemption system |
US8347211B1 (en) * | 2010-06-22 | 2013-01-01 | Amazon Technologies, Inc. | Immersive multimedia views for items |
US8786597B2 (en) | 2010-06-30 | 2014-07-22 | International Business Machines Corporation | Management of a history of a meeting |
US9906838B2 (en) | 2010-07-12 | 2018-02-27 | Time Warner Cable Enterprises Llc | Apparatus and methods for content delivery and message exchange across multiple content delivery networks |
US8910046B2 (en) | 2010-07-15 | 2014-12-09 | Apple Inc. | Media-editing application with anchored timeline |
KR101775027B1 (en) * | 2010-07-21 | 2017-09-06 | 삼성전자주식회사 | Method and apparatus for sharing content |
AU2016250475B2 (en) * | 2010-07-21 | 2018-11-15 | Samsung Electronics Co., Ltd. | Method and apparatus for sharing content |
US8997136B2 (en) | 2010-07-22 | 2015-03-31 | Time Warner Cable Enterprises Llc | Apparatus and methods for packetized content delivery over a bandwidth-efficient network |
JP2012029241A (en) * | 2010-07-27 | 2012-02-09 | Toshiba Corp | Electronic apparatus |
US20120028712A1 (en) * | 2010-07-30 | 2012-02-02 | Britesmart Llc | Distributed cloud gaming method and system where interactivity and resources are securely shared among multiple users and networks |
KR20120060134A (en) * | 2010-08-16 | 2012-06-11 | 삼성전자주식회사 | Method and apparatus for reproducing advertisement |
US8416990B2 (en) * | 2010-08-17 | 2013-04-09 | Microsoft Corporation | Hierarchical video sub-volume search |
US8738656B2 (en) * | 2010-08-23 | 2014-05-27 | Hewlett-Packard Development Company, L.P. | Method and system for processing a group of resource identifiers |
US8789117B2 (en) | 2010-08-26 | 2014-07-22 | Cox Communications, Inc. | Content library |
US9167302B2 (en) | 2010-08-26 | 2015-10-20 | Cox Communications, Inc. | Playlist bookmarking |
US8806340B2 (en) * | 2010-09-01 | 2014-08-12 | Hulu, LLC | Method and apparatus for embedding media programs having custom user selectable thumbnails |
KR101657122B1 (en) * | 2010-09-15 | 2016-09-30 | 엘지전자 주식회사 | Mobile terminal and method for controlling the same |
JP4937394B2 (en) * | 2010-09-17 | 2012-05-23 | 株式会社東芝 | Stream file management apparatus and method |
US20120072845A1 (en) * | 2010-09-21 | 2012-03-22 | Avaya Inc. | System and method for classifying live media tags into types |
US8422782B1 (en) | 2010-09-30 | 2013-04-16 | A9.Com, Inc. | Contour detection and image classification |
US8463036B1 (en) | 2010-09-30 | 2013-06-11 | A9.Com, Inc. | Shape-based search of a collection of content |
US9400585B2 (en) * | 2010-10-05 | 2016-07-26 | Citrix Systems, Inc. | Display management for native user experiences |
WO2012050244A1 (en) * | 2010-10-11 | 2012-04-19 | 엘지전자 주식회사 | Image-monitoring device and method for searching for objects therefor |
EP2442270A1 (en) * | 2010-10-13 | 2012-04-18 | Sony Ericsson Mobile Communications AB | Image transmission |
US8989499B2 (en) * | 2010-10-20 | 2015-03-24 | Comcast Cable Communications, Llc | Detection of transitions between text and non-text frames in a video stream |
US8687941B2 (en) * | 2010-10-29 | 2014-04-01 | International Business Machines Corporation | Automatic static video summarization |
US9317533B2 (en) | 2010-11-02 | 2016-04-19 | Microsoft Technology Licensing, Inc. | Adaptive image retrieval database |
TWI415427B (en) | 2010-11-04 | 2013-11-11 | Ind Tech Res Inst | System and method for peer-to-peer live streaming |
US8971651B2 (en) | 2010-11-08 | 2015-03-03 | Sony Corporation | Videolens media engine |
US8875007B2 (en) * | 2010-11-08 | 2014-10-28 | Microsoft Corporation | Creating and modifying an image wiki page |
US8559682B2 (en) | 2010-11-09 | 2013-10-15 | Microsoft Corporation | Building a person profile database |
CN102065339B (en) * | 2010-11-09 | 2013-03-20 | 中国电信股份有限公司 | Method and system for playing audio and video media stream |
US8463045B2 (en) | 2010-11-10 | 2013-06-11 | Microsoft Corporation | Hierarchical sparse representation for image retrieval |
US8370338B2 (en) * | 2010-12-03 | 2013-02-05 | Xerox Corporation | Large-scale asymmetric comparison computation for binary embeddings |
US8763068B2 (en) * | 2010-12-09 | 2014-06-24 | Microsoft Corporation | Generation and provision of media metadata |
US20120159329A1 (en) * | 2010-12-16 | 2012-06-21 | Yahoo! Inc. | System for creating anchors for media content |
US20130334300A1 (en) * | 2011-01-03 | 2013-12-19 | Curt Evans | Text-synchronized media utilization and manipulation based on an embedded barcode |
US8914534B2 (en) | 2011-01-05 | 2014-12-16 | Sonic Ip, Inc. | Systems and methods for adaptive bitrate streaming of media stored in matroska container files using hypertext transfer protocol |
WO2012094564A1 (en) | 2011-01-06 | 2012-07-12 | Veveo, Inc. | Methods of and systems for content search based on environment sampling |
US8904445B2 (en) * | 2011-01-24 | 2014-12-02 | At&T Intellectual Property I, L.P. | Methods and apparatus to manage bandwidth allocations in media delivery networks |
US9099161B2 (en) | 2011-01-28 | 2015-08-04 | Apple Inc. | Media-editing application with multiple resolution modes |
US9792363B2 (en) * | 2011-02-01 | 2017-10-17 | Vdopia, INC. | Video display method |
US8730232B2 (en) | 2011-02-01 | 2014-05-20 | Legend3D, Inc. | Director-style based 2D to 3D movie conversion system and method |
US8843510B2 (en) | 2011-02-02 | 2014-09-23 | Echostar Technologies L.L.C. | Apparatus, systems and methods for production information metadata associated with media content |
US9313535B2 (en) * | 2011-02-03 | 2016-04-12 | Ericsson Ab | Generating montages of video segments responsive to viewing preferences associated with a video terminal |
US9602414B2 (en) | 2011-02-09 | 2017-03-21 | Time Warner Cable Enterprises Llc | Apparatus and methods for controlled bandwidth reclamation |
US8799300B2 (en) | 2011-02-10 | 2014-08-05 | Microsoft Corporation | Bookmarking segments of content |
WO2012108090A1 (en) * | 2011-02-10 | 2012-08-16 | 日本電気株式会社 | Inter-video correspondence display system and inter-video correspondence display method |
US20120210219A1 (en) | 2011-02-16 | 2012-08-16 | Giovanni Agnoli | Keywords and dynamic folder structures |
US9997196B2 (en) | 2011-02-16 | 2018-06-12 | Apple Inc. | Retiming media presentations |
US11747972B2 (en) | 2011-02-16 | 2023-09-05 | Apple Inc. | Media-editing application with novel editing tools |
US9288476B2 (en) | 2011-02-17 | 2016-03-15 | Legend3D, Inc. | System and method for real-time depth modification of stereo images of a virtual reality environment |
US9241147B2 (en) | 2013-05-01 | 2016-01-19 | Legend3D, Inc. | External depth map transformation method for conversion of two-dimensional images to stereoscopic images |
US9407904B2 (en) | 2013-05-01 | 2016-08-02 | Legend3D, Inc. | Method for creating 3D virtual reality from 2D images |
US9282321B2 (en) | 2011-02-17 | 2016-03-08 | Legend3D, Inc. | 3D model multi-reviewer system |
US8527359B1 (en) | 2011-02-23 | 2013-09-03 | Amazon Technologies, Inc. | Immersive multimedia views for items |
US8892681B2 (en) | 2011-03-03 | 2014-11-18 | At&T Intellectual Property I, L.P. | Peer to peer metadata distribution |
US8762332B2 (en) * | 2011-03-04 | 2014-06-24 | Scribble Technologies Inc. | Systems and method for facilitating the synchronization of data on multiple user computers |
WO2013106013A1 (en) * | 2011-03-31 | 2013-07-18 | Noah Spitzer-Williams | Bookmarking moments in a recorded video using a recorded human action |
JP5908577B2 (en) | 2011-04-08 | 2016-04-26 | レール・リキード−ソシエテ・アノニム・プール・レテュード・エ・レクスプロワタシオン・デ・プロセデ・ジョルジュ・クロード | Mixture of adsorbent and phase change material with compatible density |
US9031958B2 (en) | 2011-04-18 | 2015-05-12 | International Business Machines Corporation | File searching on mobile devices |
US8886009B2 (en) * | 2011-04-26 | 2014-11-11 | Sony Corporation | Creation of video bookmarks via scripted interactivity in advanced digital television |
US8995775B2 (en) * | 2011-05-02 | 2015-03-31 | Facebook, Inc. | Reducing photo-tagging spam |
US8725816B2 (en) | 2011-05-03 | 2014-05-13 | Vmtv, Inc. | Program guide based on sharing personal comments about multimedia content |
US9900662B2 (en) | 2011-05-03 | 2018-02-20 | Vmtv, Inc. | Social data associated with bookmarks to multimedia content |
US9319732B2 (en) | 2011-05-03 | 2016-04-19 | Vmtv, Inc. | Program guide based on sharing personal comments about multimedia content |
US9197593B2 (en) | 2011-05-03 | 2015-11-24 | Vmtv, Inc. | Social data associated with bookmarks to multimedia content |
US9678992B2 (en) | 2011-05-18 | 2017-06-13 | Microsoft Technology Licensing, Llc | Text to image translation |
US8843586B2 (en) | 2011-06-03 | 2014-09-23 | Apple Inc. | Playlists for real-time or near real-time streaming |
US10402442B2 (en) * | 2011-06-03 | 2019-09-03 | Microsoft Technology Licensing, Llc | Semantic search interface for data collections |
US8856283B2 (en) | 2011-06-03 | 2014-10-07 | Apple Inc. | Playlists for real-time or near real-time streaming |
KR101537342B1 (en) * | 2011-06-03 | 2015-07-20 | 주식회사 케이티 | System and method for providing the contents continuously service |
US20120315014A1 (en) * | 2011-06-10 | 2012-12-13 | Brian Shuster | Audio fingerprinting to bookmark a location within a video |
US8737820B2 (en) | 2011-06-17 | 2014-05-27 | Snapone, Inc. | Systems and methods for recording content within digital video |
KR101797507B1 (en) | 2011-06-20 | 2017-11-15 | 엘지전자 주식회사 | Media content transceiving method and transceiving apparatus using same |
JP5830784B2 (en) * | 2011-06-23 | 2015-12-09 | サイバーアイ・エンタテインメント株式会社 | Interest graph collection system by relevance search with image recognition system |
US8898139B1 (en) | 2011-06-24 | 2014-11-25 | Google Inc. | Systems and methods for dynamic visual search engine |
US8938393B2 (en) | 2011-06-28 | 2015-01-20 | Sony Corporation | Extended videolens media engine for audio recognition |
US9280273B2 (en) * | 2011-06-30 | 2016-03-08 | Nokia Technologies Oy | Method, apparatus, and computer program for displaying content items in display regions |
WO2013009544A2 (en) | 2011-07-12 | 2013-01-17 | Solutions Xyz, Llc | System and method for capturing and delivering video images |
US8635518B1 (en) * | 2011-07-21 | 2014-01-21 | Google Inc. | Methods and systems to copy web content selections |
US9058331B2 (en) | 2011-07-27 | 2015-06-16 | Ricoh Co., Ltd. | Generating a conversation in a social network based on visual search results |
US20130031589A1 (en) * | 2011-07-27 | 2013-01-31 | Xavier Casanova | Multiple resolution scannable video |
US20130036442A1 (en) * | 2011-08-05 | 2013-02-07 | Qualcomm Incorporated | System and method for visual selection of elements in video content |
US8577988B2 (en) * | 2011-08-24 | 2013-11-05 | Lg Electronics Inc. | Content device and control method thereof |
US9467708B2 (en) | 2011-08-30 | 2016-10-11 | Sonic Ip, Inc. | Selection of resolutions for seamless resolution switching of multimedia content |
US8787570B2 (en) | 2011-08-31 | 2014-07-22 | Sonic Ip, Inc. | Systems and methods for automatically genenrating top level index files |
US8964977B2 (en) | 2011-09-01 | 2015-02-24 | Sonic Ip, Inc. | Systems and methods for saving encoded media streamed using adaptive bitrate streaming |
US8909922B2 (en) | 2011-09-01 | 2014-12-09 | Sonic Ip, Inc. | Systems and methods for playing back alternative streams of protected content protected using common cryptographic information |
US9049465B2 (en) * | 2011-09-02 | 2015-06-02 | Electronics And Telecommunications Research Institute | Media sharing apparatus and method |
US8217945B1 (en) | 2011-09-02 | 2012-07-10 | Metric Insights, Inc. | Social annotation of a single evolving visual representation of a changing dataset |
ITTO20110823A1 (en) * | 2011-09-15 | 2013-03-16 | St Microelectronics Srl | SYSTEM AND PROCEDURE FOR SIZING MULTIMEDIA CONTENT, AND ITS COMPUTER PRODUCT |
US9536564B2 (en) | 2011-09-20 | 2017-01-03 | Apple Inc. | Role-facilitated editing operations |
US20130073960A1 (en) | 2011-09-20 | 2013-03-21 | Aaron M. Eppolito | Audio meters and parameter controls |
US9152700B2 (en) * | 2011-09-30 | 2015-10-06 | Google Inc. | Applying query based image relevance models |
BR112014007669B1 (en) | 2011-09-30 | 2021-03-02 | Huawei Technologies Co., Ltd | method and device for streaming streaming media |
EP2579189A1 (en) * | 2011-10-06 | 2013-04-10 | Thomson Licensing | Method and apparatus for generating an explanation for a recommendation |
EP2769291B1 (en) | 2011-10-18 | 2021-04-28 | Carnegie Mellon University | Method and apparatus for classifying touch events on a touch sensitive surface |
KR101491583B1 (en) * | 2011-11-01 | 2015-02-11 | 주식회사 케이티 | Device and method for providing interface customized in content |
US9015109B2 (en) | 2011-11-01 | 2015-04-21 | Lemi Technology, Llc | Systems, methods, and computer readable media for maintaining recommendations in a media recommendation system |
WO2013070802A1 (en) * | 2011-11-07 | 2013-05-16 | Finitiv Corporation | System and method for indexing and annotation of video content |
US20130125181A1 (en) * | 2011-11-15 | 2013-05-16 | Liquidus Marketing, Inc. | Dynamic Video Platform Technology |
US9612724B2 (en) | 2011-11-29 | 2017-04-04 | Citrix Systems, Inc. | Integrating native user interface components on a mobile device |
JP5906071B2 (en) * | 2011-12-01 | 2016-04-20 | キヤノン株式会社 | Information processing method, information processing apparatus, and storage medium |
US9565476B2 (en) * | 2011-12-02 | 2017-02-07 | Netzyn, Inc. | Video providing textual content system and method |
US20130198786A1 (en) * | 2011-12-07 | 2013-08-01 | Comcast Cable Communications, LLC. | Immersive Environment User Experience |
US20130147395A1 (en) | 2011-12-07 | 2013-06-13 | Comcast Cable Communications, Llc | Dynamic Ambient Lighting |
US8868481B2 (en) * | 2011-12-14 | 2014-10-21 | Google Inc. | Video recommendation based on video co-occurrence statistics |
WO2013096743A1 (en) * | 2011-12-22 | 2013-06-27 | Google Inc. | Sending snippets of media content to a computing device |
US8805418B2 (en) | 2011-12-23 | 2014-08-12 | United Video Properties, Inc. | Methods and systems for performing actions based on location-based rules |
KR20130076650A (en) * | 2011-12-28 | 2013-07-08 | 삼성전자주식회사 | Image processing apparatus, and control method thereof |
US9058515B1 (en) | 2012-01-12 | 2015-06-16 | Kofax, Inc. | Systems and methods for identification document processing and business workflow integration |
US8855375B2 (en) | 2012-01-12 | 2014-10-07 | Kofax, Inc. | Systems and methods for mobile image capture and processing |
US10146795B2 (en) | 2012-01-12 | 2018-12-04 | Kofax, Inc. | Systems and methods for mobile image capture and processing |
US9483794B2 (en) | 2012-01-12 | 2016-11-01 | Kofax, Inc. | Systems and methods for identification document processing and business workflow integration |
US9058580B1 (en) | 2012-01-12 | 2015-06-16 | Kofax, Inc. | Systems and methods for identification document processing and business workflow integration |
US9864817B2 (en) * | 2012-01-28 | 2018-01-09 | Microsoft Technology Licensing, Llc | Determination of relationships between collections of disparate media types |
US9159364B1 (en) * | 2012-01-30 | 2015-10-13 | Google Inc. | Aggregation of related media content |
US9908182B2 (en) | 2012-01-30 | 2018-03-06 | Black & Decker Inc. | Remote programming of a power tool |
US9239848B2 (en) | 2012-02-06 | 2016-01-19 | Microsoft Technology Licensing, Llc | System and method for semantically annotating images |
EP2800392A4 (en) * | 2012-02-10 | 2015-09-09 | Lg Electronics Inc | Image display apparatus and method for operating same |
WO2013128061A1 (en) * | 2012-02-27 | 2013-09-06 | Nokia Corporation | Media tagging |
WO2013134662A2 (en) * | 2012-03-08 | 2013-09-12 | Perwaiz Nihal | Systems and methods for creating a temporal content profile |
KR101952260B1 (en) * | 2012-04-03 | 2019-02-26 | 삼성전자주식회사 | Video display terminal and method for displaying a plurality of video thumbnail simultaneously |
US9467723B2 (en) | 2012-04-04 | 2016-10-11 | Time Warner Cable Enterprises Llc | Apparatus and methods for automated highlight reel creation in a content delivery network |
US20130283143A1 (en) * | 2012-04-24 | 2013-10-24 | Eric David Petajan | System for Annotating Media Content for Automatic Content Understanding |
KR101413988B1 (en) * | 2012-04-25 | 2014-07-01 | (주)이스트소프트 | System and method for separating and dividing documents |
US10140367B2 (en) | 2012-04-30 | 2018-11-27 | Mastercard International Incorporated | Apparatus, method and computer program product for characterizing an individual based on musical preferences |
US9781388B2 (en) * | 2012-05-28 | 2017-10-03 | Samsung Electronics Co., Ltd. | Method and system for enhancing user experience during an ongoing content viewing activity |
US8914452B2 (en) | 2012-05-31 | 2014-12-16 | International Business Machines Corporation | Automatically generating a personalized digest of meetings |
US9355381B2 (en) * | 2012-06-01 | 2016-05-31 | Senaya, Inc. | Asset tracking system with adjusted ping rate and ping period |
US9510141B2 (en) | 2012-06-04 | 2016-11-29 | Apple Inc. | App recommendation using crowd-sourced localized app usage data |
WO2013184383A2 (en) * | 2012-06-04 | 2013-12-12 | Apple Inc. | App recommendation using crowd-sourced localized app usage data |
US9281011B2 (en) * | 2012-06-13 | 2016-03-08 | Sonic Ip, Inc. | System and methods for encoding live multimedia content with synchronized audio data |
WO2014002004A1 (en) * | 2012-06-25 | 2014-01-03 | Batchu Sumana Krishnaiahsetty | A method for marking highlights in a multimedia file and an electronic device thereof |
WO2014026247A1 (en) * | 2012-08-16 | 2014-02-20 | Captioning Studio Technologies Pty Ltd | Method and system for providing relevant portions of multi-media based on text searching of multi-media |
US20140082645A1 (en) | 2012-09-14 | 2014-03-20 | Peter Stern | Apparatus and methods for providing enhanced or interactive features |
US9407961B2 (en) * | 2012-09-14 | 2016-08-02 | Intel Corporation | Media stream selective decode based on window visibility state |
US8914836B2 (en) | 2012-09-28 | 2014-12-16 | Sonic Ip, Inc. | Systems, methods, and computer program products for load adaptive streaming |
CN104904202A (en) * | 2012-09-28 | 2015-09-09 | 三星电子株式会社 | Video encoding method and apparatus for parallel processing using reference picture information, and video decoding method and apparatus for parallel processing using reference picture information |
US8908987B1 (en) * | 2012-10-01 | 2014-12-09 | Google Inc. | Providing image candidates based on diverse adjustments to an image |
CN103716703A (en) * | 2012-10-09 | 2014-04-09 | 腾讯科技(深圳)有限公司 | Video playing method and apparatus |
US9306989B1 (en) | 2012-10-16 | 2016-04-05 | Google Inc. | Linking social media and broadcast media |
US9007365B2 (en) | 2012-11-27 | 2015-04-14 | Legend3D, Inc. | Line depth augmentation system and method for conversion of 2D images to 3D images |
US9547937B2 (en) | 2012-11-30 | 2017-01-17 | Legend3D, Inc. | Three-dimensional annotation system and method |
IL223381B (en) | 2012-12-02 | 2018-01-31 | Berale Of Teldan Group Ltd | Automatic summarising of media content |
US9565472B2 (en) | 2012-12-10 | 2017-02-07 | Time Warner Cable Enterprises Llc | Apparatus and methods for content transfer protection |
JP6139872B2 (en) * | 2012-12-10 | 2017-05-31 | キヤノン株式会社 | Information processing apparatus and control method therefor, program, storage medium, and video processing system |
US20140172630A1 (en) * | 2012-12-14 | 2014-06-19 | Mastercard International Incorporated | Social media interface for use with a global shopping cart |
US9600351B2 (en) | 2012-12-14 | 2017-03-21 | Microsoft Technology Licensing, Llc | Inversion-of-control component service models for virtual environments |
US9191457B2 (en) | 2012-12-31 | 2015-11-17 | Sonic Ip, Inc. | Systems, methods, and media for controlling delivery of content |
US9313510B2 (en) | 2012-12-31 | 2016-04-12 | Sonic Ip, Inc. | Use of objective quality measures of streamed content to reduce streaming bandwidth |
US9973823B1 (en) * | 2013-01-16 | 2018-05-15 | The Directv Group, Inc. | Method and system for providing access to content data for previously broadcasted content |
US8914416B2 (en) | 2013-01-31 | 2014-12-16 | Hewlett-Packard Development Company, L.P. | Semantics graphs for enterprise communication networks |
US9704136B2 (en) | 2013-01-31 | 2017-07-11 | Hewlett Packard Enterprise Development Lp | Identifying subsets of signifiers to analyze |
US9355166B2 (en) | 2013-01-31 | 2016-05-31 | Hewlett Packard Enterprise Development Lp | Clustering signifiers in a semantics graph |
US9749710B2 (en) * | 2013-03-01 | 2017-08-29 | Excalibur Ip, Llc | Video analysis system |
US9380443B2 (en) | 2013-03-12 | 2016-06-28 | Comcast Cable Communications, Llc | Immersive positioning and paring |
US9355312B2 (en) | 2013-03-13 | 2016-05-31 | Kofax, Inc. | Systems and methods for classifying objects in digital images captured using mobile devices |
US9208536B2 (en) | 2013-09-27 | 2015-12-08 | Kofax, Inc. | Systems and methods for three dimensional geometric reconstruction of captured image data |
WO2014160426A1 (en) | 2013-03-13 | 2014-10-02 | Kofax, Inc. | Classifying objects in digital images captured using mobile devices |
US9952742B2 (en) * | 2013-03-14 | 2018-04-24 | Google Llc | Providing trending information to users |
US10275463B2 (en) | 2013-03-15 | 2019-04-30 | Slacker, Inc. | System and method for scoring and ranking digital content based on activity of network users |
US9007404B2 (en) | 2013-03-15 | 2015-04-14 | Legend3D, Inc. | Tilt-based look around effect image enhancement method |
US9906785B2 (en) | 2013-03-15 | 2018-02-27 | Sonic Ip, Inc. | Systems, methods, and media for transcoding video data according to encoding parameters indicated by received metadata |
US10397292B2 (en) | 2013-03-15 | 2019-08-27 | Divx, Llc | Systems, methods, and media for delivery of content |
US20140281994A1 (en) * | 2013-03-15 | 2014-09-18 | Xiaomi Inc. | Interactive method, terminal device and system for communicating multimedia information |
KR20140114766A (en) | 2013-03-19 | 2014-09-29 | 퀵소 코 | Method and device for sensing touch inputs |
US9013452B2 (en) | 2013-03-25 | 2015-04-21 | Qeexo, Co. | Method and system for activating different interactive functions using different types of finger contacts |
US9612689B2 (en) | 2015-02-02 | 2017-04-04 | Qeexo, Co. | Method and apparatus for classifying a touch event on a touchscreen as related to one of multiple function generating interaction layers and activating a function in the selected interaction layer |
US9055169B2 (en) | 2013-03-29 | 2015-06-09 | Hewlett-Packard Development Company, L.P. | Printing frames of a video |
US9958228B2 (en) | 2013-04-01 | 2018-05-01 | Yardarm Technologies, Inc. | Telematics sensors and camera activation in connection with firearm activity |
EP2988520B1 (en) * | 2013-04-17 | 2019-11-20 | Panasonic Intellectual Property Management Co., Ltd. | Video reception device, and information-display control method for video reception device |
US20140316841A1 (en) | 2013-04-23 | 2014-10-23 | Kofax, Inc. | Location-based workflows and services |
US10373470B2 (en) | 2013-04-29 | 2019-08-06 | Intelliview Technologies, Inc. | Object detection |
US9438878B2 (en) | 2013-05-01 | 2016-09-06 | Legend3D, Inc. | Method of converting 2D video to 3D video using 3D object models |
JP2016518790A (en) | 2013-05-03 | 2016-06-23 | コファックス, インコーポレイテッド | System and method for detecting and classifying objects in video captured using a mobile device |
US9852769B2 (en) | 2013-05-20 | 2017-12-26 | Intel Corporation | Elastic cloud video editing and multimedia search |
US9094737B2 (en) | 2013-05-30 | 2015-07-28 | Sonic Ip, Inc. | Network video streaming with trick play based on separate trick play files |
US9405853B2 (en) * | 2013-06-17 | 2016-08-02 | Hewlett Packard Enterprise Development Lp | Reading object queries |
US9442637B1 (en) * | 2013-06-17 | 2016-09-13 | Xdroid Kft | Hierarchical navigation and visualization system |
US10001904B1 (en) | 2013-06-26 | 2018-06-19 | R3 Collaboratives, Inc. | Categorized and tagged video annotation |
WO2014205756A1 (en) * | 2013-06-28 | 2014-12-31 | Microsoft Corporation | Selecting and editing visual elements with attribute groups |
US9967305B2 (en) | 2013-06-28 | 2018-05-08 | Divx, Llc | Systems, methods, and media for streaming media content |
US20150012840A1 (en) * | 2013-07-02 | 2015-01-08 | International Business Machines Corporation | Identification and Sharing of Selections within Streaming Content |
KR20150004623A (en) * | 2013-07-03 | 2015-01-13 | 삼성전자주식회사 | Apparatas and method for unified search of contents in an electronic device |
KR102114729B1 (en) | 2013-08-23 | 2020-05-26 | 삼성전자주식회사 | Method for displaying saved information and an electronic device thereof |
US9239991B2 (en) | 2013-09-05 | 2016-01-19 | General Electric Company | Services support system and method |
WO2015036648A1 (en) * | 2013-09-11 | 2015-03-19 | Nokia Technologies Oy | An apparatus for processing images and associated methods |
WO2015038338A1 (en) * | 2013-09-16 | 2015-03-19 | Thomson Licensing | Browsing videos by searching multiple user comments and overlaying those into the content |
WO2015042901A1 (en) | 2013-09-29 | 2015-04-02 | Microsoft Technology Licensing, Llc | Media presentation effects |
CN104572712A (en) * | 2013-10-18 | 2015-04-29 | 英业达科技有限公司 | Multimedia file browsing system and multimedia file browsing method |
US9684661B2 (en) * | 2013-10-24 | 2017-06-20 | Kim Marie Rees | Method for correlating data |
EP2869236A1 (en) * | 2013-10-31 | 2015-05-06 | Alcatel Lucent | Process for generating a video tag cloud representing objects appearing in a video content |
US9674563B2 (en) | 2013-11-04 | 2017-06-06 | Rovi Guides, Inc. | Systems and methods for recommending content |
WO2015073920A1 (en) | 2013-11-15 | 2015-05-21 | Kofax, Inc. | Systems and methods for generating composite images of long documents using mobile video data |
US20150170067A1 (en) * | 2013-12-17 | 2015-06-18 | International Business Machines Corporation | Determining analysis recommendations based on data analysis context |
US9491514B2 (en) * | 2013-12-19 | 2016-11-08 | Echostar Technologies L.L.C. | Media content bookmarking |
US9607015B2 (en) | 2013-12-20 | 2017-03-28 | Qualcomm Incorporated | Systems, methods, and apparatus for encoding object formations |
US9842111B2 (en) | 2013-12-22 | 2017-12-12 | Varonis Systems, Ltd. | On-demand indexing |
US10002191B2 (en) | 2013-12-31 | 2018-06-19 | Google Llc | Methods, systems, and media for generating search results based on contextual information |
US10079040B2 (en) | 2013-12-31 | 2018-09-18 | Disney Enterprises, Inc. | Systems and methods for video clip creation, curation, and interaction |
US9456237B2 (en) | 2013-12-31 | 2016-09-27 | Google Inc. | Methods, systems, and media for presenting supplemental information corresponding to on-demand media content |
DE112014006235T5 (en) | 2014-01-22 | 2016-10-13 | Apple Inc. | Coordinated handover of an audio data transmission |
US20150221112A1 (en) * | 2014-02-04 | 2015-08-06 | Microsoft Corporation | Emotion Indicators in Content |
US10037380B2 (en) | 2014-02-14 | 2018-07-31 | Microsoft Technology Licensing, Llc | Browsing videos via a segment list |
KR102243653B1 (en) * | 2014-02-17 | 2021-04-23 | 엘지전자 주식회사 | Didsplay device and Method for controlling thereof |
KR101678389B1 (en) * | 2014-02-28 | 2016-11-22 | 엔트릭스 주식회사 | Method for providing media data based on cloud computing, apparatus and system |
CA2847707C (en) | 2014-03-28 | 2021-03-30 | Intelliview Technologies Inc. | Leak detection |
US9866878B2 (en) | 2014-04-05 | 2018-01-09 | Sonic Ip, Inc. | Systems and methods for encoding and playing back video at different frame rates using enhancement layers |
US10431259B2 (en) | 2014-04-23 | 2019-10-01 | Sony Corporation | Systems and methods for reviewing video content |
US20150312652A1 (en) * | 2014-04-24 | 2015-10-29 | Microsoft Corporation | Automatic generation of videos via a segment list |
US20180046630A1 (en) * | 2016-08-12 | 2018-02-15 | Invensys Systems, Inc. | Storing and identifying content through content descriptors in a historian system |
US9621940B2 (en) | 2014-05-29 | 2017-04-11 | Time Warner Cable Enterprises Llc | Apparatus and methods for recording, accessing, and delivering packetized content |
US9996898B2 (en) * | 2014-05-30 | 2018-06-12 | International Business Machines Corporation | Flexible control in resizing of visual displays |
US9913100B2 (en) | 2014-05-30 | 2018-03-06 | Apple Inc. | Techniques for generating maps of venues including buildings and floors |
US10108748B2 (en) | 2014-05-30 | 2018-10-23 | Apple Inc. | Most relevant application recommendation based on crowd-sourced application usage data |
US9766789B1 (en) | 2014-07-07 | 2017-09-19 | Cloneless Media, LLC | Media effects system |
US9402161B2 (en) | 2014-07-23 | 2016-07-26 | Apple Inc. | Providing personalized content based on historical interaction with a mobile device |
US9679609B2 (en) * | 2014-08-14 | 2017-06-13 | Utc Fire & Security Corporation | Systems and methods for cataloguing audio-visual data |
US10943357B2 (en) | 2014-08-19 | 2021-03-09 | Intelliview Technologies Inc. | Video based indoor leak detection |
US10102285B2 (en) | 2014-08-27 | 2018-10-16 | International Business Machines Corporation | Consolidating video search for an event |
US9870800B2 (en) | 2014-08-27 | 2018-01-16 | International Business Machines Corporation | Multi-source video input |
CN105468347B (en) * | 2014-09-05 | 2018-07-27 | 富泰华工业(深圳)有限公司 | Suspend the system and method for video playing |
US9329715B2 (en) | 2014-09-11 | 2016-05-03 | Qeexo, Co. | Method and apparatus for differentiating touch screen users based on touch event analysis |
US11619983B2 (en) | 2014-09-15 | 2023-04-04 | Qeexo, Co. | Method and apparatus for resolving touch screen ambiguities |
US10606417B2 (en) | 2014-09-24 | 2020-03-31 | Qeexo, Co. | Method for improving accuracy of touch screen event analysis by use of spatiotemporal touch patterns |
US10282024B2 (en) | 2014-09-25 | 2019-05-07 | Qeexo, Co. | Classifying contacts or associations with a touch sensitive device |
US11051075B2 (en) | 2014-10-03 | 2021-06-29 | Dish Network L.L.C. | Systems and methods for providing bookmarking data |
US10140379B2 (en) | 2014-10-27 | 2018-11-27 | Chegg, Inc. | Automated lecture deconstruction |
US9760788B2 (en) | 2014-10-30 | 2017-09-12 | Kofax, Inc. | Mobile document detection and orientation based on reference object characteristics |
US20170244992A1 (en) * | 2014-10-30 | 2017-08-24 | Sharp Kabushiki Kaisha | Media playback communication |
US10430805B2 (en) | 2014-12-10 | 2019-10-01 | Samsung Electronics Co., Ltd. | Semantic enrichment of trajectory data |
CN105787402B (en) | 2014-12-16 | 2019-07-05 | 阿里巴巴集团控股有限公司 | A kind of information displaying method and device |
CN105893387B (en) * | 2015-01-04 | 2021-03-23 | 伊姆西Ip控股有限责任公司 | Intelligent multimedia processing method and system |
CN107111477B (en) | 2015-01-06 | 2021-05-14 | 帝威视有限公司 | System and method for encoding content and sharing content between devices |
KR101589180B1 (en) * | 2015-01-28 | 2016-02-02 | 엘지전자 주식회사 | Mobile terminal and method for controlling the same |
US10116676B2 (en) | 2015-02-13 | 2018-10-30 | Time Warner Cable Enterprises Llc | Apparatus and methods for data collection, analysis and service modification based on online activity |
US10440076B2 (en) | 2015-03-10 | 2019-10-08 | Mobitv, Inc. | Media seek mechanisms |
US10331304B2 (en) * | 2015-05-06 | 2019-06-25 | Microsoft Technology Licensing, Llc | Techniques to automatically generate bookmarks for media files |
EP3298797A4 (en) * | 2015-05-22 | 2019-01-30 | Playsight Interactive Ltd. | Event based video generation |
US9529500B1 (en) | 2015-06-05 | 2016-12-27 | Apple Inc. | Application recommendation based on detected triggering events |
US20160378863A1 (en) * | 2015-06-24 | 2016-12-29 | Google Inc. | Selecting representative video frames for videos |
JP6468105B2 (en) * | 2015-07-16 | 2019-02-13 | 富士ゼロックス株式会社 | COMMUNICATION SYSTEM, SERVER DEVICE, CLIENT DEVICE, AND PROGRAM |
US10242285B2 (en) | 2015-07-20 | 2019-03-26 | Kofax, Inc. | Iterative recognition-guided thresholding and data extraction |
US10642404B2 (en) | 2015-08-24 | 2020-05-05 | Qeexo, Co. | Touch sensitive device with multi-sensor stream synchronized data |
US10582235B2 (en) * | 2015-09-01 | 2020-03-03 | The Nielsen Company (Us), Llc | Methods and apparatus to monitor a media presentation |
KR102551239B1 (en) * | 2015-09-02 | 2023-07-05 | 인터디지털 씨이 페이튼트 홀딩스, 에스에이에스 | Method, apparatus and system for facilitating navigation in an extended scene |
US9609307B1 (en) | 2015-09-17 | 2017-03-28 | Legend3D, Inc. | Method of converting 2D video to 3D video using machine learning |
TWI607331B (en) * | 2015-09-23 | 2017-12-01 | 財團法人工業技術研究院 | Method and device for analyzing data |
US10395323B2 (en) | 2015-11-06 | 2019-08-27 | International Business Machines Corporation | Defect management |
CN105357562B (en) * | 2015-11-11 | 2017-10-24 | 腾讯科技(深圳)有限公司 | A kind of information processing method and terminal |
KR101929781B1 (en) | 2016-01-20 | 2018-12-17 | 엘지전자 주식회사 | Mobile terminal and method for controlling the same |
US9779293B2 (en) * | 2016-01-27 | 2017-10-03 | Honeywell International Inc. | Method and tool for post-mortem analysis of tripped field devices in process industry using optical character recognition and intelligent character recognition |
US10404758B2 (en) | 2016-02-26 | 2019-09-03 | Time Warner Cable Enterprises Llc | Apparatus and methods for centralized message exchange in a user premises device |
US9779296B1 (en) | 2016-04-01 | 2017-10-03 | Kofax, Inc. | Content-based detection and three dimensional geometric reconstruction of objects in image and video data |
GB2549117B (en) * | 2016-04-05 | 2021-01-06 | Intelligent Voice Ltd | A searchable media player |
US9858340B1 (en) * | 2016-04-11 | 2018-01-02 | Digital Reasoning Systems, Inc. | Systems and methods for queryable graph representations of videos |
US10231001B2 (en) | 2016-05-24 | 2019-03-12 | Divx, Llc | Systems and methods for providing audio content during trick-play playback |
KR101810321B1 (en) * | 2016-05-30 | 2017-12-20 | 라인 가부시키가이샤 | Method and system for providing digital content based on social |
US10389776B2 (en) * | 2016-07-29 | 2019-08-20 | International Business Machines Corporation | Media streaming using hybrid P2P and client-server distribution of content |
CN106897346A (en) | 2016-08-04 | 2017-06-27 | 阿里巴巴集团控股有限公司 | The method and device of data processing |
CN106791892B (en) * | 2016-11-10 | 2020-05-12 | 广州华多网络科技有限公司 | Method, device and system for live broadcasting of wheelhouses |
US10482126B2 (en) * | 2016-11-30 | 2019-11-19 | Google Llc | Determination of similarity between videos using shot duration correlation |
US20180189143A1 (en) * | 2017-01-03 | 2018-07-05 | International Business Machines Corporation | Simultaneous compression of multiple stored videos |
US10769797B2 (en) * | 2017-01-05 | 2020-09-08 | Samsung Electronics Co., Ltd. | Virtual reality experience sharing |
US10498795B2 (en) | 2017-02-17 | 2019-12-03 | Divx, Llc | Systems and methods for adaptive switching between multiple content delivery networks during adaptive bitrate streaming |
US10638144B2 (en) * | 2017-03-15 | 2020-04-28 | Facebook, Inc. | Content-based transcoder |
EP3399438A1 (en) * | 2017-05-04 | 2018-11-07 | Buzzmusiq Inc. | Method for creating preview track and apparatus using same |
US11599263B2 (en) * | 2017-05-18 | 2023-03-07 | Sony Group Corporation | Information processing device, method, and program for generating a proxy image from a proxy file representing a moving image |
US10719715B2 (en) * | 2017-06-07 | 2020-07-21 | Silveredge Technologies Pvt. Ltd. | Method and system for adaptively switching detection strategies for watermarked and non-watermarked real-time televised advertisements |
WO2019008581A1 (en) | 2017-07-05 | 2019-01-10 | Cortica Ltd. | Driving policies determination |
WO2019012527A1 (en) | 2017-07-09 | 2019-01-17 | Cortica Ltd. | Deep learning networks orchestration |
CN107688792B (en) * | 2017-09-05 | 2020-06-05 | 语联网(武汉)信息技术有限公司 | Video translation method and system |
US10997620B2 (en) * | 2017-09-18 | 2021-05-04 | Vertigo Studios, Llc | Blockchain-enabled system for controlling advertiser access to personal user data |
US10587919B2 (en) | 2017-09-29 | 2020-03-10 | International Business Machines Corporation | Cognitive digital video filtering based on user preferences |
US11363352B2 (en) * | 2017-09-29 | 2022-06-14 | International Business Machines Corporation | Video content relationship mapping |
EP3698372A1 (en) | 2017-10-17 | 2020-08-26 | Verily Life Sciences LLC | Systems and methods for segmenting surgical videos |
US10349097B2 (en) * | 2017-10-27 | 2019-07-09 | Mti Film, Llc | Metadata editor for multimedia delivery |
US10803350B2 (en) | 2017-11-30 | 2020-10-13 | Kofax, Inc. | Object detection and image cropping using a multi-detector approach |
KR102462516B1 (en) | 2018-01-09 | 2022-11-03 | 삼성전자주식회사 | Display apparatus and Method for providing a content thereof |
CN111602105B (en) * | 2018-01-22 | 2023-09-01 | 苹果公司 | Method and apparatus for presenting synthetic reality accompanying content |
US10595088B2 (en) * | 2018-03-28 | 2020-03-17 | Neulion, Inc. | Systems and methods for bookmarking during live media streaming |
CN110545469B (en) * | 2018-05-29 | 2021-07-06 | 北京字节跳动网络技术有限公司 | Webpage playing method, device and storage medium of non-streaming media file |
US11227197B2 (en) | 2018-08-02 | 2022-01-18 | International Business Machines Corporation | Semantic understanding of images based on vectorization |
US11009989B2 (en) | 2018-08-21 | 2021-05-18 | Qeexo, Co. | Recognizing and rejecting unintentional touch events associated with a touch sensitive device |
IL311652A (en) | 2018-09-18 | 2024-05-01 | Vertigo Studios Llc | Interoperable digital social recorder of multi-threaded smart routed media and crypto asset compliance and payment systems and methods |
US10839694B2 (en) | 2018-10-18 | 2020-11-17 | Cartica Ai Ltd | Blind spot alert |
US11181911B2 (en) | 2018-10-18 | 2021-11-23 | Cartica Ai Ltd | Control transfer of a vehicle |
US20200133308A1 (en) | 2018-10-18 | 2020-04-30 | Cartica Ai Ltd | Vehicle to vehicle (v2v) communication less truck platooning |
US11126870B2 (en) | 2018-10-18 | 2021-09-21 | Cartica Ai Ltd. | Method and system for obstacle detection |
US11270132B2 (en) | 2018-10-26 | 2022-03-08 | Cartica Ai Ltd | Vehicle to vehicle communication and signatures |
US10748038B1 (en) | 2019-03-31 | 2020-08-18 | Cortica Ltd. | Efficient calculation of a robust signature of a media unit |
US10789535B2 (en) | 2018-11-26 | 2020-09-29 | Cartica Ai Ltd | Detection of road elements |
US10936178B2 (en) * | 2019-01-07 | 2021-03-02 | MemoryWeb, LLC | Systems and methods for analyzing and organizing digital photos and videos |
US11643005B2 (en) | 2019-02-27 | 2023-05-09 | Autobrains Technologies Ltd | Adjusting adjustable headlights of a vehicle |
US10932009B2 (en) * | 2019-03-08 | 2021-02-23 | Fcb Worldwide, Inc. | Technologies for analyzing and searching for features in image data |
US11285963B2 (en) | 2019-03-10 | 2022-03-29 | Cartica Ai Ltd. | Driver-based prediction of dangerous events |
US11694088B2 (en) | 2019-03-13 | 2023-07-04 | Cortica Ltd. | Method for object detection using knowledge distillation |
CN109947986A (en) * | 2019-03-18 | 2019-06-28 | 东华大学 | Infrared video timing localization method based on structuring sectional convolution neural network |
US11132548B2 (en) | 2019-03-20 | 2021-09-28 | Cortica Ltd. | Determining object information that does not explicitly appear in a media unit signature |
WO2020190246A1 (en) * | 2019-03-21 | 2020-09-24 | Google Llc | Content encryption |
KR102065994B1 (en) * | 2019-03-22 | 2020-01-15 | 보보인터내셔널 주식회사 | The method of matching an audio contents with the other audio contents using sound triggers |
US11348235B2 (en) | 2019-03-22 | 2022-05-31 | Verily Life Sciences Llc | Improving surgical video consumption by identifying useful segments in surgical videos |
US10958975B2 (en) | 2019-03-27 | 2021-03-23 | Rovi Guides, Inc. | Method and apparatus for identifying a single user requesting conflicting content and resolving said conflict |
US10897648B2 (en) * | 2019-03-27 | 2021-01-19 | Rovi Guides, Inc. | Method and apparatus for identifying a single user requesting conflicting content and resolving said conflict |
US12055408B2 (en) | 2019-03-28 | 2024-08-06 | Autobrains Technologies Ltd | Estimating a movement of a hybrid-behavior vehicle |
US10776669B1 (en) | 2019-03-31 | 2020-09-15 | Cortica Ltd. | Signature generation and object detection that refer to rare scenes |
US11222069B2 (en) | 2019-03-31 | 2022-01-11 | Cortica Ltd. | Low-power calculation of a signature of a media unit |
US10796444B1 (en) | 2019-03-31 | 2020-10-06 | Cortica Ltd | Configuring spanning elements of a signature generator |
US10789527B1 (en) | 2019-03-31 | 2020-09-29 | Cortica Ltd. | Method for object detection using shallow neural networks |
US10445915B1 (en) | 2019-04-09 | 2019-10-15 | Coupang Corp. | Systems and methods for efficient management and modification of images |
US10942603B2 (en) | 2019-05-06 | 2021-03-09 | Qeexo, Co. | Managing activity states of an application processor in relation to touch or hover interactions with a touch sensitive device |
US10997459B2 (en) * | 2019-05-23 | 2021-05-04 | Webkontrol, Inc. | Video content indexing and searching |
JP7299767B2 (en) * | 2019-06-20 | 2023-06-28 | 日本放送協会 | Stream comparator and program |
US11620389B2 (en) | 2019-06-24 | 2023-04-04 | University Of Maryland Baltimore County | Method and system for reducing false positives in static source code analysis reports using machine learning and classification techniques |
US11231815B2 (en) | 2019-06-28 | 2022-01-25 | Qeexo, Co. | Detecting object proximity using touch sensitive surface sensing and ultrasonic sensing |
WO2021009597A1 (en) * | 2019-07-12 | 2021-01-21 | Carrier Corporation | A system and a method for streaming videos by creating object urls at client |
US11238094B2 (en) | 2019-09-25 | 2022-02-01 | Rovi Guides, Inc. | Auto-populating image metadata |
CN111104457A (en) * | 2019-10-30 | 2020-05-05 | 武汉大学 | Massive space-time data management method based on distributed database |
CN110825891B (en) * | 2019-10-31 | 2023-11-14 | 北京小米移动软件有限公司 | Method and device for identifying multimedia information and storage medium |
US11593662B2 (en) | 2019-12-12 | 2023-02-28 | Autobrains Technologies Ltd | Unsupervised cluster generation |
US10748022B1 (en) | 2019-12-12 | 2020-08-18 | Cartica Ai Ltd | Crowd separation |
CN110968730B (en) * | 2019-12-16 | 2023-06-09 | Oppo(重庆)智能科技有限公司 | Audio mark processing method, device, computer equipment and storage medium |
US11429662B2 (en) | 2019-12-20 | 2022-08-30 | SWATCHBOOK, Inc. | Material search system for visual, structural, and semantic search using machine learning |
US11592423B2 (en) | 2020-01-29 | 2023-02-28 | Qeexo, Co. | Adaptive ultrasonic sensing techniques and systems to mitigate interference |
US11172269B2 (en) | 2020-03-04 | 2021-11-09 | Dish Network L.L.C. | Automated commercial content shifting in a video streaming system |
US11590988B2 (en) | 2020-03-19 | 2023-02-28 | Autobrains Technologies Ltd | Predictive turning assistant |
US11827215B2 (en) | 2020-03-31 | 2023-11-28 | AutoBrains Technologies Ltd. | Method for training a driving related object detector |
US11386151B2 (en) | 2020-04-11 | 2022-07-12 | Open Space Labs, Inc. | Image search in walkthrough videos |
CN112306595A (en) * | 2020-04-30 | 2021-02-02 | 北京字节跳动网络技术有限公司 | Interaction method and device and electronic equipment |
US11776578B2 (en) | 2020-06-02 | 2023-10-03 | Trapelo Corp. | Automatic modification of values of content elements in a video |
US11381797B2 (en) | 2020-07-16 | 2022-07-05 | Apple Inc. | Variable audio for audio-visual content |
US11756424B2 (en) | 2020-07-24 | 2023-09-12 | AutoBrains Technologies Ltd. | Parking assist |
US11675822B2 (en) * | 2020-07-27 | 2023-06-13 | International Business Machines Corporation | Computer generated data analysis and learning to derive multimedia factoids |
US11379432B2 (en) | 2020-08-28 | 2022-07-05 | Bank Of America Corporation | File management using a temporal database architecture |
CN112019851A (en) * | 2020-08-31 | 2020-12-01 | 佛山市南海区广工大数控装备协同创新研究院 | Lens transformation detection method based on visual rhythm |
US12033669B2 (en) | 2020-09-10 | 2024-07-09 | Adobe Inc. | Snap point video segmentation identifying selection snap points for a video |
US11631434B2 (en) * | 2020-09-10 | 2023-04-18 | Adobe Inc. | Selecting and performing operations on hierarchical clusters of video segments |
US11887371B2 (en) | 2020-09-10 | 2024-01-30 | Adobe Inc. | Thumbnail video segmentation identifying thumbnail locations for a video |
US11630562B2 (en) | 2020-09-10 | 2023-04-18 | Adobe Inc. | Interacting with hierarchical clusters of video segments using a video timeline |
US11880408B2 (en) * | 2020-09-10 | 2024-01-23 | Adobe Inc. | Interacting with hierarchical clusters of video segments using a metadata search |
US11995894B2 (en) * | 2020-09-10 | 2024-05-28 | Adobe Inc. | Interacting with hierarchical clusters of video segments using a metadata panel |
US11887629B2 (en) | 2020-09-10 | 2024-01-30 | Adobe Inc. | Interacting with semantic video segments through interactive tiles |
US11810358B2 (en) | 2020-09-10 | 2023-11-07 | Adobe Inc. | Video search segmentation |
US11450112B2 (en) | 2020-09-10 | 2022-09-20 | Adobe Inc. | Segmentation and hierarchical clustering of video |
US11455731B2 (en) | 2020-09-10 | 2022-09-27 | Adobe Inc. | Video segmentation based on detected video features using a graphical model |
US12049116B2 (en) | 2020-09-30 | 2024-07-30 | Autobrains Technologies Ltd | Configuring an active suspension |
CN112215223B (en) | 2020-10-16 | 2024-03-19 | 清华大学 | Multidirectional scene character recognition method and system based on multi-element attention mechanism |
EP3985669A1 (en) * | 2020-10-16 | 2022-04-20 | Moodagent A/S | Methods and systems for automatically matching audio content with visual input |
US20220167057A1 (en) * | 2020-11-23 | 2022-05-26 | Arris Enterprises Llc | Managing user uploaded content |
US11765428B2 (en) * | 2021-04-07 | 2023-09-19 | Idomoo Ltd | System and method to adapting video size |
US11783001B2 (en) | 2021-07-08 | 2023-10-10 | Bank Of America Corporation | System and method for splitting a video stream using breakpoints based on recognizing workflow patterns |
US11921999B2 (en) * | 2021-07-27 | 2024-03-05 | Rovi Guides, Inc. | Methods and systems for populating data for content item |
US12110075B2 (en) | 2021-08-05 | 2024-10-08 | AutoBrains Technologies Ltd. | Providing a prediction of a radius of a motorcycle turn |
US11516270B1 (en) | 2021-08-20 | 2022-11-29 | T-Mobile Usa, Inc. | Network protocol for enabling enhanced features for media content |
CN114286169B (en) * | 2021-08-31 | 2023-06-20 | 腾讯科技(深圳)有限公司 | Video generation method, device, terminal, server and storage medium |
US11656926B1 (en) | 2022-01-26 | 2023-05-23 | Bank Of America Corporation | Systems and methods for automatically applying configuration changes to computing clusters |
WO2024107176A1 (en) * | 2022-11-15 | 2024-05-23 | Rakuten Symphony India Pte. Ltd. | Hbase online merging |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5956716A (en) * | 1995-06-07 | 1999-09-21 | Intervu, Inc. | System and method for delivery of video data over a computer network |
US6128617A (en) * | 1997-11-24 | 2000-10-03 | Lowry Software, Incorporated | Data display software with actions and links integrated with information |
US6351776B1 (en) * | 1999-11-04 | 2002-02-26 | Xdrive, Inc. | Shared internet storage resource, user interface system, and method |
US6473804B1 (en) * | 1999-01-15 | 2002-10-29 | Grischa Corporation | System for indexical triggers in enhanced video productions by redirecting request to newly generated URI based on extracted parameter of first URI |
US6546405B2 (en) * | 1997-10-23 | 2003-04-08 | Microsoft Corporation | Annotating temporally-dimensioned multimedia content |
US6573907B1 (en) * | 1997-07-03 | 2003-06-03 | Obvious Technology | Network distribution and management of interactive video and multi-media containers |
US6591295B1 (en) * | 1999-11-05 | 2003-07-08 | Oracle International Corp. | Methods and apparatus for using multimedia data stored in a relational database in web applications |
US6598074B1 (en) * | 1999-09-23 | 2003-07-22 | Rocket Network, Inc. | System and method for enabling multimedia production collaboration over a network |
US6882793B1 (en) * | 2000-06-16 | 2005-04-19 | Yesvideo, Inc. | Video processing system |
US7096271B1 (en) * | 1998-09-15 | 2006-08-22 | Microsoft Corporation | Managing timeline modification and synchronization of multiple media streams in networked client/server systems |
Family Cites Families (63)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA1206278A (en) * | 1981-10-16 | 1986-06-17 | David R. Stamp | Fluoroscopic examination of pipe girth welds |
FR2515331A1 (en) * | 1981-10-23 | 1983-04-29 | Creusot Loire | DEVICE FOR FIXING A BEAM OF TUBES, IN PARTICULAR FOR A STEAM GENERATOR |
US5758180A (en) * | 1993-04-15 | 1998-05-26 | Sony Corporation | Block resizing function for multi-media editing which moves other blocks in response to the resize only as necessary |
US5553281A (en) * | 1994-03-21 | 1996-09-03 | Visual F/X, Inc. | Method for computer-assisted media processing |
JP3575063B2 (en) | 1994-07-04 | 2004-10-06 | ソニー株式会社 | Playback device and playback method |
WO1996017313A1 (en) | 1994-11-18 | 1996-06-06 | Oracle Corporation | Method and apparatus for indexing multimedia information streams |
US6477370B1 (en) * | 1995-09-19 | 2002-11-05 | Motient Service Inc. | Satellite trunked radio service system |
US5767893A (en) | 1995-10-11 | 1998-06-16 | International Business Machines Corporation | Method and apparatus for content based downloading of video programs |
US5966121A (en) * | 1995-10-12 | 1999-10-12 | Andersen Consulting Llp | Interactive hypervideo editing system and interface |
US5751280A (en) * | 1995-12-11 | 1998-05-12 | Silicon Graphics, Inc. | System and method for media stream synchronization with a base atom index file and an auxiliary atom index file |
US5884056A (en) * | 1995-12-28 | 1999-03-16 | International Business Machines Corporation | Method and system for video browsing on the world wide web |
US5911139A (en) * | 1996-03-29 | 1999-06-08 | Virage, Inc. | Visual image database search engine which allows for different schema |
US5918013A (en) * | 1996-06-03 | 1999-06-29 | Webtv Networks, Inc. | Method of transcoding documents in a network environment using a proxy server |
USRE39115E1 (en) * | 1996-07-05 | 2006-06-06 | Matsushita Electric Industrial Co., Ltd. | Method for display time stamping and synchronization of multiple video object planes |
US5969716A (en) * | 1996-08-06 | 1999-10-19 | Interval Research Corporation | Time-based media processing system |
US6233017B1 (en) * | 1996-09-16 | 2001-05-15 | Microsoft Corporation | Multimedia compression system with adaptive block sizes |
US5918237A (en) | 1996-09-30 | 1999-06-29 | At&T Corp. | System and method for providing multimedia bookmarks for hypertext markup language files |
US6139197A (en) * | 1997-03-04 | 2000-10-31 | Seeitfirst.Com | Method and system automatically forwarding snapshots created from a compressed digital video stream |
US6654933B1 (en) * | 1999-09-21 | 2003-11-25 | Kasenna, Inc. | System and method for media stream indexing |
US6741655B1 (en) * | 1997-05-05 | 2004-05-25 | The Trustees Of Columbia University In The City Of New York | Algorithms and system for object-oriented content-based video search |
US6195458B1 (en) * | 1997-07-29 | 2001-02-27 | Eastman Kodak Company | Method for content-based temporal segmentation of video |
US6567980B1 (en) * | 1997-08-14 | 2003-05-20 | Virage, Inc. | Video cataloger system with hyperlinked output |
US6317170B1 (en) * | 1997-09-13 | 2001-11-13 | Samsung Electronics Co., Ltd. | Large screen compact image projection apparatus using a hybrid video laser color mixer |
US6429879B1 (en) * | 1997-09-30 | 2002-08-06 | Compaq Computer Corporation | Customization schemes for content presentation in a device with converged functionality |
US6349330B1 (en) | 1997-11-07 | 2002-02-19 | Eigden Video | Method and appparatus for generating a compact post-diagnostic case record for browsing and diagnostic viewing |
US6571054B1 (en) * | 1997-11-10 | 2003-05-27 | Nippon Telegraph And Telephone Corporation | Method for creating and utilizing electronic image book and recording medium having recorded therein a program for implementing the method |
US6064380A (en) * | 1997-11-17 | 2000-05-16 | International Business Machines Corporation | Bookmark for multi-media content |
US6208659B1 (en) * | 1997-12-22 | 2001-03-27 | Nortel Networks Limited | Data processing system and method for providing personal information in a communication network |
US6363380B1 (en) * | 1998-01-13 | 2002-03-26 | U.S. Philips Corporation | Multimedia computer system with story segmentation capability and operating program therefor including finite automation video parser |
JP3597690B2 (en) | 1998-01-21 | 2004-12-08 | 株式会社東芝 | Digital information recording and playback system |
JPH11212884A (en) * | 1998-01-22 | 1999-08-06 | Internatl Business Mach Corp <Ibm> | Electronic mail transmission device and method |
US6216173B1 (en) * | 1998-02-03 | 2001-04-10 | Redbox Technologies Limited | Method and apparatus for content processing and routing |
US6275227B1 (en) * | 1998-02-09 | 2001-08-14 | International Business Machines Corporation | Computer system and method for controlling the same utilizing a user interface control integrated with multiple sets of instructional material therefor |
JP3579240B2 (en) * | 1998-02-13 | 2004-10-20 | 富士通株式会社 | E-mail device and computer-readable recording medium recording e-mail program |
US6278446B1 (en) | 1998-02-23 | 2001-08-21 | Siemens Corporate Research, Inc. | System for interactive organization and browsing of video |
US6219679B1 (en) | 1998-03-18 | 2001-04-17 | Nortel Networks Limited | Enhanced user-interactive information content bookmarking |
US6426778B1 (en) * | 1998-04-03 | 2002-07-30 | Avid Technology, Inc. | System and method for providing interactive components in motion video |
US6073133A (en) * | 1998-05-15 | 2000-06-06 | Micron Electronics Inc. | Electronic mail attachment verifier |
US6956593B1 (en) * | 1998-09-15 | 2005-10-18 | Microsoft Corporation | User interface for creating, viewing and temporally positioning annotations for media content |
US6725227B1 (en) * | 1998-10-02 | 2004-04-20 | Nec Corporation | Advanced web bookmark database system |
US7143434B1 (en) * | 1998-11-06 | 2006-11-28 | Seungyup Paek | Video description system and method |
US6564263B1 (en) * | 1998-12-04 | 2003-05-13 | International Business Machines Corporation | Multimedia content description framework |
US6492998B1 (en) | 1998-12-05 | 2002-12-10 | Lg Electronics Inc. | Contents-based video story browsing system |
KR100313713B1 (en) | 1998-12-18 | 2002-02-28 | 이계철 | Visual rate dynamic generation method using pixel sampling |
US6449392B1 (en) | 1999-01-14 | 2002-09-10 | Mitsubishi Electric Research Laboratories, Inc. | Methods of scene change detection and fade detection for indexing of video sequences |
US6651087B1 (en) * | 1999-01-28 | 2003-11-18 | Bellsouth Intellectual Property Corporation | Method and system for publishing an electronic file attached to an electronic mail message |
DE60040184D1 (en) | 1999-01-28 | 2008-10-23 | Toshiba Kawasaki Kk | A method of describing image information, retrieving and reproducing video data and apparatus for playing video data |
SG92628A1 (en) * | 1999-02-13 | 2002-11-19 | Newstakes Inc | A method and apparatus for converting video to multiple mark-up-language presentations |
US6904227B1 (en) | 1999-02-15 | 2005-06-07 | Nec Corporation | Device and method for editing video and/or audio data recorded in a disc storage medium |
US6356971B1 (en) * | 1999-03-04 | 2002-03-12 | Sony Corporation | System for managing multimedia discs, tracks and files on a standalone computer |
US6774917B1 (en) | 1999-03-11 | 2004-08-10 | Fuji Xerox Co., Ltd. | Methods and apparatuses for interactive similarity searching, retrieval, and browsing of video |
EP1953758B1 (en) * | 1999-03-30 | 2014-04-30 | TiVo, Inc. | Multimedia program bookmarking system |
US6873982B1 (en) * | 1999-07-16 | 2005-03-29 | International Business Machines Corporation | Ordering of database search results based on user feedback |
US6460038B1 (en) * | 1999-09-24 | 2002-10-01 | Clickmarks, Inc. | System, method, and article of manufacture for delivering information to a user through programmable network bookmarks |
US6624826B1 (en) | 1999-09-28 | 2003-09-23 | Ricoh Co., Ltd. | Method and apparatus for generating visual representations for audio documents |
US6549643B1 (en) | 1999-11-30 | 2003-04-15 | Siemens Corporate Research, Inc. | System and method for selecting key-frames of video data |
US7047305B1 (en) * | 1999-12-09 | 2006-05-16 | Vidiator Enterprises Inc. | Personal broadcasting system for audio and video data using a wide area network |
US6829428B1 (en) | 1999-12-28 | 2004-12-07 | Elias R. Quintos | Method for compact disc presentation of video movies |
US6757273B1 (en) * | 2000-02-07 | 2004-06-29 | Nokia Corporation | Apparatus, and associated method, for communicating streaming video in a radio communication system |
US6693959B1 (en) * | 2000-03-03 | 2004-02-17 | Ati International Srl | Method and apparatus for indexing and locating key frames in streaming and variable-frame-length data |
US6925602B1 (en) | 2000-03-20 | 2005-08-02 | Intel Corporation | Facilitating access to digital video |
US6859838B1 (en) * | 2000-05-04 | 2005-02-22 | On24, Inc. | Media player with programmable playlists |
US7016937B1 (en) * | 2000-05-04 | 2006-03-21 | Bellsouth Intellectual Property Corporation | Method and apparatus for generating reminders to transmit electronic mail attachments by parsing e-mail message text |
-
2001
- 2001-07-23 KR KR1020037001067A patent/KR20040041082A/en unknown
- 2001-07-23 WO PCT/US2001/023631 patent/WO2002008948A2/en not_active Application Discontinuation
- 2001-07-23 US US09/911,293 patent/US7624337B2/en not_active Expired - Fee Related
- 2001-07-23 AU AU2001283004A patent/AU2001283004A1/en not_active Abandoned
-
2004
- 2004-09-03 KR KR1020040070337A patent/KR100798538B1/en not_active IP Right Cessation
-
2006
- 2006-06-08 US US11/423,143 patent/US20070033533A1/en not_active Abandoned
- 2006-06-08 US US11/423,138 patent/US20070033170A1/en not_active Abandoned
- 2006-06-08 US US11/423,136 patent/US20070033521A1/en not_active Abandoned
- 2006-06-08 US US11/423,140 patent/US20070033292A1/en not_active Abandoned
- 2006-06-08 US US11/423,134 patent/US20070033515A1/en not_active Abandoned
- 2006-08-14 US US11/504,058 patent/US7823055B2/en not_active Expired - Fee Related
- 2006-10-16 US US11/581,740 patent/US20070038612A1/en not_active Abandoned
-
2007
- 2007-10-01 KR KR1020070098972A patent/KR100798570B1/en not_active IP Right Cessation
-
2009
- 2009-10-26 US US12/605,874 patent/US20110093492A1/en not_active Abandoned
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5956716A (en) * | 1995-06-07 | 1999-09-21 | Intervu, Inc. | System and method for delivery of video data over a computer network |
US6573907B1 (en) * | 1997-07-03 | 2003-06-03 | Obvious Technology | Network distribution and management of interactive video and multi-media containers |
US6546405B2 (en) * | 1997-10-23 | 2003-04-08 | Microsoft Corporation | Annotating temporally-dimensioned multimedia content |
US6128617A (en) * | 1997-11-24 | 2000-10-03 | Lowry Software, Incorporated | Data display software with actions and links integrated with information |
US7096271B1 (en) * | 1998-09-15 | 2006-08-22 | Microsoft Corporation | Managing timeline modification and synchronization of multiple media streams in networked client/server systems |
US6473804B1 (en) * | 1999-01-15 | 2002-10-29 | Grischa Corporation | System for indexical triggers in enhanced video productions by redirecting request to newly generated URI based on extracted parameter of first URI |
US6598074B1 (en) * | 1999-09-23 | 2003-07-22 | Rocket Network, Inc. | System and method for enabling multimedia production collaboration over a network |
US6351776B1 (en) * | 1999-11-04 | 2002-02-26 | Xdrive, Inc. | Shared internet storage resource, user interface system, and method |
US6591295B1 (en) * | 1999-11-05 | 2003-07-08 | Oracle International Corp. | Methods and apparatus for using multimedia data stored in a relational database in web applications |
US6882793B1 (en) * | 2000-06-16 | 2005-04-19 | Yesvideo, Inc. | Video processing system |
Cited By (88)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040193740A1 (en) * | 2000-02-14 | 2004-09-30 | Nice Systems Ltd. | Content-based storage management |
US7664794B2 (en) | 2000-02-14 | 2010-02-16 | Nice Systems Ltd. | Content-based storage management |
US8195616B2 (en) | 2000-02-14 | 2012-06-05 | Nice Systems Ltd. | Content-based storage management |
US7349895B2 (en) * | 2000-10-30 | 2008-03-25 | Microsoft Corporation | Semi-automatic annotation of multimedia objects |
US20050010553A1 (en) * | 2000-10-30 | 2005-01-13 | Microsoft Corporation | Semi-automatic annotation of multimedia objects |
US20050055344A1 (en) * | 2000-10-30 | 2005-03-10 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
US20050114325A1 (en) * | 2000-10-30 | 2005-05-26 | Microsoft Corporation | Semi-automatic annotation of multimedia objects |
US7627556B2 (en) | 2000-10-30 | 2009-12-01 | Microsoft Corporation | Semi-automatic annotation of multimedia objects |
US7499916B2 (en) | 2000-10-30 | 2009-03-03 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
US7529732B2 (en) | 2000-10-30 | 2009-05-05 | Microsoft Corporation | Image retrieval systems and methods with semantic and feature based relevance feedback |
US20040158676A1 (en) * | 2001-01-03 | 2004-08-12 | Yehoshaphat Kasmirsky | Content-based storage management |
US20030236716A1 (en) * | 2002-06-25 | 2003-12-25 | Manico Joseph A. | Software and system for customizing a presentation of digital images |
US7236960B2 (en) * | 2002-06-25 | 2007-06-26 | Eastman Kodak Company | Software and system for customizing a presentation of digital images |
US20040044745A1 (en) * | 2002-08-30 | 2004-03-04 | Fujitsu Limited | Method, apparatus, and computer program for servicing viewing record of contents |
US20050102256A1 (en) * | 2003-11-07 | 2005-05-12 | Ibm Corporation | Single pass workload directed clustering of XML documents |
US7512615B2 (en) * | 2003-11-07 | 2009-03-31 | International Business Machines Corporation | Single pass workload directed clustering of XML documents |
US20070273754A1 (en) * | 2004-07-14 | 2007-11-29 | Ectus Limited | Method and System for Correlating Content with Linear Media |
US8363084B2 (en) | 2004-07-14 | 2013-01-29 | Cisco Systems New Zealand Limited | Method and system for correlating content with linear media |
US20070050226A1 (en) * | 2005-08-31 | 2007-03-01 | Soichiro Iga | Information display system, information display apparatus, and information display method |
US8078988B2 (en) * | 2005-08-31 | 2011-12-13 | Ricoh Company, Ltd. | Information display system, apparatus and method of displaying electronic information according to schedule information |
US8818898B2 (en) | 2005-12-06 | 2014-08-26 | Pumpone, Llc | System and method for management and distribution of multimedia presentations |
US8195028B2 (en) * | 2005-12-08 | 2012-06-05 | Thomson Licensing | Method for identifying a document recorded by a display, selection of key images and an associated receptor |
US20100046909A1 (en) * | 2005-12-08 | 2010-02-25 | Louis Chevallier | Method for Identifying a Document Recorded by a Display, Selection of Key Images and an Associated Receptor |
US20070204238A1 (en) * | 2006-02-27 | 2007-08-30 | Microsoft Corporation | Smart Video Presentation |
US10491748B1 (en) | 2006-04-03 | 2019-11-26 | Wai Wu | Intelligent communication routing system and method |
US8195734B1 (en) | 2006-11-27 | 2012-06-05 | The Research Foundation Of State University Of New York | Combining multiple clusterings by soft correspondence |
US20090281909A1 (en) * | 2006-12-06 | 2009-11-12 | Pumpone, Llc | System and method for management and distribution of multimedia presentations |
US20090265649A1 (en) * | 2006-12-06 | 2009-10-22 | Pumpone, Llc | System and method for management and distribution of multimedia presentations |
US8650489B1 (en) * | 2007-04-20 | 2014-02-11 | Adobe Systems Incorporated | Event processing in a content editor |
US20090132924A1 (en) * | 2007-11-15 | 2009-05-21 | Yojak Harshad Vasa | System and method to create highlight portions of media content |
US10324612B2 (en) | 2007-12-14 | 2019-06-18 | Apple Inc. | Scroll bar with video region in a media system |
US20090158203A1 (en) * | 2007-12-14 | 2009-06-18 | Apple Inc. | Scrolling displayed objects using a 3D remote controller in a media system |
US20110035692A1 (en) * | 2008-01-25 | 2011-02-10 | Visual Information Technologies, Inc. | Scalable Architecture for Dynamic Visualization of Multimedia Information |
WO2009094635A1 (en) * | 2008-01-25 | 2009-07-30 | Visual Information Technologies, Inc. | Scalable architecture for dynamic visualization of multimedia information |
US20100153847A1 (en) * | 2008-12-17 | 2010-06-17 | Sony Computer Entertainment America Inc. | User deformation of movie character images |
US8639086B2 (en) | 2009-01-06 | 2014-01-28 | Adobe Systems Incorporated | Rendering of video based on overlaying of bitmapped images |
US8407596B2 (en) * | 2009-04-22 | 2013-03-26 | Microsoft Corporation | Media timeline interaction |
US20100275123A1 (en) * | 2009-04-22 | 2010-10-28 | Microsoft Corporation | Media Timeline Interaction |
US8392598B2 (en) | 2009-06-15 | 2013-03-05 | Research In Motion Limited | Methods and apparatus to facilitate client controlled sessionless adaptation |
US8244901B2 (en) * | 2009-06-15 | 2012-08-14 | Research In Motion Limited | Methods and apparatus to facilitate client controlled sessionless adaptation |
US20100318600A1 (en) * | 2009-06-15 | 2010-12-16 | David Furbeck | Methods and apparatus to facilitate client controlled sessionless adaptation |
AU2010260303B2 (en) * | 2009-06-15 | 2014-08-28 | Blackberry Limited | Methods and apparatus to facilitate client controlled sessionless adaptation |
US10819815B2 (en) | 2010-07-20 | 2020-10-27 | Ideahub Inc. | Apparatus and method for providing streaming content |
US10362130B2 (en) | 2010-07-20 | 2019-07-23 | Ideahub Inc. | Apparatus and method for providing streaming contents |
US10277660B1 (en) | 2010-09-06 | 2019-04-30 | Ideahub Inc. | Apparatus and method for providing streaming content |
US20140281013A1 (en) * | 2010-10-06 | 2014-09-18 | Electronics And Telecommunications Research Institute | Apparatus and method for providing streaming content |
US9986009B2 (en) | 2010-10-06 | 2018-05-29 | Electronics And Telecommunications Research Institute | Apparatus and method for providing streaming content |
US20130185398A1 (en) * | 2010-10-06 | 2013-07-18 | Industry-University Cooperation Foundation Korea Aerospace University | Apparatus and method for providing streaming content |
US8909805B2 (en) * | 2010-10-06 | 2014-12-09 | Electronics And Telecommunications Research Institute | Apparatus and method for providing streaming content |
US20170041371A9 (en) * | 2010-10-06 | 2017-02-09 | Electronics And Telecommunications Research Institute | Apparatus and method for providing streaming content |
US9369512B2 (en) * | 2010-10-06 | 2016-06-14 | Electronics And Telecommunications Research Institute | Apparatus and method for providing streaming content |
US10748532B1 (en) | 2011-09-27 | 2020-08-18 | 3Play Media, Inc. | Electronic transcription job market |
US9704111B1 (en) | 2011-09-27 | 2017-07-11 | 3Play Media, Inc. | Electronic transcription job market |
US11657341B2 (en) | 2011-09-27 | 2023-05-23 | 3Play Media, Inc. | Electronic transcription job market |
US20130097643A1 (en) * | 2011-10-17 | 2013-04-18 | Microsoft Corporation | Interactive video |
US9846696B2 (en) * | 2012-02-29 | 2017-12-19 | Telefonaktiebolaget Lm Ericsson (Publ) | Apparatus and methods for indexing multimedia content |
US20130226930A1 (en) * | 2012-02-29 | 2013-08-29 | Telefonaktiebolaget L M Ericsson (Publ) | Apparatus and Methods For Indexing Multimedia Content |
US8918311B1 (en) * | 2012-03-21 | 2014-12-23 | 3Play Media, Inc. | Intelligent caption systems and methods |
US9632997B1 (en) * | 2012-03-21 | 2017-04-25 | 3Play Media, Inc. | Intelligent caption systems and methods |
US9607023B1 (en) | 2012-07-20 | 2017-03-28 | Ool Llc | Insight and algorithmic clustering for automated synthesis |
US11216428B1 (en) | 2012-07-20 | 2022-01-04 | Ool Llc | Insight and algorithmic clustering for automated synthesis |
US9336302B1 (en) | 2012-07-20 | 2016-05-10 | Zuci Realty Llc | Insight and algorithmic clustering for automated synthesis |
US10318503B1 (en) | 2012-07-20 | 2019-06-11 | Ool Llc | Insight and algorithmic clustering for automated synthesis |
US9633015B2 (en) | 2012-07-26 | 2017-04-25 | Telefonaktiebolaget Lm Ericsson (Publ) | Apparatus and methods for user generated content indexing |
US10606461B1 (en) | 2012-07-26 | 2020-03-31 | Google Llc | Snapping a pointing-indicator to a scene boundary of a video |
US9110562B1 (en) * | 2012-07-26 | 2015-08-18 | Google Inc. | Snapping a pointing-indicator to a scene boundary of a video |
US20140089806A1 (en) * | 2012-09-25 | 2014-03-27 | John C. Weast | Techniques for enhanced content seek |
US9621665B2 (en) * | 2012-12-07 | 2017-04-11 | Huawei Technologies Co., Ltd. | Multimedia redirection method, multimedia server, and computer system |
US20150264149A1 (en) * | 2012-12-07 | 2015-09-17 | Huawei Technologies Co., Ltd. | Multimedia Redirection Method, Multimedia Server, and Computer System |
US9854260B2 (en) | 2013-03-06 | 2017-12-26 | Disney Enterprises, Inc. | Key frame aligned transcoding using key frame list file |
US9253484B2 (en) | 2013-03-06 | 2016-02-02 | Disney Enterprises, Inc. | Key frame aligned transcoding using statistics file |
US10445367B2 (en) | 2013-05-14 | 2019-10-15 | Telefonaktiebolaget Lm Ericsson (Publ) | Search engine for textual content and non-textual content |
US10222946B2 (en) * | 2013-08-12 | 2019-03-05 | Curious.Com, Inc. | Video lesson builder system and method |
US9336685B2 (en) * | 2013-08-12 | 2016-05-10 | Curious.Com, Inc. | Video lesson builder system and method |
US10311038B2 (en) | 2013-08-29 | 2019-06-04 | Telefonaktiebolaget Lm Ericsson (Publ) | Methods, computer program, computer program product and indexing systems for indexing or updating index |
US10289810B2 (en) | 2013-08-29 | 2019-05-14 | Telefonaktiebolaget Lm Ericsson (Publ) | Method, content owner device, computer program, and computer program product for distributing content items to authorized users |
US9456170B1 (en) | 2013-10-08 | 2016-09-27 | 3Play Media, Inc. | Automated caption positioning systems and methods |
US9411422B1 (en) * | 2013-12-13 | 2016-08-09 | Audible, Inc. | User interaction with content markers |
US9521470B2 (en) * | 2014-06-13 | 2016-12-13 | Hulu, LLC | Video delivery system configured to seek in a video using different modes |
US20150365736A1 (en) * | 2014-06-13 | 2015-12-17 | Hulu, LLC | Video Delivery System Configured to Seek in a Video Using Different Modes |
USD893612S1 (en) * | 2016-11-18 | 2020-08-18 | International Business Machines Corporation | Training card |
USD1025205S1 (en) | 2016-11-18 | 2024-04-30 | International Business Machines Corporation | Training card |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
WO2020214404A1 (en) * | 2019-04-19 | 2020-10-22 | Microsoft Technology Licensing, Llc | Previewing video content referenced by typed hyperlinks in comments |
US11026000B2 (en) | 2019-04-19 | 2021-06-01 | Microsoft Technology Licensing, Llc | Previewing video content referenced by typed hyperlinks in comments |
US11678031B2 (en) | 2019-04-19 | 2023-06-13 | Microsoft Technology Licensing, Llc | Authoring comments including typed hyperlinks that reference video content |
US11785194B2 (en) | 2019-04-19 | 2023-10-10 | Microsoft Technology Licensing, Llc | Contextually-aware control of a user interface displaying a video and related user text |
US11735186B2 (en) | 2021-09-07 | 2023-08-22 | 3Play Media, Inc. | Hybrid live captioning systems and methods |
Also Published As
Publication number | Publication date |
---|---|
US20070038612A1 (en) | 2007-02-15 |
US20110093492A1 (en) | 2011-04-21 |
US20070033170A1 (en) | 2007-02-08 |
KR100798570B1 (en) | 2008-01-28 |
US20070033292A1 (en) | 2007-02-08 |
US7823055B2 (en) | 2010-10-26 |
WO2002008948A2 (en) | 2002-01-31 |
US20070033521A1 (en) | 2007-02-08 |
AU2001283004A1 (en) | 2002-02-05 |
KR20050002681A (en) | 2005-01-10 |
KR20040041082A (en) | 2004-05-13 |
US20070033533A1 (en) | 2007-02-08 |
US20070044010A1 (en) | 2007-02-22 |
KR100798538B1 (en) | 2008-01-28 |
US20020069218A1 (en) | 2002-06-06 |
WO2002008948A3 (en) | 2003-09-25 |
KR20070103728A (en) | 2007-10-24 |
US7624337B2 (en) | 2009-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7823055B2 (en) | System and method for indexing, searching, identifying, and editing multimedia files | |
Tseng et al. | Using MPEG-7 and MPEG-21 for personalizing video | |
US7143353B2 (en) | Streaming video bookmarks | |
US7653635B1 (en) | Systems and methods for interoperable multimedia content descriptions | |
EP1125245B1 (en) | Image description system and method | |
Sun et al. | Video summarization using R-sequences | |
US20110064136A1 (en) | Methods and architecture for indexing and editing compressed video over the world wide web | |
KR20050099488A (en) | Method and apparatus for encoding and decoding of a video multimedia application format including both video and metadata | |
Smeaton | Indexing, browsing and searching of digital video | |
Zhang | Content-based video browsing and retrieval | |
England et al. | I/browse: The bellcore video library toolkit | |
Tseng et al. | Hierarchical video summarization based on context clustering | |
Chang et al. | Exploring image functionalities in WWW applications development of image/video search and editing engines | |
Smeaton | Indexing, browsing, and searching of digital video and digital audio information | |
Coden et al. | Multi-Search of Video Segments Indexed by Time-Aligned Annotations of Video Content | |
Zhang | Video content analysis and retrieval | |
Meessen et al. | Content browsing and semantic context viewing through JPEG 2000-based scalable video summary | |
Meng et al. | A distributed system for editing and browsing compressed video over the network | |
Ahanger et al. | Automatic digital video production concepts | |
Zhou | Intelligent systems for video analysis and access over the Internet | |
Lee et al. | Automatic video summary and description | |
Bolle et al. | Video query and retrieval | |
Day | MPEG-7: Applications for managing content | |
Izquierdo et al. | Bringing user satisfaction to media access: The IST BUSMAN Project | |
Doulamis et al. | Non-sequential multiscale content-based video decomposition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: VMARK, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:VIVCOM, INC.;REEL/FRAME:021098/0121 Effective date: 20051221 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |