US20010042114A1 - Indexing multimedia communications - Google Patents

Indexing multimedia communications Download PDF

Info

Publication number
US20010042114A1
US20010042114A1 US09/025,940 US2594098A US2001042114A1 US 20010042114 A1 US20010042114 A1 US 20010042114A1 US 2594098 A US2594098 A US 2594098A US 2001042114 A1 US2001042114 A1 US 2001042114A1
Authority
US
United States
Prior art keywords
multimedia
multimedia data
data packets
video
method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US09/025,940
Other versions
US6377995B2 (en
Inventor
Sanjay Agraharam
Robert E. Markowitz
Kenneth H. Rosen
David Hilton Shur
Joel A. Winthrop
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Corp
Original Assignee
AT&T Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AT&T Corp filed Critical AT&T Corp
Priority to US09/025,940 priority Critical patent/US6377995B2/en
Assigned to AT&T CORP. reassignment AT&T CORP. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WINTHROP, JOEL A., ROSEN, KENNETH H., AGRAHARAM, SANJAY, SHUR, DAVID HILTON, MARKOWITZ, ROBERT EDWARD
Publication of US20010042114A1 publication Critical patent/US20010042114A1/en
Application granted granted Critical
Publication of US6377995B2 publication Critical patent/US6377995B2/en
Anticipated expiration legal-status Critical
Application status is Expired - Lifetime legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements or protocols for real-time communications
    • H04L65/40Services or applications
    • H04L65/4069Services related to one way streaming
    • H04L65/4076Multicast or broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L29/00Arrangements, apparatus, circuits or systems, not covered by a single one of groups H04L1/00 - H04L27/00
    • H04L29/02Communication control; Communication processing
    • H04L29/06Communication control; Communication processing characterised by a protocol
    • H04L29/0602Protocols characterised by their application
    • H04L29/06027Protocols for multimedia communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements or protocols for real-time communications
    • H04L65/40Services or applications
    • H04L65/403Arrangements for multiparty communication, e.g. conference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Abstract

A network based platform uses face recognition, speech recognition, background change detection and key scene events to index multimedia communications. Before the multimedia communication begins, active participants register their speech and face models with a server. The process consists of creating a speech sample, capturing a sample image of the participant and storing the data in a database. The server provides an indexing function for the multimedia communication. During the multimedia communication, metadata including time stamping is retained along with the multimedia content. The time stamping information is used for synchronizing the multimedia elements. The multimedia communication is then processed through the server to identify the multimedia communication participants based on speaker and face recognition models. This allows the server to create an index table that becomes an index of the multimedia communication. In addition, through scene change detection and background recognition, certain backgrounds and key scene information can be used for indexing. Therefore, through this indexing apparatus and method, a specific participant can be recognized as speaking and the content that the participant discussed can also be used for indexing.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of Invention [0001]
  • This invention relates to multimedia communications. More particularly, this invention relates to a method and an apparatus for indexing multimedia communications. [0002]
  • 2. Description of Related Art [0003]
  • Multimedia communications used, for example, in a conference call, may be saved for future review by conference call participants or other interested parties. That is, the audio, video and data communications that comprise the conference call may be stored for future retrieval and review. An individual may desire to see the entire conference call replayed, or may want to review only selected portions of the conference call. The individual may want to have participants identified to determine what they said and when they said it or to determine who is saying what. For example, the individual may want to review only the audio from one particular conference call participant. [0004]
  • However, some conference calls include more than one participant at a given location or end point. The audio, for example, for all the participants at a given location may be recorded and retained for future review. When a large number of participants are involved in the conference call, separating out individual audio tracks, for example, is difficult due to limitations of current systems to differentiate between the participants. This situation can arise when there are a large number of participants at all the locations or when there are a large number of participants at one particular location. Therefore, a more efficient and reliable method for indexing multimedia communications is needed. [0005]
  • SUMMARY OF THE INVENTION
  • The invention provides a reliable and efficient method and apparatus for indexing multimedia communications so that selected portions of the multimedia communications can be efficiently retrieved and replayed. The invention uses distinctive features of the multimedia communications to achieve the indexing. For example, the invention provides a combination of face recognition and voice recognition features to identify particular participants to a multicast, multimedia conference call. Data related to the identities of the particular participants, or metadata, may be added, as part of a multimedia data packet extension header, to multimedia data packets containing the audio and video information corresponding to the particular participants, thus indexing the multimedia data packets. The multimedia data packets with the extension headers may then be stored in a database or retransmitted in near real-time (i.e., with some small delay). Then, multimedia data packets containing, for example, audio from a particular individual, can be readily and reliably retrieved from the database by specifying the particular individual. [0006]
  • Other features, such as background detection and key scene changes can also be used to index the multimedia communications. Data related to these features is also added to multimedia data packet extension headers to allow reliable retrieval of data associated with these features. [0007]
  • In a preferred embodiment, the participants to the multimedia communications are connected via a local area network (LAN) to a multicast network. An index server within the multicast network receives the multimedia communications from different locations, manipulates/alters the communication and simultaneously broadcasts, or multicasts, the altered multimedia communications to all other locations involved in the multimedia communications. Alternately, the locations can be connected to the multicast network using plain old telephone service (POTS) lines with modems at individual locations and at the multicast network or using ISDN, xDSL, Cable Modem, and Frame Relay, for example. [0008]
  • These and other features and advantages of the invention are described in or are apparent from the following detailed description of the preferred embodiments.[0009]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The invention is described in detail with reference to the following drawings, in which like numerals refer to like elements, and wherein: [0010]
  • FIG. 1 is a block diagram of a multicast network according to the present invention; [0011]
  • FIG. 2 is a block diagram of a server platform of the invention; [0012]
  • FIG. 3 is a block diagram of representative equipment used by multimedia conference call participants; [0013]
  • FIG. 4 is an alternate equipment arrangement; [0014]
  • FIG. 5 is a logical diagram of an index server; [0015]
  • FIG. 6 is an alternate arrangement for a multicast network; [0016]
  • FIG. 7 is a logical diagram of an index used in the multicast network of FIG. 6; [0017]
  • FIG. 8 is a representation of a multimedia data packet; [0018]
  • FIG. 9 is a logical representation of an index table; [0019]
  • FIG. 10 is a flowchart representing the multicast operation; and [0020]
  • FIGS. 11A and 11B show a flowchart representing operation of the indexing process.[0021]
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • Multimedia communications can be provided with metadata that can be recorded along with the multimedia communications and possibly rebroadcast along with the original communications in near real time. Metadata is simply information about data. That is, metadata is information that can be used to further define or characterize the data. A paradigm example of metadata is a time stamp. When multimedia communications are recorded, a time of production may be associated with the communications. A time stamp can be added to the multimedia communications to provide an indexing function. However, a time stamp is often not an adequate method for indexing, multimedia communications. [0022]
  • One method for indexing multimedia communications is provided in U.S. Pat. Number 5,710,591, “Method and Apparatus for Recording and Indexing Audio and Multimedia Conference,” which is hereby incorporated by reference. However, when the multimedia communications originate from several different locations a multicast feature must be included to provide optimum conference performance. [0023]
  • Accordingly, this invention provides a method and an apparatus to allow indexing of multicast, multimedia communications that is based on distinguishing features such as the identity of participants to the multimedia communications and other features associated with the locations where the multimedia communications originate. The participants in the multimedia communication can view a slightly time-delayed version of the original communication that now includes the metadata. The indexing then allows subsequent users of a multimedia communications service to easily and efficiently search for and replay audio, video and data communications of a particular individual, for example. [0024]
  • For example, the invention uses voice and image feature/object recognition in conjunction with RTP packet protocol information to identify specific speakers at a given location. Information that identifies the specific speakers is then inserted into RPT packets, and the RTP packets are rebroadcast/multicast to other participants in a multimedia communication. The thus-modified RTP packets may be stored and later retrieved on demand. [0025]
  • In one example of the invention, the indexing feature is provided for a multicast, multimedia conference call. However, the invention can be used with any multimedia communications. [0026]
  • FIG. 1 shows an example of an apparatus for completing the multicast, multimedia conference call. In FIG. 1, a multicast network [0027] 10 receives multimedia communications from locations, or sources, S1, S2 and S3, that are engaged in a multimedia conference call. Each of the sources S1, S2 and S3 includes one or more conference call participants. Thus, participants P1, P2, P3 and P4 are at the source S1; participants P5 and P6 are at the source S2; and participants P7 and P8 are at the source S3 in this example. However, the invention is not limited to three sources and eight participants, and any number of sources with any number of participants may be engaged in the multimedia conference call.
  • In FIG. 1, the sources S[0028] 1, S2 and S3 connect to the multicast network 10 over a local area network such as an Ethernet or any local area network (e.g. ATM) capable of providing sufficient bandwidth. The multimedia conference call could also be completed over existing telephone lines using asymmetric digital subscriber line (ADSL) or integrated services digital network (ISDN) connectors. The method for providing the multimedia conference call could also operate over the public Internet in conjunction with Internet Protocols (IP) . Finally, the method could also be applied to a public switched telephone network (PSTN), for example. The communications may use Real Time Protocols (RTP), Internet Protocols (IP) and User Datagram Protocols (UDP).
  • Communications from the sources S[0029] 1, S2 and S3 are received at routers R located in the multicast network 10. The routers R ensure that the communications from each of the sources and from the multicast network 10 are sent to the desired address. The multicast network 10 connects to an index server 20. The index server 20 participates in the multimedia conference call among the sources S1, S2 and S3.
  • As shown in FIG. 2, the index server [0030] 20 may be a multimedia communications device such as a multicast server or a bridge. If a bridge is used, the bridge may repeatedly transmit the multimedia communications, one transmission for each source connected to the multimedia conference call. The multicast server may transmit the multimedia communications simultaneously to each of the sources. In the discussion that follows, the index server 20 participates in a multimedia conference call among the sources S1, S2 and S3. The index server 20 receives multimedia communications from all the sources and simultaneously retransmits the multimedia communications to all the sources. For example, the index server 20 receives multimedia including audio, video and data from sources S1, S2 and S3. The index server 20 then simultaneously retransmits, or multicasts, the multimedia communications from sources S2 and S3 to source S1, the multimedia communications from sources S1 and S2 to source S3, and the multimedia communications from sources S1 and S3 to source S2.
  • Also as shown in FIG. 2, the index server [0031] 20 includes a buffer 22 and a database 23. The buffer 22 temporarily stores data that is to be processed by an index process module 21. The buffer 22 is needed because of slight time delays between receipt of the multimedia communications and subsequent index processing. The buffer 22 is also necessary because feature recognition modules (to be described later) contained in the index process module 21 may require several milliseconds of data in order to correctly identify a distinguishing feature. However, the multimedia communications are received at the multicast server in multimedia data packets that may contain as few as 10 μsec of data. Thus, the buffer 22 may store the multimedia data packets temporarily until a sufficient amount of data is available to allow processing by the index process module 21.
  • The database [0032] 23 is used to store indexed multimedia communications for subsequent playback, and to store certain information related to the sources and to the participants in the multimedia communications. For example, all the communications received at the multicast network 10 include a time stamp. The time stamp, in addition to serving an indexing function, allows synchronization of the different multimedia communications. That is, the time stamp allows for synchronization of video and data communications from the source S1 and for synchronization of communications from the sources S1 and S2, for example. For indexing, the time stamp can be used to segment the multimedia conference call according to time, and an individual can replay selected portions of the multimedia conference call corresponding to a set time period. The time stamp information is stored in the database 23 along with the multimedia communications.
  • The index server [0033] 20 allows for indexing of the multimedia conference call. Specifically, the index server 20 may index the multimedia communications received from the sources S1, S2 and S3. The indexing adds metadata to the multimedia conference call data. The metadata could include information that identifies a particular participant, for example. In the preferred embodiment, speaker identification and face recognition software determines the identity of the participant. Once the participant is identified, the index server 20 creates an index table that becomes an index to the multimedia conference call. The indexing process will be described in more detail later. Furthermore, the invention is not limited to distinguishing between live persons. Any object may be distinguished according to the invention including animals, plants and inanimate objects.
  • FIGS. 3 and 4 show examples of a communications device available at the sources S[0034] 1, S2 and S3. FIG. 3 shows a video phone 60 that is an example of a communication device that the participants may use to communicate multimedia information. FIG. 4 shows a computer 40 that is another example of a multimedia communications device that the participants may use in accordance with the invention. The computer 40 includes a data entry device such as a keyboard 41 and a mouse 42, a central processor unit (CPU) 43, a visual display 44, speakers 45, a video camera 46 and a microphone 47. The computer 40 connects to the multicast network 10 through LAN connectors 48. The computer 40 may be a personal computer, a portable computer, a workstation or a main frame computer. The microphone 47 captures and transmits the audio portion of the multimedia conference call. An analog to digital converter or sound card (not shown) converts the analog speech into a digital representation. The video camera 46 captures and transmits an image of each participant. The video camera 46 may be an analog or a digital camera. If an analog camera is used, the video signal from the video camera 46 is first sent to a codec (not shown) for conversion to a digital signal before it is transmitted to the multicast network 10. Furthermore, the video camera 46 may be voice activated so that the video camera 46 slews, or rotates, to capture the image of a speaker. The speakers 45 provide audio signals to the participants. The display 44 may display images of the participants. The display 44 may also display information related to the multimedia call such as a call label and toolbars that allow the participants to interact with the index server 20 in the multicast network 10.
  • The computer [0035] 40 is provided with a specific application program that allows the participants to interface with the index server 20 and other network components. The keyboard 41 and the mouse 42 function as data input devices that allow participants to send commands to the index server 21, for example. The computer 40 includes a packet assembler (not shown) that compresses and assembles the digital representations of the multimedia communications into discrete packets. The CPU 43 controls all functions of the computer 40.
  • As noted above, at least two ways are available to identify individual conference call participants. Face recognition software such as FACEIT® automatically detects, locates, extracts and identifies human faces from live video. FACEIT® requires a personal computer or similar device and a video camera and compares faces recorded by the video camera to data stored in a database such as the database [0036] 23, using statistical techniques. FACEIT® is described in detail at http://www.faceit.com.
  • FACEIT® uses an algorithm based on local feature analysis (LFA) which is a statistical pattern representation formalism that derives from an ensemble of examples of patterns a unique set of local building blocks that best represent new instances of these patterns. For example, starting with an ensemble of facial images, LFA derives the set of local features that are optimal for representing any new face. Equipped with these universal facial building blocks, FACEIT® automatically breaks down a face into its component features and compares these features to stored data such as data stored in the database [0037] 23. Therefore, to use FACEIT®, each multimedia conference call participant's face must first be registered with the multimedia service so that the facial features can be stored in the database 23. Then, during subsequent multimedia conference calls, FACEIT® can be used to identify a face from all the faces being captured by the video cameras 46. Although the above discussion refers to FACEIT®, it should be understood that the present invention is not limited to use of this particular facial identification system.
  • In addition to face recognition, the preferred embodiment includes a speech recognition feature in which a participant is identified based on spectral information from the participant's voice. As with the face recognition feature, the multimedia conference call participant must first register a speech sample so that a voice model is stored in the database [0038] 23. The speech recognition feature requires an input, a processor and an output. The input may be a high quality microphone or microphone array for speech input and an analog to digital conversion board that produces digital speech signals representative of the analog speech input. The processor and output may be incorporated into the multicast network 10. A speech recognition system is described in detail in U.S. Pat. No. 5,666,466, which is hereby incorporated by reference.
  • By using both speech recognition and face recognition systems, the preferred embodiment can reliably and quickly identify a particular multimedia conference call participant and thereby allow precise indexing of the multimedia conference call. That is, in a multimedia conference call involving several different sources with numerous participants at each source, identifying a particular participant out of the total group of participants is difficult to achieve with current systems. However, this invention can locate an individual to a particular source, such as source S[0039] 1, based on source address and in addition, applies speech and face recognition, only among specific individuals at a particular location. The index process module 21 then compares participant face and speech patterns contained in the database 23 to audio information and video information being received at the multicast network 10 during the multimedia conference call. By using both face recognition and speech recognition systems, the invention is much more likely to correctly identify the particular participant than a system that uses only speech recognition, for example.
  • Other metadata may be used in addition to face and speech recognition data to index the multimedia conference call. For example, if the background changes, the index process module [0040] 21 can detect this event and record the change as metadata. Thus, if the video camera 46 that is recording the video portion of the multimedia conference call at a source, such as source S1, slews or rotates so that the background changes from a blank wall to a blackboard, for example, the index process module 21 may detect this change and record a background change event. This change in background can then be used for subsequent searches of the multimedia conference call. For example, the individual reviewing the contents of a multimedia conference call may desire to retrieve those portions of the multicast, multimedia conference call in which a blackboard at the source S1 is displayed. As with face and speech recognition, the index process module 21 may have stored in the database 23, a representation of the various backgrounds at the sources S1, S2 and S3, that are intended to be specifically identified. Similarly, a change in key scene features may be detected and used to classify or index the multimedia conference call. Key scene feature changes include loss or interruption of a video signal such as when a source, such as source S1, purposely goes off-line, for example.
  • FIG. 5 shows the index process module [0041] 21 in detail. A control module 31 controls the functions of index process module 21. When multimedia communications are transmitted to the multicast network 10, a multicast module 32 in the index process module 21 receives and broadcasts the multimedia communications. In parallel with broadcasting the multimedia communications, a speech recognition module 33 compares data received by the multicast module 32 to speech models stored in the database 23 to determine if a match exists and outputs a speech identifier when the match exists. In addition, a face recognition module 34, which incorporates face recognition software such as FACEIT®, compares the data to face models stored in the database 23 and outputs a face identifier when a match is determined between video data comprising the multimedia communications and facial models stored in the database 23. Finally, a scene recognition module 35 compares scenes captured by the video camera 46 to scenes stored in the database 23 to determine if any background or key scene changes occurred. The scene recognition module 35 outputs a scene change identifier when a background change, for example, is detected. When any of the above recognition modules determines that a match exists, a header function module 36 receives the inputs from the face, speech and scene recognition modules 33, 34 and 35, respectively. The header function module 36, based on the inputs, creates an multimedia extension header, attaches the multimedia extension header to the multimedia data packet, and applies certain information, or data, to the multimedia extension header, based on the inputs received from the speech, face and scene modules 33, 34 and 35, respectively. The multimedia data packets with the modified headers are then retransmitted/multicast into the network. The multimedia extension header is described below.
  • As noted above, the present invention is able to provide indexing of multimedia communications because the multimedia data streams are divided into multimedia data packets, and each multimedia data packet contains at least one multimedia header. The conversion of the data into digital format is performed at the source, such as source S[0042] 1. Specifically, a processor, such as the computer 40, receives the audio and video data in an analog format. For example, with the audio data, audio analog wave forms are fed into a sound card that converts the data into digital form. A packet assembler (not shown) then assembles the digitized audio data into multimedia data packets. The multimedia data packets include a segment that contains the digital data and a segment, or multimedia header, that contains other information, such as the source IP address and the destination IP address, for example. The multimedia header is also used by the index process module 21 to hold the data needed to provide the indexing function.
  • FIG. 6 shows an alternate arrangement of an apparatus for indexing multimedia communications. In FIG. 6, source S[0043] 4 with participants P10 and P12 connects to a multicast network 80 via a computer system 70 and a local indexer 72. Source S5, with participants P9 and P11, connects to the multicast network 80 via a computer system 71 and a local indexer 73. The computer systems 70 and 72 (or 71 and 73) contain all the components and perform all the functions as the apparatus shown in FIG. 3.
  • Multimedia communications from the sources S[0044] 4 and S5 are received at routers R located in the multicast network 80. The multicast network 80 connects to a server 81 that may be a multicast server or a bridge. Multimedia communications from the sources S4 and S5 may be stored in a database 82.
  • The apparatus of FIG. 6 differs from that of FIG. 1 in that the indexing function occurs locally at the sources S[0045] 4 and S5. That is, face recognition and speech recognition functions, for example, are performed by the indexers 72 and 73 at the sources S4 and S5, respectively.
  • FIG. 7 is a logical diagram of the indexer [0046] 72. The description that follows applies to the indexer 72. However, the indexer 73 is identical to the indexer 72; hence the following description is equally applicable to both indexers. An interface module 90 receives multimedia data packets from the computer system 70. A buffer 91 temporarily stores the multimedia data packets received at the interface module 90. A database 92 stores face, speech and background models for the source S4. A speech recognition module 93 compares audio received by the interface module 90 to speech models stored in the database 92 to determine if a speech pattern match exists. The speech recognition module 93 outputs a speech identifier when the speech pattern match exists. A face recognition module 94 compares video data received at the interface module 90 to face models stored in the database 92 to determine if a face pattern match exists, and outputs a face identifier when the face pattern match exists. A scene recognition module 95 compares scenes captured by the computer system 70 to scene models stored in the database 92, and outputs a scene identifier when a scene match exists. A header function module 96 creates a multimedia extension header, attaches the multimedia extension header to the multimedia data packet, and applies specific data to the multimedia extension header, based on the inputs from the speech, face and scene recognition modules 93, 94 and 95, respectively. The local indexer 72 may also incorporate other recognition modules to identify additional distinguishing features for indexing the multimedia communications.
  • FIG. 8 shows a multimedia data packet [0047] 100. In FIG. 8, a data segment 110 (payload) contains the data, such as the audio data, for example, that was transmitted from the source S1. A multimedia data packet header segment 120 contains additional fields related to the multimedia communications. An extension field 121 may be used to indicate that the multimedia data packet 100 contains a multimedia extension header. A payload type field 122 indicates the type of data contained in the data segment 110. A source identifier field 123 indicates the source of the multimedia data packet. The header segment 120 may contain numerous additional fields.
  • Returning to the first embodiment, a header extension segment [0048] 130 may be added to allow additional information to be carried with the multimedia data packet header segment 120. When indexing a particular packet, the index process module 21 records the appropriate metadata in the header extension segment 130. In this case, a bit is placed in the extension field 121 to indicate the presence of the header extension segment 130. Once the multimedia data packet 100 arrives at the multicast network 10, the index process module 21 compares the data contained in the data segment 110 to face and speech models contained in the database 23. If a match is achieved, the control module 31 in the index process module 21 directs the header function module 36 to add to the multimedia data packet header segment 120, the header extension segment 130, which includes the speech or face identifiers, as appropriate. When the data in the data segment 110 indicates a background change or a key scene event, the index process module 21 adds a corresponding indication to the header extension segment 130.
  • In the preferred embodiment, the multimedia data packet [0049] 100 is then stored in the database 23. Alternately, the data in the data segment 110 may be separated from the multimedia data packet header segment 120. In that case, the index process module 21 creates a separate file that links the data in the multimedia data packet header segment 120 to the data contained in the data segment 110.
  • FIG. 9 is a logical representation of an index table for a multimedia conference call. In FIG. 9, the index process module [0050] 21 is receiving the multimedia conference call described above. When a participant such as participant P1 is speaking, the header function module 36 adds to the multimedia data packet, a header extension segment 130 that includes an index/ID for participant P1. The index process module 21 then stores the resulting multimedia data packet in the database 23. Thus, as shown in FIG. 9, for participant P1 at location S1, the database 23 stores a speaker ID model VO-102 and a video model VI-356. Then, by specifying the participant P1 (associated with VO-102 and VI-356), the corresponding data packets can be retrieved from the database 23 and their contents reviewed.
  • To identify a particular participant based on information in a multimedia data packet, such as the multimedia data packet [0051] 100, the index process module 21 first retrieves the multimedia data packet 100 from the buffer 22. The index process module 21 reads the payload type field 122 in the data packet header segment 120 to determine if the multimedia data packet 100 contains audio data. If the multimedia data packet 100 contains audio data, the index process module 21 determines if there is a multimedia data packet from the same source with a corresponding time stamp that contains video data. With both video and audio multimedia data packets from the same source with approximately the same time stamp, the index process module 21 can then compare the audio and video data to the speech and face models contained in the database 23. For example, the face recognition module 34, containing FACEIT®, compares the digitized video image contained in the multimedia data packets 100 to the face models stored in the database 23. The speech recognition module 32 compares the digitized audio in the multimedia data packets 100 to the speech models stored in the database 23. By using both speech and face recognition features, the index process module 21 may more reliably identify a particular participant.
  • FIG. 10 is a flowchart representing the process of indexing multimedia communications in accordance with the multicast network [0052] 10 shown in FIG. 1. The index process module 21 starts with step S100. In step S110, the index process module 21 queries all participants to register their face and speech features. If all participants are registered, the index process module 21 moves to step S150. Otherwise, the index process module 21 moves to step S120. In step S120, individual participants register their face and speech features with the index process module 21. The index process module 21 then moves to step S150.
  • In step S[0053] 150, the index process module 21 stores the multimedia communications from the sources S1, S2 and S3 in the buffer 22. That is, the index process module 21 stores the multimedia data packets containing the audio, video and data communications from the sources S1, S2 and S3. The index process module 21 then moves to step S160 and the multimedia communications end.
  • FIGS. 11A and 11B show processing of the multimedia data packets which were stored in the buffer [0054] 22 (of FIG. 2) during the multimedia communications of FIG. 10, using the multicast network of FIG. 1. In FIG. 11A, the index process module 21 processes each multimedia data packet to identify participants by face and speech patterns. The index process module 21 starts at step S200. In step S210, the index process module 21 selects a multimedia data packet for indexing. The index process module 21 then moves to step S220. In step S220, the index process module 21 reads the payload type field 122 and source identifier field 123. The index process module 21 thus determines the source of the multimedia data packet and the type of data. If the multimedia data packet contains audio data, the index process module 21 moves to step S230. Otherwise the index process module 21 jumps to step S280.
  • In step S[0055] 230, the index process module 21 notes the time stamp of the multimedia data packet and determines if there are any corresponding video multimedia data packets from the same source with approximately this time stamp. If there are, the index process module 21 moves to step S240. Otherwise, the index process module 21 jumps to step S250. In step S240, the index process module 21 retrieves the corresponding video multimedia data packet identified in Step S230. The index process module 21 then moves to step S250.
  • In step S[0056] 250, the index process module 21 compares the audio and video data contained in the multimedia data packets to face and speech models for the sources as identified by the source identifier field of FIG. 8 stored in the database 23. The index process module 21 then moves to step S260. In step S260, the index process module 21 determines if there is a pattern match between the audio and video data contained in the multimedia data packets and the face and speech models. If there is a match, the index process module 21 moves to step S270. Otherwise the index process module 21 moves to step S300.
  • In step S[0057] 270, the index process module 21 creates a variable length header extension segment and attaches the segment to the multimedia data packet header. The index process module 21 places a bit in the extension field 122 to indicate the existence of the header extension segment. The index process module 21 also populates the header extension segment with data to indicate the identity of the participant. The index process module 21 then stores the multimedia data packet in the database 23 (as detailed in FIG. 11B). The index process module 21 then moves to step S280.
  • In step S[0058] 280, the index process module 21 determines if the multimedia data packet selected in step S210 contains video data. If the multimedia data packet contains video data, the index process module 21 moves to step S290. Otherwise the index process module 21 process moves to step S300. In step S290, the index process module 21 determines if the multimedia data packet contains background change or key scene change data. If the multimedia data packet contains the data, the index process module 21 moves to step S310. Otherwise the index process module 21 moves to step S300.
  • In step S[0059] 310, the index process module 21 creates a variable length header extension segment and attaches the segment to the multimedia data packet header. The index process module 21 places a bit in the extension field 122 to indicate the existence of the header extension segment. The index process module 21 also populates the header extension segment with data to indicate the identity of the change event. The index process module 21 then stores the multimedia data packet in the database 23. The index process module 21 then moves to step S320.
  • In step S[0060] 300, the index process module 21 stores the multimedia data packet in the database 23 without the header extension segment and without the bit in the extension field. The index process module 21 then returns to step S210.
  • In step S[0061] 320, the index process module 21 determines if all the multimedia data packets for the multimedia communication have been indexed. If all multimedia data packets have not been indexed, the index process module 21 returns to step S210. Otherwise the index process module 21 moves to step S330. In step S330, the index process module 21 creates an index table that associates the identity of each participant with corresponding multimedia data packets and indicates for each multimedia data packet, the appropriate face and speech model for the participant. For multimedia data packets that contain background and scene change data, the index table includes a reference to the change event. The index process module 21 then moves to step S340 and ends processing of the multimedia data packets.
  • In the illustrated embodiments, suitably programmed general purpose computers control data processing in the multicast network [0062] 10 and at the sources. However, the processing functions could also be implemented using a single purpose integrated circuit (e.g., an ASIC) having a main or central processor section for overall, system-level control, and separate circuits dedicated to performing various specific computational functional and other processes under control of the central processor section. The processing can also be implemented using separate dedicated or programmable integrated electronic circuits or devices (e.g., hardwired electronic or logical devices). In general, any device or assembly of devices on which a finite state machine capable of implementing the flowcharts of FIGS. 10, 11A and 11B can be used to control data processing.
  • The invention has been described with reference to the preferred embodiments thereof, which are illustrative and not limiting. Various changes may be made without departing from the spirit and scope of the invention as defined in the following claims. [0063]

Claims (21)

What is claimed is:
1. A method for indexing a multimedia communication, comprising:
receiving the multimedia communication, the multimedia communication including a plurality of multimedia data packets;
processing the plurality of multimedia data packets to identify distinguishing features; and
indexing the plurality of multimedia data packets based on the identified distinguishing features, wherein the processing step comprises associating each of the plurality of multimedia data packets with one of a plurality of objects within the multimedia communication.
2. The method of
claim 1
, wherein the distinguishing features are based on at least one of audio data and video data of the multimedia communication.
3. The method of
claim 1
, further comprising rebroadcasting the processed plurality of multimedia data packets.
4. The method of
claim 1
, wherein the objects include at least one of a person, an animal, a plant, and an inanimate object.
5. The method of
claim 4
, wherein a multimedia data packet of the plurality of multimedia data packets includes a payload having one of the audio and the video data that corresponds to an object, the associating step attaching a header identifier that identifies the object .
6. The method of
claim 5
, wherein the person is a speaker participating in the multimedia communication, audio speech patterns and video statistical sampling of a face of the speaker being a portion of the distinguishing features, the speaker being associated with the multimedia data packet if the portion of the distinguishing features is included in the payload of the multimedia data packet.
7. The method of
claim 5
, further comprising:
identifying background changes in the video data;
identifying key scene events based on the video data; and
attaching a second header identifier to each multimedia data packet containing the background change and the key scene event, the second header identifier identifying the multimedia data packet as containing a background change and a key scene event.
8. The method of
claim 7
, wherein the multimedia communication is a multicast multimedia communication, the rebroadcasting step including multicasting the processed plurality of multimedia data packets.
9. The method of
claim 8
, wherein a time stamp is provided to synchronize the audio and the video data.
10. The method of
claim 9
, further comprising storing the indexed plurality of multimedia data packets, wherein the indexed plurality of multimedia data packets can be searched to retrieve audio and video multimedia data packets corresponding to selected distinguishing features.
11. The method of
claim 10
, wherein the indexed plurality of multimedia data packets can be searched using key words.
12. The method of
claim 11
, wherein the multimedia communication is conducted using a local area network.
13. The method of
claim 1
, wherein the indexing and the processing steps are performed at a multicast network.
14. An apparatus for indexing a multimedia communication, comprising:
a server that receives multimedia communications in multimedia data packets including audio, visual and data communications and identifies distinguishing features in the multimedia communication based on audio and video recognition and a source of the multimedia communications;
a header function module connected to the server, the header function module entering metadata in a header segment corresponding to the multimedia data packets received by the server, the metadata being related to the distinguishing features; and
a storage device that stores the multimedia data packets.
15. The apparatus of
claim 14
, wherein the distinguishing features include audio voice and video face patterns of participants in the multimedia communications.
16. The apparatus of
claim 15
, wherein the metadata includes voice and face identifiers of the participants.
17. The apparatus of
claim 16
, wherein the server identifies background changes in the video multimedia data packets and wherein the header function module enters second metadata in the header segment corresponding to the multimedia data packets having background changes, the second metadata including scene identifiers.
18. The apparatus of
claim 17
, wherein the server is an multicast server.
19. The apparatus of
claim 18
, wherein the server comprises audio, video and data bridges.
20. A method of identifying participants to a multimedia communication, comprising:
comparing audio speech patterns for each participant to speech models;
comparing video face patterns for each participant to face models; and
determining an identity of a particular participant when both the audio speech patterns and the video face patterns match speech and face models for the particular participant.
21. The method of
claim 20
, further comprising creating an index of the participants based on identification of the speech and face patterns of the participants, the index being used to segment the multimedia communication.
US09/025,940 1998-02-19 1998-02-19 Indexing multimedia communications Expired - Lifetime US6377995B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/025,940 US6377995B2 (en) 1998-02-19 1998-02-19 Indexing multimedia communications

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/025,940 US6377995B2 (en) 1998-02-19 1998-02-19 Indexing multimedia communications

Publications (2)

Publication Number Publication Date
US20010042114A1 true US20010042114A1 (en) 2001-11-15
US6377995B2 US6377995B2 (en) 2002-04-23

Family

ID=21828895

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/025,940 Expired - Lifetime US6377995B2 (en) 1998-02-19 1998-02-19 Indexing multimedia communications

Country Status (1)

Country Link
US (1) US6377995B2 (en)

Cited By (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020149672A1 (en) * 2001-04-13 2002-10-17 Clapp Craig S.K. Modular video conferencing system
US20030041162A1 (en) * 2001-08-27 2003-02-27 Hochmuth Roland M. System and method for communicating graphics images over a computer network
US6539055B1 (en) * 1999-12-03 2003-03-25 Intel Corporation Scene change detector for video data
US6584509B2 (en) * 1998-06-23 2003-06-24 Intel Corporation Recognizing audio and video streams over PPP links in the absence of an announcement protocol
US20030187652A1 (en) * 2002-03-27 2003-10-02 Sony Corporation Content recognition system for indexing occurrences of objects within an audio/video data stream to generate an index database corresponding to the content data stream
US20040012669A1 (en) * 2002-03-25 2004-01-22 David Drell Conferencing system with integrated audio driver and network interface device
US6707819B1 (en) * 1998-12-18 2004-03-16 At&T Corp. Method and apparatus for the encapsulation of control information in a real-time data stream
US20040098398A1 (en) * 2001-01-30 2004-05-20 Sang-Woo Ahn Method and apparatus for delivery of metadata synchronized to multimedia contents
WO2004046845A2 (en) 2002-11-20 2004-06-03 Nokia Corporation System and method for data transmission and reception
GB2415800A (en) * 2004-07-01 2006-01-04 School Pictures Internat Ltd Image correlation apparatus
US20060064477A1 (en) * 2004-09-23 2006-03-23 Renkis Martin A Mesh networked video and sensor surveillance system and method for wireless mesh networked sensors
US20060070109A1 (en) * 2004-09-30 2006-03-30 Martin Renkis Wireless video surveillance system & method with rapid installation
US20060070107A1 (en) * 2004-09-24 2006-03-30 Martin Renkis Wireless video surveillance system and method with remote viewing
US20060070108A1 (en) * 2004-09-30 2006-03-30 Martin Renkis Wireless video surveillance system & method with digital input recorder interface and setup
US20060066720A1 (en) * 2004-09-24 2006-03-30 Martin Renkis Wireless video surveillance system and method with external removable recording
US20060066721A1 (en) * 2004-09-25 2006-03-30 Martin Renkis Wireless video surveillance system and method with dual encoding
US20060066729A1 (en) * 2004-09-24 2006-03-30 Martin Renkis Wireless video surveillance system and method with DVR-based querying
US20060071779A1 (en) * 2004-09-30 2006-04-06 Martin Renkis Wireless video surveillance system & method with input capture and data transmission prioritization and adjustment
US20060075065A1 (en) * 2004-09-30 2006-04-06 Renkis Martin A Wireless video surveillance system and method with single click-select actions
US20060072757A1 (en) * 2004-09-24 2006-04-06 Martin Renkis Wireless video surveillance system and method with emergency video access
US20060072013A1 (en) * 2004-09-23 2006-04-06 Martin Renkis Wireless video surveillance system and method with two-way locking of input capture devices
US20060075235A1 (en) * 2004-09-30 2006-04-06 Martin Renkis Wireless video surveillance system and method with security key
US20060080303A1 (en) * 2004-10-07 2006-04-13 Computer Associates Think, Inc. Method, apparatus, and computer program product for indexing, synchronizing and searching digital data
US20060095539A1 (en) * 2004-10-29 2006-05-04 Martin Renkis Wireless video surveillance system and method for mesh networking
US20060143672A1 (en) * 2004-09-23 2006-06-29 Martin Renkis Wireless video surveillance processing negative motion
US20060192675A1 (en) * 2004-09-23 2006-08-31 Renkis Martin A Enterprise video intelligence and analytics management system and method
US20060251259A1 (en) * 2004-09-23 2006-11-09 Martin Renkis Wireless surveillance system releasably mountable to track lighting
US7136577B1 (en) * 2000-06-29 2006-11-14 Tandberg Telecom As RTP-formated media clips
US20060282265A1 (en) * 2005-06-10 2006-12-14 Steve Grobman Methods and apparatus to perform enhanced speech to text processing
US20070009104A1 (en) * 2004-09-23 2007-01-11 Renkis Martin A Wireless smart camera system and method
US20070009114A1 (en) * 2005-05-02 2007-01-11 Kenoyer Michael L Integrated videoconferencing system
US20070064109A1 (en) * 2004-09-23 2007-03-22 Renkis Martin A Wireless video surveillance system and method for self-configuring network
US20070199032A1 (en) * 2004-09-23 2007-08-23 Renkis Martin A Wireless surveillance system releasably mountable to track lighting
US20070196032A1 (en) * 2006-02-17 2007-08-23 Sony Corporation Compressible earth mover's distance
US20070223682A1 (en) * 2006-03-23 2007-09-27 Nokia Corporation Electronic device for identifying a party
US20070233733A1 (en) * 2006-04-04 2007-10-04 Sony Corporation Fast generalized 2-Dimensional heap for hausdorff and earth mover's distance
WO2007116281A1 (en) * 2006-04-10 2007-10-18 Nokia Corporation Method for utilizing speaker recognition in content management
US20080005184A1 (en) * 2006-06-30 2008-01-03 Nokia Corporation Method and Apparatus for the Synchronization and Storage of Metadata
US20080074554A1 (en) * 2006-04-21 2008-03-27 Lg Electronic Inc. Apparatus for transmitting broadcast signal, method thereof, method of producing broadcast signal and apparatus for receiving broadcast signal
US20080201389A1 (en) * 2007-02-20 2008-08-21 Searete, Llc Cross-media storage coordination
US20080198844A1 (en) * 2007-02-20 2008-08-21 Searete, Llc Cross-media communication coordination
US20080250066A1 (en) * 2007-04-05 2008-10-09 Sony Ericsson Mobile Communications Ab Apparatus and method for adding contact information into a contact list
US20090225164A1 (en) * 2006-09-13 2009-09-10 Renkis Martin A Wireless smart camera system and method for 3-D visualization of surveillance
US7603087B1 (en) 2005-08-12 2009-10-13 Smartvue Corporation Wireless video surveillance jamming and interface prevention
US20100149305A1 (en) * 2008-12-15 2010-06-17 Tandberg Telecom As Device and method for automatic participant identification in a recorded multimedia stream
US20100265942A1 (en) * 2001-03-14 2010-10-21 At&T Intellectual Property I, L.P. Receive Device for a Cable Data Service
US20110292164A1 (en) * 2010-05-28 2011-12-01 Radvision Ltd. Systems, methods, and media for identifying and selecting data images in a video stream
CN102402382A (en) * 2010-09-07 2012-04-04 索尼公司 Information processing device and information processing method
EP2448265A1 (en) * 2010-10-26 2012-05-02 Google, Inc. Lip synchronization in a video conference
GB2486793A (en) * 2010-12-23 2012-06-27 Samsung Electronics Co Ltd Identifying a speaker via mouth movement and generating a still image
US20120188914A1 (en) * 2009-12-22 2012-07-26 Motorola Solutions, Inc. Decoupled cascaded mixers architechture and related methods
US20130147897A1 (en) * 2010-09-10 2013-06-13 Shigehiro Ichimura Mobile terminal, remote operation system, data transmission control method by mobile terminal, and non-transitory computer readable medium
WO2013133828A1 (en) * 2012-03-08 2013-09-12 Hewlett-Packard Development Company, L.P. Data sampling deduplication
US20130300939A1 (en) * 2012-05-11 2013-11-14 Cisco Technology, Inc. System and method for joint speaker and scene recognition in a video/audio processing environment
EP2677743A1 (en) * 2012-06-19 2013-12-25 BlackBerry Limited Method and apparatus for identifying an active participant in a conferencing event
US20140129676A1 (en) * 2011-06-28 2014-05-08 Nokia Corporation Method and apparatus for live video sharing with multimodal modes
US8886011B2 (en) 2012-12-07 2014-11-11 Cisco Technology, Inc. System and method for question detection based video segmentation, search and collaboration in a video processing environment
US8917309B1 (en) 2012-03-08 2014-12-23 Google, Inc. Key frame distribution in video conferencing
US8972262B1 (en) * 2012-01-18 2015-03-03 Google Inc. Indexing and search of content in recorded group communications
US20150089560A1 (en) * 2012-04-25 2015-03-26 Samsung Electronics Co., Ltd. Method and apparatus for transceiving data for multimedia transmission system
US9058806B2 (en) 2012-09-10 2015-06-16 Cisco Technology, Inc. Speaker segmentation and recognition based on list of speakers
US20150172592A1 (en) * 2001-04-09 2015-06-18 Monitoring Technology Llc Data recording and playback system and method
US9210302B1 (en) 2011-08-10 2015-12-08 Google Inc. System, method and apparatus for multipoint video transmission
US9386273B1 (en) 2012-06-27 2016-07-05 Google Inc. Video multicast engine
US9609275B2 (en) 2015-07-08 2017-03-28 Google Inc. Single-stream transmission method for multi-user video conferencing

Families Citing this family (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6564263B1 (en) * 1998-12-04 2003-05-13 International Business Machines Corporation Multimedia content description framework
JP2000354080A (en) * 1999-04-09 2000-12-19 Komatsu Ltd Method for controlling communication between electronic devices, construction machine using the same, and electronic circuit for construction machine
JP2001014052A (en) * 1999-06-25 2001-01-19 Toshiba Comput Eng Corp Individual authenticating method of computer system, computer system, and recording medium
GB2354105A (en) * 1999-09-08 2001-03-14 Sony Uk Ltd System and method for navigating source content
US7149359B1 (en) * 1999-12-16 2006-12-12 Microsoft Corporation Searching and recording media streams
US6704769B1 (en) * 2000-04-24 2004-03-09 Polycom, Inc. Media role management in a video conferencing network
US7853664B1 (en) 2000-07-31 2010-12-14 Landmark Digital Services Llc Method and system for purchasing pre-recorded music
US20020059629A1 (en) * 2000-08-21 2002-05-16 Markel Steven O. Detection and recognition of data receiver to facilitate proper transmission of enhanced data
JP2004509490A (en) * 2000-08-25 2004-03-25 インテロシティー ユーエスエイ,アイエヌシー.Intellocity Usa, Inc. Personal remote control
US20020057286A1 (en) * 2000-08-25 2002-05-16 Markel Steven O. Device independent video enhancement scripting language
US7716358B2 (en) 2000-09-12 2010-05-11 Wag Acquisition, Llc Streaming media buffering system
US6766376B2 (en) 2000-09-12 2004-07-20 Sn Acquisition, L.L.C Streaming media buffering system
US8595372B2 (en) * 2000-09-12 2013-11-26 Wag Acquisition, Llc Streaming media buffering system
EP1350392B1 (en) * 2000-10-24 2009-01-14 Aol Llc Method of sizing an embedded media player page
US8122236B2 (en) 2001-10-24 2012-02-21 Aol Inc. Method of disseminating advertisements using an embedded media player page
US7987280B1 (en) * 2000-10-27 2011-07-26 Realnetworks, Inc. System and method for locating and capturing desired media content from media broadcasts
FR2816157A1 (en) * 2000-10-31 2002-05-03 Thomson Multimedia Sa Method for processing video data distinees for viewing on screen and device embodying the METHOD
US20040064500A1 (en) * 2001-11-20 2004-04-01 Kolar Jennifer Lynn System and method for unified extraction of media objects
US6842761B2 (en) 2000-11-21 2005-01-11 America Online, Inc. Full-text relevancy ranking
US7170886B1 (en) 2001-04-26 2007-01-30 Cisco Technology, Inc. Devices, methods and software for generating indexing metatags in real time for a stream of digitally stored voice data
US6662176B2 (en) * 2001-05-07 2003-12-09 Hewlett-Packard Development Company, L.P. Database indexing and rolling storage method for time-stamped normalized event data
US20020169893A1 (en) * 2001-05-09 2002-11-14 Li-Han Chen System and method for computer data synchronization
US7016901B2 (en) * 2001-07-31 2006-03-21 Ideal Scanners & Systems, Inc. System and method for distributed database management of graphic information in electronic form
US20030046705A1 (en) * 2001-08-28 2003-03-06 Sears Michael E. System and method for enabling communication between video-enabled and non-video-enabled communication devices
US7728870B2 (en) * 2001-09-06 2010-06-01 Nice Systems Ltd Advanced quality management and recording solutions for walk-in environments
JP4549610B2 (en) * 2001-11-08 2010-09-22 ソニー株式会社 Communication system, a communication method, transmission apparatus and method, receiving apparatus and method, and program
JP2003162506A (en) * 2001-11-22 2003-06-06 Sony Corp Network information processing system, information- providing management apparatus, information-processing apparatus and information-processing method
US20040052418A1 (en) * 2002-04-05 2004-03-18 Bruno Delean Method and apparatus for probabilistic image analysis
US7369685B2 (en) * 2002-04-05 2008-05-06 Identix Corporation Vision-based operating method and system
US20030231746A1 (en) * 2002-06-14 2003-12-18 Hunter Karla Rae Teleconference speaker identification
US20040003394A1 (en) * 2002-07-01 2004-01-01 Arun Ramaswamy System for automatically matching video with ratings information
US7466334B1 (en) * 2002-09-17 2008-12-16 Commfore Corporation Method and system for recording and indexing audio and video conference calls allowing topic-based notification and navigation of recordings
US7158689B2 (en) * 2002-11-25 2007-01-02 Eastman Kodak Company Correlating captured images and timed event data
US7756923B2 (en) * 2002-12-11 2010-07-13 Siemens Enterprise Communications, Inc. System and method for intelligent multimedia conference collaboration summarization
US20050013589A1 (en) * 2003-07-14 2005-01-20 Microsoft Corporation Adding recording functionality to a media player
JP4551668B2 (en) * 2004-02-25 2010-09-29 パイオニア株式会社 Minutes file generation method, proceedings file management method, the conference server and the network conferencing system
US20050206721A1 (en) * 2004-03-22 2005-09-22 Dennis Bushmitch Method and apparatus for disseminating information associated with an active conference participant to other conference participants
US7355622B2 (en) * 2004-04-30 2008-04-08 Microsoft Corporation System and process for adding high frame-rate current speaker data to a low frame-rate video using delta frames
US7355623B2 (en) * 2004-04-30 2008-04-08 Microsoft Corporation System and process for adding high frame-rate current speaker data to a low frame-rate video using audio watermarking techniques
TWI254221B (en) * 2004-05-06 2006-05-01 Lite On It Corp Method and apparatus for indexing multimedia data
JP4649944B2 (en) * 2004-10-20 2011-03-16 富士ゼロックス株式会社 Moving image processing apparatus, a moving image processing method, and program
US7752532B2 (en) * 2005-03-10 2010-07-06 Qualcomm Incorporated Methods and apparatus for providing linear erasure codes
US20070168325A1 (en) * 2006-01-13 2007-07-19 Julian Bourne System and method for workflow processing using a portable knowledge format
US7489772B2 (en) 2005-12-30 2009-02-10 Nokia Corporation Network entity, method and computer program product for effectuating a conference session
US9633356B2 (en) * 2006-07-20 2017-04-25 Aol Inc. Targeted advertising for playlists based upon search queries
US8121198B2 (en) 2006-10-16 2012-02-21 Microsoft Corporation Embedding content-based searchable indexes in multimedia files
US7995106B2 (en) * 2007-03-05 2011-08-09 Fujifilm Corporation Imaging apparatus with human extraction and voice analysis and control method thereof
KR101378372B1 (en) * 2007-07-12 2014-03-27 삼성전자주식회사 Digital image processing apparatus, method for controlling the same, and recording medium storing program to implement the method
US8170342B2 (en) 2007-11-07 2012-05-01 Microsoft Corporation Image recognition of content
US20090123035A1 (en) * 2007-11-13 2009-05-14 Cisco Technology, Inc. Automated Video Presence Detection
US9465892B2 (en) 2007-12-03 2016-10-11 Yahoo! Inc. Associating metadata with media objects using time
US20090207316A1 (en) * 2008-02-19 2009-08-20 Sorenson Media, Inc. Methods for summarizing and auditing the content of digital video
US8301444B2 (en) * 2008-12-29 2012-10-30 At&T Intellectual Property I, L.P. Automated demographic analysis by analyzing voice activity
US8260877B2 (en) * 2008-12-31 2012-09-04 Apple Inc. Variant streams for real-time or near real-time streaming to provide failover protection
US8099473B2 (en) * 2008-12-31 2012-01-17 Apple Inc. Variant streams for real-time or near real-time streaming
US8156089B2 (en) 2008-12-31 2012-04-10 Apple, Inc. Real-time or near real-time streaming with compressed playlists
US8578272B2 (en) 2008-12-31 2013-11-05 Apple Inc. Real-time or near real-time streaming
JP5619775B2 (en) * 2009-01-30 2014-11-05 トムソン ライセンシングThomson Licensing How to control and request information of the multimedia from the display
US9489577B2 (en) * 2009-07-27 2016-11-08 Cxense Asa Visual similarity for video content
GB201105502D0 (en) 2010-04-01 2011-05-18 Apple Inc Real time or near real time streaming
US8805963B2 (en) 2010-04-01 2014-08-12 Apple Inc. Real-time or near real-time streaming
US8560642B2 (en) 2010-04-01 2013-10-15 Apple Inc. Real-time or near real-time streaming
CN102238179B (en) 2010-04-07 2014-12-10 苹果公司 Real-time or near real-time streaming
US8395653B2 (en) 2010-05-18 2013-03-12 Polycom, Inc. Videoconferencing endpoint having multiple voice-tracking cameras
US9723260B2 (en) * 2010-05-18 2017-08-01 Polycom, Inc. Voice tracking camera with speaker identification
US8630854B2 (en) * 2010-08-31 2014-01-14 Fujitsu Limited System and method for generating videoconference transcriptions
US8949871B2 (en) 2010-09-08 2015-02-03 Opentv, Inc. Smart media selection based on viewer user presence
US8791977B2 (en) 2010-10-05 2014-07-29 Fujitsu Limited Method and system for presenting metadata during a videoconference
US8959071B2 (en) 2010-11-08 2015-02-17 Sony Corporation Videolens media system for feature selection
US8843586B2 (en) 2011-06-03 2014-09-23 Apple Inc. Playlists for real-time or near real-time streaming
US8856283B2 (en) 2011-06-03 2014-10-07 Apple Inc. Playlists for real-time or near real-time streaming
US8938393B2 (en) * 2011-06-28 2015-01-20 Sony Corporation Extended videolens media engine for audio recognition
US10304458B1 (en) 2014-03-06 2019-05-28 Board of Trustees of the University of Alabama and the University of Alabama in Huntsville Systems and methods for transcribing videos using speaker identification
US9672829B2 (en) 2015-03-23 2017-06-06 International Business Machines Corporation Extracting and displaying key points of a video conference

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5835667A (en) * 1994-10-14 1998-11-10 Carnegie Mellon University Method and apparatus for creating a searchable digital video library and a system and method of using such a library
US5666466A (en) 1994-12-27 1997-09-09 Rutgers, The State University Of New Jersey Method and apparatus for speaker recognition using selected spectral information
US5710591A (en) 1995-06-27 1998-01-20 At&T Method and apparatus for recording and indexing an audio and multimedia conference
US5835153A (en) * 1995-12-22 1998-11-10 Cirrus Logic, Inc. Software teletext decoder architecture
US6038368A (en) * 1996-02-05 2000-03-14 Sony Corporation System for acquiring, reviewing, and editing sports video segments
US5894480A (en) * 1996-02-29 1999-04-13 Apple Computer, Inc. Method and apparatus for operating a multicast system on an unreliable network
US6035304A (en) * 1996-06-25 2000-03-07 Matsushita Electric Industrial Co., Ltd. System for storing and playing a multimedia application adding variety of services specific thereto
US5928330A (en) * 1996-09-06 1999-07-27 Motorola, Inc. System, device, and method for streaming a multimedia file
US5926624A (en) * 1996-09-12 1999-07-20 Audible, Inc. Digital information library and delivery system with logic for generating files targeted to the playback device

Cited By (130)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6584509B2 (en) * 1998-06-23 2003-06-24 Intel Corporation Recognizing audio and video streams over PPP links in the absence of an announcement protocol
US6707819B1 (en) * 1998-12-18 2004-03-16 At&T Corp. Method and apparatus for the encapsulation of control information in a real-time data stream
US6539055B1 (en) * 1999-12-03 2003-03-25 Intel Corporation Scene change detector for video data
US7136577B1 (en) * 2000-06-29 2006-11-14 Tandberg Telecom As RTP-formated media clips
US20070014545A1 (en) * 2000-06-29 2007-01-18 Falco Michael A RTP-formatted media clips
US9014532B2 (en) * 2000-06-29 2015-04-21 Cisco Technology, Inc. RTP-formatted media clips
US20080212672A1 (en) * 2001-01-30 2008-09-04 Sang-Woo Ahn Method and apparatus for delivery of metadata synchronized to multimedia contents
US20040098398A1 (en) * 2001-01-30 2004-05-20 Sang-Woo Ahn Method and apparatus for delivery of metadata synchronized to multimedia contents
US7376155B2 (en) * 2001-01-30 2008-05-20 Electronics And Telecommunications Research Institute Method and apparatus for delivery of metadata synchronized to multimedia contents
US20100265942A1 (en) * 2001-03-14 2010-10-21 At&T Intellectual Property I, L.P. Receive Device for a Cable Data Service
US8855147B2 (en) 2001-03-14 2014-10-07 At&T Intellectual Property Ii, L.P. Devices and methods to communicate data streams
US7990977B2 (en) 2001-03-14 2011-08-02 At&T Intellectual Property I, L.P. Method, system, and device for sending data in a cable data service
US10009190B2 (en) 2001-03-14 2018-06-26 At&T Intellectual Property Ii, L.P. Data service including channel group
US8000331B2 (en) 2001-03-14 2011-08-16 At&T Intellectual Property Ii, L.P. Receive device for a cable data service
US20150172592A1 (en) * 2001-04-09 2015-06-18 Monitoring Technology Llc Data recording and playback system and method
US20020149672A1 (en) * 2001-04-13 2002-10-17 Clapp Craig S.K. Modular video conferencing system
US20030041162A1 (en) * 2001-08-27 2003-02-27 Hochmuth Roland M. System and method for communicating graphics images over a computer network
US7450149B2 (en) * 2002-03-25 2008-11-11 Polycom, Inc. Conferencing system with integrated audio driver and network interface device
US20040012669A1 (en) * 2002-03-25 2004-01-22 David Drell Conferencing system with integrated audio driver and network interface device
US20030187652A1 (en) * 2002-03-27 2003-10-02 Sony Corporation Content recognition system for indexing occurrences of objects within an audio/video data stream to generate an index database corresponding to the content data stream
WO2004046845A2 (en) 2002-11-20 2004-06-03 Nokia Corporation System and method for data transmission and reception
WO2004046845A3 (en) * 2002-11-20 2004-10-21 Nokia Corp System and method for data transmission and reception
US20060173921A1 (en) * 2002-11-20 2006-08-03 Esa Jalonen System and method for data transmission and reception
KR100746190B1 (en) * 2002-11-20 2007-08-03 노키아 인크 System and method for data transmission and reception
GB2415800A (en) * 2004-07-01 2006-01-04 School Pictures Internat Ltd Image correlation apparatus
US8752106B2 (en) 2004-09-23 2014-06-10 Smartvue Corporation Mesh networked video and sensor surveillance system and method for wireless mesh networked sensors
US20060072013A1 (en) * 2004-09-23 2006-04-06 Martin Renkis Wireless video surveillance system and method with two-way locking of input capture devices
US20060192675A1 (en) * 2004-09-23 2006-08-31 Renkis Martin A Enterprise video intelligence and analytics management system and method
US20060251259A1 (en) * 2004-09-23 2006-11-09 Martin Renkis Wireless surveillance system releasably mountable to track lighting
US20060143672A1 (en) * 2004-09-23 2006-06-29 Martin Renkis Wireless video surveillance processing negative motion
US8457314B2 (en) 2004-09-23 2013-06-04 Smartvue Corporation Wireless video surveillance system and method for self-configuring network
US20070009104A1 (en) * 2004-09-23 2007-01-11 Renkis Martin A Wireless smart camera system and method
US20060064477A1 (en) * 2004-09-23 2006-03-23 Renkis Martin A Mesh networked video and sensor surveillance system and method for wireless mesh networked sensors
US7821533B2 (en) 2004-09-23 2010-10-26 Smartvue Corporation Wireless video surveillance system and method with two-way locking of input capture devices
US20070064109A1 (en) * 2004-09-23 2007-03-22 Renkis Martin A Wireless video surveillance system and method for self-configuring network
US8750509B2 (en) 2004-09-23 2014-06-10 Smartvue Corporation Wireless surveillance system releasably mountable to track lighting
US20070199032A1 (en) * 2004-09-23 2007-08-23 Renkis Martin A Wireless surveillance system releasably mountable to track lighting
US7730534B2 (en) 2004-09-23 2010-06-01 Smartvue Corporation Enterprise video intelligence and analytics management system and method
US20060066729A1 (en) * 2004-09-24 2006-03-30 Martin Renkis Wireless video surveillance system and method with DVR-based querying
US20060066720A1 (en) * 2004-09-24 2006-03-30 Martin Renkis Wireless video surveillance system and method with external removable recording
US7719567B2 (en) 2004-09-24 2010-05-18 Smartvue Corporation Wireless video surveillance system and method with emergency video access
US20060070107A1 (en) * 2004-09-24 2006-03-30 Martin Renkis Wireless video surveillance system and method with remote viewing
US7508418B2 (en) 2004-09-24 2009-03-24 Smartvue Corporation Wireless video surveillance system and method with DVR-based querying
US7954129B2 (en) 2004-09-24 2011-05-31 Smartvue Corporation Wireless video surveillance system and method with remote viewing
US8208019B2 (en) 2004-09-24 2012-06-26 Martin Renkis Wireless video surveillance system and method with external removable recording
US20060072757A1 (en) * 2004-09-24 2006-04-06 Martin Renkis Wireless video surveillance system and method with emergency video access
US20060066721A1 (en) * 2004-09-25 2006-03-30 Martin Renkis Wireless video surveillance system and method with dual encoding
US7936370B2 (en) 2004-09-25 2011-05-03 Smartvue Corporation Wireless video surveillance system and method with dual encoding
US10198923B2 (en) 2004-09-30 2019-02-05 Sensormatic Electronics, LLC Wireless video surveillance system and method with input capture and data transmission prioritization and adjustment
US8199195B2 (en) 2004-09-30 2012-06-12 Martin Renkis Wireless video surveillance system and method with security key
US9407877B2 (en) 2004-09-30 2016-08-02 Kip Smrt P1 Lp Wireless video surveillance system and method with input capture and data transmission prioritization and adjustment
US9544547B2 (en) 2004-09-30 2017-01-10 Kip Smrt P1 Lp Monitoring smart devices on a wireless mesh communication network
US10152860B2 (en) 2004-09-30 2018-12-11 Sensormatics Electronics, Llc Monitoring smart devices on a wireless mesh communication network
US20060075235A1 (en) * 2004-09-30 2006-04-06 Martin Renkis Wireless video surveillance system and method with security key
US20060075065A1 (en) * 2004-09-30 2006-04-06 Renkis Martin A Wireless video surveillance system and method with single click-select actions
US20060071779A1 (en) * 2004-09-30 2006-04-06 Martin Renkis Wireless video surveillance system & method with input capture and data transmission prioritization and adjustment
US20060070108A1 (en) * 2004-09-30 2006-03-30 Martin Renkis Wireless video surveillance system & method with digital input recorder interface and setup
US20060070109A1 (en) * 2004-09-30 2006-03-30 Martin Renkis Wireless video surveillance system & method with rapid installation
US7728871B2 (en) 2004-09-30 2010-06-01 Smartvue Corporation Wireless video surveillance system & method with input capture and data transmission prioritization and adjustment
US7784080B2 (en) 2004-09-30 2010-08-24 Smartvue Corporation Wireless video surveillance system and method with single click-select actions
US8253796B2 (en) 2004-09-30 2012-08-28 Smartvue Corp. Wireless video surveillance system and method with rapid installation
US8126895B2 (en) * 2004-10-07 2012-02-28 Computer Associates Think, Inc. Method, apparatus, and computer program product for indexing, synchronizing and searching digital data
US20060080303A1 (en) * 2004-10-07 2006-04-13 Computer Associates Think, Inc. Method, apparatus, and computer program product for indexing, synchronizing and searching digital data
US10304301B2 (en) 2004-10-29 2019-05-28 Sensormatic Electronics, LLC Wireless environmental data capture system and method for mesh networking
US10194119B1 (en) 2004-10-29 2019-01-29 Sensormatic Electronics, LLC Wireless environmental data capture system and method for mesh networking
US20060095539A1 (en) * 2004-10-29 2006-05-04 Martin Renkis Wireless video surveillance system and method for mesh networking
US10115279B2 (en) 2004-10-29 2018-10-30 Sensomatic Electronics, LLC Surveillance monitoring systems and methods for remotely viewing data and controlling cameras
US7907164B2 (en) * 2005-05-02 2011-03-15 Lifesize Communications, Inc. Integrated videoconferencing system
US20070009114A1 (en) * 2005-05-02 2007-01-11 Kenoyer Michael L Integrated videoconferencing system
US20060282265A1 (en) * 2005-06-10 2006-12-14 Steve Grobman Methods and apparatus to perform enhanced speech to text processing
US20090275287A1 (en) * 2005-08-12 2009-11-05 Renkis Martin A Wireless video surveillance jamming and interface prevention
US7603087B1 (en) 2005-08-12 2009-10-13 Smartvue Corporation Wireless video surveillance jamming and interface prevention
US20070196032A1 (en) * 2006-02-17 2007-08-23 Sony Corporation Compressible earth mover's distance
US7602976B2 (en) 2006-02-17 2009-10-13 Sony Corporation Compressible earth mover's distance
US20070223682A1 (en) * 2006-03-23 2007-09-27 Nokia Corporation Electronic device for identifying a party
US20140221046A1 (en) * 2006-03-23 2014-08-07 Core Wireless Licensing S.A.R.L. Electronic device for identifying a party
US8724785B2 (en) * 2006-03-23 2014-05-13 Core Wireless Licensing S.A.R.L. Electronic device for identifying a party
US9083786B2 (en) * 2006-03-23 2015-07-14 Core Wireless Licensing S.A.R.L Electronic device for identifying a party
US7577684B2 (en) 2006-04-04 2009-08-18 Sony Corporation Fast generalized 2-Dimensional heap for Hausdorff and earth mover's distance
US20070233733A1 (en) * 2006-04-04 2007-10-04 Sony Corporation Fast generalized 2-Dimensional heap for hausdorff and earth mover's distance
WO2007116281A1 (en) * 2006-04-10 2007-10-18 Nokia Corporation Method for utilizing speaker recognition in content management
US20080074554A1 (en) * 2006-04-21 2008-03-27 Lg Electronic Inc. Apparatus for transmitting broadcast signal, method thereof, method of producing broadcast signal and apparatus for receiving broadcast signal
US8531607B2 (en) * 2006-04-21 2013-09-10 Lg Electronics Inc. Apparatus for transmitting broadcast signal, method thereof, method of producing broadcast signal and apparatus for receiving broadcast signal
US20080005184A1 (en) * 2006-06-30 2008-01-03 Nokia Corporation Method and Apparatus for the Synchronization and Storage of Metadata
US7725431B2 (en) * 2006-06-30 2010-05-25 Nokia Corporation Method and apparatus for the synchronization and storage of metadata
US8395664B2 (en) 2006-09-13 2013-03-12 Smartvue Corp. Wireless surveillance system and method for 3-D visualization and user-controlled analytics of captured data
US20090225164A1 (en) * 2006-09-13 2009-09-10 Renkis Martin A Wireless smart camera system and method for 3-D visualization of surveillance
US9008117B2 (en) 2007-02-20 2015-04-14 The Invention Science Fund I, Llc Cross-media storage coordination
US20080201389A1 (en) * 2007-02-20 2008-08-21 Searete, Llc Cross-media storage coordination
US9008116B2 (en) 2007-02-20 2015-04-14 The Invention Science Fund I, Llc Cross-media communication coordination
US7860887B2 (en) 2007-02-20 2010-12-28 The Invention Science Fund I, Llc Cross-media storage coordination
US9760588B2 (en) 2007-02-20 2017-09-12 Invention Science Fund I, Llc Cross-media storage coordination
US20080198844A1 (en) * 2007-02-20 2008-08-21 Searete, Llc Cross-media communication coordination
US20080250066A1 (en) * 2007-04-05 2008-10-09 Sony Ericsson Mobile Communications Ab Apparatus and method for adding contact information into a contact list
WO2008122836A1 (en) * 2007-04-05 2008-10-16 Sony Ericsson Mobile Communications Ab Apparatus and method for adding.contact information into a contact list during a call
WO2010071442A1 (en) * 2008-12-15 2010-06-24 Tandberg Telecom As Method for speeding up face detection
US8390669B2 (en) 2008-12-15 2013-03-05 Cisco Technology, Inc. Device and method for automatic participant identification in a recorded multimedia stream
EP2380349A4 (en) * 2008-12-15 2012-06-27 Cisco Systems Int Sarl Method for speeding up face detection
EP2380349A1 (en) * 2008-12-15 2011-10-26 Tandberg Telecom AS Method for speeding up face detection
US20100149305A1 (en) * 2008-12-15 2010-06-17 Tandberg Telecom As Device and method for automatic participant identification in a recorded multimedia stream
US8937889B2 (en) * 2009-12-22 2015-01-20 Motorola Solutions, Inc. Decoupled cascaded mixers architechture and related methods
US20120188914A1 (en) * 2009-12-22 2012-07-26 Motorola Solutions, Inc. Decoupled cascaded mixers architechture and related methods
US8773490B2 (en) * 2010-05-28 2014-07-08 Avaya Inc. Systems, methods, and media for identifying and selecting data images in a video stream
US20110292164A1 (en) * 2010-05-28 2011-12-01 Radvision Ltd. Systems, methods, and media for identifying and selecting data images in a video stream
EP2577964A4 (en) * 2010-05-28 2015-06-03 Avaya Inc Systems, methods, and media for identifying and selecting data images in a video stream
CN102402382A (en) * 2010-09-07 2012-04-04 索尼公司 Information processing device and information processing method
US8842890B2 (en) 2010-09-07 2014-09-23 Sony Corporation Method and device for detecting a gesture from a user and for performing desired processing in accordance with the detected gesture
EP2426621A3 (en) * 2010-09-07 2014-05-14 Sony Corporation Information processing device and information processing method
US20130147897A1 (en) * 2010-09-10 2013-06-13 Shigehiro Ichimura Mobile terminal, remote operation system, data transmission control method by mobile terminal, and non-transitory computer readable medium
US9313450B2 (en) * 2010-09-10 2016-04-12 Nec Corporation Mobile terminal, remote operation system, data transmission control method by mobile terminal, and non-transitory computer readable medium
US9055332B2 (en) 2010-10-26 2015-06-09 Google Inc. Lip synchronization in a video conference
EP2448265A1 (en) * 2010-10-26 2012-05-02 Google, Inc. Lip synchronization in a video conference
US8687076B2 (en) 2010-12-23 2014-04-01 Samsung Electronics Co., Ltd. Moving image photographing method and moving image photographing apparatus
GB2486793B (en) * 2010-12-23 2017-12-20 Samsung Electronics Co Ltd Moving image photographing method and moving image photographing apparatus
GB2486793A (en) * 2010-12-23 2012-06-27 Samsung Electronics Co Ltd Identifying a speaker via mouth movement and generating a still image
US20140129676A1 (en) * 2011-06-28 2014-05-08 Nokia Corporation Method and apparatus for live video sharing with multimodal modes
US9210302B1 (en) 2011-08-10 2015-12-08 Google Inc. System, method and apparatus for multipoint video transmission
US8972262B1 (en) * 2012-01-18 2015-03-03 Google Inc. Indexing and search of content in recorded group communications
WO2013133828A1 (en) * 2012-03-08 2013-09-12 Hewlett-Packard Development Company, L.P. Data sampling deduplication
US8917309B1 (en) 2012-03-08 2014-12-23 Google, Inc. Key frame distribution in video conferencing
US9872051B2 (en) * 2012-04-25 2018-01-16 Samsung Electonics Co., Ltd. Method and apparatus for transceiving data for multimedia transmission system
US20150089560A1 (en) * 2012-04-25 2015-03-26 Samsung Electronics Co., Ltd. Method and apparatus for transceiving data for multimedia transmission system
US10219012B2 (en) 2012-04-25 2019-02-26 Samsung Electronics Co., Ltd. Method and apparatus for transceiving data for multimedia transmission system
WO2013170212A1 (en) * 2012-05-11 2013-11-14 Cisco Technology, Inc. System and method for joint speaker and scene recognition in a video/audio processing environment
US20130300939A1 (en) * 2012-05-11 2013-11-14 Cisco Technology, Inc. System and method for joint speaker and scene recognition in a video/audio processing environment
EP2677743A1 (en) * 2012-06-19 2013-12-25 BlackBerry Limited Method and apparatus for identifying an active participant in a conferencing event
US9386273B1 (en) 2012-06-27 2016-07-05 Google Inc. Video multicast engine
US9058806B2 (en) 2012-09-10 2015-06-16 Cisco Technology, Inc. Speaker segmentation and recognition based on list of speakers
US8886011B2 (en) 2012-12-07 2014-11-11 Cisco Technology, Inc. System and method for question detection based video segmentation, search and collaboration in a video processing environment
US9609275B2 (en) 2015-07-08 2017-03-28 Google Inc. Single-stream transmission method for multi-user video conferencing

Also Published As

Publication number Publication date
US6377995B2 (en) 2002-04-23

Similar Documents

Publication Publication Date Title
Hardman et al. Successful multiparty audio communication over the Internet
US7474633B2 (en) Method for forwarding and storing session packets according to preset and/or dynamic rules
US5946386A (en) Call management system with call control from user workstation computers
US6963353B1 (en) Non-causal speaker selection for conference multicast
US6498791B2 (en) Systems and methods for multiple mode voice and data communications using intelligently bridged TDM and packet buses and methods for performing telephony and data functions using the same
US8559469B1 (en) System and method for voice transmission over network protocols
US8068592B2 (en) Intelligent switching system for voice and data
JP4157664B2 (en) How to set up the communication session, device and communication system
US6850609B1 (en) Methods and apparatus for providing speech recording and speech transcription services
US7007098B1 (en) Methods of controlling video signals in a video conference
JP4205310B2 (en) Rule-based multi-media customer / company interaction network operating system
EP0721725B1 (en) Multimedia collaboration system
US7046780B2 (en) Efficient buffer allocation for current and predicted active speakers in voice conferencing systems
US7039675B1 (en) Data communication control apparatus and method adapted to control distribution of data corresponding to various types of a plurality of terminals
US6775247B1 (en) Reducing multipoint conferencing bandwidth
EP1247386B1 (en) Digital recording in an ip based distributed switching platform
US20040001479A1 (en) Systems and methods for voice and data communications including a network drop and insert interface for an external data routing resource
EP1628480A2 (en) Telecommunications system
US6226361B1 (en) Communication method, voice transmission apparatus and voice reception apparatus
CA2227173C (en) Method and apparatus for recording and indexing an audio and multimedia conference
US6100882A (en) Textual recording of contributions to audio conference using speech recognition
US8407287B2 (en) Systems, methods, and media for identifying and associating user devices with media cues
US6298129B1 (en) Teleconference recording and playback system and associated method
US7130403B2 (en) System and method for enhanced multimedia conference collaboration
US6038295A (en) Apparatus and method for recording, communicating and administering digital images

Legal Events

Date Code Title Description
AS Assignment

Owner name: AT&T CORP., NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AGRAHARAM, SANJAY;MARKOWITZ, ROBERT EDWARD;ROSEN, KENNETH H.;AND OTHERS;REEL/FRAME:009028/0265;SIGNING DATES FROM 19980204 TO 19980217

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12