MXPA98010440A

MXPA98010440A - System and method for associating multime objects

Info

Publication number: MXPA98010440A
Application number: MXPA/A/1998/010440A
Authority: MX
Inventors: Gray Boyer David
Original assignee: Bell Communications Research Inc
Priority date: 1996-06-21
Filing date: 1998-12-09
Publication date: 1999-06-01

Abstract

A video conference system (30) and method using a central multimedia bridge (32) to combine multimedia signals from a plurality of conference participants (34-37) into an individual composite signal for each participant. The system gives each conference participant the ability to customize his or her individual screen of other participants, including entering and expelling the computer from selected portions of the screen and overlaying displayed images, the ability to identify individual images in a stream of information. video composed of clicking and dragging operations or similar. The user of the system uses a string of video composition modules that can be extended as necessary to combine video signal streams from any number of real-time conference participants. The multimedia association computation program is provided to associate different types of media to improve the display and manipulation capacity for multimedia uses. The system also allows each user to dynamically change the time to receive the information provided by the conference.

Description

SYSTEM AND METHOD FOR ASSOCIATING MULTIMEDIA OBJECTS BACKGROUND OF THE INVENTION Field of the Invention The present invention relates to the association of multimedia objects. More specifically, the invention relates to a system and method for associating multimedia objects to improve display and manipulation capabilities for multimedia uses, such as, for example, real-time video conferencing.

DESCRIPTION OF THE RELATED TECHNIQUE Video teleconferencing occurs when people in different locations send voice and video data to each other in order to simulate having all the participants present in a single room. Each person in a multi-point conference expects to see all or most of all participants. Accordingly, the various video streams are presented to each participant in a REF: 29069 spatially separated manner, either on separate screens or in separate areas of an individual video screen. Each of the video conference terminals sends a locally generated video image to each of the other participating terminals and receives a video image of each of the other participants. In the prior art, this meant that for a directional conference, six video streams must be transmitted; for a five-way conference, twenty video streams must be transmitted. For a conference of eight participants, fifty-six streams of video should be transmitted. In general, if N people are holding a televideo conference, then N x (N - 1) broadcast channels must be sent. Accordingly, the relatively large number of channels used for a video teleconference comprising multiple participants should be prohibitive with the prior art systems. In addition, participants must have a sufficient number of input channels, decoders, and translators (if different video formats are transmitted) to receive and display multiple images of the different participants. Therefore, the required number of channels, decoders and / or translators also becomes prohibitive. With the prior art systems, the video conference participants were unable to customize their video screen by entering into the computer in or out of the portions of the visual image used, or by making the various images in the overlapping participants. in a way that is measured naturally, or when placing and sizing images as they like. Also, participants were unable to associate the video images with other multimedia objects to improve the variety of conference functions that can be enjoyed.

BRIEF DESCRIPTION OF THE INVENTION It is an object of the present invention to provide a real-time, flexible, video conferencing system for use by a plurality of users, in which the transmission bandwidth required of each user is minimized.

It is a further object of the present invention to provide a video conferencing system in which each participant receives only one video stream (and audio) of the bandwidth, encoding and video standard they may desire from a central multimedia bridge . It is a further object of the present invention to provide a video conferencing service that gives each participant the ability to compose the video images of other participants on a fully customized screen. It is an object of the present invention to provide a video composition unit, driven by the plurality, infinitely extendable, to combine any number of video signals into a single priority, video stream. It is a further object of the present invention to provide a method for associating images of a video screen in a hierarchical manner and for associating multimedia objects together to improve video conferencing applications and other multimedia applications.

It is a further object of the present invention to allow each user to dynamically change who can receive the information they provide to the conference. It is a further object of the present invention to provide the ability for users to identify individual images in a video stream comprised of click-and-drag operations or the like. Objects, advantages and further novel features of the invention will be set forth in the description that follows, and will become apparent to those skilled in the art in reading this description or in practicing the invention. The objects and advantages of the invention can be realized and achieved by the appended claims. The present invention is a teleconference, multimedia, multi-point service, with customer presentation controls for each participant. An enhanced multimedia bridge provides media controlled by the client, rich in features (mainly video and audio) that mixes the capabilities for each participant. The multimedia bridge is a shared network resource that does not need to be continued by the users or co-located with them, but the change can be rented on a time-based basis. A "star" network topology is used to connect each user to the server (s). Also available in the central bridge location are encoders and decoders of different types, so that customers with different types and brands of equipment will be able to communicate with each other. The central combination eliminates the need for multiple input channels and multiple decoders in each desk of the participants. Each user receives only one video stream of the bandwidth, encoding and video standard they desire. All transcoding and standard conversions are achieved in the multimedia bridge. The enhanced multimedia bridge gives the user the ability to compose a visual space by itself that is different from the screens of the other conference participants. Due to this "personal" control characteristic, the present invention will be referred to as a personal presence system (PPS).

The computer program of the present invention controls and handles the multimedia bridge, adjusts and coordinates the conference, and provides easy-to-use human interfaces. Each participant in a multimedia conference using the present invention can arrange the various video images on a screen in a way that is pleasing to them, and resume them at any time during the session. To fix your screen, conference participants can move and scale up or down the video images and overlap them in a manner similar in priority to a workstation screen with windows. A user can select any of the images that appear on his video screen for an operation on that image. The user's signaling device (for example, mouse) can be used to move or readjust the size of the image, in a long way to the "click and drag" operations supported by the Windows environments of the PC. The present invention brings this unprecedented capacity to the video workspace. Additionally, the various elements of each image, such as a person or a diagram, can be "entered into the computer" in or out of the image so that the desired elements can be mounted in a more natural way, not restricted by the rectangular limits. The present invention also provides a presentation control capability that allows users to "associate" multimedia streams with each other, thereby allowing the creation of composite groups or groups of objects. The multimedia association feature can be used to provide the reception and synchronization of audio and video points, by decreasing image slides synchronized with recorded audio. A multimedia provider can use this feature to synchronize the information of different servers to deal with the limitations of the storage capacity of information or with the restrictions of the Copyright in certain information. A user can associate different video images in order to compose a video scene. By associating the images that are sent by a camera array, a panoramic view can be generated and the panoramic production of the panoramic view can be supported. The association of different incoming images also allows a teleconference user to select the vision of a subset of other conferences and provide a convenient way to access different conference images simply by displaying the left or right side of the panoramic projection on the scene combined In addition, a user can associate the video and audio cases together so that the size of the video case changes, the volume of the audio case changes, and when the location of the video case changes, the panoramic volume in stereo changes in the audio case BRIEF DESCRIPTION OF THE DRAWINGS The invention is better understood by reading the following detailed description of the preferred embodiments with reference to the drawing figures, in which similar reference numerals refer to similar elements throughout, and in which: Figure 1 is a schematic overview of the main components of the present invention; Figure 2 is a graphic diagram of a video conference session using the present invention; Figure 3 is a graphical view of a user station associated with the present invention; Figure 4 is an illustration of a sample video screen during a video sample session using the present invention; Figure 5 is a schematic diagram of an improved multimedia bridge used in the present invention; Figure 6 is a schematic diagram of the video portion of the multimedia bridge of Figure 5; Figure 7 is a schematic diagram of a video composer unit within the video bridge portion of Figure 6; Figure 8 is a schematic diagram of a video composition module within the video composer string of Figure 7; Figure 9 is a block diagram of construction of the components of the computer program used in the present invention; Figure 10 is an object model diagram of the Client program shown in Figure 9; Figure 11 is an object model diagram of the Service Section program shown in Figure 9; Figure 12 is a model diagram of the bridge manager program object used in conjunction with the resource agent program shown in Figure 9; Figure 13 is a flow chart of a process for establishing a session with the multimedia bridge of the present invention; Figure 14 is a graphic diagram of a video image association using the present invention; Figure 15 is a model diagram object of the architecture of the multimedia object association video program, used in the present invention; Figure 16 is an object model diagram showing an example of the association of multimedia objects using the objects of the video case group; Figure 17 is an object model diagram showing an example of the association of video objects with the associated video and audio cases together; and Figure 18 is a graphical diagram illustrating a process for extracting a portion of the video screen using the present invention.

DETAILED DESCRIPTION OF THE PREFERRED MODALITIES In describing the preferred embodiments of the present invention illustrated in the drawings, specific terminology is employed for the safety of clarity. However, the invention is not intended to limit the specific terminology thus selected, and it will be understood that each specific element includes all technical equivalents that operate in a similar manner to achieve a similar purpose. With reference to Figure 1, a real-time video conferencing system 30 includes an enhanced multimedia bridge (AMB) 32 and a plurality of user stations 34-37 that are connected to the MAB 32. The connections between user stations 34-37 and AMB 32 can be any of a variety of conventional electrical / data connections such as telephone modem link, broadband SDN, etc. Each of the user stations 34-37 transmits and receives video, audio and / or other data to and from the AMB 32. The AMB 32 is configured to interconnect with a variety of conventional communication links between the user stations 34 -37 and the AMB 32 and is configured to send and receive data to each of the user stations 34-37. Figure 2 shows a video conference session using the present invention. Each of the user stations 34-37 may contain one or more users who have a video terminal to view the teleconference, audio input and output capabilities and / or one or more video cameras. The data from the video cameras and the audio data from the user are transmitted from each of the user stations 34-37 to the AMB 32. The AMB 32 combines and manipulates the data in a manner described in more detail later in the present and provides a return signal to each user at user stations 34-37. With reference to Figure 3, the user station 34 of Figure 1 is shown in more detail. The user station 34 is illustrated as having an individual user 42, a video camera 44, and an individual display station 46. The camera 44 and the display station 46 are electrically connected to the communication channel that connects the monitoring station. user 34 to the AMB 32. The display station 46 has a conventional display 48 which displays images received from the video signals of other user stations 35-37 in a manner described in more detail hereinafter. If the user station includes a television and an adjustment encoder, the user 42 can control the display of the screen 48 with a remote control device 49. If the user station has a PC or workstation, the user can control the display with a mouse. Although the user station 34 is shown to have a user 42, a camera 44 and a display terminal 46, it is possible for other user stations 35-37 to have more than one user and / or more than one camera. In addition, it is possible to use a variety of terminal devices, including standalone PCs, network workstations, and even conventional television monitors with the control computation program (described below) located in a different location. Application of the terminal user will run on an adjustment encoder or a control PC. The specific configuration of the user station 34 is shown in Figure 3 and is for illustrative purposes only. With reference to Figure 4, the screen 48 of Figure 3 is shown in more detail. Screen 48 includes a direct window 52 showing other participants 54-58 of the video conference. Separate video images of each of the participants 54-58 could be provided to the AMB 32 by separate video signals from other stations of the user stations 35-37. Alternatively, it is possible that some of the participants 54-56 are in the same location and therefore are captured by an individual video image signal, this will occur if the participants 54-56 are actually sitting together in a station of individual user in the manner shown in the window 52. However, it is also possible that the images of each of the participants 54-56 are from a separate video camera. As will be discussed in more detail later herein, the AMB 32 may combine the images of the various participants 54-58 in a manner shown in the direct window 52 to present the user with an individual view of the teleconference participants, creating This way the illusion that the participants are sitting together in the teleconference. With reference to Figure 5, the schematic diagram illustrates the architecture of the complete physical equipment of the AMB 32. The AMB 32 includes network interfaces 72, 78 to handle incoming outgoing signals from user stations 34-37. A demultiplexer 73 separates the incoming signals into data, video, audio and control signals, respectively, and routes the signals to the respective data, audio and video bridges, and a control unit 76. The control network 76 controls the functions of each one of the data, audio and video bridges, based on the control signals and the instructions received from the user stations 34-37. A multiplexer unit 77 multiplexes the outgoing signals from each of the bridges and the control unit 76 and sends them through a network interface 78 back to the user stations 34-37.

With reference to Figure 6, a schematic diagram illustrates the video portion (AVB) 32a of the AMB 32. The AVB 32a receives the control signals Cl, C2, ... CN from each of the N users. The AVB 32a also receives the video input signals VIN1, VIN2, ... VINK from each of the K cameras located at the user stations 34-37. It is noted that, as discussed above, the number of cameras is not necessarily equal to the user number. The AVB 32a transfers video signals VOUT1, VOUT2, VOUTN to the N users. In the manner discussed in more detail below, each of the audio output signals is controlled by the control inputs of each of the users. For example, the video output signal VOUT1 could represent the video image represented in the direct window 52 of Figure 4. The user who sees the direct window 52 can observe the contents and presentation of the VOUT video signal to the providing control signals Cl to the AVB 32a, in a manner discussed in more detail below. The video input signals from the camera are provided to the video interface and the normalization unit 72a. the video interface unit 72a handles, in a conventional manner, the various communication formats provided by the connections between the AMB 32 and the user stations 34-37. The unit 72a also normalizes the color components of the input video signals, so that each picture element ("picture element" or "pixel") for each of the video input signals has red components, green and blue, comparable. The output signals of the video and normalization interface unit 72a are standardized video input signals. A video composition unit (VCU) 74 receives the standardized video signals from the cameras and combines the signals. Also input to the VCU 74 are control signals provided by a control unit 76 which processes the user control signals Cl, C2, ... CN, to control the contents and display of the output of the VCU 74. The operation of the VCU 74 in the control unit 76 is described in more detail herein. The output of the VCU 74 is a plurality of standardized video signals, each of which contains a video image similar to that shown in the direct window 52 of Figure 4. The 78a unit of the video interface and denormalization receives the outputs of the VCU 74, and provides the output signals, VTU1, VOUT2, VOUTN, to each of the N users. The video interface and denormalization unit 78a denormalizes the input video signals to provide an appropriate video output format according to what each user desires. With reference to Figure 7, a schematic diagram illustrates the VCU 74 in detail. In order to simplify the discussion of Figure 7, the set of control and control input circuits of the VCU 74 are shown in the schematic representation of Figure 7. The VCU 74 is comprised of a plurality of composition strings of Video (VCM, for its acronym in English) 92-94, there is a VCC for each output. VOUT1, VOUT2, ... VOUTN. That is, for a system that supports N users, the VCU 74 must have at least N VCC 92-94. The VCC 92-94 are comprised of a plurality of video composition module (VCM) units 96-107. VCC 92 includes VCM 96-99, VCC 93 includes VCM 100-103, and VCC 94 includes VCM 104-107. Each of the VCM 96-107 is identical to each of the other VCM 96-107. Each of the VCM 96-107 has an input A and an input B, each of which receives a separate video signal. Each of the VCM 96-107 superimposes the video signal of the input B on the video signal of the input A, in the manner described in more detail later herein. The output is the result of the superposition of the signal B in the signal A. The inputs to the VCC 92-94 are provided by switches 112-114, respectively. the inputs to the switches are the video input signals of the cameras VIN1, VIN2, ... VINK. The control signals (not shown in Figure 7) operate the switches 112-14 to provide a particular signal of the video input signals to the particular inputs of the VCCS 96-107 of the VCC 92-94. The control signals of the switches 112-114 vary according to the control units provided by the users. For example, if the user receiving the signal VOUT1 wishes to see a particular subset of video input signals, the user provides the appropriate control signals to the AVB 32a. The control logic circuit (not shown in Figure 7) operates the switch 112 so that the switch provides the required video input signals to the VCM 96-99 of the VCC 92 that supplies the V0UT1. For the VCU 74 shown in Figure 7, the VCC 92-94 are illustrated as having four VCMs 96-99, 100-103, 104-107, respectively, each. Therefore, each of the 92-94 VCCs is able to combine five separate video images. This can be illustrated by examining VCC 92, where VCM 96 receives two video inputs and combines these inputs to provide an output. The output of the VCM 96 is provided as the input A to the VCM 97 which receives another video signal at the input B thereof and combines these signals with the input A to provide an output to the VCM 98 which receives the combined input as the input A thereof and receives a new video signal at the input B thereof, combines these signals, provides an output to the A input of the VCM 99. The VCM 99 receives the combined signal at the input A of the same and a new video signal at the B input thereof, combines the signals, and provides the output V0UT1. It is possible to build video composition strings that have any number of video composition modules different from those shown in Figure 7. The maximum number of units that can be overlap is always one greater than the number of VCMs in the VCC. Although Figure 7 shows the VCC 92-94 each with four VCM 96-99, 100-103, 104-107, respectively, wired together, it is possible to configure the VCU 74, so that the connections between the VCMs are switched by themselves. In this way, it would be possible for a user to request a particular number of the VCMs from a set of available VCMs that would then be exchanged together by the switches in a customized VCC. The particular switch arrangements used may be conventional, and the implementation of these switch arrays is within the skill of one skilled in the art.

The video composition strings described in Figure 7 are shown as receiving in a central network bridge. It should be understood that these parts of the invention could also be used within some user stations or similar terminal equipment for some of the same purposes as described herein, and therefore that these parts of the invention are not limited to use in a central installation. With reference to Figure 8, the schematic diagram illustrates in detail one of the VCM 96 of Figure 7. As discussed above, the VCMs 96-107 of Figure 7 are essentially identical and differ only in terms of the inputs provided. these. The VCM 96 combines the video data from the A inputs with the video data from the B inputs. For each position of the picture element in the output frame, a data picture element of either the A input or the Input B is transferred to the output. The choice which of the inputs is transferred to the output depends on the priority assigned to each picture element in each of the input video streams A and B. For the A inputs of the VCM 96 illustrated in Figure 8, each video image element is shown as having 24 bits each (8 bits each for red, green and blue), and as having 8 bits for priority. Accordingly, each image element of the input A is represented as a value of 32 bits. Similarly, for the B inputs, each picture element is represented by a 24 bit video signal (8 bits each for red, green and blue) and a priority of 8 bits. Therefore, just as with the A inputs, each picture element of the B inputs are represented by the 32 bit color. The bit values discussed herein and shown in the drawings are used for purposes of illustration only and should not be taken to limit the scope of the invention. All the bit values described for the inputs and outputs to the VCM 96 can be varied without changing the invention. For example, video inputs and outputs could be 18 to 30 bits, priority / key entries and outputs could be 6 or 10 bits, and so on. Video A inputs are provided directly to multiplexer 122 driven by priority. The video inputs B, on the one hand, are first provided to a frame memory 124 of 512K by 32 bits which stores the video data and the priority data for the input video signal B. Between the priority input B and the frame memory is a flexible masking and priority generation system, described in detail below, which alters the original priority value of the input B. The memory frame 124 can be used to synchronize, misalign, reflect and set scales the video input B with respect to the video input A. The output of the frame memory 124 is provided to the multiplexer 122 driven by priority. Accordingly, the priority-driven multiplexer 122 compares the priority for each image element of the input A with the priority for each image element of the input B of the frame memory 124 and transfers the image element having the highest priority. high associated with it. The priority-driven multiplexer 122 also transfers the priority of the image element having the highest priority where each image element of the input A and the input B. An input address generator 126 receives the clock signals H, and V for the video input B. the input address generator 126 stores the 24 bit video portion of each picture element of the input B in the frame memory 124 without making any significant modification to the video input data B. That is, the address generator 126 stores the 24-bit video portion of each picture element for the video input B without providing any deviation, resizing or any other image modification to the video input B. Therefore, the video portion of the B inputs stored in the frame memory 124 is essentially identical to that provided to the VCM 96. The priority portion 8-bit services of the video inputs B are provided to a mask and selector 128 of priority B. A priority generator 130 also provides inputs to the selector mask 128 of priority B. The operation of the priority generator 130 is described in detail later. The selector mask 128 of priority B selects certain bits of the priority generator output 130 and the input priority value and provides this output to a priority lookup table (P-LUT) 132. The P-LUT 132 is a RAM of 256 x 8 (or other compatible size) that correlates the 8-bit input to it in an 8-bit priority value that is stored, on a per pixel basis, in the frame memory 124. The values for the priority search table 132 are provided to the VCM 96 in the manner discussed in more detail below. The sizes of the P-LUT 132 in the frame memory 124 can be varied for different video frame sizes, maximum, different, such as HDTV, and for different numbers of priority stacking levels, such as 256 (P -LUT = 256 X 8) 5 or 64 (P-LUT = 64 x 6), without changing the invention. The priority generator 130 generates a priority value for each of the picture elements of the video input B stored in the frame memory 124. One or more sections 134 of manipulators of the value of the picture elements provide a value of priority for each of the image elements according to the value of the 24-bit video signal. That is, the manipulator 134 of the value of the image element alters the priority of each image element according to the color and brightness of that image element input. The manipulator 134 of the value of the displayed image element has 3 sections marked A, B and C. Each section transfers a bit of the priority where the bit output is equal to a digit "1" if an image element falls on the specified color range and is equal to the digit "0" if the image element falls outside the specified color range. For example, the manipulator A of the image element value has 6 values T1-T6 that are loaded with constant values in a manner described in more detail below. The manipulator A of the value of the image element examines each image element from the input video image B determines whether the red portion of the image element is between the values of TI and T2, the green portion is between the values of T3 and T4, and the blue portion is between the values of T5 and T6. If all these conditions are met, that is, if the image element has red, green and blue values that are well between TI and T2, Te and T4, and T5 and T6, respectively, then the manipulator A of the element clock image transfers a "1". Otherwise, the manipulator A of the value of the image element transmits a "0". The operations of the manipulator B of the value of the image element and the manipulator C of the value of the image element are similar. Thus, each of the manipulators of the value of the image element of the unit 134 of the value manipulator of the image element can separately independently provide a bit of the plurality according to the color value of the element of the image element. input video image B. The manipulator 134 of the value of the image element may be implemented in a conventional manner using the physical equipment of a digital comparator. For some purposes, it may be more useful for the three video channels to carry information in formats other than RGB (red, green, blue), such as conventional YIQ or YUV formats. These alternative encodings are also useful by the image element value manipulator and do not alter its different pressure by altering the color space and thresholds required. The priority generator 130 also contains one or more window generation sections 136. The window generation section 136 each consists of a generation part A of windows, a generation part B of windows, and a generation part C of windows. Each of the parties are operated independently. The window generation part processes the H, V and clock portions (CLK) of the signal of the video input B and transfers a digit "1" bit or a digit "0" bit depending on the horizontal and vertical location of each of the picture elements of the video input B. For example, the generation part A of windows can have 4 separate values for Hl, H2, VI and V2. If the input value indicated by the input H for the input video signal B is between Hl and H2, the input value indicated by the input V is between VI and V2, then a generation part of the window of the section 136 of window generation transfers a digit "1" of bit. Otherwise, the window generation part transfers a digit "0" bit. Each of the window generation parts, the generation part A of windows, the generation part B of windows, and the generation part C of windows, operate independently from each other. The window generation section 136 may be implemented in a conventional manner using the physical type of a digital comparator. Various window generators 136 and manipulators 134 of the values of the image element, each producing a bit, can in order define different priorities for several objects of various colors in different parts of the image. The individual output bits are treated as an 8-bit word. This word is defined as a numerical value and is used to direct the P-LUT 132. Depending on the contents of the memory of the P-LUT 132 any input can be transformed into any numerical priority output at the element clock rate of image, complete of video. This transformation is necessary because the multiplexer 122 passes only the highest priority input to each position of the picture element. The priority generator 130 needs only to assign different numerical priority values to different sales or objects within the input video frame B. The P-LUT 132 then allows the customer to control the ordering of these priorities. For example, when the client makes a request for a graphical interaction at user station 34-37 a particular object or window appears in its composite scene. The human interface program and the control programs of the physical equipment convert that request into a reassignment of the numerical priorities attached to that area of the image, increasing the priority of the requested object, or decreasing the priorities of the occluded objects. The priority generator 130 is illustrated in Figure 8 as having a manipulator section 134 of the image element with three independent manipulator parts of the value of the image element and a window generating section 136 with three window generating portions. , separated independent. The number of window generators and manipulators of the value of the image element can be varied without changing the invention. In addition, the number of separate parts used for each of the sections 134, 136 is a design choice based on a variety of functional factors including the number of bits used for the priority, the number of independent parts desired, and others. family criteria to those skilled in the art. Accordingly, the invention can be practiced with one or more manipulator sections 134 of the image element value having a number of parts different from three and one or more window generating sections 136 having a number of independent generating parts. of windows dependent on three. The 6-bit output of priority generator 130 is provided to mask and priority selector 128 which is also provided with the input priority signal of video input B. Conventional control registers (not shown) determine that - of the 14 input bits provided to the priority mask selector 128 will be provided to the priority search table 132. Although the output of the priority mask and selector 128 is shown as an 8-bit output, and similarly, the entry to the priority search table 132 is shown as an 8-bit input, the invention can be practiced with any number of bits transferred for the mask and priority selector 128 and the entry for the table priority search 132. The number of selected bits is a design choice based on a variety of functional factors known to those skilled in the art, including the number of desired priority priorities and the amount of priority control desired. As discussed above, the priority lookup table 132 is a 256 x 8 RAM that correlates the 8 bits provided by the priority selector mask 128 to an 8 bit value that is provided to the frame memory 124. Accordingly, the priority associated with each image element stored in the frame memory 124 is provided by the priority search table 132.

The priority selector mask 128, the priority generator 130 and the priority lookup table 132 operate together to provide the priority for each image element of the video input B. As discussed in more detail below, the priority of B video inputs can be altered in this way in order to provide a variety of effects. For example, if the video input B is provided in a window that has been trimmed, the window generation section 136 can be set accordingly so that the image elements that are outside the trimmed window are given a low Priority while the image elements are within the cropped window are given a relatively high priority. Similarly, the section 134 of the image element manipulator can be used to mask one or more colors, so that, for example, a video image of a teleconference participant showing the participant in front of a blue background is can provide as the video input B and the section 134 of the manipulator of the value of the image element can be adjusted to mask the blue background by providing a relatively low priority to the image elements having a color corresponding to the blue background and a priority respectively high to other picture elements for the video input image B. A read address generator 140 reads the input data B from the frame memory 124 and provides the data to the multiplexer 122 driven by priority. In order to compensate different video standards that are used for the input A and for the input B, the address generator 140 reads the data at a rate corresponding to the speed or proportion of the data provided via the video input A. That is, the read address generator 140 synchronizes the inputs of the multiplexer 122 driven by priority, so that the picture elements of the frame memory 124 arrive simultaneously with the corresponding picture elements from the video input A to the multiplexer 122 driven by plurality. The read address generator 140 also handles imbalances between the input A and the input B and any scaling and / or reflection of the video input B. The requested amount of unbalance in X and Y, the amount of increase or reduction and any returned are provided to the VCM 96 in the manner described in more detail below. The read address generator 140 handles imbalances by providing the data of the picture element from the frame memory 124 to a specified vertical and horizontal imbalance from the data of the video input A. For example, if the video image B is it will horizontally change 5 picture elements from the video input A, then the read address generator 140 will wait five picture elements after the left edge at the video input A to provide the left edge of the video input B. The increase / decrease of the video image B and the return of the video image B are handled in a similar manner. It is pointed out that the ratio of an imbalance to a video image, increase and reduction of a video image, and return of a video image are all known to one skilled in the art and will not be described in detail herein. A computer control interface 142 connects the VCM 96 to an external control device such as a control unit 76 shown in Figures 5 and 6. The computer control interface 142 has an address input and a data input. The address entry is shown as a 16-bit value and the data entry is shown in Figure 8 with a value of 8 bits. However, it will be appreciated by one skilled in the art that the number of bits for the address and data entries can be modified and is a design selection that depends on a variety of functional factors familiar to one skilled in the art. The address entry is used to select different VCMs and several registers within each VCM 96 and to load the priority search table 132. Different address entries load different elements of these elements. The data entry is the data that is provided to the various registers and the lookup table 132. Consequently, a user wishing to provide values to the priority search table 132 will simply provide the appropriate address for each of the 256 locations in the priority search table 132 presented herein and will provide the data to be loaded into the priority search table 132. Similarly, section 134 of the image element value manipulator and / or window generating section 136 may be loaded via comparator control interface 142 by providing the appropriate address for each one of the elements of the manipulator 134 of the image element value or the window generation section 136 and provide the desired data for this. VCM 96 is otherwise accessed in a conventional manner and will not be discussed further in the foregoing. The following input parameters are provided to the VCM 96: HBMAX: the number of pixels in the horizontal line of the video image B. HP the desired horizontal position of the video image B with respect to the video image A. HS the horizontal scale will be applied to the video image B. The scale is defined as the factor by which the video image B shrinks with respect to the video image A.

HF a binary value that indicates that the horizontal return is applied or not to the video image. That is, when HF is equal to 1, the image will be flipped to provide an image in the mirror. VBMAX the number of pixels in the vertical line of the video image B. VP the vertical, desired position of the video image B with respect to the video image A. VS the vertical scale to be applied to the video image B. Scale is defined as the factor by which the video image B shrinks. VF a binary value that indicates whether the vertical return to the image is applied (that is, if the image is flipped or not, upwards).

Architecture of the computer program Figure 9 is a block diagram of transcription of the computer program products that support the operation of the present invention. The computer program provides a generic service platform to control network-based multimedia bridges. The AVB described above is an example of a video bridge that can be controlled by the service platform. These video bridges, as well as audio bridges, can also be controlled on this service platform. A remote procedure call mechanism (RPC) of a distributed procedure environment (DPE) can be used as the communication mechanism between the PPS clients and the PPS service session module. A PPS client program 200 provides an application programming interface (API) that tracks the media objects provided by the local user in the session in the multimedia associations of the cases received from the media. A PPS service session 201 program tracks all users, media objects, cases in a session, and multimedia bridges. Program 201 of the PPS service session contacts the administrator of the network portion (not part of PPS, but a necessary core component of any network management environment) to establish connection between all the participants.

A resource agent program 202 reserves the necessary computer hardware and contacts a network service administrator 205 (not part of the PPS) (discussed later) for billing. Finally, program 203 of the resource manager configures the physical equipment and provides feedback, if any. In addition to the four computer program components (as mentioned above) that are necessary to manage or manage the PPS service (ie, the PPS 203 resource manager, the PPS 202 resource agent, the service session of PPS 201, and the client program 200), a connection manager (CM) 204 and a Service 205 administrator are part of the network management environment that supports the PPS service. The CM 204 is responsible for establishing and maintaining the network connectivity required by the user. Service manager 205 is responsible for providing operations that support functionality for network services. Service administrator 205 configures services, provides billing, inspection and performance, inspects faults, reports events, etc.

PPS Client Program The PPS 200 client communicates with an end-user application in the user stations 34-37 after the connectivity is established through the CM 204. The user applications include, for example, applications that support multiparty conference, remote learning, remote surveillance, remote manufacturing, etc., by presenting an integrated view to the user through a graphical user interface (GUI) at user stations 34-37. The program of . Client 200 supports two types of primary order: orders to establish, or to change the network connectivity for the session (orders sent to CM 204) and commands directed to the control of the presentation that a user receives (signals sent from the administrator of the PPS 201 service session are then sent to resource agent 202).

In Figure 10, the object model for client model 200 is shown to use Rumbaugh object modeling signaling. A one-to-many association is represented by a line that connects two class boxes with a "point" at many ends of the line. A succession relationship is represented by a triangle at the intersection of multiple lines. The line that connects the upper part of the triangle goes to the superclass. The PPS 200 client program keeps track of the media objects that are being supplied by the user that is the client being represented. The cases of the means received by the user are represented by the class of the case of the medium 210 which is a superclass refined by the classes of the data case 211, audio case 212, and video class 213. Each kind of media has a set unique that is generated by the service section program of PPS 201 and an objectID that identifies which object was generated from this case. The instID is the handle that gives the client program 200 access to the VCM 96-107 that is responsible for creating and controlling the video case.

The video frame 215 contains a set of video frame points 216. Spatial associations (a type of multimedia association) can be constructed from the video cases or "smaller" spatial associations. The video frame 215 is needed to determine which video case the user has selected from the video stream he is receiving for a presentation control action. The video frame 215 correlates the correlated location in the video display window to a specific video case. It is necessary in order to support the presentation control signaling which defines an action in a specific image, for example, resizing the case of the user C. The client program PPS 200 will send a presentation control signal to the resource agent 202 which will cause the selected video case (based on its media_ins t ID) to be displayed in the new way the user wants. An action in the special association 217, for example, move association, causes multiple network signals to be sent, for example, when a panning action is requested by the user, the client program 200 will send presentation control signals, correlated, separated, to each of the VCM 96-107 that are affected by the change of the user's presentation. If two video images are combined together and a request for panoramic presentation is made by the user, the user's vision of the two images will be changed. Each of the VCM 96-107 affected by the change that gives the origin of its exhibited image changed.

Service Session In Figure 11, the object model for the PPS Service Session 201 program is shown. The base class of the PPS Service Session has its own media object class 220 which is different from that associated with the PPS 200 client, as well as the media object cases 221, and bridges 222, and clients 223. The client information that includes what part of the video the user is receiving, its output from and its clientID are stored for each client object 223.

The class of the media object 220 is used to obtain the track of all the multimedia objects that are available to the participants in this session. The attributes of these objects include what types of object it is (audio, video, data), the owner of the object, the access list for the object, etc. The owner of the object controls which user can be added to the access list by the object. A chair 224 can also control the access list established by the owner of the object (by passing the owner's setting). A media association class 225 registers the association relationships between the media object for the session-level associations. Access control of media objects allows a user, or another empowered individual, determines what other users can receive the media objects that are "owned" in a media stream per base of the media stream. The media objects that a user sends from their site are typically media objects owned by a user. In the general case, however, ownership of a media object means that a user can control who can access that media object. In a chair session, for example, the chair of a session can control access privileges from a different location. A teacher, for example, can control student access to each one during a test. The PPS 201 service session is responsible for tracking the access services in each media object. The parameter access_list of the media object class 220 and keeps track of the access permissions of the users. Once the service session program has confirmed the correctness of a user to access a media object, the user will receive a case of that media object, and a case of media objects 221 will be created to reflect the status of this case in the PPS client. When permission to access a media object changes (for example, a user needs to prevent other users from seeing their image because they are initiating a private conversation), users who are now restricted from receiving this media object are notify of the change and have access to the finished media object, for example, your case (s) of the media object will be removed. Bridge class 222 is used to track resources that have been reserved for use by participants in this session. When a session is created a minimum set of resources can be set aside for the use of session participants (for example, a number of video cases per user or per session). For example, participants in a session can start and make sure that there are enough resources available so that each user can see (in the case of the video) all the other participants. Although the resources have been reserved, everything can not be used at any given time during the course of the session. Bridge class 222 also includes the information and network address for each bridge so that in the service section administration it can send the correct signals to the correct bridge.

Resource management The resource agent 202 is a component of the computation program that represents the managed objects of a resource to the network in a vendor-dependent manner. The managed objects represent the state and functionality of the resource. In general, any resource that provides a service in a network will provide two types of interfaces. A service interface that is used by service customers, and a management interface that is used by the management system to manage and control the functionality of the service. The resource agent of PPS 202 reports management interfaces and a service interface. The first interface, which interacts with the network connection management computing program, presents a view of the network resource that allows the network connection management computing program to connect the transport to the resource. The second interface supports the service management functionality (for example, operations support). The final interface supports service-specific signaling (presentation control signaling) and is necessary to control the resource during a session (PPS 201 service session administration interfaces).

The PPS 202 resource agent receives commands sent to it by the CM 204, the PPS service session administrator 201, and the network service management administrators 205 and translates these commands into several privately owned orders. The commands and protocols supported by each resource agent interface can be different. For example, interfaces that support the presentation control signage of the service session manager can support the RPC protocol, the CM interface, and the service management interface can support a CMISE or SNMP interface. The administrator of the video bridge (described below) that receives the private property orders, specific to the vendor, from the resource agent 202, is responsible for the internal configuration of the resource. In Figure 12, the object model for a video resource management subsystem is shown. Similar object models (not shown) exist for administrators of data and audio resources. The video bridge can be seen through the network as a black box that has input and output bridges with certain capacities (types of media supported, bandwidth, QoS, etc.). The management information base (MIB) 231 contains the managed objects that reflect the state and functionality of the AVB 32a in a way understood by the computation problem of network management. Communications between the video bridge administrator (VBM) 230 and the resource agent 202 are via private property orders. The commands sent to the AMB 32, for example, use a command language that is used with the computer interface protocol used to communicate with the AMB hardware. To communicate with a specific VCM 96-107, the VCM 232 object corresponding to the VCM of the specific equipment translates an order that it receives from the VBM 230 in the specific instructions of the physical type for the type of VCM that the order is intended for. The list (windows, chromatic manipulation information, priority information, etc.) of each VCM, for example, the values stored in the registers of each VCM, are followed by the object of VCM 232. The VBM 230 (Figure 12) also is responsible for the internal configuration of the bridge.

The VBM will connect the correct VCM 96-107 together to a VCC (with a corresponding VCC object 233) and connect the VCC to the output bridge 234 for a user. When a new session is requested, the VCM and the bridge objects 232, 235 are interrogated to determine if the AVB 32a has the resources required for the proposed session. If the state variable is set to Available for a VCM 232 object or bridge object 235, means that the VCM bridge can be used for a new session. If the listing is set to reserved or in use, then the bridge or VCM is not available. A subsystem of the VCM agent 206 (Figure 9) provides an individual interface to the VBM 230 for controlling the VCM of the hardware programmer of the computation program of a VCM. The VCM agent subsystem 206 consists of a VCM base class that provides the interface definition and the basic functionality of the VCM. The VCM physical equipment only provides written access so that it is the responsibility of the VCM source 206 to store the status of each record in the physical equipment as well as to provide and / or for the VBM 230.

There are two classes derived from the base VCM class, the softVCM 207 and the hardVCM 208. The interfaces for these classes differ only in the constructor. The softVCM 207 takes a sequence representing the file name to use commands written to a softVCM. The hardVCM 208, an address value that is the address of the physical equipment. This design allows one to use a VCM type indicator regardless of whether they are using a hardware or computer program implementation.

Initialization of the Session Figure 13 shows the steps to initialize a session with the AVB 32a. Similar steps are required for audio and data bridges. Initialization begins when a case of the connection manager, or appropriate network management computation program, is created by a user request for a spontaneous session or by a session of a reservation administrator 203. The session is then established by the Next steps .

First, the PPS 200 client program uses the CM 204 interfaces to determine if the necessary resources are available to support the requested session. The CM 204 requests resources from the video resource agents 202 who, in turn, will communicate with the VBM 230 of the resource manager 203 to determine if the necessary resources are available. Second, CM 204 will handle the session adjustment negotiations that are necessary among the participants. Some of the attributes that will be used between a user and the connection manager include bandwidth access, quality of service, video speeds, video quality, audio quality, session objects that are to be sent to the AMB 32 (a user can only request to transmit audio data and may also wish to send videos), session objects that are to be received (an object can be restricted to receive only the audio data and may wish to receive video objects as well) , etc. Third, when the negotiation of the session is completed, the multimedia connections to each user will be established by the network connection manager. This includes connecting each user's camera to the video bridge and connecting the designated bridge of the video bridge for each user to the user's terminal. Fourth, the connection administrator will notify the PPS client program that transportation for the session has been established. The PPS client program then creates a PPS service session manager. The PPS client program passes the information and initial configuration for the resources included in the session to the PPS service session administrator. This includes the information regarding which user camera connects to which input port of the video bridge and which output is responsible for each user's combined video stream. Fifth, the PPS service session manager causes the bridge manager to update the port and the VCM objects (Figure 12) to indicate which portion of the AVB 32a is being used by the current session by using the service interface of the resource agent (s). Sixth, the client objects 223 for session participants become snapshots in the service session and the VCM status information for each VCM becomes snapshot and initialized in the resource manager.

Session Operation During a communication session, the user has access to the images, changes which images are received, and adds and drags the objects of the media. This process is provided by the client program (API) 200 which interacts with the various components of the computer program described above. The PPS client program has an API to allow application developers to access PPS programs. An application developer does not need to know the details of the network connection manager, nor do you need to make changes to the computing product based on the PPS network to support a new application. A pseudocode list of this API operation is shown in Appendix A. For a user recovering a case and a video object, the user must first request permission to access the object (API order access_obj_petition). If a user is given permission to receive a case of a video object, specify the initial location and size of the case and use the API command recib_vídeo_ins t. The details of this process are preferably hidden from the user by having an individual, received image menu item that, when called, calls the API order access_obj_petition and then calls the order of the receiving video case. An initial location and size can be given automatically by the application. The location and initial size of the case is stored in the case of client video 213 (Figure 10) so that the location, size and stacking order of each case in the combined video stream can be tracked. This is necessary because the video stream sent to 1 user is a combination of the images that the user is receiving and there is no way to determine where the single image is in this composite video stream. Audio cases are received in a similar way. An order of control of presentation (move, resize, manipulate chromatically, push, jump, change volume, pan, low, triple, etc.) causes the case of the appropriate client to be updated and results in the message sent to the service session that passes the message about the appropriate resource manager (administrator of the audio or video bridge). The request must specify the name of the case that will receive the order. The API command get_cases allows an application developer to determine the current settings of the attributes for a media case. This mitigates the need for each application to follow this information. If a user needs to change the volume attribute of an audio case he is receiving, he needs to know what the current setting is. For video cases, to better support the click-and-drag video interfaces, presentation control commands are available based on the selection. The first click of the user in a location in the video window where the case is located you want to make a presentation control order in it. An API command to collect_article () is called and returns a list of cases that are located in the specified location. Since the cases can be overlapped, the developer of the application is left to determine the best way to present this to the user (such as a list of cases, when recycling and when illuminating each case in the location, etc.). When the user selects the desired case, the order of API select_article is called in order to select the case. Once the case has been selected, the different presentation control commands can be used in the connection by specifying the appropriate API command, for example, move_selection. You can select multiple cases here to group them together. For video cases that have been grouped together, the group is treated as if it were an individual case. All presentation control orders can be made in the group as if they were an individual case. When a selection order with application is issued in a given location, if the video case in the location is in a group, the group information is returned to the application. The individual case information is not available to the application again until the group has been dissolved. When the presentation order for the group is issued, the PPS client program unrolls the group and sends the individual orders to the components of the divided bridge (VCM) responsible for the generation of each case. In other words, a separate presentation control signal is sent for each case in the group to the VCM responsible for generating it. When a user does not wish to receive a case any longer, the order trad_medios_caso is issued and the object of the video case is deleted from the client's program. When an object that is contributing to the session is disconnected, all the cases of the object need to be eliminated too. The video / audio case will stop when sent to each of the users when they were being received. When a user changes the access permissions to an object that he has, the users who are not allowed any longer access to the object have in the cases that they were receiving in the removed object. The service session updates the client's programs appropriately. The program checks to see if the object has been removed from the session or if the access rights for this user have been changed.

Control of Multimedia Objects Association A significant advantage of the present invention is that it provides the association of multimedia objects at the user session level. The user-controlled associations between video image and other multimedia objects allow the grouping of different multimedia streams in order to achieve a desired presentation objective. A user may wish to associate different video images together in order to compose a video scene. This can be used to associate the images that are sent by a camera array to give a panoramic view, to give a three-dimensional perspective, or to allow a user to group other users of a teleconference. The teleconferencing user can see a subset of other conferences and conveniently access other conference images by panning just left or right in the combined video scene. For example, in Figure 14, the associated video objects are presented to a user as a scene that is projected in a panoramic manner. The images of a three-camera array align to provide a larger individual image (a composite view). The location of the group affects the location of the video cases in the group. If the group is moved, the video cases in the group have their locations misaligned from the new group location by a pre-set amount. The multimedia object association computing program of the present invention can also group objects of different types together. Audio and video objects can be associated together in order to achieve a number of presentation objectives. For example, the volume of an audio object can be associated with the size of a video object. If the size of the video object is increased, the volume of the associated audio object is increased. If the video size object is decreased, the volume of the audio object is decreased. The two stereo audio streams can be associated with the location of the video object on the screen. As the video object moves to the right of the screen, the right channel of the audio becomes louder and the left channel becomes softer. A media association defines the relationship between groups of media objects or cases of the same or different types of media in order to remain a group of objects. A media association has attributes that are used to control some, or all attributes of cases / objects in association. A change in a media association attribute will cause the values of the specified attributes of the cases and the subjects of the media in association to be changed. Some associations will require an imbalance for each value of the case attribute, for example, imbalance of the video case location of the group's location. This is necessary because the value of the case attribute of the real media reflects the absolute value for the attribute, for example, the location of the video case on the screen. In Figure 15, the object model for multimedia association is presented. A media association object 301 has a one-to-many relationship (a line between objects with a "point" at the many ends) with the case of media / objects Obj 302. That is, there are one or more instances of media in a multimedia association. A media association also has one or more association attributes 303. Each of the association attributes 303 in turn, affects one or more attributes of each case in the association. Each affected attribute (video case attribute 307, audio case attribute 308, data case attribute 309) is represented in the association by an association case attribute (video case attribute in association 304, case attribute) audio in association 305, association data case attribute 306). An attribute of the association case defines the relationship of a case attribute (case of video 310, case of audio 311, case of data 312) to the association. A location attribute for a video case (case of video 310) needs to have its imbalance of the group location represented in association. The association video case attribute 304 is used for this purpose. Each association case attribute will affect a case attribute (to a ratio of 1 to 1). The video case attribute 307 for the location case will reflect the actual location of the video case 310 as it appears in the terminal; is the absolute location of a video case, not the location in relation to the location of the group as reflected by the association video case attribute 304 for the location.

Examples of User Level Association The object models shown in the Figures 16 and 17 show examples of association at the specific user level that are subsets of the general object model presented in Figure 15. In general, all operations that can be performed in a case can be performed in an association. For associations of video cases, a user can move, scale, prioritize, prioritize chromatically, put in a window, and flip the association. Only the components of the general object model of Figure 15 that are relevant to the examples are shown in Figures 16 and 17. However, it is pointed out that fewer computer programs are used for the realization of all the examples.

Example 1 The first example is the case where a user or group or number of video cases together in the video group in order to move all the cases as a scene and to scale the cases as a group (referred to as special association). An object model for this example is shown in Figure 16. This media association only includes cases of video media. Media attributes disassociate (attributes of media association) are at scale and location of video cases (association scale attribute 313 / scale attribute of association video case 315 and association location attribute 314 / location attribute of video cases and association 316). In this example, a change in the location attribute of the association causes the location attribute of the video cases (location attribute of video cases 317) to be changed by the amount of change to the location of the association. Each video case has an association location attribute that records the imbalance of the location location of the group. The function of the attribute member and video case association location, update_attribute, when called, adds its imbalance to the group's location, and in turn, calls the member function change_ attribute of the attribute and location of video cases ( attribute and location of video cases 317). The actual location of the video case (video case 310) in which the video frame is changed when the attribute 317 of the video case location is changed. A change in the attribute 313 in the association ladder of the media association 301 causes a change in the scale location of each of the video cases 310. When the scale of the group is increased, the scale of each of the Video cases is increased by the same percentage. The imbalance of the location in each case will also increase with the same percentage as the change in the group's scale. The media association scale attribute 313 in this case will change the scale of each of the video cases as well as the value of the location imbalance attribute of each case from the group origin. The video case scale attribute 318 will not need an "unbalance attribute" of the association, as is required for the location attribute 317. The member function of the attribute 313 in the association scale ac tuali zar_attribute () will cause the case scale attribute 318 is changed by the same percentage as the scale of the group was changed. The object model in Figure 16 that reflects this example only shows the components of the object model of the general model (Figure 15) that are used in this example. The association attribute examples are represented separately. In the general model, the association attribute objects are represented in a one or many relationship with the media association 301.

Example 2 In a second example, a user can associate video cases with audio cases so that when scaling the group of video cases, the audio volume will also change, and when the location changes in the case of video, The audio's panoramic location volume will also change. An object model for this association is shown in Figure 17. In this caseIf the scale of the video case group is increased the audio volume is also increased, while if the scale is decreased, the audio volume is also decreased. Also, if the location of the grouped video cases changes, the stereo surround volumes will change for the audio cases, and the total volume will change for the audio cases. When the video cases are moved to the middle of the screen, the volume becomes louder, and when the cases are moved to the edge of the screen or off the screen, the volumes of the audio cases will become more soft. The media association 301 has a scale attribute 318 that corresponds to the junction box of the video cases in the association. When the scale of the media association is changed, it affects the scales of the video cases, in the second example. There is also an attribute of the media association audio volume level (audio case volume attribute) 321. Each audio case volume attribute 321 has an "unbalance" of volume that is added (subtracted) to the audio attribute. audio volume of the group to obtain the value for the audio volume in each audio case. When the audio volume of the associated group is changed, the function of the update_vol member is called and changes all volumes of the appropriate audio cases when calling the member function update_attribute (vol) of the appropriate object of attributes of the association volume case 320. When the scale of the video group is changed, the audio volume of each case in the association is changed by the same percentage as the video scale attribute is changed. The update_attribute (scale) member function of each association volume case attribute object 320 is called in order to achieve this. The volume levels of the audio cases will be called by a predetermined percentage. A similar change occurs in the volume case attribute object 320 in association when the video case attribute changes.

Associations and Objects at the Session Level The main use of session-level associations is for synchronization associations and presentation associations. The specification to the associations is provided by the multimedia association architecture. The tolerable distribution imbalance will also be specified. This information is then used for the terminal equipment by the network (assuming it has buffer capabilities) to ensure that media objects are distributed synchronously. Presentation associations will cause all video cases in a group to jump to the foreground when any member of the group is selected. This association is usually created at the session level, perhaps by a user. The grouping information is then passed to the client programs of the interested users. There are many possibilities of association of different media supported by the architecture of the media association. The synchronization, grouping of video by paper (administrators, teacher, etc.), images, (graphics-vu), etc. The problem of the PPS client and the session manager may contain the same computation program to maintain associations. The session maintains the object associations and each client program controls the associations present for that user.

Basic Operation Characteristics Presentation Control The users of the present invention can arrange their own favorite vision in the viewing window through a user-friendly human interface (click and drag). They can choose the images they need to see and fix them to their liking. You can move the images to a desired position and scale the images to a desired size. They can point images horizontally and / or vertically. The user can also carry part of the images by specifying the areas of the window and / or chromatic manipulations. For example, Figure 18 shows an example of combining the actions of several generations of windows and keys to define a regular area of an image for extraction. The example shown, three window generators 136, rectangular (Figure 8) define the windows A, B and C within which the selection can be further modified by the settings of the manipulators of the value of the image element 134. The window A it is a rough cut of an area that is going to be removed from the final scene. To avoid the need for the precise location of the bottom edge of window A, window B is used to broadly define a region where only the color of the man's suit will be removed. Since the irregular border between the figure of the map and the woman could be defined with a rectangular window, the window C defines another general area within which the particular color of the map is extracted to complete the separation. The video portion of the AVB 32a uses a multi-level priority overlay param- eter to determine the visibility on a pixel by pixel basis. Therefore, the cut views will continue to overlap each other, and the relative movements will put two bodies of people or heads in contact with the screen, the image and the highest priority will be seen to happen naturally in front of the other. The ability of each user to control the size, shape and orientation, (face direction) of each object that is added to the display window mixes different formats laterally into a pleasant total image.

Association of Objects The user-controlled associations within the video images and other multimedia objects allow the synchronization of different multimedia streams in order to achieve a presentation objective. The association of media streams with each other allows a user to create what could be done through compound groups or objects. Objects can be grouped together for easy access and arrangement. This mechanism can be used to synchronize the distribution of different multimedia streams in order to achieve some presentation objective, for example, an image slide show could be synchronized to a recorded presentation of an audio server. A multimedia provider may need to synchronize information from different servers, since the provider may not have the capacity to store all the necessary information, or may not have the Copyright for certain information. You can also use the object association to generate an object of a panoramic exposure effect to simulate the panning of the video camera, and to associate audio and video cases.

Access control An access control feature allows each user of the present invention to specify who will be able to access the multimedia objects they "own". Private conversations are supported by allowing the owner of the media objects to change the access permissions of the media permissions. Participants in a private conversation can specify that other users can access their audio and video media objects, and participants not included in the private conversation will be prevented from accessing media objects that are private to that private conversation . Modifications and variations of the embodiments described above of the present invention are possible, as will be appreciated by those skilled in the art in view of the foregoing teachings. Therefore, it is understood that within the scope of the appended claims and their equivalents, the invention may be practiced in another manner than that specifically described.

APPENDIX A: PSEUDOCÓDIGO FOR THE COMMUNICATION SESSION // create a new session create_MBSPession (sessionname, chair_name, duration, client_list); MBSPextreme_session (); // The session chair adds a new user add_user (client_name); // The user or the chair drops the user from the session fallen_user (client_name); // A new media object is added by a user with an initial list of // the other users who can access this object (it is pointed out that // the clientID of the user is added to the order when it is sent to the // service session by the PPS client) add_object_obj eto (access_list_name, media_type, media_obj eto_name); fall_object_media (medium_type, medium_obj_name); // Access permission requested from an o.object object access request (media_obj_name, media_type, media_inst_name); // If access is granted, receive a video / audio case with // specified presentation attributes recib_vídeo_inst (instID, x_loc, y__loc, x_scale, and_scale); recib_audio_inst (instID, vol, panoramic, low, triple); caer_medios_ins tancia (media_type, inst_name); // Get a list of the other users that are participating in this // session get_clients (clientelist); Obtain a list of available media objects obtain mediates (media_type, obj etolist); // Get a list of the cases that are received by the given user // together with the presentation attributes of the cases get_medios_cases (media_type, casolist); // Request the join to a current session join_PPSession (clientID); // Presentation control commands // Remove a color range from a video case video_inst_key (instID, key attributes); // Move a video case to a new location mov_vídeo_inst (instID, x, y); // Increase the video case specified by instID to the front // of the video display (the top of the image stacking order) increase_input_video (instID); // Decrease the case of video specified by the institution to the back // of the video display (background of the image stacking oreden) decrease_initial_video (instID); // Give the video case a new size based on the new scale // parameters redimensionar_vídeo_inst instID, x_scale, and_scale); // select (collection) of oriented orders (against oriented objects, for example // case-oriented commands (ID)) The user collects the location of the diagonals in the video display where the case is desired // to receive. then select one of the localized cases of diagonals in the location for other operations // Return to a list of points (cases of video and / or video // groups) located in x, and .collect points (x, y, points) ); // Select the point in the list that has an ID of the id_point. select_point (point_id); // Move the selected point to the new location x, and move_selection (x, y); // Give the selected point (case or group) a new width and // length dimensionar_selection (width, length); // jump the selected point in front of the displayed points jump_selection (); // push the selected point to the back of the stacking order push_selection (); // Croma enters the selected point with the color range in // the key parameter key_selection (key attributes); deselect_articles (); // create a group of the articles that have been selected group_selections (); ungroup_selection (); // Audio presentation control orders chng_audio_inst_panorámica (instID, panoramic); chng_audio_inst_bajo (instID, low); chng_audio_ins t_triple (instID, triple); chng audio_inst vol (instID, vol); It is noted that in relation to this date, the best method known to the applicant to carry out the present invention is that which is clear from the present description of the invention.

Having described the invention as above, the contents of the following are claimed as property:

Claims

1. A method for controlling the presentation of a stream of media signals, characterized in that it comprises the steps of: providing a plurality of media signal streams, each of the streams comprising a plurality of media instances, wherein each case of means is a different portion of the total information represented by the media stream; associating a plurality of cases of different streams of media signals in a different group of media cases; and manipulate the different group of media cases as if it were a stream of media signals.

2. The method for controlling the presentation of a stream of media signals according to claim 1, characterized in that the media cases comprise video cases, the method further comprising the step of displaying the video cases in a video display device. video.

3. The method for controlling the presentation of a stream of media signals according to claim 2, characterized in that the media cases comprise audio cases in addition to the video cases.

4. A method for allowing a viewer to control the presentation to the viewer of a plurality of discrete sources in a multi-point teleconferencing service, the method comprising the steps of: combining the images of the sources into composite streams, grouping together a subset of images from a plurality of composite streams; manipulate the images together grouped as if they were an individual stream; and display the manipulated images to the viewer.

5. A video conference system, wherein each individual participant can compose the video images that are to be displayed to that participant other than the video images displayed to other participants, the system is characterized in that it comprises: a means to receive a plurality of video signal streams from a plurality of participant stations, each video signal stream comprising a plurality of video instances, wherein each video case is an image element distinct from the video image represented by the stream of video signals; means for combining the plurality of video signal streams into a plurality of composite video streams, each composite video stream containing selected portions of two or more of the video signal streams; means for transferring each of the composite video stream signals to a respective participating station; a means controlled by a computer program for associating a plurality of cases of different streams of video signals in a different group of video cases; and a means controlled by a computer program to manipulate the different group of video cases as if they were a stream of video signals.

6. The video conference system according to claim 5, characterized in that the association means includes a means for scaling the group of video cases as a group.

7. The video conference system according to claim 5, characterized in that the association means includes a means for chromatic manipulation of the group of video cases as a group, whereby a color or range of illumination of the group can be removed .

8. The video conference system according to claim 5, characterized in that the association means includes a means to reflect the group of video cases as a group.

9. The video conferencing system according to claim 5, characterized in that the association means includes a means for changing the plurality of the group of video cases as a group, whereby a stacking order of the associated group can change with respect to to video cases not associated with the group.

10. A video conference system, characterized in that it comprises: means for receiving a plurality of video signal streams from a plurality of user stations, each video signal stream comprising one or more video cases; means for combining the plurality of video signal streams into a plurality of composite video streams, each composite video stream containing selected portions of two or more of the video signal streams; means for transferring each of the composite video streams to a respective user station; and a means for associating a plurality of cases from different streams of video signals in a group of video cases that can be manipulated as a group, an association means including a means for setting a window the group of cases of video as a group, so the portions of the associated group within a defined window can be removed.

11. The video conference system according to claim 5, characterized in that it further comprises: means for receiving a plurality of audio signal streams from the plurality of user stations, the audio signal streams comprising each an audio case; means for combining the audio cases into a plurality of composite audio streams; and a means for transferring the composite audio signal streams to the respective user stations.

12. The video conference system according to claim 11, characterized in that the association means includes a means for associating the group of video cases with audio cases of the respective audio signal streams, corresponding to the group of cases Of video.

13. A video conference system, characterized in that it comprises: means for receiving a plurality of video signal streams from a plurality of user stations, each video signal stream comprising one or more video cases; means for combining the plurality of video signal streams into a plurality of composite video streams, each composite video stream containing selected portions of two or more of the video signal streams; means for transferring each of the composite video streams to a respective user station; means for receiving a plurality of streams of audio signals from the plurality of user stations, the streams of audio signals each comprising one audio case; means for combining the audio cases into a plurality of composite audio streams; means for transferring the streams of composite audio signals to the respective user stations; and a means for associating a plurality of cases from different streams of video signals in a group of video cases that can be manipulated as a group; the association means including a means for associating the group of video cases with the audio cases of the respective audio signal streams corresponding to the group of video cases, and a means for associating a volume of associated audio cases with the group of video cases with a group size, whereby the volume of audio cases increases or decreases with a change in group size.

14. A video conference system, wherein each individual participant can compose the video images that are to be displayed to the participant other than the video images displayed to other participants, the system is characterized in that it comprises: a means for receiving a plurality of streams of video signals from a plurality of participating stations, each video signal stream comprising a plurality of video instances, wherein the video case is an image element other than the video images represented by the stream of video signals; means for combining the plurality of video signal streams into a plurality of composite video streams, each composite video stream containing selected portions of two or more of the video signal streams; means for transferring each of the composite video streams to a respective user station; means for receiving a plurality of audio signal streams from the plurality of participant stations, each audio signal stream comprising a plurality of audio cases, wherein each audio case is a distinct sound element; means for combining the plurality of streams of audio signals into a plurality of composite audio streams, a means for transferring each of the composite audio streams to a respective participating station; and a means controlled by a computer program for associating the video cases of a video signal stream, respectively, with the audio cases of an audio signal stream, respectively, in a different group of associated audio and video cases; and a medium controlled by a computer program to manipulate different groups these different groups of audio and video cases as if they were an individual stream.

15. A video conference system, characterized in that it comprises: means for receiving a plurality of video signal streams from a plurality of user stations, each video signal stream comprising one or more video cases; means for combining the plurality of video signal streams into a plurality of composite video streams, each composite video stream containing selected portions of two or more of the video signal streams; means for transferring each of the composite video streams to a respective user station; means for receiving a plurality of audio signal streams from the plurality of user stations, each of the audio signal streams comprising one or more audio cases; means for combining the plurality of streams of audio signals into a plurality of composite audio streams; means for transferring each of the composite audio streams to the user stations; and a means for associating the video cases of a video signal stream, respectively, with the audio cases of a respective audio signal stream, wherein the association means includes a means for associating a volume of at least one of the audio cases selected with a size of at least one of the selected video cases, whereby the volume of the selected audio case is increased or decreased with a change in the size of the selected video case. SUMMARY OF THE INVENTION A video conference system (30) and method using a central multimedia bridge (32) to combine multimedia signals from a plurality of conference participants (34-37) into an individual composite signal for each participant. The system gives each conference participant the ability to customize his or her individual screen of other participants, including entering and expelling the computer from selected portions of the screen and overlaying displayed images, the ability to identify individual images in a stream of information. video composed of clicking and dragging operations or similar. The system user uses a string of video composition modules that can be extended as necessary to combine the video signal streams from any number of real-time conference participants. The multimedia association computing program is provided to associate different types of media to improve the display and manipulation capacity for multimedia uses. The system also allows each user to dynamically change the time to receive the information provided by the conference.