US20030048353A1 - System and method for high resolution videoconferencing - Google Patents
System and method for high resolution videoconferencing Download PDFInfo
- Publication number
- US20030048353A1 US20030048353A1 US10/214,976 US21497602A US2003048353A1 US 20030048353 A1 US20030048353 A1 US 20030048353A1 US 21497602 A US21497602 A US 21497602A US 2003048353 A1 US2003048353 A1 US 2003048353A1
- Authority
- US
- United States
- Prior art keywords
- video
- image
- generating
- audio
- video stream
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 69
- 238000004891 communication Methods 0.000 claims abstract description 66
- 230000005236 sound signal Effects 0.000 claims description 29
- 230000004044 response Effects 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 5
- 230000005540 biological transmission Effects 0.000 description 15
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004091 panning Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
- H04N7/147—Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/141—Systems for two-way working between two video terminals, e.g. videophone
- H04N7/142—Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Definitions
- the present invention relates generally to conferencing systems, and more particularly to a high resolution videoconferencing system.
- videoconferencing systems utilize video cameras to capture an image of the conference participants for transmission to a remote conferencing site.
- a conventional (stationary or movable) video camera can only capture one image or one view of a conferencing site at a certain point in time.
- a conventional video camera may be beneficially provided with a device for adjusting a rotational orientation of the camera.
- Positioning devices designed to rotate the camera about two orthogonal axes typically utilize two actuators: a first actuator rotates the camera about a vertical axis and a second actuator rotates the camera about a horizontal axis perpendicular to the camera's vertical axis.
- Rotation of the camera about the horizontal axis is referred to as “panning”, while rotation about the vertical axis is referred to as “tilting.”
- devices for rotating the camera about the horizontal and vertical axis are commonly referred to as “pan/tilt positioning devices.”
- a conventional video camera would require a set of zoom lenses for performing zooming functions, resulting in a “pan/tilt/zoom” (“PZT”) camera.
- the present invention provides for a videoconferencing system comprising a transmitting station located at a first site, including a plurality of microphones for generating an audio signal in response to a sound source; an audio processing engine for generating a position signal that indicates the position of the sound source and for processing the audio signal; and a communication interface for transmitting the audio and position signals to a communication channel.
- the plurality of microphones of the videoconferencing system can be arranged in an n-fire configuration as well as a vertical array.
- the videoconferencing system may also comprise a receiving station located at a second site, including a communication interface for receiving the audio and position signals from the communication channel, a plurality of speakers for playing the audio signal, and an audio processing engine for selectively driving one of the speakers in response to the position signal to play the audio signal on the selected speaker.
- a receiving station located at a second site, including a communication interface for receiving the audio and position signals from the communication channel, a plurality of speakers for playing the audio signal, and an audio processing engine for selectively driving one of the speakers in response to the position signal to play the audio signal on the selected speaker.
- the position signal generated by the videoconferencing system is based upon magnitude differences of electric or current signals received from the plurality of microphones. Whereas, if the position of the sound source changes, the audio processing engine generates a new position signal to reflect a position change.
- the transmitting station communication interface includes a communication processing engine for encoding and compressing the audio signal and the position signal, and a transceiver device for transmitting the audio and position signals through the communication channel.
- the receiving station communication interface includes a transceiver device, for receiving the audio and position signals through the communication channel, and a communication processing engine for decoding and decompressing the audio signal and the position signal.
- a videoconferencing system comprises a transmitting station located at a first site, including a high resolution video sensor for generating an image, a video memory for storing the high resolution image, a data loading engine for loading image data from the video sensor to the video memory. Additionally, a Field Programmable Gate Array/Application Specific Integrated Circuit (FPGA/ASIC) is coupled to the video memory and data loading engine. The FPGA/ASIC defines a first image section and a second image section within the high resolution image stored in the video memory. Further the FPGA/ASIC can scale the first image section into a first video stream with a first resolution and scale the second image section into a second video stream with a second resolution.
- FPGA/ASIC Field Programmable Gate Array/Application Specific Integrated Circuit
- a communication interface coupled to the FPGA/ASIC transmits the first video stream and the second video stream to a communication channel.
- the videoconferencing system may also comprise a receiving station located at a second site, including a communication interface for receiving the first video stream and the second video stream from the communication channel.
- the receiving station further includes a video processing engine for processing the first video stream and the second video stream and for displaying the first video stream as a first image with a first resolution and displaying the second video stream as a second image with a second resolution, is coupled to the communication interface.
- the transmitting station communication interface in this embodiment comprises a communication processing engine for encoding and compressing the first and second video stream, and a transceiver device for transmitting the first and second video stream through the communication channel.
- the receiving station video processing engine of the present embodiment comprises a video memory for storing the first video stream and the second video stream, a data loading engine for loading the first video stream and the second video stream from the receiving station communication interface and an FPGA/ASIC for displaying the first and second image data stream based on the high resolution image stored in the video memory.
- a videoconferencing system comprises a receiving station located at a first site having a communication interface for receiving a video signal from a communication channel, a video processing engine for generating a video display output in response to the video signal, and a video display for displaying the video display output.
- the videoconferencing system may further comprise a transmitting station located at a second site, having a video camera for generating the video signal, a video processing engine for processing the video signal, a phase synchronization engine for synchronizing a phase between the video camera at the transmitting station and the video display output at the receiving station, and a communication interface for transmitting the video signal to the communication channel.
- FIG. 1 shows an exemplary videoconferencing system in accordance with the present invention
- FIG. 2 shows an exemplary conferencing station
- FIG. 3 is an exemplary block diagram illustrating the processing unit of FIG. 2 in greater detail
- FIG. 4 is an exemplary block diagram illustrating components in the video processing engine of FIG. 3;
- FIG. 5 is an exemplary section (or view) configuration in accordance with the present invention.
- FIG. 6 is a flowchart illustrating an exemplary process for transmitting audio in a videoconferencing system
- FIG. 7 is a flowchart illustrating an exemplary process for transmitting high resolution images in a videoconferencing system.
- FIG. 8 is a flowchart illustrating an exemplary process for transmitting a video signal in a videoconferencing system.
- FIG. 1 shows an exemplary videoconferencing system 100 in accordance with the present invention.
- the videoconferencing system 100 includes a first conferencing station 102 and a second conferencing station 104 .
- the first conferencing station 102 includes an audio input/output device 106 , a video display 108 and a video camera (or video sensor) 110 .
- the second conferencing station 104 includes an audio input/output device 112 , a video display 114 and a video camera (or a video sensor) 116 .
- the first conferencing station 102 communicates with the second conferencing station 104 through a communication channel 118 .
- the communication channel 118 can be an Internet, a LAN, a WAN, or any other type of network communication means.
- FIG. 1 only shows two conferencing stations 102 and 104 , those skilled in the art will recognize that additional conferencing stations may be coupled to the videoconferencing system 100 .
- FIG. 2 shows an exemplary conferencing station 200 , similar to the conferencing stations 102 and 104 of FIG. 1, in accordance with one embodiment of the present invention.
- the conferencing station 200 includes a display 202 , a high resolution conferencing bar 204 , and a video processing unit 206 .
- the display 202 is a High Definition (“HD”) monitor having a relatively large-size flat screen 208 with a 16:9 viewable area.
- HD High Definition
- FIG. 2 shows an exemplary conferencing station 200 , similar to the conferencing stations 102 and 104 of FIG. 1, in accordance with one embodiment of the present invention.
- the conferencing station 200 includes a display 202 , a high resolution conferencing bar 204 , and a video processing unit 206 .
- the display 202 is a High Definition (“HD”) monitor having a relatively large-size flat screen 208 with a 16:9 viewable area.
- other view area proportions and other types of displays 202 are contemplated and may be used.
- the high resolution video conferencing bar 204 contains multiple speakers 210 a to 210 d , a video sensor (e.g., a high resolution digital video image sensor such as a CMOS video sensor) 212 , and a plurality of microphones 214 .
- the speakers 210 a to 210 d preferably operate at frequencies above 250 Hz. However, the speakers 210 a to 210 d may operate at any other frequency compatible with various embodiments of the present invention.
- the conferencing bar 204 is approximately 36 inch wide by 2 inch high and by 4 inch deep, although the conferencing bar 204 may comprise any other dimension.
- the conferencing bar 204 is designed to sit atop the display 202 with a front portion 218 extending slightly below a front edge of the display 202 .
- the positioning of the conferencing bar 204 brings the speakers 210 a to 210 d , the video sensor 212 , and the plurality of microphones 214 closer to the screen 208 , and provides a positioning reference at the front edge of the display 202 .
- Other conference bar 204 positions may be utilized in keeping with the scope and objects of the present invention. Further, although only four speakers are shown in FIG. 2, more or less speakers may be utilized in the present invention.
- the video sensor 212 has the capability to output multiple images in real-time at a preferred resolution of 720 i (i.e., 1280 ⁇ 720 interlaced at 60 fields per second) or higher, although other resolutions are contemplated by the present invention.
- the resolution of the video sensor 212 is sufficient based on approximately a 65 degree field of view to capture an entire conferencing site.
- a limited horizontal pan motor may be provided for a wider degree field of view (such as a 90 degree field of view).
- Providing this limited horizontal pan motor results in the avoidance of a costly and complicated full mechanical pan/tilt/zoom camera and lens system.
- a pure digital zoom may be provided with a fixed lens to accommodate up to an 8 ⁇ or higher effective zoom while maintaining a minimum Full CIF (352 ⁇ 288) resolution image.
- the plurality of microphones 214 are located on both sides of the video sensor 212 on the conferencing bar 204 , and can be arranged in an n-fire configuration, as shown in FIG. 2, which provides a better forward directional feature.
- a vertical microphone array can be optionally arranged along a side of the display 202 to provide vertical positioning references.
- the conferencing bar 204 is coupled to the processing unit 206 via a high speed digital link 205 .
- the processing unit 206 may contain a sub-woofer device that, preferably, operates from 250 Hz down to 50-100 Hz frequencies.
- the processing unit 206 will be discussed in more details in connection with FIG. 3. Although the processing unit 206 is shown as being separate from the conferencing bar 204 , alternatively, the processing unit 206 may be encompassed within the conferencing bar 204 .
- a smoked glass or similar covering can be installed in front of the video senor 212 and/or other portions of the conferencing bar 204 so that the conference participants cannot view the video sensor 212 , and/or the speakers 210 a to 210 d and the plurality of microphones 214 .
- FIG. 3 is an exemplary block diagram illustrating the processing unit 206 of FIG. 2 in greater detail in accordance with one embodiment of the present invention.
- the processing unit 206 preferably includes a processing engine 302 , a communication interface 304 , and a sub-woofer device 306 .
- the processing engine 302 further comprises a phase synchronization engine 308 , a video processing engine 310 , and an audio processing engine 312 .
- the phase synchronization engine 308 is able to reduce or minimize negative impact caused by transmission delay.
- the video camera 110 (FIG. 1) at the local (or first) conferencing station 102 (FIG. 1) has an arbitrary phase relative to a video display output at a remote (or second) conferencing station 104 (FIG. 1).
- the video display output at the remote conferencing station 104 may be out of phase with the video camera 110 located at the local conferencing station 102 .
- participant at the remote conferencing station 104 may still see the user in pause due to the transmission delay. If any of the participants at the remote conferencing station 104 interrupts the user at this moment, the remote participant and the user will talk over each other.
- the present invention synchronizes the phase between the video camera 110 located at the local conferencing station 102 and the video display output at the remote conferencing station 104 so that the transmission delay can be compensated for or reduced in the video display output.
- the video camera 110 at the local conferencing station 102 moves at a certain frequency and speed which causes phase shifting relative to the video display output at the remote conferencing station 104 .
- the movement of the video camera 110 at the local conferencing station 102 can be measured and used as a reference to synchronize the phase between the video camera 110 and the video display output.
- the phase synchronization engine 308 includes a memory device 314 for storing a phase synchronization module for performing the phase synchronization or locking function.
- the video processing engine 310 first receives a high resolution image from the video sensor 212 (or video camera 110 ) and stores the image into a video memory (not shown). The video processing engine 310 then, preferably, defines two image sections (views) within the high resolution image stored in the video memory, and generates two respective video streams for the two image sections (views). Alternatively, more or less image sections and corresponding video streams are contemplated. The video processing engine 310 then sends the two video streams to the communication interface 304 . Conversely, to display a remote video signal from a remote site, the video processing engine 310 receives at least two video streams (i.e., Video Streams A and B) from the communication interface 304 . The video processing engine 310 then processes the video streams A and B and displays two image views on the screen 208 for the two video streams A and B, respectively.
- video processing engine 310 receives at least two video streams (i.e., Video Streams A and B) from the communication interface 304 .
- each of the plurality of microphones 214 (FIG. 2) in the conferencing bar 204 receives a sound from an acoustic source (e.g., from a speaking participant) and converts the received sound to an electric or current signal. Because the plurality of microphones 214 are located at different positions in reference to the conferencing bar 204 and the acoustic source, the electric or current signals in the plurality of microphones 214 have different magnitudes. The magnitude differences in the electric or current signals indicate a position of the acoustic source. Upon receiving the electric or current signals from the plurality of microphones 214 , the audio processing engine 312 generates an audio signal and a position signal.
- an acoustic source e.g., from a speaking participant
- the position signal may contain information indicating a speaker's position relative to the conferencing bar 204 . If the position of the acoustic source changes, the audio processing engine 312 generates a new position signal to reflect the position change. The audio processing engine 312 then sends the audio and position signals to the communication interface 304 .
- the audio processing engine 312 first receives the audio signal and position signal from the communication interface 304 .
- the audio processing engine 312 then drives one or more of the speakers 210 a to 210 d (FIG. 2) in the conferencing bar 204 according to the position signal, while the video processing engine 310 is displaying one or more views of an image on the screen 208 .
- the speakers 210 a to 210 d in the conferencing bar 204 are selected based on the position of the speaking participant displayed on the screen 208 . Because the screen 208 has a relatively large size, the present invention improves video conference by making it appear as if the sound is coming from the location of the speaking participant.
- the speakers 210 a to 210 d in the speaker array of the conferencing bar 204 operate, typically, at frequencies above 250 Hz, because the sounds within this frequency range have directional characteristics. Consequently, the sub-woofer device 306 (FIG. 3) installed within the video processing unit 206 operates, preferably, at frequencies from 250 Hz down to 50-100 Hz, because the sounds within this frequency range are not directional.
- the present invention is described as including the sub-woofer device 306 , those skilled in the art will recognize that the sub-woofer device 306 is not required for operation and function of the present invention.
- any frequency range of acoustics may be utilized in the present invention. For example, lower frequencies may be used for the speakers 210 a to 210 d in the speaker array of the conferencing bar 204 .
- the communication interface 304 includes a transceiver device 316 and a communication processing engine 318 .
- the transmission of a communication signal containing an audio signal, a position signal, and two video streams A and B requires the communication processing engine 318 to receive the audio and position signals from the audio processing engine 312 and the two video streams A and B from the video processing engine 310 .
- the communication processing engine 318 encodes and compresses this communication signal and sends it to the transceiver device 316 .
- the transceiver device 316 Upon receiving the communication signal, the transceiver device 316 forwards the communication signal to a remote site through the communication channel 118 .
- the transceiver device 316 receives the communication signal from the communication channel 118 and forwards the communication signal to the communication processing engine 318 .
- the communication processing engine 318 then decompresses and decodes the communication signal to recover the audio signal, position signal, and two video data streams.
- FIG. 4 is an exemplary block diagram illustrating components of the video processing engine 310 of FIG. 3.
- the video processing engine 310 includes a data loading engine 402 coupled to the video sensor 212 (FIG. 2), a video memory 404 , and an FPGA/ASIC 406 .
- the data loading engine 402 receives video image data from the video sensor 212 and stores it into the video memory 404 , while the FPGA/ASIC 406 controls the data loading engine 402 and the video memory 404 .
- the video sensor 212 is, preferably, a high resolution digital image sensor, the video sensor 212 can generate a large amount of image data. For example, with a 3,000 ⁇ 2000 resolution, the video sensor 212 generates 6,000,000 pixels for an image.
- the data loading engine 402 preferably, has six parallel data channels 1 - 6 .
- the FPGA/ASIC 406 is programmed to feed entire image pixels to the video memory 404 through these six parallel data channels 1 - 6 .
- the FPGA/ASIC 406 is also programmed to define at least two image sections (views) over the image stored in the video memory 404 with selectable resolutions, and to produce two video streams for the two image sections (views), respectively.
- the present embodiment contemplates utilizing six data channels, any number of data channels may be used by the present invention. Further, any number of image sections and corresponding video streams may be utilized in the present invention.
- FIG. 5 is an exemplary image section (or view) configuration in accordance with one embodiment of the present invention defined by the FPGA/ASIC 406 (FIG. 4) and viewed on the display 202 (FIG. 2).
- a large section A 502 defines an entire view of an image having a 700 ⁇ 400 resolution
- a small section B 504 defines a view having a 300 ⁇ 200 resolution in which a speaking participant from a remote conferencing station is displayed.
- the FPGA/ASIC 406 scales the entire image down to a 700 ⁇ 400 resolution image to produce the video stream A (FIG. 3) for the large section A 502 .
- the FPGA/ASIC 406 scales the section B 504 image down to 300 ⁇ 200 resolution to produce the video stream B (FIG. 3). Because the image stored in the video memory 402 has a relatively high resolution, the two scaled images still present good resolution quality. Those skilled in the art will recognize that other resolutions may be utilized in the present invention.
- the present invention has the ability to generate a whole image of a conferencing site while zooming a view from any arbitrary section of the whole image. Further, because at least two video streams are produced for an image, it is possible to transmit a wide angle high resolution image including all participants at a conferencing site (e.g., section A 502 ) along with an inset zoomed view (e.g., section B 504 ) showing a particular speaking participant. Alternatively, more or fewer streams may be produced from a single image and consequently more or fewer views displayed. Therefore, the present invention can be used to replace conventional mechanical pan/tilt/zoom cameras.
- a typical COMS video sensor can effectively provide approximately 65 degree view angle. In reality, a 90 degree view angle may be required. Therefore, a small, inexpensive pan motor can be used to move the COMS video sensor in the horizontal direction. However, because the movement and the resulting noise of the CMOS video sensor are relatively small, such movement and resulting noise are hardly noticeable to the conferencing participants. With the development of technology, the COMS video sensor may be able to provide a cost effective 90 degree view angle.
- an exemplary flowchart 600 illustrating a process for transmitting audio data in a videoconferencing system is shown.
- an audio signal is generated at a transmitting station of a first site by the plurality of microphones 214 (FIG. 2) in response to an acoustic source by converting the received sound into an electric or current signal.
- a position signal is generated at step 620 that indicates a position of the acoustic source.
- the current signal will have a particular magnitude.
- the audio processing engine 312 (FIG. 3) determines the position signal based on the magnitude of the current signal.
- the audio and position signals are then transmitted to the communication interface 304 (FIG. 3) and then processed at step 630 by the communications processing engine 318 (FIG. 3).
- This processing can include compressing and encoding the audio and position signals for transmission.
- the audio and position signals are then transmitted through a communication channel such as an Internet, a LAN, a WAN, or any other type of network communication means at step 640 by a transceiver device.
- a transceiver device at a receiving station of a second site receives the audio and position signals.
- a communications processing engine processes the audio and position signals at step 660 , which may include decompressing and decoding the audio and position signals for playback.
- step 670 based on the position signal, one or more speakers at the receiving station are driven to play the audio signal.
- the position signal generated by the audio processing engine creates a more realistic video conference situation because the playback of the audio signal on one of the speakers makes it appear as if the audio signal is coming from a location of the acoustic source.
- the system determines whether more video conferencing is occurring in step 680 . If the conference continues, the system repeats steps 610 though 670 .
- FIG. 7 an exemplary flowchart 700 illustrating a process for transmitting high resolution images in a videoconferencing system is shown.
- a video camera or video sensor captures a high resolution image.
- the high resolution image is then loaded and stored from the video camera or video sensor to a video memory.
- the images are converted to video streams in step 720 .
- a first and a second image section are initially defined by the transmitting station video processing engine.
- the first and second image sections are scaled to a first video stream having a first resolution and a second video stream having a second resolution. Scaling is implemented by the FPGA/ASIC 406 (FIG.
- the video streams are processed by a transmitting station communication processing engine. This processing can include encoding and compressing of the streams for transmission. Typically, the video streams are encoded and compressed to allow for faster transmission of the video data.
- the processed video streams are sent to a receiving station through a communication channel in step 740 .
- the communication channel may be any packet-switched network, a circuit-switched network (such as an Asynchronous Transfer Mode (“ATM”) network), or any other network for carrying data including the well-known Internet.
- ATM Asynchronous Transfer Mode
- the communication channel may also be the Internet, an extranet, a local area network, or other networks known in the art.
- the video streams are then decoded and decompressed by the receiving station video processing engine and displayed on a video display of the receiving station at step 750 .
- the system determines whether more video conferencing is occurring in step 760 . If the conference continues, the system repeats steps 710 though 750 .
- an exemplary flowchart 800 illustrating an alternative process for transmitting a video signal in a videoconferencing system is shown.
- a video camera or video sensor captures a video image.
- the video signal is processed by a transmitting station communication engine at step 820 .
- This processing can include encoding and compressing the video signal.
- the video streams are encoded and compressed to allow for faster transmission of the video data.
- a phase synchronization engine synchronizes a phase between the video camera and a video display output. The synchronizing of the phase between the video camera and the video display output allows for a minimization of a negative impact that can be caused by transmission delay. Specifically, if the video camera is out of phase with the video display output, participants at a receiving station may still see a user in pause at the transmitting station, even after the user at the transmitting station has begun to speak again.
- the video signal is transmitted to the receiving station at step 840 via a communication channel.
- the communication channel may be any packet-switched network, a circuit-switched network (such as an Asynchronous Transfer Mode (“ATM”) network), or any other network for carrying data including the well-known Internet.
- the communication channel may also be the Internet, an extranet, a local area network, or other networks known in the art.
- the video signal is processed for display on the video display output by a receiving station communication processing engine. This processing can include decoding and decompressing the video signal.
- the video display output is generated in response to the decoded and decompressed video signal and displayed on a receiving station video display.
- the system determines whether more video conferencing is occurring in step 860 . If the conference continues, the system repeats steps 810 though 850 .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
A system and method for high resolution video conferencing is shown and described. A transmitting station and a receiving station including video cameras or sensors, a plurality of microphones and speakers, video, audio and communication processing engines are disclosed. Video is processed and transferred through the system allowing for multiple video streams to be produced and audio is processed and transferred through the system allowing for sound to be played back with an indication of position in relation to the videoconferencing system.
Description
- The present application claims the benefit of priority from U.S. Provisional Patent Application No. 60/310,742, entitled “High Resolution Video Conferencing Bar” filed on Aug. 7, 2001, which is herein incorporated by reference.
- 1. Field of the Invention
- The present invention relates generally to conferencing systems, and more particularly to a high resolution videoconferencing system.
- 2. Description of the Background Art
- Conventionally, videoconferencing systems utilize video cameras to capture an image of the conference participants for transmission to a remote conferencing site. A conventional (stationary or movable) video camera can only capture one image or one view of a conferencing site at a certain point in time. In order to capture different images or views of a conferencing site at different points in time, a conventional video camera may be beneficially provided with a device for adjusting a rotational orientation of the camera. Positioning devices designed to rotate the camera about two orthogonal axes typically utilize two actuators: a first actuator rotates the camera about a vertical axis and a second actuator rotates the camera about a horizontal axis perpendicular to the camera's vertical axis. Rotation of the camera about the horizontal axis is referred to as “panning”, while rotation about the vertical axis is referred to as “tilting.” As such, devices for rotating the camera about the horizontal and vertical axis are commonly referred to as “pan/tilt positioning devices.” Further, to capture an image or view that is of a particular interest, such as the image of a speaking conference participant, a conventional video camera would require a set of zoom lenses for performing zooming functions, resulting in a “pan/tilt/zoom” (“PZT”) camera.
- Disadvantageously, conventional PZT cameras have many shortcomings. First, movement of mechanical components in the positioning device can generate a substantial amount of noise. These movements and noise can be annoying and distracting to the conference participants. More importantly, the noise can interfere with acoustic localization techniques utilized to automatically orient the camera in a direction of the speaking participant. Secondly, the mechanical components in the positioning device may be susceptible to misalignment or breakage due to wear or rough handling, thereby rendering the positioning device partially or fully inoperative. A further disadvantage is complexity in manufacturing of the positioning device; thus resulting in high manufacturing costs and, subsequently, high consumer prices.
- With the development of technology, sizes of display screens in videoconferencing systems are getting larger and larger. Consequently, positions of participant speakers on the display screen can change over a large span area. Disadvantageously, however, conventional videoconferencing systems are unable to adjust to a new participant speaker position as the position changes over the large span area.
- Therefore, there is a need for a videoconferencing system and method which captures multiple views of a conferencing site without involving a complex mechanical structure. There is another need for a videoconferencing system and method which adjusts acoustics relative to a speaker's position.
- The present invention provides for a videoconferencing system comprising a transmitting station located at a first site, including a plurality of microphones for generating an audio signal in response to a sound source; an audio processing engine for generating a position signal that indicates the position of the sound source and for processing the audio signal; and a communication interface for transmitting the audio and position signals to a communication channel. The plurality of microphones of the videoconferencing system can be arranged in an n-fire configuration as well as a vertical array. The videoconferencing system may also comprise a receiving station located at a second site, including a communication interface for receiving the audio and position signals from the communication channel, a plurality of speakers for playing the audio signal, and an audio processing engine for selectively driving one of the speakers in response to the position signal to play the audio signal on the selected speaker.
- The position signal generated by the videoconferencing system is based upon magnitude differences of electric or current signals received from the plurality of microphones. Whereas, if the position of the sound source changes, the audio processing engine generates a new position signal to reflect a position change.
- The transmitting station communication interface includes a communication processing engine for encoding and compressing the audio signal and the position signal, and a transceiver device for transmitting the audio and position signals through the communication channel. Conversely, the receiving station communication interface includes a transceiver device, for receiving the audio and position signals through the communication channel, and a communication processing engine for decoding and decompressing the audio signal and the position signal.
- In another embodiment, a videoconferencing system comprises a transmitting station located at a first site, including a high resolution video sensor for generating an image, a video memory for storing the high resolution image, a data loading engine for loading image data from the video sensor to the video memory. Additionally, a Field Programmable Gate Array/Application Specific Integrated Circuit (FPGA/ASIC) is coupled to the video memory and data loading engine. The FPGA/ASIC defines a first image section and a second image section within the high resolution image stored in the video memory. Further the FPGA/ASIC can scale the first image section into a first video stream with a first resolution and scale the second image section into a second video stream with a second resolution. A communication interface coupled to the FPGA/ASIC transmits the first video stream and the second video stream to a communication channel. The videoconferencing system may also comprise a receiving station located at a second site, including a communication interface for receiving the first video stream and the second video stream from the communication channel. The receiving station further includes a video processing engine for processing the first video stream and the second video stream and for displaying the first video stream as a first image with a first resolution and displaying the second video stream as a second image with a second resolution, is coupled to the communication interface.
- The transmitting station communication interface in this embodiment comprises a communication processing engine for encoding and compressing the first and second video stream, and a transceiver device for transmitting the first and second video stream through the communication channel. Conversely, the receiving station video processing engine of the present embodiment comprises a video memory for storing the first video stream and the second video stream, a data loading engine for loading the first video stream and the second video stream from the receiving station communication interface and an FPGA/ASIC for displaying the first and second image data stream based on the high resolution image stored in the video memory.
- In yet another embodiment, a videoconferencing system comprises a receiving station located at a first site having a communication interface for receiving a video signal from a communication channel, a video processing engine for generating a video display output in response to the video signal, and a video display for displaying the video display output. The videoconferencing system may further comprise a transmitting station located at a second site, having a video camera for generating the video signal, a video processing engine for processing the video signal, a phase synchronization engine for synchronizing a phase between the video camera at the transmitting station and the video display output at the receiving station, and a communication interface for transmitting the video signal to the communication channel.
- FIG. 1 shows an exemplary videoconferencing system in accordance with the present invention;
- FIG. 2 shows an exemplary conferencing station;
- FIG. 3 is an exemplary block diagram illustrating the processing unit of FIG. 2 in greater detail;
- FIG. 4 is an exemplary block diagram illustrating components in the video processing engine of FIG. 3;
- FIG. 5 is an exemplary section (or view) configuration in accordance with the present invention;
- FIG. 6 is a flowchart illustrating an exemplary process for transmitting audio in a videoconferencing system;
- FIG. 7 is a flowchart illustrating an exemplary process for transmitting high resolution images in a videoconferencing system; and
- FIG. 8 is a flowchart illustrating an exemplary process for transmitting a video signal in a videoconferencing system.
- FIG. 1 shows an
exemplary videoconferencing system 100 in accordance with the present invention. Thevideoconferencing system 100 includes afirst conferencing station 102 and asecond conferencing station 104. Thefirst conferencing station 102 includes an audio input/output device 106, avideo display 108 and a video camera (or video sensor) 110. Similarly, thesecond conferencing station 104 includes an audio input/output device 112, avideo display 114 and a video camera (or a video sensor) 116. Thefirst conferencing station 102 communicates with thesecond conferencing station 104 through acommunication channel 118. Thecommunication channel 118 can be an Internet, a LAN, a WAN, or any other type of network communication means. Although FIG. 1 only shows twoconferencing stations videoconferencing system 100. - FIG. 2 shows an
exemplary conferencing station 200, similar to theconferencing stations conferencing station 200 includes adisplay 202, a highresolution conferencing bar 204, and avideo processing unit 206. Preferably, thedisplay 202 is a High Definition (“HD”) monitor having a relatively large-sizeflat screen 208 with a 16:9 viewable area. Alternatively, other view area proportions and other types ofdisplays 202 are contemplated and may be used. - Preferably, the high resolution
video conferencing bar 204 containsmultiple speakers 210 a to 210 d, a video sensor (e.g., a high resolution digital video image sensor such as a CMOS video sensor) 212, and a plurality ofmicrophones 214. Thespeakers 210 a to 210 d preferably operate at frequencies above 250 Hz. However, thespeakers 210 a to 210 d may operate at any other frequency compatible with various embodiments of the present invention. In one embodiment, theconferencing bar 204 is approximately 36 inch wide by 2 inch high and by 4 inch deep, although theconferencing bar 204 may comprise any other dimension. Typically, theconferencing bar 204 is designed to sit atop thedisplay 202 with afront portion 218 extending slightly below a front edge of thedisplay 202. The positioning of theconferencing bar 204 brings thespeakers 210 a to 210 d, thevideo sensor 212, and the plurality ofmicrophones 214 closer to thescreen 208, and provides a positioning reference at the front edge of thedisplay 202.Other conference bar 204 positions may be utilized in keeping with the scope and objects of the present invention. Further, although only four speakers are shown in FIG. 2, more or less speakers may be utilized in the present invention. - The
video sensor 212 has the capability to output multiple images in real-time at a preferred resolution of 720 i (i.e., 1280×720 interlaced at 60 fields per second) or higher, although other resolutions are contemplated by the present invention. The resolution of thevideo sensor 212 is sufficient based on approximately a 65 degree field of view to capture an entire conferencing site. For a wider degree field of view (such as a 90 degree field of view), a limited horizontal pan motor may be provided. Providing this limited horizontal pan motor results in the avoidance of a costly and complicated full mechanical pan/tilt/zoom camera and lens system. Further, a pure digital zoom may be provided with a fixed lens to accommodate up to an 8× or higher effective zoom while maintaining a minimum Full CIF (352×288) resolution image. - The plurality of
microphones 214 are located on both sides of thevideo sensor 212 on theconferencing bar 204, and can be arranged in an n-fire configuration, as shown in FIG. 2, which provides a better forward directional feature. A vertical microphone array can be optionally arranged along a side of thedisplay 202 to provide vertical positioning references. - The
conferencing bar 204 is coupled to theprocessing unit 206 via a high speeddigital link 205. Theprocessing unit 206 may contain a sub-woofer device that, preferably, operates from 250 Hz down to 50-100 Hz frequencies. Theprocessing unit 206 will be discussed in more details in connection with FIG. 3. Although theprocessing unit 206 is shown as being separate from theconferencing bar 204, alternatively, theprocessing unit 206 may be encompassed within theconferencing bar 204. - Because conference participants may not feel comfortable in view of, or seeing the movement of, the
video sensor 212, a smoked glass or similar covering can be installed in front of thevideo senor 212 and/or other portions of theconferencing bar 204 so that the conference participants cannot view thevideo sensor 212, and/or thespeakers 210 a to 210 d and the plurality ofmicrophones 214. - FIG. 3 is an exemplary block diagram illustrating the
processing unit 206 of FIG. 2 in greater detail in accordance with one embodiment of the present invention. Theprocessing unit 206 preferably includes aprocessing engine 302, acommunication interface 304, and asub-woofer device 306. Theprocessing engine 302 further comprises aphase synchronization engine 308, avideo processing engine 310, and anaudio processing engine 312. Thephase synchronization engine 308 is able to reduce or minimize negative impact caused by transmission delay. Specifically, the video camera 110 (FIG. 1) at the local (or first) conferencing station 102 (FIG. 1) has an arbitrary phase relative to a video display output at a remote (or second) conferencing station 104 (FIG. 1). Thus, the video display output at theremote conferencing station 104 may be out of phase with thevideo camera 110 located at thelocal conferencing station 102. - Further, in transmitting a source video signal from the
local conferencing station 102 to theremote conferencing station 104, there is a transmission delay between a time when a source video signal is being generated at thelocal conferencing station 102 and a time when the source video signal is displayed at theremote conferencing station 104. The transmission delay cannot be compensated for when the video display output at theremote conferencing station 104 is out of phase with thevideo camera 110 located at thelocal conferencing station 102. As a result, the transmission delay is added to the video display output at theremote conferencing station 104, which may generate a negative effect in an interactive video conference. For example, when a user at thelocal conferencing station 102 starts to speak after a pause, participants at theremote conferencing station 104 may still see the user in pause due to the transmission delay. If any of the participants at theremote conferencing station 104 interrupts the user at this moment, the remote participant and the user will talk over each other. - Advantageously, the present invention synchronizes the phase between the
video camera 110 located at thelocal conferencing station 102 and the video display output at theremote conferencing station 104 so that the transmission delay can be compensated for or reduced in the video display output. Specifically, during a video conference, thevideo camera 110 at thelocal conferencing station 102 moves at a certain frequency and speed which causes phase shifting relative to the video display output at theremote conferencing station 104. The movement of thevideo camera 110 at thelocal conferencing station 102 can be measured and used as a reference to synchronize the phase between thevideo camera 110 and the video display output. Thephase synchronization engine 308 includes amemory device 314 for storing a phase synchronization module for performing the phase synchronization or locking function. - In operation, to transmit a source video signal, the
video processing engine 310 first receives a high resolution image from the video sensor 212 (or video camera 110) and stores the image into a video memory (not shown). Thevideo processing engine 310 then, preferably, defines two image sections (views) within the high resolution image stored in the video memory, and generates two respective video streams for the two image sections (views). Alternatively, more or less image sections and corresponding video streams are contemplated. Thevideo processing engine 310 then sends the two video streams to thecommunication interface 304. Conversely, to display a remote video signal from a remote site, thevideo processing engine 310 receives at least two video streams (i.e., Video Streams A and B) from thecommunication interface 304. Thevideo processing engine 310 then processes the video streams A and B and displays two image views on thescreen 208 for the two video streams A and B, respectively. - To transmit a source audio signal, each of the plurality of microphones214 (FIG. 2) in the
conferencing bar 204 receives a sound from an acoustic source (e.g., from a speaking participant) and converts the received sound to an electric or current signal. Because the plurality ofmicrophones 214 are located at different positions in reference to theconferencing bar 204 and the acoustic source, the electric or current signals in the plurality ofmicrophones 214 have different magnitudes. The magnitude differences in the electric or current signals indicate a position of the acoustic source. Upon receiving the electric or current signals from the plurality ofmicrophones 214, theaudio processing engine 312 generates an audio signal and a position signal. The position signal may contain information indicating a speaker's position relative to theconferencing bar 204. If the position of the acoustic source changes, theaudio processing engine 312 generates a new position signal to reflect the position change. Theaudio processing engine 312 then sends the audio and position signals to thecommunication interface 304. - Conversely, to play a remote audio signal from a remote site, the
audio processing engine 312 first receives the audio signal and position signal from thecommunication interface 304. Theaudio processing engine 312 then drives one or more of thespeakers 210 a to 210 d (FIG. 2) in theconferencing bar 204 according to the position signal, while thevideo processing engine 310 is displaying one or more views of an image on thescreen 208. Thespeakers 210 a to 210 d in theconferencing bar 204 are selected based on the position of the speaking participant displayed on thescreen 208. Because thescreen 208 has a relatively large size, the present invention improves video conference by making it appear as if the sound is coming from the location of the speaking participant. It should be noted that thespeakers 210 a to 210 d in the speaker array of theconferencing bar 204 operate, typically, at frequencies above 250 Hz, because the sounds within this frequency range have directional characteristics. Consequently, the sub-woofer device 306 (FIG. 3) installed within thevideo processing unit 206 operates, preferably, at frequencies from 250 Hz down to 50-100 Hz, because the sounds within this frequency range are not directional. Although the present invention is described as including thesub-woofer device 306, those skilled in the art will recognize that thesub-woofer device 306 is not required for operation and function of the present invention. Those skilled in the art will also recognize that any frequency range of acoustics may be utilized in the present invention. For example, lower frequencies may be used for thespeakers 210 a to 210 d in the speaker array of theconferencing bar 204. - The
communication interface 304 includes atransceiver device 316 and acommunication processing engine 318. The transmission of a communication signal containing an audio signal, a position signal, and two video streams A and B requires thecommunication processing engine 318 to receive the audio and position signals from theaudio processing engine 312 and the two video streams A and B from thevideo processing engine 310. Subsequently, thecommunication processing engine 318 encodes and compresses this communication signal and sends it to thetransceiver device 316. Upon receiving the communication signal, thetransceiver device 316 forwards the communication signal to a remote site through thecommunication channel 118. - Conversely, to receive a communication signal containing an audio signal, a position signal, and two video streams A and B, the
transceiver device 316 receives the communication signal from thecommunication channel 118 and forwards the communication signal to thecommunication processing engine 318. Thecommunication processing engine 318 then decompresses and decodes the communication signal to recover the audio signal, position signal, and two video data streams. - FIG. 4 is an exemplary block diagram illustrating components of the
video processing engine 310 of FIG. 3. Thevideo processing engine 310 includes adata loading engine 402 coupled to the video sensor 212 (FIG. 2), avideo memory 404, and an FPGA/ASIC 406. Thedata loading engine 402 receives video image data from thevideo sensor 212 and stores it into thevideo memory 404, while the FPGA/ASIC 406 controls thedata loading engine 402 and thevideo memory 404. Because thevideo sensor 212 is, preferably, a high resolution digital image sensor, thevideo sensor 212 can generate a large amount of image data. For example, with a 3,000×2000 resolution, thevideo sensor 212 generates 6,000,000 pixels for an image. To increase input bandwidth, thedata loading engine 402, preferably, has six parallel data channels 1-6. The FPGA/ASIC 406 is programmed to feed entire image pixels to thevideo memory 404 through these six parallel data channels 1-6. The FPGA/ASIC 406 is also programmed to define at least two image sections (views) over the image stored in thevideo memory 404 with selectable resolutions, and to produce two video streams for the two image sections (views), respectively. Although the present embodiment contemplates utilizing six data channels, any number of data channels may be used by the present invention. Further, any number of image sections and corresponding video streams may be utilized in the present invention. - FIG. 5 is an exemplary image section (or view) configuration in accordance with one embodiment of the present invention defined by the FPGA/ASIC406 (FIG. 4) and viewed on the display 202 (FIG. 2). In FIG. 5, a
large section A 502 defines an entire view of an image having a 700×400 resolution, while asmall section B 504 defines a view having a 300×200 resolution in which a speaking participant from a remote conferencing station is displayed. Based on the image stored in the video memory 404 (FIG. 4), the FPGA/ASIC 406 scales the entire image down to a 700×400 resolution image to produce the video stream A (FIG. 3) for thelarge section A 502. Subsequently, the FPGA/ASIC 406 scales thesection B 504 image down to 300×200 resolution to produce the video stream B (FIG. 3). Because the image stored in thevideo memory 402 has a relatively high resolution, the two scaled images still present good resolution quality. Those skilled in the art will recognize that other resolutions may be utilized in the present invention. - Advantageously, the present invention has the ability to generate a whole image of a conferencing site while zooming a view from any arbitrary section of the whole image. Further, because at least two video streams are produced for an image, it is possible to transmit a wide angle high resolution image including all participants at a conferencing site (e.g., section A502) along with an inset zoomed view (e.g., section B 504) showing a particular speaking participant. Alternatively, more or fewer streams may be produced from a single image and consequently more or fewer views displayed. Therefore, the present invention can be used to replace conventional mechanical pan/tilt/zoom cameras.
- With current technology, a typical COMS video sensor can effectively provide approximately 65 degree view angle. In reality, a 90 degree view angle may be required. Therefore, a small, inexpensive pan motor can be used to move the COMS video sensor in the horizontal direction. However, because the movement and the resulting noise of the CMOS video sensor are relatively small, such movement and resulting noise are hardly noticeable to the conferencing participants. With the development of technology, the COMS video sensor may be able to provide a cost effective 90 degree view angle.
- In FIG. 6, an
exemplary flowchart 600 illustrating a process for transmitting audio data in a videoconferencing system is shown. Atstep 610, an audio signal is generated at a transmitting station of a first site by the plurality of microphones 214 (FIG. 2) in response to an acoustic source by converting the received sound into an electric or current signal. Next, a position signal is generated atstep 620 that indicates a position of the acoustic source. Depending upon the position of the acoustic source from the transmitting station, the current signal will have a particular magnitude. The audio processing engine 312 (FIG. 3) determines the position signal based on the magnitude of the current signal. The audio and position signals are then transmitted to the communication interface 304 (FIG. 3) and then processed atstep 630 by the communications processing engine 318 (FIG. 3). This processing can include compressing and encoding the audio and position signals for transmission. The audio and position signals are then transmitted through a communication channel such as an Internet, a LAN, a WAN, or any other type of network communication means atstep 640 by a transceiver device. Instep 650, a transceiver device at a receiving station of a second site receives the audio and position signals. A communications processing engine processes the audio and position signals atstep 660, which may include decompressing and decoding the audio and position signals for playback. Subsequently, atstep 670, based on the position signal, one or more speakers at the receiving station are driven to play the audio signal. The position signal generated by the audio processing engine creates a more realistic video conference situation because the playback of the audio signal on one of the speakers makes it appear as if the audio signal is coming from a location of the acoustic source. The system then determines whether more video conferencing is occurring instep 680. If the conference continues, the system repeatssteps 610 though 670. - In FIG. 7, an
exemplary flowchart 700 illustrating a process for transmitting high resolution images in a videoconferencing system is shown. Atstep 710, a video camera or video sensor captures a high resolution image. The high resolution image is then loaded and stored from the video camera or video sensor to a video memory. Next, the images are converted to video streams instep 720. Within the high resolution image stored in the video memory, a first and a second image section are initially defined by the transmitting station video processing engine. Subsequently, the first and second image sections are scaled to a first video stream having a first resolution and a second video stream having a second resolution. Scaling is implemented by the FPGA/ASIC 406 (FIG. 4) of the video processing engine 310 (FIG. 3), which scales the first image section to a first video stream having a 700×400 resolution and scales the second image section to a second video stream having a 300×200 resolution. Those skilled in the art will recognize that other resolutions may be utilized in the present invention, and that more or less than two image sections, and subsequently more or less than two video streams can also be utilized. - At
step 730, the video streams are processed by a transmitting station communication processing engine. This processing can include encoding and compressing of the streams for transmission. Typically, the video streams are encoded and compressed to allow for faster transmission of the video data. Next, the processed video streams are sent to a receiving station through a communication channel instep 740. The communication channel may be any packet-switched network, a circuit-switched network (such as an Asynchronous Transfer Mode (“ATM”) network), or any other network for carrying data including the well-known Internet. The communication channel may also be the Internet, an extranet, a local area network, or other networks known in the art. The video streams are then decoded and decompressed by the receiving station video processing engine and displayed on a video display of the receiving station atstep 750. The system then determines whether more video conferencing is occurring instep 760. If the conference continues, the system repeatssteps 710 though 750. Although the transmission of audio, position, and video data are described in separate flowcharts and methods, the present invention contemplates the simultaneous or near simultaneous transmission of these data. - In FIG. 8, an
exemplary flowchart 800 illustrating an alternative process for transmitting a video signal in a videoconferencing system is shown. Atstep 810, a video camera or video sensor captures a video image. Next, the video signal is processed by a transmitting station communication engine atstep 820. This processing can include encoding and compressing the video signal. Typically, the video streams are encoded and compressed to allow for faster transmission of the video data. Atstep 830, a phase synchronization engine synchronizes a phase between the video camera and a video display output. The synchronizing of the phase between the video camera and the video display output allows for a minimization of a negative impact that can be caused by transmission delay. Specifically, if the video camera is out of phase with the video display output, participants at a receiving station may still see a user in pause at the transmitting station, even after the user at the transmitting station has begun to speak again. - Next, the video signal is transmitted to the receiving station at
step 840 via a communication channel. The communication channel may be any packet-switched network, a circuit-switched network (such as an Asynchronous Transfer Mode (“ATM”) network), or any other network for carrying data including the well-known Internet. The communication channel may also be the Internet, an extranet, a local area network, or other networks known in the art. Subsequently, atstep 850, the video signal is processed for display on the video display output by a receiving station communication processing engine. This processing can include decoding and decompressing the video signal. The video display output is generated in response to the decoded and decompressed video signal and displayed on a receiving station video display. The system then determines whether more video conferencing is occurring instep 860. If the conference continues, the system repeatssteps 810 though 850. - The invention has been described with reference to exemplary embodiments. Those skilled in the art will recognize that various features disclosed in connection with the embodiments may be used either individually or jointly, and that various modifications may be made and other embodiments can be used without departing from the broader scope of the invention. For example, it is to be appreciated that while the positioning apparatus of the present invention has been described with reference to a preferred implementation, those having ordinary skill in the art will recognize that the present invention may be beneficially utilized in any number of environments and implementations. Accordingly, the claims set forth below should be construed in view of the full breadth and spirit of the invention as disclosed herein.
Claims (22)
1. A videoconferencing device comprising:
a video sensor for capturing an image;
a plurality of microphones for generating an audio signal in response to an acoustic source; and
a processing engine coupled to the video sensor and the plurality of microphones for generating at least one video stream and a position signal indicating a position of the acoustic source.
2. The videoconferencing device of claim 1 , further comprising a phase synchronization engine coupled to the video sensor for synchronizing a phase between the video sensor and a video display output.
3. The videoconferencing device of claim 1 , further comprising a communication interface coupled to the processing engine for transmitting the audio signal, position signal, and at least one video stream to a remote videoconferencing device.
4. The videoconferencing device of claim 1 , wherein the position signal is generated based upon magnitude differences of electric or current signals received from the plurality of microphones.
5. The videoconferencing system of claim 1 , wherein the processing engine further comprises a video processing engine, the video processing engine defining a plurality of image sections and generating a respective plurality of video streams corresponding to the plurality of image sections.
6. The videoconferencing system of claim 1 , wherein if the position of the sound source changes, the processing engine generates a new position signal to reflect a position change.
7. The videoconferencing device of claim 2 , wherein the remote videoconferencing device selectively drives one or more speakers in response to the position signal to play the audio signal corresponding to the image of the at least one video stream.
8. The videoconferencing device of claim 1 , wherein the plurality of microphones are arranged in an n-fire configuration.
9. The videoconferencing device of claim 1 , wherein the plurality of microphones are arranged in a vertical array.
10. The videoconferencing device of claim 5 , wherein the processing engine scale a first image section of the plurality of image sections into a first video stream having a first resolution and scales a second image section of the plurality of image sections into a second video stream having a second resolution.
11. The videoconferencing system of claim 1 , further comprising a pan motor coupled to the video sensor for providing a larger degree view angle.
12. A method for transmitting conferencing data in a video conferencing system, comprising:
capturing an image with a video sensor and generating at least one video stream from the image;
capturing audio data with a plurality of microphones and generating an audio signal;
generating a position signal indicating a position of an acoustic source based upon magnitude differences of the audio data; and
transmitting the position signal, audio signals, and the at least one video streams via a communication channel.
13. The method of claim 12 , further comprising selectively driving one or more speakers of a remote video conferencing system in response to the position signal to play the audio signal corresponding to the image of the at least one video stream.
14. The method of claim 12 , further comprising synchronizing a phase between the video sensor and a video display output.
15. The method of claim 12 , further comprising defining a plurality of image sections and generating a respective plurality of video streams corresponding to the plurality of image sections.
16. The method of claim 12 , further comprising generating a new position signal to reflect a position change.
17. The method of claim 14 , further comprising scaling a first image section of the plurality of image sections into a first video stream having a first resolution and scaling a second image section of the plurality of image sections into a second video stream having a second resolution.
18. A videoconferencing device comprising:
means for capturing an image and generating at least one video stream from the image;
means for capturing audio and generating an audio signal;
means for generating a position signal indicating a position of an acoustic source based upon magnitude differences of the audio data, the position signal selectively driving one or more speakers of a remote videoconferencing system in response to the position signal to play the audio signal corresponding to the image of the at least one video stream; and
means for transmitting the position signal, audio signals, and the at least one video streams via a communication channel.
19. An electronically-readable medium having embodied thereon a program, the program being executable by a machine to perform method steps for transmitting conferencing data, the method steps comprising:
capturing an image with a video sensor and generating at least one video stream from the image;
capturing audio data with a plurality of microphones and generating an audio signal;
generating a position signal indicating a position of an acoustic source based upon magnitude differences of the audio data; and
transmitting the position signal, audio signals, and the at least one video streams via a communication channel.
20. The electronically-readable medium of claim 19 , wherein the method steps further comprise selectively driving one or more speakers of a remote videoconferencing system in response to the position signal to play the audio signal corresponding to the image of the at least one video stream.
21. The electronically-readable medium of claim 19 , wherein the method steps further comprise defining a plurality of image sections and generating a respective plurality of video streams corresponding to the plurality of image sections.
22. The electronically-readable medium of claim 19 , wherein the method steps further comprise scaling a first image section of the plurality of image sections into a first video stream having a first resolution and scaling a second image section of the plurality of image sections into a second video stream having a second resolution.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/214,976 US20030048353A1 (en) | 2001-08-07 | 2002-08-07 | System and method for high resolution videoconferencing |
US10/753,139 US20050042211A1 (en) | 2001-08-07 | 2004-01-07 | System and method for high resolution videoconferencing |
US10/814,364 US20040183897A1 (en) | 2001-08-07 | 2004-03-31 | System and method for high resolution videoconferencing |
US12/349,409 US8077194B2 (en) | 2001-08-07 | 2009-01-06 | System and method for high resolution videoconferencing |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US31074201P | 2001-08-07 | 2001-08-07 | |
US10/214,976 US20030048353A1 (en) | 2001-08-07 | 2002-08-07 | System and method for high resolution videoconferencing |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/753,139 Continuation US20050042211A1 (en) | 2001-08-07 | 2004-01-07 | System and method for high resolution videoconferencing |
US10/814,364 Continuation US20040183897A1 (en) | 2001-08-07 | 2004-03-31 | System and method for high resolution videoconferencing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20030048353A1 true US20030048353A1 (en) | 2003-03-13 |
Family
ID=23203909
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/214,976 Abandoned US20030048353A1 (en) | 2001-08-07 | 2002-08-07 | System and method for high resolution videoconferencing |
US10/753,139 Abandoned US20050042211A1 (en) | 2001-08-07 | 2004-01-07 | System and method for high resolution videoconferencing |
US10/814,364 Abandoned US20040183897A1 (en) | 2001-08-07 | 2004-03-31 | System and method for high resolution videoconferencing |
US12/349,409 Expired - Lifetime US8077194B2 (en) | 2001-08-07 | 2009-01-06 | System and method for high resolution videoconferencing |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/753,139 Abandoned US20050042211A1 (en) | 2001-08-07 | 2004-01-07 | System and method for high resolution videoconferencing |
US10/814,364 Abandoned US20040183897A1 (en) | 2001-08-07 | 2004-03-31 | System and method for high resolution videoconferencing |
US12/349,409 Expired - Lifetime US8077194B2 (en) | 2001-08-07 | 2009-01-06 | System and method for high resolution videoconferencing |
Country Status (4)
Country | Link |
---|---|
US (4) | US20030048353A1 (en) |
EP (1) | EP1425909A4 (en) |
JP (1) | JP2004538724A (en) |
WO (1) | WO2003015407A1 (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020188731A1 (en) * | 2001-05-10 | 2002-12-12 | Sergey Potekhin | Control unit for multipoint multimedia/audio system |
US20040022272A1 (en) * | 2002-03-01 | 2004-02-05 | Jeffrey Rodman | System and method for communication channel and device control via an existing audio channel |
US6812956B2 (en) * | 2001-12-21 | 2004-11-02 | Applied Minds, Inc. | Method and apparatus for selection of signals in a teleconference |
US20040218099A1 (en) * | 2003-03-20 | 2004-11-04 | Washington Richard G. | Systems and methods for multi-stream image processing |
US20050212908A1 (en) * | 2001-12-31 | 2005-09-29 | Polycom, Inc. | Method and apparatus for combining speakerphone and video conference unit operations |
US20050213729A1 (en) * | 2000-12-26 | 2005-09-29 | Polycom,Inc. | Speakerphone using a secure audio connection to initiate a second secure connection |
US20050213725A1 (en) * | 2001-12-31 | 2005-09-29 | Polycom, Inc. | Speakerphone transmitting control information embedded in audio information through a conference bridge |
US20050213732A1 (en) * | 2001-12-31 | 2005-09-29 | Polycom, Inc. | Conference bridge which decodes and responds to control information embedded in audio information |
US20050213735A1 (en) * | 2000-12-26 | 2005-09-29 | Polycom, Inc. | Speakerphone transmitting URL information to a remote device |
US20050213739A1 (en) * | 2001-05-10 | 2005-09-29 | Polycom, Inc. | Conference endpoint controlling functions of a remote device |
US20050213727A1 (en) * | 2001-05-10 | 2005-09-29 | Polycom, Inc. | Speakerphone and conference bridge which request and perform polling operations |
US20050213730A1 (en) * | 2000-12-26 | 2005-09-29 | Polycom, Inc. | Conference endpoint instructing conference bridge to dial phone number |
US20050231586A1 (en) * | 2004-04-16 | 2005-10-20 | Jeffrey Rodman | Conference link between a speakerphone and a video conference unit |
US20050248652A1 (en) * | 2003-10-08 | 2005-11-10 | Cisco Technology, Inc., A California Corporation | System and method for performing distributed video conferencing |
US20050280701A1 (en) * | 2004-06-14 | 2005-12-22 | Wardell Patrick J | Method and system for associating positional audio to positional video |
US20060092269A1 (en) * | 2003-10-08 | 2006-05-04 | Cisco Technology, Inc. | Dynamically switched and static multiple video streams for a multimedia conference |
US20060277254A1 (en) * | 2005-05-02 | 2006-12-07 | Kenoyer Michael L | Multi-component videoconferencing system |
US20070024706A1 (en) * | 2005-08-01 | 2007-02-01 | Brannon Robert H Jr | Systems and methods for providing high-resolution regions-of-interest |
US20070024705A1 (en) * | 2005-08-01 | 2007-02-01 | Richter Roger K | Systems and methods for video stream selection |
US20070140456A1 (en) * | 2001-12-31 | 2007-06-21 | Polycom, Inc. | Method and apparatus for wideband conferencing |
US20070156924A1 (en) * | 2006-01-03 | 2007-07-05 | Cisco Technology, Inc. | Method and apparatus for transcoding and transrating in distributed video systems |
WO2008014697A1 (en) | 2006-07-25 | 2008-02-07 | Huawei Technologies Co., Ltd. | A method and an apparatus for obtaining acoustic source location information and a multimedia communication system |
WO2009043275A1 (en) * | 2007-09-28 | 2009-04-09 | Shenzhen Huawei Telecommunication Technologies Co., Ltd. | A method and a system of video communication and a device for video communication |
US7742588B2 (en) | 2001-12-31 | 2010-06-22 | Polycom, Inc. | Speakerphone establishing and using a second connection of graphics information |
US7796565B2 (en) | 2005-06-08 | 2010-09-14 | Polycom, Inc. | Mixed voice and spread spectrum data signaling with multiplexing multiple users with CDMA |
US7978838B2 (en) | 2001-12-31 | 2011-07-12 | Polycom, Inc. | Conference endpoint instructing conference bridge to mute participants |
US8102984B2 (en) | 2001-12-31 | 2012-01-24 | Polycom Inc. | Speakerphone and conference bridge which receive and provide participant monitoring information |
US8126029B2 (en) | 2005-06-08 | 2012-02-28 | Polycom, Inc. | Voice interference correction for mixed voice and spread spectrum data signaling |
US8144854B2 (en) | 2001-12-31 | 2012-03-27 | Polycom Inc. | Conference bridge which detects control information embedded in audio information to prioritize operations |
US8199791B2 (en) | 2005-06-08 | 2012-06-12 | Polycom, Inc. | Mixed voice and spread spectrum data signaling with enhanced concealment of data |
US8223942B2 (en) | 2001-12-31 | 2012-07-17 | Polycom, Inc. | Conference endpoint requesting and receiving billing information from a conference bridge |
US8705719B2 (en) | 2001-12-31 | 2014-04-22 | Polycom, Inc. | Speakerphone and conference bridge which receive and provide participant monitoring information |
WO2014200629A1 (en) * | 2013-06-11 | 2014-12-18 | New Vad, Llc | System and method for pc-based video conferencing and audio/video presentation |
EP2819400A1 (en) * | 2013-06-27 | 2014-12-31 | Samsung Electronics Co., Ltd. | Display apparatus and method for providing stereophonic sound service |
US8934381B2 (en) | 2001-12-31 | 2015-01-13 | Polycom, Inc. | Conference endpoint instructing a remote device to establish a new connection |
US8948059B2 (en) | 2000-12-26 | 2015-02-03 | Polycom, Inc. | Conference endpoint controlling audio volume of a remote device |
US8977683B2 (en) | 2000-12-26 | 2015-03-10 | Polycom, Inc. | Speakerphone transmitting password information to a remote device |
US20150378566A1 (en) * | 2014-06-27 | 2015-12-31 | Alcatel Lucent | Method, system and device for navigating in ultra high resolution video content by a client device |
US11601731B1 (en) * | 2022-08-25 | 2023-03-07 | Benjamin Slotznick | Computer program product and method for auto-focusing a camera on an in-person attendee who is speaking into a microphone at a hybrid meeting that is being streamed via a videoconferencing system to remote attendees |
US20230315380A1 (en) * | 2011-07-28 | 2023-10-05 | Apple Inc. | Devices with enhanced audio |
US11877058B1 (en) | 2022-08-25 | 2024-01-16 | Benjamin Slotznick | Computer program product and automated method for auto-focusing a camera on a person in a venue who is wearing, or carrying, or holding, or speaking into a microphone at the venue |
US11889188B1 (en) | 2022-08-25 | 2024-01-30 | Benjamin Slotznick | Computer program product and method for auto-focusing one or more cameras on selected persons in a venue who are performers of a performance occurring at the venue |
US11889187B1 (en) | 2022-08-25 | 2024-01-30 | Benjamin Slotznick | Computer program product and method for auto-focusing one or more lighting fixtures on selected persons in a venue who are performers of a performance occurring at the venue |
US11902659B1 (en) | 2022-08-25 | 2024-02-13 | Benjamin Slotznick | Computer program product and method for auto-focusing a lighting fixture on a person in a venue who is wearing, or carrying, or holding, or speaking into a microphone at the venue |
Families Citing this family (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2002325878A1 (en) * | 2002-07-10 | 2004-02-02 | Telefonaktiebolaget Lm Ericsson (Publ) | Synchronous data transfer system for time-sensitive data in packet-switched networks |
US7525584B2 (en) * | 2004-01-05 | 2009-04-28 | Lifesize Communications, Inc. | Fast edge directed demosaicing |
US20060047749A1 (en) * | 2004-08-31 | 2006-03-02 | Robert Davis | Digital links for multi-media network conferencing |
US20060106929A1 (en) * | 2004-10-15 | 2006-05-18 | Kenoyer Michael L | Network conference communications |
US7473040B2 (en) * | 2004-10-15 | 2009-01-06 | Lifesize Communications, Inc. | High definition camera pan tilt mechanism |
US8149739B2 (en) * | 2004-10-15 | 2012-04-03 | Lifesize Communications, Inc. | Background call validation |
US7572073B2 (en) * | 2004-10-15 | 2009-08-11 | Lifesize Communications, Inc. | Camera support mechanism |
US8116500B2 (en) * | 2004-10-15 | 2012-02-14 | Lifesize Communications, Inc. | Microphone orientation and size in a speakerphone |
US7826624B2 (en) * | 2004-10-15 | 2010-11-02 | Lifesize Communications, Inc. | Speakerphone self calibration and beam forming |
US7667728B2 (en) * | 2004-10-15 | 2010-02-23 | Lifesize Communications, Inc. | Video and audio conferencing system with spatial audio |
US8054336B2 (en) * | 2004-10-15 | 2011-11-08 | Lifesize Communications, Inc. | High definition pan tilt zoom camera with embedded microphones and thin cable for data and power |
US7903137B2 (en) * | 2004-10-15 | 2011-03-08 | Lifesize Communications, Inc. | Videoconferencing echo cancellers |
US7760887B2 (en) * | 2004-10-15 | 2010-07-20 | Lifesize Communications, Inc. | Updating modeling information based on online data gathering |
US8477173B2 (en) * | 2004-10-15 | 2013-07-02 | Lifesize Communications, Inc. | High definition videoconferencing system |
US7720232B2 (en) * | 2004-10-15 | 2010-05-18 | Lifesize Communications, Inc. | Speakerphone |
US20060132595A1 (en) * | 2004-10-15 | 2006-06-22 | Kenoyer Michael L | Speakerphone supporting video and audio features |
US7545435B2 (en) * | 2004-10-15 | 2009-06-09 | Lifesize Communications, Inc. | Automatic backlight compensation and exposure control |
US7717629B2 (en) * | 2004-10-15 | 2010-05-18 | Lifesize Communications, Inc. | Coordinated camera pan tilt mechanism |
US7692683B2 (en) * | 2004-10-15 | 2010-04-06 | Lifesize Communications, Inc. | Video conferencing system transcoder |
US7970151B2 (en) * | 2004-10-15 | 2011-06-28 | Lifesize Communications, Inc. | Hybrid beamforming |
US7720236B2 (en) * | 2004-10-15 | 2010-05-18 | Lifesize Communications, Inc. | Updating modeling information based on offline calibration experiments |
US7864221B2 (en) * | 2004-10-15 | 2011-01-04 | Lifesize Communications, Inc. | White balance for video applications |
US7930409B2 (en) | 2005-02-23 | 2011-04-19 | Aol Inc. | Configuring output on a communication device |
US7593539B2 (en) | 2005-04-29 | 2009-09-22 | Lifesize Communications, Inc. | Microphone and speaker arrangement in speakerphone |
US7970150B2 (en) * | 2005-04-29 | 2011-06-28 | Lifesize Communications, Inc. | Tracking talkers using virtual broadside scan and directed beams |
US7991167B2 (en) * | 2005-04-29 | 2011-08-02 | Lifesize Communications, Inc. | Forming beams with nulls directed at noise sources |
US20070165106A1 (en) * | 2005-05-02 | 2007-07-19 | Groves Randall D | Distributed Videoconferencing Processing |
US20060248210A1 (en) * | 2005-05-02 | 2006-11-02 | Lifesize Communications, Inc. | Controlling video display mode in a video conferencing system |
JP2007019907A (en) | 2005-07-08 | 2007-01-25 | Yamaha Corp | Speech transmission system, and communication conference apparatus |
DE102005057406A1 (en) * | 2005-11-30 | 2007-06-06 | Valenzuela, Carlos Alberto, Dr.-Ing. | Method for recording a sound source with time-variable directional characteristics and for playback and system for carrying out the method |
US8311129B2 (en) * | 2005-12-16 | 2012-11-13 | Lifesize Communications, Inc. | Temporal video filtering |
US7667762B2 (en) * | 2006-08-01 | 2010-02-23 | Lifesize Communications, Inc. | Dual sensor video camera |
US8334891B2 (en) | 2007-03-05 | 2012-12-18 | Cisco Technology, Inc. | Multipoint conference video switching |
US8264521B2 (en) | 2007-04-30 | 2012-09-11 | Cisco Technology, Inc. | Media detection and packet distribution in a multipoint conference |
TWI381733B (en) * | 2007-06-11 | 2013-01-01 | Quanta Comp Inc | High definition video conference system |
US20080316295A1 (en) * | 2007-06-22 | 2008-12-25 | King Keith C | Virtual decoders |
US8139100B2 (en) | 2007-07-13 | 2012-03-20 | Lifesize Communications, Inc. | Virtual multiway scaler compensation |
US9661267B2 (en) * | 2007-09-20 | 2017-05-23 | Lifesize, Inc. | Videoconferencing system discovery |
WO2010024000A1 (en) * | 2008-08-26 | 2010-03-04 | シャープ株式会社 | Image display device and image display device drive method |
US8514265B2 (en) | 2008-10-02 | 2013-08-20 | Lifesize Communications, Inc. | Systems and methods for selecting videoconferencing endpoints for display in a composite video image |
US20100110160A1 (en) * | 2008-10-30 | 2010-05-06 | Brandt Matthew K | Videoconferencing Community with Live Images |
US8456510B2 (en) | 2009-03-04 | 2013-06-04 | Lifesize Communications, Inc. | Virtual distributed multipoint control unit |
US8643695B2 (en) | 2009-03-04 | 2014-02-04 | Lifesize Communications, Inc. | Videoconferencing endpoint extension |
US8305421B2 (en) * | 2009-06-29 | 2012-11-06 | Lifesize Communications, Inc. | Automatic determination of a configuration for a conference |
JP5325745B2 (en) * | 2009-11-02 | 2013-10-23 | 株式会社ソニー・コンピュータエンタテインメント | Moving image processing program, apparatus and method, and imaging apparatus equipped with moving image processing apparatus |
US8350891B2 (en) * | 2009-11-16 | 2013-01-08 | Lifesize Communications, Inc. | Determining a videoconference layout based on numbers of participants |
US8866968B2 (en) * | 2011-03-10 | 2014-10-21 | Panasonic Corporation | Video processing device, and video display system containing same |
US8937638B2 (en) * | 2012-08-10 | 2015-01-20 | Tellybean Oy | Method and apparatus for tracking active subject in video call service |
WO2014130977A1 (en) | 2013-02-25 | 2014-08-28 | Herold Williams | Nonlinear scaling in video conferencing |
US10427040B2 (en) * | 2015-06-03 | 2019-10-01 | Razer (Asia-Pacific) Pte. Ltd. | Haptics devices and methods for controlling a haptics device |
CN112055876A (en) * | 2018-04-27 | 2020-12-08 | 语享路有限责任公司 | Multi-party dialogue recording/outputting method using voice recognition technology and apparatus therefor |
US20240259451A1 (en) * | 2023-01-27 | 2024-08-01 | Zoom Video Communications, Inc. | Isolating videoconference streams |
Family Cites Families (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3618035A (en) * | 1969-04-17 | 1971-11-02 | Bell Telephone Labor Inc | Video-telephone computer graphics system |
US4311874A (en) * | 1979-12-17 | 1982-01-19 | Bell Telephone Laboratories, Incorporated | Teleconference microphone arrays |
US4494144A (en) * | 1982-06-28 | 1985-01-15 | At&T Bell Laboratories | Reduced bandwidth video transmission |
JP3036088B2 (en) * | 1991-01-21 | 2000-04-24 | 日本電信電話株式会社 | Sound signal output method for displaying multiple image windows |
US5280540A (en) * | 1991-10-09 | 1994-01-18 | Bell Communications Research, Inc. | Video teleconferencing system employing aspect ratio transformation |
JPH05276510A (en) * | 1992-03-27 | 1993-10-22 | Canon Inc | Video conference system |
CA2122371C (en) * | 1992-08-27 | 1998-03-03 | Osamu Okada | Moving picture coding apparatus |
US5335011A (en) * | 1993-01-12 | 1994-08-02 | Bell Communications Research, Inc. | Sound localization system for teleconferencing using self-steering microphone arrays |
US5508734A (en) * | 1994-07-27 | 1996-04-16 | International Business Machines Corporation | Method and apparatus for hemispheric imaging which emphasizes peripheral content |
US5487665A (en) * | 1994-10-31 | 1996-01-30 | Mcdonnell Douglas Corporation | Video display system and method for generating and individually positioning high resolution inset images |
JPH08279999A (en) | 1995-02-22 | 1996-10-22 | Internatl Business Mach Corp <Ibm> | Video conference multimedia system |
DE19531222A1 (en) * | 1995-08-24 | 1997-02-27 | Siemens Ag | Speech signal control method for multi-point video conference system |
JPH09140000A (en) * | 1995-11-15 | 1997-05-27 | Nippon Telegr & Teleph Corp <Ntt> | Loud hearing aid for conference |
JPH1051755A (en) * | 1996-05-30 | 1998-02-20 | Fujitsu Ltd | Screen display controller for video conference terminal equipment |
JPH1042264A (en) | 1996-07-23 | 1998-02-13 | Nec Corp | Video conference system |
US5864681A (en) * | 1996-08-09 | 1999-01-26 | U.S. Robotics Access Corp. | Video encoder/decoder system |
EP0838950A1 (en) * | 1996-10-23 | 1998-04-29 | Alcatel | Terminal for video communication |
FR2761562B1 (en) * | 1997-03-27 | 2004-08-27 | France Telecom | VIDEO CONFERENCE SYSTEM |
US5900907A (en) * | 1997-10-17 | 1999-05-04 | Polycom, Inc. | Integrated videoconferencing unit |
US6489956B1 (en) * | 1998-02-17 | 2002-12-03 | Sun Microsystems, Inc. | Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for implementation of display effects |
JP4465880B2 (en) | 1998-10-09 | 2010-05-26 | ソニー株式会社 | Communication apparatus and method |
JP4244416B2 (en) * | 1998-10-30 | 2009-03-25 | ソニー株式会社 | Information processing apparatus and method, and recording medium |
JP2000287188A (en) * | 1999-04-01 | 2000-10-13 | Nippon Telegr & Teleph Corp <Ntt> | System and unit for inter-multi-point video audio communication |
US6208373B1 (en) * | 1999-08-02 | 2001-03-27 | Timothy Lo Fong | Method and apparatus for enabling a videoconferencing participant to appear focused on camera to corresponding users |
US6323893B1 (en) * | 1999-10-27 | 2001-11-27 | Tidenet, Inc. | Portable conference center |
US6894714B2 (en) * | 2000-12-05 | 2005-05-17 | Koninklijke Philips Electronics N.V. | Method and apparatus for predicting events in video conferencing and other applications |
US6577333B2 (en) * | 2000-12-12 | 2003-06-10 | Intel Corporation | Automatic multi-camera video composition |
US6677979B1 (en) * | 2001-06-12 | 2004-01-13 | Cisco Technology, Inc. | Method and apparatus for dual image video teleconferencing |
-
2002
- 2002-08-07 WO PCT/US2002/025477 patent/WO2003015407A1/en not_active Application Discontinuation
- 2002-08-07 EP EP02761322A patent/EP1425909A4/en not_active Withdrawn
- 2002-08-07 US US10/214,976 patent/US20030048353A1/en not_active Abandoned
- 2002-08-07 JP JP2003520192A patent/JP2004538724A/en active Pending
-
2004
- 2004-01-07 US US10/753,139 patent/US20050042211A1/en not_active Abandoned
- 2004-03-31 US US10/814,364 patent/US20040183897A1/en not_active Abandoned
-
2009
- 2009-01-06 US US12/349,409 patent/US8077194B2/en not_active Expired - Lifetime
Cited By (77)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050213735A1 (en) * | 2000-12-26 | 2005-09-29 | Polycom, Inc. | Speakerphone transmitting URL information to a remote device |
US8977683B2 (en) | 2000-12-26 | 2015-03-10 | Polycom, Inc. | Speakerphone transmitting password information to a remote device |
US9001702B2 (en) | 2000-12-26 | 2015-04-07 | Polycom, Inc. | Speakerphone using a secure audio connection to initiate a second secure connection |
US20050213730A1 (en) * | 2000-12-26 | 2005-09-29 | Polycom, Inc. | Conference endpoint instructing conference bridge to dial phone number |
US8948059B2 (en) | 2000-12-26 | 2015-02-03 | Polycom, Inc. | Conference endpoint controlling audio volume of a remote device |
US8964604B2 (en) | 2000-12-26 | 2015-02-24 | Polycom, Inc. | Conference endpoint instructing conference bridge to dial phone number |
US7864938B2 (en) | 2000-12-26 | 2011-01-04 | Polycom, Inc. | Speakerphone transmitting URL information to a remote device |
US20050213729A1 (en) * | 2000-12-26 | 2005-09-29 | Polycom,Inc. | Speakerphone using a secure audio connection to initiate a second secure connection |
US8976712B2 (en) | 2001-05-10 | 2015-03-10 | Polycom, Inc. | Speakerphone and conference bridge which request and perform polling operations |
US20020188731A1 (en) * | 2001-05-10 | 2002-12-12 | Sergey Potekhin | Control unit for multipoint multimedia/audio system |
US20050213739A1 (en) * | 2001-05-10 | 2005-09-29 | Polycom, Inc. | Conference endpoint controlling functions of a remote device |
US20050213727A1 (en) * | 2001-05-10 | 2005-09-29 | Polycom, Inc. | Speakerphone and conference bridge which request and perform polling operations |
US8934382B2 (en) | 2001-05-10 | 2015-01-13 | Polycom, Inc. | Conference endpoint controlling functions of a remote device |
US8805928B2 (en) | 2001-05-10 | 2014-08-12 | Polycom, Inc. | Control unit for multipoint multimedia/audio system |
US6812956B2 (en) * | 2001-12-21 | 2004-11-02 | Applied Minds, Inc. | Method and apparatus for selection of signals in a teleconference |
US20070140456A1 (en) * | 2001-12-31 | 2007-06-21 | Polycom, Inc. | Method and apparatus for wideband conferencing |
US8102984B2 (en) | 2001-12-31 | 2012-01-24 | Polycom Inc. | Speakerphone and conference bridge which receive and provide participant monitoring information |
US8885523B2 (en) | 2001-12-31 | 2014-11-11 | Polycom, Inc. | Speakerphone transmitting control information embedded in audio information through a conference bridge |
US7978838B2 (en) | 2001-12-31 | 2011-07-12 | Polycom, Inc. | Conference endpoint instructing conference bridge to mute participants |
US8705719B2 (en) | 2001-12-31 | 2014-04-22 | Polycom, Inc. | Speakerphone and conference bridge which receive and provide participant monitoring information |
US8023458B2 (en) | 2001-12-31 | 2011-09-20 | Polycom, Inc. | Method and apparatus for wideband conferencing |
US20050213732A1 (en) * | 2001-12-31 | 2005-09-29 | Polycom, Inc. | Conference bridge which decodes and responds to control information embedded in audio information |
US20050213725A1 (en) * | 2001-12-31 | 2005-09-29 | Polycom, Inc. | Speakerphone transmitting control information embedded in audio information through a conference bridge |
US8223942B2 (en) | 2001-12-31 | 2012-07-17 | Polycom, Inc. | Conference endpoint requesting and receiving billing information from a conference bridge |
US8582520B2 (en) | 2001-12-31 | 2013-11-12 | Polycom, Inc. | Method and apparatus for wideband conferencing |
US8144854B2 (en) | 2001-12-31 | 2012-03-27 | Polycom Inc. | Conference bridge which detects control information embedded in audio information to prioritize operations |
US20050212908A1 (en) * | 2001-12-31 | 2005-09-29 | Polycom, Inc. | Method and apparatus for combining speakerphone and video conference unit operations |
US8934381B2 (en) | 2001-12-31 | 2015-01-13 | Polycom, Inc. | Conference endpoint instructing a remote device to establish a new connection |
US7787605B2 (en) | 2001-12-31 | 2010-08-31 | Polycom, Inc. | Conference bridge which decodes and responds to control information embedded in audio information |
US8947487B2 (en) | 2001-12-31 | 2015-02-03 | Polycom, Inc. | Method and apparatus for combining speakerphone and video conference unit operations |
US7742588B2 (en) | 2001-12-31 | 2010-06-22 | Polycom, Inc. | Speakerphone establishing and using a second connection of graphics information |
US7821918B2 (en) | 2002-03-01 | 2010-10-26 | Polycom, Inc. | System and method for communication channel and device control via an existing audio channel |
US20040022272A1 (en) * | 2002-03-01 | 2004-02-05 | Jeffrey Rodman | System and method for communication channel and device control via an existing audio channel |
US7702015B2 (en) * | 2003-03-20 | 2010-04-20 | Ge Security, Inc. | Systems and methods for multi-resolution image processing |
US20040218099A1 (en) * | 2003-03-20 | 2004-11-04 | Washington Richard G. | Systems and methods for multi-stream image processing |
US20040223058A1 (en) * | 2003-03-20 | 2004-11-11 | Richter Roger K. | Systems and methods for multi-resolution image processing |
WO2004086748A3 (en) * | 2003-03-20 | 2008-04-10 | Covi Technologies Inc | Systems and methods for multi-resolution image processing |
US8681859B2 (en) * | 2003-03-20 | 2014-03-25 | Utc Fire & Security Americas Corporation, Inc. | Systems and methods for multi-stream image processing |
US7995652B2 (en) * | 2003-03-20 | 2011-08-09 | Utc Fire & Security Americas Corporation, Inc. | Systems and methods for multi-stream image processing |
US20110292287A1 (en) * | 2003-03-20 | 2011-12-01 | Utc Fire & Security Americas Corporation, Inc. | Systems and methods for multi-stream image processing |
US20050248652A1 (en) * | 2003-10-08 | 2005-11-10 | Cisco Technology, Inc., A California Corporation | System and method for performing distributed video conferencing |
US8081205B2 (en) | 2003-10-08 | 2011-12-20 | Cisco Technology, Inc. | Dynamically switched and static multiple video streams for a multimedia conference |
US7477282B2 (en) * | 2003-10-08 | 2009-01-13 | Cisco Technology, Inc. | System and method for performing distributed video conferencing |
US20060092269A1 (en) * | 2003-10-08 | 2006-05-04 | Cisco Technology, Inc. | Dynamically switched and static multiple video streams for a multimedia conference |
US8004556B2 (en) | 2004-04-16 | 2011-08-23 | Polycom, Inc. | Conference link between a speakerphone and a video conference unit |
US20080143819A1 (en) * | 2004-04-16 | 2008-06-19 | Polycom, Inc. | Conference link between a speakerphone and a video conference unit |
US7339605B2 (en) * | 2004-04-16 | 2008-03-04 | Polycom, Inc. | Conference link between a speakerphone and a video conference unit |
US20050231586A1 (en) * | 2004-04-16 | 2005-10-20 | Jeffrey Rodman | Conference link between a speakerphone and a video conference unit |
US20050280701A1 (en) * | 2004-06-14 | 2005-12-22 | Wardell Patrick J | Method and system for associating positional audio to positional video |
US20060277254A1 (en) * | 2005-05-02 | 2006-12-07 | Kenoyer Michael L | Multi-component videoconferencing system |
US7796565B2 (en) | 2005-06-08 | 2010-09-14 | Polycom, Inc. | Mixed voice and spread spectrum data signaling with multiplexing multiple users with CDMA |
US8126029B2 (en) | 2005-06-08 | 2012-02-28 | Polycom, Inc. | Voice interference correction for mixed voice and spread spectrum data signaling |
US8199791B2 (en) | 2005-06-08 | 2012-06-12 | Polycom, Inc. | Mixed voice and spread spectrum data signaling with enhanced concealment of data |
US20070024706A1 (en) * | 2005-08-01 | 2007-02-01 | Brannon Robert H Jr | Systems and methods for providing high-resolution regions-of-interest |
US20070024705A1 (en) * | 2005-08-01 | 2007-02-01 | Richter Roger K | Systems and methods for video stream selection |
US20070156924A1 (en) * | 2006-01-03 | 2007-07-05 | Cisco Technology, Inc. | Method and apparatus for transcoding and transrating in distributed video systems |
US8713105B2 (en) | 2006-01-03 | 2014-04-29 | Cisco Technology, Inc. | Method and apparatus for transcoding and transrating in distributed video systems |
CN100442837C (en) * | 2006-07-25 | 2008-12-10 | 华为技术有限公司 | Video frequency communication system with sound position information and its obtaining method |
WO2008014697A1 (en) | 2006-07-25 | 2008-02-07 | Huawei Technologies Co., Ltd. | A method and an apparatus for obtaining acoustic source location information and a multimedia communication system |
US8115799B2 (en) | 2006-07-25 | 2012-02-14 | Huawei Technologies Co., Ltd. | Method and apparatus for obtaining acoustic source location information and a multimedia communication system |
US20090128617A1 (en) * | 2006-07-25 | 2009-05-21 | Huawei Technologies Co., Ltd. | Method and apparatus for obtaining acoustic source location information and a multimedia communication system |
US20100182394A1 (en) * | 2007-09-28 | 2010-07-22 | Wuzhou Zhan | Method, system, and device of video communication |
US8259625B2 (en) * | 2007-09-28 | 2012-09-04 | Huawei Technologies Co., Ltd. | Method, system, and device of video communication |
WO2009043275A1 (en) * | 2007-09-28 | 2009-04-09 | Shenzhen Huawei Telecommunication Technologies Co., Ltd. | A method and a system of video communication and a device for video communication |
US20230315380A1 (en) * | 2011-07-28 | 2023-10-05 | Apple Inc. | Devices with enhanced audio |
US9667913B2 (en) | 2013-06-11 | 2017-05-30 | New Vad, Llc | System and method for PC-based video conferencing and audio/video presentation |
WO2014200629A1 (en) * | 2013-06-11 | 2014-12-18 | New Vad, Llc | System and method for pc-based video conferencing and audio/video presentation |
US10122963B2 (en) | 2013-06-11 | 2018-11-06 | Milestone Av Technologies Llc | Bidirectional audio/video: system and method for opportunistic scheduling and transmission |
US9307339B2 (en) * | 2013-06-27 | 2016-04-05 | Samsung Electronics Co., Ltd. | Display apparatus and method for providing stereophonic sound service |
EP2819400A1 (en) * | 2013-06-27 | 2014-12-31 | Samsung Electronics Co., Ltd. | Display apparatus and method for providing stereophonic sound service |
US20150378566A1 (en) * | 2014-06-27 | 2015-12-31 | Alcatel Lucent | Method, system and device for navigating in ultra high resolution video content by a client device |
US11601731B1 (en) * | 2022-08-25 | 2023-03-07 | Benjamin Slotznick | Computer program product and method for auto-focusing a camera on an in-person attendee who is speaking into a microphone at a hybrid meeting that is being streamed via a videoconferencing system to remote attendees |
US11750925B1 (en) | 2022-08-25 | 2023-09-05 | Benjamin Slotznick | Computer program product and method for auto-focusing a camera on an in-person attendee who is speaking into a microphone at a meeting |
US11877058B1 (en) | 2022-08-25 | 2024-01-16 | Benjamin Slotznick | Computer program product and automated method for auto-focusing a camera on a person in a venue who is wearing, or carrying, or holding, or speaking into a microphone at the venue |
US11889188B1 (en) | 2022-08-25 | 2024-01-30 | Benjamin Slotznick | Computer program product and method for auto-focusing one or more cameras on selected persons in a venue who are performers of a performance occurring at the venue |
US11889187B1 (en) | 2022-08-25 | 2024-01-30 | Benjamin Slotznick | Computer program product and method for auto-focusing one or more lighting fixtures on selected persons in a venue who are performers of a performance occurring at the venue |
US11902659B1 (en) | 2022-08-25 | 2024-02-13 | Benjamin Slotznick | Computer program product and method for auto-focusing a lighting fixture on a person in a venue who is wearing, or carrying, or holding, or speaking into a microphone at the venue |
Also Published As
Publication number | Publication date |
---|---|
EP1425909A4 (en) | 2006-10-18 |
US20050042211A1 (en) | 2005-02-24 |
US20090115838A1 (en) | 2009-05-07 |
US20040183897A1 (en) | 2004-09-23 |
EP1425909A1 (en) | 2004-06-09 |
US8077194B2 (en) | 2011-12-13 |
JP2004538724A (en) | 2004-12-24 |
WO2003015407A1 (en) | 2003-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8077194B2 (en) | System and method for high resolution videoconferencing | |
US5757424A (en) | High-resolution video conferencing system | |
US8115799B2 (en) | Method and apparatus for obtaining acoustic source location information and a multimedia communication system | |
US20070171275A1 (en) | Three Dimensional Videoconferencing | |
US20020009137A1 (en) | Three-dimensional video broadcasting system | |
US8749609B2 (en) | Apparatus, system and method for video call | |
JP2001514826A (en) | Method and apparatus for transmitting and displaying still images | |
US7999842B1 (en) | Continuously rotating video camera, method and user interface for using the same | |
EP2293559A2 (en) | Apparatus, system and method for video call | |
WO2014192804A1 (en) | Decoder and monitor system | |
US20130088561A1 (en) | Television system and control method thereof | |
CN101047872A (en) | Stereo audio vedio device for TV | |
CN102202206A (en) | Communication device | |
JPH08336128A (en) | Video viewing device | |
WO2001069911A2 (en) | Interactive multimedia transmission system | |
JP2009118151A (en) | Communication system, transmitter, relay device, receiver, and transmission program | |
JPH09149391A (en) | Television telephone device | |
KR100641176B1 (en) | Method for displaying of three dimensions picture in wireless terminal | |
JPH07226958A (en) | Stereoscopic video image display system | |
JP2003008968A (en) | Image pickup device | |
JP5004680B2 (en) | Image processing apparatus, image processing method, video conference system, video conference method, program, and recording medium | |
JPH06276427A (en) | Voice controller with motion picture | |
WO2011087356A2 (en) | Video conferencing using single panoramic camera | |
KR101692190B1 (en) | A Multi Image Transmitting/Receiving System | |
JPH08317363A (en) | Image transmitter |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: POLYCOM INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KENOYER, MICHAEL;WASHINGTON, RICHARD;CHU, PETER;REEL/FRAME:014470/0065 Effective date: 20040209 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |