WO2011091604A1 - 视频通信的方法、装置和系统 - Google Patents

视频通信的方法、装置和系统 Download PDF

Info

Publication number
WO2011091604A1
WO2011091604A1 PCT/CN2010/070427 CN2010070427W WO2011091604A1 WO 2011091604 A1 WO2011091604 A1 WO 2011091604A1 CN 2010070427 W CN2010070427 W CN 2010070427W WO 2011091604 A1 WO2011091604 A1 WO 2011091604A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
local
unit
image
parameter
Prior art date
Application number
PCT/CN2010/070427
Other languages
English (en)
French (fr)
Inventor
刘源
赵光耀
王静
Original Assignee
华为终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为终端有限公司 filed Critical 华为终端有限公司
Priority to EP10844382.1A priority Critical patent/EP2525574A4/en
Priority to PCT/CN2010/070427 priority patent/WO2011091604A1/zh
Publication of WO2011091604A1 publication Critical patent/WO2011091604A1/zh
Priority to US13/561,928 priority patent/US8890922B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/003Details of a display terminal, the details relating to the control arrangement of the display terminal and to the interfaces thereto
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2625Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects for obtaining an image which is composed of images from a temporal image sequence, e.g. for a stroboscopic effect
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B17/00Details of cameras or camera bodies; Accessories therefor
    • G03B17/02Bodies
    • G03B17/17Bodies with reflectors arranged in beam forming the photographic image, e.g. for reducing dimensions of camera
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B37/00Panoramic or wide-screen photography; Photographing extended surfaces, e.g. for surveying; Photographing internal surfaces, e.g. of pipe
    • G03B37/04Panoramic or wide-screen photography; Photographing extended surfaces, e.g. for surveying; Photographing internal surfaces, e.g. of pipe with cameras or projectors providing touching or overlapping fields of view
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/06Adjustment of display parameters
    • G09G2320/0673Adjustment of display parameters for control of gamma adjustment, e.g. selecting another gamma curve
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2370/00Aspects of data communication
    • G09G2370/02Networking aspects
    • G09G2370/025LAN communication management
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2370/00Aspects of data communication
    • G09G2370/20Details of the management of multiple sources of image data
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/37Details of the operation on graphic patterns
    • G09G5/377Details of the operation on graphic patterns for mixing or overlaying two or more graphic patterns

Definitions

  • the present invention relates to the field of communications, and in particular, to a method, apparatus, and system for video communication. Background technique
  • FIG. 1 The general layout of the telepresence conference system in the prior art is shown in FIG.
  • the system consists of three large-screen displays 1, 2, 3, and three high-definition cameras 4, 5, and 6, used to photograph the participants 14 to 19 sitting in front of the conference table.
  • Each display shows a subset of participants, for example, each display shows two participants, and the three displays display a complete conference scene.
  • Embodiments of the present invention provide a method, apparatus, and system for video communication that are capable of generating a wide range, high resolution panoramic video image and seamlessly presenting to provide a better immersive panoramic experience for the user.
  • a video communication site including:
  • At least two local cameras for pointing to at least two local user portions, capturing local video images of at least two local user portions; a local camera stitching fusion unit, configured to fuse the local video images of the at least two local user parts obtained by the shooting according to the fusion parameters in the first video processing parameter to generate a panoramic video image; and the panoramic video image Encoding into a video code stream, and transmitting the video code stream to a remote video communication station;
  • a local display fuser configured to respectively decode at least two video data from a video stream received from a remote end; and fuse the at least two video data obtained by the decoding according to a fusion parameter in the second video processing parameter And outputting the merged at least two video data to the local display device; and at least two local display devices, configured to display at least two video data that are fused by the local display fusion device.
  • a method of providing video communication including:
  • the panoramic video image is transmitted to a video encoder, and the panoramic video image is encoded into a video code stream by the video encoder, and the video code stream is transmitted.
  • a method of video communication including:
  • a device for video communication including:
  • a first acquiring unit configured to acquire at least two local video images
  • a first merging unit configured to fuse at least two local video images acquired by the first acquiring unit according to a merging parameter in the first video processing parameter to generate a panoramic video image
  • a first sending unit configured to send a panoramic video image obtained by the first merging unit to a video encoder, where the panoramic video image is encoded into a video code stream, and the video code stream is Send it out.
  • a device for video communication including:
  • a second acquiring unit configured to acquire at least two pieces of video data decoded by the video decoder from the video code stream, where the video code stream is received by the video decoder from a remote video communication station; And merging at least two pieces of video data acquired by the second acquiring unit according to the merging parameter in the second video processing parameter;
  • an output unit configured to output at least two pieces of video data that are fused by the second merging unit to the display device, where the fused at least two pieces of video data are displayed by the display device.
  • a system for video communication includes at least two video communication sites.
  • at least one site of the at least two video communication sites serves as a receiving station, configured to separately decode at least two pieces of video data from the received video code stream; and fuse the at least two pieces of video data obtained by the decoding according to the fusion parameter in the second video processing parameter;
  • At least two video data output displays At least two video data output displays.
  • the method, device and system for video communication fuse at least two acquired video images into a panoramic video image at a transmitting end of the video communication, and the combined panoramic video image can more accurately represent adjacent video.
  • the positional relationship between the intersections between the images, so that the last displayed image gives the user a more realistic panoramic experience, which solves the problem that overlapping or missing areas of the adjacent video images captured by the camera have overlap and brightness and color inconsistency;
  • the video communication transmitting end in the embodiment of the invention encodes the merged panoramic video image into a video code stream and sends it to the video communication receiving end, and then the video communication receiving end further fuses the processing, and then performs the fusion processing.
  • the video image is output to the display device for display, and the fusion processing performed by the video communication receiving end enables multiple projection images to be seamlessly displayed on the curved screen, and the difference in color and brightness of each projection area is small, and the panoramic video is improved.
  • the visual continuity of the image gives the user a better immersive panoramic experience.
  • FIG. 1 is a schematic diagram of a system layout of video communication in the prior art
  • FIG. 2 is a top plan view of a layout of a video communication system according to an embodiment of the present invention
  • FIG. 3 is a top view of a second embodiment of a video communication system according to an embodiment of the present invention.
  • FIG. 4 is a side view of a layout of a video communication system according to an embodiment of the present invention.
  • FIG. 5 is a top view III of a layout of a video communication system according to an embodiment of the present invention.
  • FIG. 6 is a device connection diagram of a video communication system according to an embodiment of the present invention.
  • FIG. 7 is a flowchart of a method for video communication according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of a method for video communication according to another embodiment of the present invention.
  • FIG. 9 is a flowchart of a video communication sending end method according to another embodiment of the present invention.
  • FIG. 10 is a flowchart of a video communication receiving end according to another embodiment of the present invention.
  • FIG. 11 is a schematic diagram 1 of a common optical center camera according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram 2 of a common optical center camera according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of a common optical center camera according to an embodiment of the present invention.
  • FIG. 14 is a schematic structural diagram of a configuration of a common optical center camera and an arc curtain according to an embodiment of the present invention
  • FIG. 15 is a flowchart of a first video processing parameter of a transmitting end in a video communication method according to an embodiment of the present invention
  • FIG. 17 is a flowchart of a configuration of a second video processing parameter of a receiving end in a method for video communication according to an embodiment of the present disclosure
  • FIG. 18 is a schematic diagram of a luminance and color difference between receivers at a receiving end according to an embodiment of the present invention
  • FIG. 19 is a schematic structural diagram of a device for transmitting video communication at a transmitting end according to an embodiment of the present invention
  • FIG. 21 is a schematic structural diagram of a device for transmitting video communication at a transmitting end according to an embodiment of the present invention
  • FIG. 22 is a schematic structural diagram of a device for transmitting video communication at a transmitting end according to an embodiment of the present invention
  • FIG. 24 is a schematic structural diagram of a device for receiving video communication at a receiving end according to an embodiment of the present invention
  • FIG. 25 is a schematic structural diagram of a device for receiving video communication at a receiving end according to an embodiment of the present invention
  • FIG. 26 is a schematic structural diagram of a device for receiving video communication at a receiving end according to an embodiment of the present invention.
  • a video communication site including:
  • At least two local cameras for pointing to at least two local user portions, capturing local video images of at least two local user portions.
  • the at least two local cameras are a common optical center camera with three movements, and the synchronization time between each movement is realized by the same synchronous clock.
  • the first video processing parameter is obtained by the local camera stitching fusion device, or is calculated by the PC, and then sent to the local camera stitching fusion device.
  • the PC is connected to the local camera stitching fusion device.
  • the first video processing parameters include: a fusion parameter, a GAMMA correction parameter, a dead point compensation parameter, a transformation parameter, and a clipping region parameter.
  • a local display fuser configured to respectively decode at least two video data from a video stream received from a remote end; and fuse the at least two video data obtained by the decoding according to a fusion parameter in the second video processing parameter ; output the merged at least two video data to the local display device.
  • the second video processing parameter is obtained by the local display fusion device, or is calculated by the PC, and sent to the local display fusion device.
  • the PC is coupled to the local display fuser.
  • the second video processing parameters include: a fusion parameter, a GAMMA correction parameter, a projection correction parameter, a transformation parameter, and a clipping region parameter.
  • At least two local display devices are configured to display at least two video data that are fused by the local display fuse.
  • the at least two local display devices may be a projector and a screen, or a display; wherein the screen may be an arc curtain, an elliptical curtain, or a paraboloid, or a folding surface.
  • Curtain, or straight curtain the display is generally a high-definition flat panel display to obtain high-definition video images.
  • the video communication station acquires at least two local video images by using at least two local cameras, and fuses the acquired at least two local video images into a panoramic video image, and the fused panoramic video image can be more realistic.
  • the positional relationship between the adjacent video images is represented, so that the last displayed image gives the user a more realistic panoramic experience, and the adjacent video images captured by the camera have overlapping or missing areas at the intersection, and the brightness and color are inconsistent. The problem.
  • the fusion processing can cause the plurality of projection images to be on the screen Seamlessly rendered, and each projection area has a small difference in color and brightness, which improves the visual continuity of the panoramic video image, and can give the user a better sink. Dip panorama experience.
  • FIG. 2 is a top view of the overall layout of the video conference system according to the present invention, including: a common center camera 9 for collecting image data of a conference scene, and obtaining three video images, wherein the common center camera The shooting time of each movement is synchronized, and the synchronization is achieved by the same synchronous clock; a curved conference table 7 and a plurality of user seats 1 to 6, wherein the user indicates one or more of the video conferences a group of individuals or individuals, during a video conference, the user participates in the conversation as a speaker or participates as a non-speaker; an arc screen 8, three projectors 10, 11, 12, for display by the display fuser The processed three-way video image and the shared data information; the shooting area of the camera is the union of the shooting areas of the three local cameras, and the viewing angle range of the camera is related to the number of cameras and the shooting angle of each camera.
  • the angle of view of each camera is between 30 degrees and 45 degrees, so the angle of view ranges from 90 degrees to 135 degrees.
  • the angle of view of the camera is 35 degrees, then
  • the angle of view of the camera is a circular arc of 105 degrees;
  • the projection screen is centered on the midpoint of the edge of the conference table, and the radius is between 2500 mm and 7500 mm.
  • the radius of the projection screen is 2700 mm;
  • the arc length is determined according to the viewing angle range of the camera and the radius of the projection screen.
  • the height of the projection screen is related to the arc length of the projection screen and the ratio of the video image.
  • the arc length of the projection screen is about 4950 mm, which is high.
  • the value is approximately 900 mm.
  • the above parameters are designed to ensure a life-size visual effect.
  • the desktop edge 101 obtains an image of about 1:1 on the display screen; because it is closer to the camera, a projected image of about 1.1:1 can be obtained at 100, otherwise 102 A projected image of about 0.9:1 is obtained.
  • the virtual optical center of the camera 9 and the center of the upper surface of the projection screen 8 are on the same vertical line, and the distance of the vertical line is about 100 mm; 13 is a rear projection box, which accommodates three projectors 10, 11, 12 .
  • the projector projects the image on the arc screen 8 by means of a rear projection.
  • the rear projection box can be designed as a dark room, so that the image on the arc screen 8 is less affected by external light as much as possible. Good projection effect; Of course, in addition to the rear projection method, you can also use the front projection method to achieve image display.
  • the numbers 14 , 15 , and 16 are three microphones for collecting local audio signals; the numbers 20, 21 and 22 are three speakers for outputting the remote site audio signals transmitted through the network.
  • Figure 3 shows another top view of the overall layout of the video conferencing system of the present invention. The difference from Figure 2 is that the conference room layout scheme uses multiple rows of user seat settings (shown as two rows of user seats in Figure 3). Before or after the original row of conference tables 7 in Fig. 2, one or more rows of conference tables and corresponding seats can be added.
  • a row of conference tables 104 is added to Figure 3, and seats 101-103 are added. Since the seat of the rear conference table 7 is far from the display screen and has the occlusion of the front row of participants, the experience is deteriorated.
  • the conference table and the seats in the rear row can be raised to a height as a whole, forming a stepped conference room layout, and the seats in the rear row should be placed in the front row as far as possible for the two participants in the front row. intermediate. In this way, the participants in the back row will not be blocked by the front row, which can improve the user experience.
  • Figure 4 is a side elevational view of the overall layout of the video conferencing system of the present invention (taking a user's side as an example).
  • the camera's optical center 0 is located 100 mm behind the screen and 100 mm below the upper edge of the active screen.
  • the camera's vertical viewing angle is approximately 20 degrees. Since the camera cannot be placed at the user's horizontal line of sight position, the camera optical axis needs to be tilted downward by a predetermined angle, which is 8.5 degrees in the embodiment of the present invention.
  • the 300 mm arc band on the side of the table is designed to display a height of 100 mm on the arc screen, so that the portrait can be displayed in a height range of about 800 mm.
  • Human eye In the middle position, it can be calculated that the vertical eye-to-eye deviation angle is about 6.2 degrees, and the eye contact angle deviation can be perceived as a threshold of 5 degrees, so that a better eye-to-eye effect can be obtained.
  • the arc screen can display not only video data but also shared data information, and can be flexibly configured according to the viewing position of the user.
  • Shared data information can include: shared text, images, and video information.
  • the shared data information source may be stored in the local end in advance, or may be shared by the remote end by means of network transmission.
  • the shared data information can also be displayed by using at least one display device, the at least one display device can be disposed at one end of the conference room, or can be disposed at the at least one display device for displaying the remote conference site. Extension.
  • the arc screen can be expanded to add another display device to display the shared data information.
  • two projection display areas 4, 5 can be added.
  • Original The projection areas 1, 2, 3 can also be configured to display shared data information.
  • the projection area 2 can display the remote site image, and can also display the shared data information.
  • 4 can be configured to display shared data information.
  • 4 and 5 can be configured to display the remote venue image, 2 which can be configured to display shared data information. In this way, both parties can get a shared experience of sharing data information.
  • FIG. 6 shows a connection diagram of two video communication site devices provided by an embodiment of the present invention.
  • the common-side optical camera on the transmitting side collects the video image of the venue, and outputs three video images (usually using 1920x1080 high-definition video images) to the camera splicer of the transmitting end. Since the original three video images of the common optical camera 1 cannot be simply combined into an ideal panoramic image of the conference, the camera stitching fuser needs to process the three video images according to the first video processing parameters. The fusion parameters are used to fuse the three video images to generate a high resolution panoramic video image of approximately 48:9.
  • the panoramic video image can be output to the three video communication terminals at the transmitting end through three ways, and the video communication terminal separately encodes each video image, and encapsulates the encoded video code stream into a network data packet, and sends it to the network through the network. End of the video communication site.
  • the used network is embodied in a device in the network, including hardware and any suitable control logic, for interconnecting various components coupled with the network and assisting as shown in this embodiment. Communication between the various sites.
  • the network may include a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), any other public or private network, a local, regional or global communication network, an intranet, other suitable wired or wireless communication links, Or any combination of the preceding items.
  • the network may include gateways, routers, hubs, switches, access points, base stations, and any other hardware, software, or a combination of the foregoing that may implement any suitable protocol or communication.
  • the receiving end receives the data packet in the network, and decodes the three video video streams by using the video decoding unit in the three video communication terminals to obtain three video data, and then outputs the data to the display fusion device.
  • the display fuser integrates the three channels of video data according to the fusion parameter in the second video processing parameter Finally, the output is output to three projectors 2, which are projected onto the curved screen to form a panoramic seamless video image with a resolution of about 48:9.
  • 3 and 4 are two PC workstations, which are used as calibration tools for the fusion processing before the image display together with the camera stitching fuser and the display fuser.
  • the video encoder may be integrated in the camera stitching fuser; or when the system includes a transmitting end video communication terminal, the video encoder may also be integrated in the transmitting end video communication terminal.
  • the video decoder may be integrated in the display fuse; or when the system further includes a receiving end video communication terminal, the video decoder may also be integrated in the receiving end video communication terminal.
  • the camera splicing fuse is placed at the transmitting end, and the display fused device is placed at the receiving end.
  • the display fuse can also be placed on the transmitting end, connected to the camera stitching fusion device, and placed after the camera stitching fuser, and the parameters required for the display fuser to process the video image can be obtained from the receiving end;
  • the camera splicing fuse can also be placed at the receiving end, connected to the display fused device, and placed before the display fused device, and the parameters required for the camera splicing and fuser to process the video image can be obtained from the transmitting end.
  • the system for video communication provided by the embodiment of the present invention combines the acquired at least two video images into a panoramic video image at the transmitting end of the video communication, and the fused panoramic video image can more realistically represent the handover between adjacent video images.
  • the positional relationship of the area makes the last displayed image give the user a more realistic panoramic experience, and solves the problem that overlapping or missing areas of the adjacent video images captured by the camera have overlap and brightness and color inconsistency.
  • the video communication transmitting end encodes the panoramic video image into a video code stream and sends it to the video communication receiving end, and then the video communication receiving end further performs the fusion processing, and then outputs the merged processed video image to the display.
  • the device performs display, and the fusion processing performed by the video communication receiving end enables multiple projection images to be seamlessly presented on the screen, and each projection area has a small difference in color and brightness, thereby improving the visual continuity of the panoramic video image, and A better immersive panoramic experience for users.
  • the embodiment of the present invention provides a video communication method and Device.
  • the method for video communication provided by the embodiment of the present invention includes:
  • Step 101 Obtain at least two local video images.
  • the acquired at least two local video images are captured by a common center camera.
  • the user first logs in to the camera stitching fusion device through a PC, and sends an image collection command to the camera stitching fusion device through the PC; the camera stitching fusion device receives the command to collect the image, and then from the common light center
  • At least two video images captured by the camera are acquired in the camera and saved in the buffer of the camera stitching fuser.
  • devices that capture at least two video images are not limited to co-optical cameras, and are not listed here.
  • Step 102 fused the at least two local video images according to a fusion parameter in the first video processing parameter to generate a panoramic video image.
  • the fusion parameter is a part of the first video processing parameter, and the first video processing parameter is based on the acquired video image.
  • at least two video images acquired in step 101 are transmitted to the PC, and the required fusion parameters are calculated by the PC according to the video image.
  • the PC transmits the calculated fusion parameter to the camera stitching fusion device, and the camera fusion fuser configures the received fusion parameter as a parameter to be used in the working state, and displays the at least two video images according to the configured fusion parameter. Splicing into a panoramic video image.
  • the fusion parameter can also be calculated in the camera stitching fusion device, and the entire process of calculation, configuration, fusion, and the like can be directly performed by the camera stitching fusion device without interacting with the PC.
  • the camera stitching fusion device is required to obtain corresponding parameters of each local camera, thereby determining fusion parameters and related parameters such as GAMMA correction, sensor dead pixel compensation, image processing, and related parameters of cropping, scaling, and segmentation; or, alone
  • the process of computing, configuring, and merging is performed by one or more PCs without interacting with the camera splicing fuse.
  • the actual product implementation can be determined by the specific needs of the user, and will not be described here.
  • Step 103 Send the panoramic video image to a video encoder, encode the panoramic video image into a video code stream by using the video encoder, and send the video code stream.
  • the video encoder may be integrated in a video communication terminal at the transmitting end or integrated in the camera splicing fuse.
  • the fused video image is encoded by the video communication terminal at the transmitting end, and the encoded video code stream is transmitted to the network.
  • the video communication receiver receives the video stream from the network.
  • the encoded video code stream is transmitted to the receiving end through the network, and in the structure of the point-to-multipoint (Point To Multi-Point) video communication, the encoded video The code stream may be sent to the corresponding multi-point communication server, and after being multi-point fusion processed by the multi-point communication server, it is sent to the corresponding receiving end.
  • the encoded video The code stream may be sent to the corresponding multi-point communication server, and after being multi-point fusion processed by the multi-point communication server, it is sent to the corresponding receiving end.
  • the method for video communication provided by the embodiment of the present invention combines the acquired at least two video images into one panoramic video image, and the fused panoramic video image can more realistically represent the positional relationship of the handover area between adjacent video images, so that The last displayed image gives the user a more realistic panoramic experience, which solves the problem that the adjacent video images captured by the camera have overlapping and missing areas at the intersection, and the brightness and color are inconsistent.
  • Step 201 Acquire at least two video data decoded by a video decoder from a video code stream, where the video code stream is The video decoder is received from a remote video communication station;
  • the transmitting end divides the video stream into at least two video streams, which can improve the processing speed and reduce the error rate.
  • the video decoder receives the at least two video code streams from the network, and separately decodes the video code streams to obtain at least two pieces of video data.
  • the video decoder may be integrated in the video communication terminal at the receiving end, or may be integrated in the display fuse.
  • Step 202 fused the at least two video data according to the fusion parameter in the second video processing parameter.
  • the at least two video data are caused due to the difference between the display devices. There is a difference in color and brightness between them, which is blended to eliminate this difference. Similar to the image fusion operation of the transmitting end, the fusion parameter is first calculated by the PC and sent to the display fusion device; the display fusion device configures the received fusion parameter as the fusion parameter used in the working state, and according to the configured fusion parameter pair The at least two video data are fused. It can be understood that the process of calculating the second video processing parameter in the embodiment of the present invention can also be completed in the display fusion device, that is, the entire process of calculating, configuring, and merging is directly performed by the display fusion device, without requiring a PC.
  • Interacting which requires the camera stitching fusion device to obtain the corresponding parameters of each local camera, and then determine the fusion parameters and related parameters of GAMMA correction, sensor dead point compensation, image processing, and clipping, scaling, and segmentation;
  • the calculation, configuration, and fusion process can also be performed by one or more PCs separately without interacting with the display fuser.
  • the actual implementation can be determined by the specific needs of the user, and will not be described here.
  • Step 203 Output the fused at least two pieces of video data to the display device, and display, by the display device, the fused at least two pieces of video data.
  • the display device includes a projector and a screen, or a display; wherein the screen is not limited to an arc curtain, but may also be an elliptical curtain, or a parabolic screen, or a folding screen, or a straight Curtain, the display is generally a high-definition flat panel display to obtain high-definition video images.
  • the display device is a projector and an arc curtain, or a projector and an elliptical screen, or a projector and a parabola
  • the at least two video data are subjected to projection correction to eliminate the influence of the transformation of the screen shape on the image display effect.
  • the method for video communication provided by the embodiment of the present invention, after the fused panoramic video image is encoded into a video code stream and sent to the video communication receiving end, the video communication receiving end further performs fusion processing, and then the fused processed video is further processed.
  • the image is output to the display device for display, and the fusion processing performed by the video communication receiving end enables multiple projection images to be seamlessly displayed on the curved screen, and the difference in color and brightness of each projection area is small, and the panoramic video image is improved. Visual continuity, giving users a better immersive panoramic experience.
  • FIG. 9 and FIG. 10 a method for video communication provided by another embodiment of the present invention.
  • Step 301 The sending end acquires at least two local video images.
  • the acquired at least two local video images are captured by a common optical camera.
  • the video image collection at the transmitting end is obtained by the PC being logged in to the camera splicing fusion device, and the image splicing and fusion device is configured to send at least two video images through the common optical center camera.
  • each video image is captured by one of the common optical cameras.
  • the common optical camera includes three cameras, namely: a left camera, a middle camera, and a right camera, wherein The left camera captures the participant with the user seat number 1, 2, the middle camera captures the participant with the user seat number 3, 4, and the right camera captures the participant with the user seat number 5, 6.
  • the common center camera can capture all participants, and the shooting time of the above three cameras is synchronized.
  • Step 302 The sending end performs GAMMA correction on the at least two local video images according to the GAMMA correction parameter in the first video processing parameter.
  • the video processing parameters in this embodiment refer to the first video processing parameters. If the parameter is not configured, the received video image is transparently transmitted, that is, the video image is not processed, and is directly output; if the parameter is already configured, image processing is performed. In the case that the parameters have been configured, it is also necessary to specifically determine which parameters are configured. For example, if only the GAMMA parameter is configured and the sensor dead point compensation parameter is not configured, only the GAMMA correction operation is performed.
  • the camera stitching fuser Receiving a video image processed by a common-lens camera movement, or receiving a common-lens camera sensor, such as a Charge Coupled Device (CCD) or a Complementary Metal Oxide Semiconductor (CMOS) sensor
  • CMOS Complementary Metal Oxide Semiconductor
  • Step 303 The transmitting end performs sensor dead point compensation on the at least two local video images according to the dead pixel compensation parameter in the first video processing parameter.
  • the dead pixel compensation process may obtain the pixel at the dead pixel according to the difference of the adjacent pixel value of the dead pixel on the video image.
  • Step 304 The transmitting end transforms the at least two local video images according to the transform parameters in the first video processing parameter.
  • the transform includes: any one of a translation of a video image, a rotation of a video image, a homography transformation of a video image, and a cylinder transformation of a video image, and combinations thereof.
  • the three-dimensional point in the space is projected onto the imaging plane of the common optical camera, and the coordinate transformation relationship between the three-dimensional point and the planar point is:
  • X is the homogeneous representation of the world coordinate system
  • ⁇ and / ⁇ are the equivalent focal lengths in the horizontal and vertical directions
  • s is the distortion coefficient of the image
  • M Q , I3 ⁇ 4 is the image principal point coordinates
  • R is the rotation matrix of the camera
  • t is the camera translation vector.
  • the nickname is the internal reference of the camera, including the equivalent focal length in the horizontal and vertical directions, the distortion coefficient of the image, and the coordinates of the image principal point; R and t are called the external parameters of the camera.
  • Method 1 After converting a 3D point into a plane point, there are three methods for transforming the video image: Method 1, three video images with overlapping areas captured by three camera movements in a common center camera, in space The imaging relationship of points on a plane on two of the video images is: ⁇
  • is a matrix of 3 ⁇ 3, and the degree of freedom is 8, which represents the transformation relationship between two imaging planes, called the homography matrix.
  • X is the homogeneous representation of the image coordinates before the transformation
  • ⁇ ' is the homogeneous representation of the transformed image coordinates.
  • For a common-mode camera, the parameter t is not considered, so ⁇ can be expressed as:
  • the homography matrix H can be obtained by establishing at least eight equations by four pairs of points. After finding the homography matrix H, the two images can be stitched together by a coordinate transformation to align the pixels of the overlapping area. There are various calculation methods for H. One is a manual method. The user selects at least the coordinates of the four points on the image before the transformation and the coordinates of the four points on the transformed image. According to the coordinates of these four point pairs, we can use the equation (5) to establish a system of equations including at least eight equations to solve the homography matrix H.
  • Feature point extraction algorithms such as Scale-invariant feature transform (SIFT)
  • SIFT Scale-invariant feature transform
  • S is an image scaling matrix
  • R is a two-dimensional rotation matrix
  • T is a translation vector
  • X is the homogeneous representation of the image coordinates before the transformation
  • x' is the homogeneous representation of the transformed image coordinates.
  • the coordinate of the plane is converted into the cylindrical coordinate by the cylindrical coordinate transformation, and the image is spliced by the translation of the image in the cylindrical coordinate.
  • the transformation and inverse transformation of cylindrical coordinates are:
  • steps 302, 303, and 304 are existing implementation steps, and the order of replacing the above three steps does not affect the achieved effect of the present invention.
  • Step 305 The transmitting end combines the at least two local video images according to the fusion parameter in the first video processing parameter to generate a panoramic video image.
  • the ideal seamless image is generally not obtained, and the video image taken by the common optical camera movement must be considered due to the color difference between the exposure or the video image.
  • the difference in brightness or chromaticity of the image is particularly noticeable at the seams of the two video images, so multiple images need to be fused to eliminate differences in brightness or chromaticity between different images.
  • Alpha fusion may be performed at an overlap region at the seam of the video image. The formula for the Alpha fusion is:
  • the Alpha fusion generally only blends the brightness or chrominance differences at the seam of the video image, if the video image is between The brightness or chromaticity of the body is quite different, and the Alpha fusion cannot achieve good results.
  • Laplacian pyramid fusion, gradient gradient fusion, or Poisson fusion can be performed on the entire video image. The specific fusion principle will not be described again.
  • Step 306 The transmitting end clips the ratio of the panoramic video image to a first target ratio according to the trimming region parameter in the first video processing parameter.
  • the purpose of cropping an image is to eliminate a portion of the image that does not need to be displayed.
  • the first target ratio is artificially determined according to actual conditions.
  • the first video processing parameter is required in steps 302 to 306. Before processing the video image, the first video processing parameter needs to be configured, and the video image is processed according to the configured parameter.
  • the camera stitching fuser can be combined with the PC, the PC is also referred to as a first processor, the first video processing parameter is calculated by the PC, and the first video processing is configured by the camera stitching fuser.
  • Parameter, and processing the image according to the parameter in addition to the above implementation method, at least two images may be collected by the camera stitching fusion device, and the video processing parameter is calculated by the camera stitching fusion device according to the collected image, that is, directly
  • the camera stitching fusion device completes the process of calculating, configuring, and processing images without interacting with the PC.
  • the user can manually control the video processing parameters generated by the camera stitching fusion device by using a remote controller, a mouse, or the like, or
  • the automatic processing algorithm automatically generates the required video processing parameters by the camera stitching fuser; or, the process of calculating, configuring, and processing the image by one or more PCs separately, without the need to cooperate with the camera stitching fusion device.
  • the user automatically generates the required video through the calibration software on the PC Processing the parameters, and then performing image processing on the PC directly according to the parameters, wherein the calculation parameter is partially completed by a Central Processing Unit (CPU), and the processed image portion may be processed by a CPU or a Graphic Processing Unit (GPU). carry out.
  • CPU Central Processing Unit
  • GPU Graphic Processing Unit
  • the PC described in the above embodiment is only a specific implementation manner.
  • the above-described image processing can be completed by using a device having a processor with an audio/video input/output device.
  • the processor array can be set in the remote presentation management server, and the image processing of the collected image is performed by the server side.
  • Step 401 start Tuning software on a PC
  • the calibration software has a GUI interface.
  • the GUI interface includes a menu bar, a toolbar, a tab bar, a display area, a status bar, and a dialog box.
  • the menu bar is used for the user to select related commands, and supports mouse and keyboard shortcut operations;
  • the toolbar is used for the user to quickly select commonly used commands;
  • the Tab bar is used to list the open images, and the user can perform between the opened images. Switching and closing the open image; the display area is used to display the current image of the user operation, the scroll bar is supported, and the image content that cannot be displayed in the current window can be dragged by the user for viewing.
  • the status bar is used to display some current important information, such as image size, current mouse. Coordinates, etc.; dialogs are fired from the menu bar or toolbar to perform complex work tasks that require user keyboard input.
  • the image transformation parameters and the image transformation parameters, the alpha fusion and the GAMMA correction parameters required by the camera mosaic fuse and the display fusion device can be generated by a simple command and transmitted to the camera stitching fusion device and the display. Fusion.
  • Step 402 The PC is logged in to the camera stitching and fusion device by using the calibration software.
  • the calibration software may be installed locally on the PC as a third-party software, or may be accessed by a WEB page built into the PC.
  • Step 403 The PC sends an image collection command to the camera stitching fusion device.
  • Step 404 The camera stitching combiner obtains at least two images collected from the common center camera
  • the camera stitching fuser receives the video image collection command, the three video images collected from the common optical camera are acquired and saved in the buffer of the image mosaic fuser.
  • Step 405 The camera stitching fuser sends at least two images collected by the camera to the PC.
  • the camera stitching fuser sends the three video images in the cache to the PC through a data transmission protocol. .
  • Step 406 The PC calculates a first video processing parameter according to the collected image.
  • the parameters calculated by the PC include one or more of a camera GAMMA correction parameter, a compensation parameter of a camera sensor dead point, an image transformation parameter, an image alpha blending parameter table, and an image clipping region parameter.
  • Step 407 The PC sends the calculated first video processing parameter to the camera splicing fusion device.
  • the data transmission interface between the PC and the camera splicer can use an interface mode such as Ethernet or USB.
  • the transport protocol can use File Transfer Protocol (FTP), Hypertext Transfer Protocol (HTTP), or Transmission Control Protocol (TCP), User Data 4 User Protocol. Protocol, UDP) Customize the high-level transport protocol for transmission.
  • FTP File Transfer Protocol
  • HTTP Hypertext Transfer Protocol
  • TCP Transmission Control Protocol
  • UDP User Data 4 User Protocol. Protocol
  • the function of the PC further includes sending a configuration command to the camera stitching fuser.
  • Configuration commands can be transmitted in a variety of ways, such as through serial ports, parallel ports, or network interfaces. If it is transmitted through the network interface, you can use the remote login protocol (Telnet, Teletype network), or use the TCP protocol, UDP protocol to customize the high-level transport protocol for transmission.
  • Step 408 The camera stitching fuser configures the received first video processing parameter to the first video processing parameter used by the working state.
  • Step 307 The sending end zooms the size of the panoramic video image to a first target size.
  • the size of the video image may be Smaller, the video image is scaled to the size required by the user.
  • the first target size is determined manually according to actual conditions.
  • Step 308 The transmitting end divides the panoramic video image into at least two pieces of video data.
  • the panoramic video image is divided into three channels of video data and output to three video communication terminals, which can improve data processing speed and reduce errors. rate.
  • Step 309 The transmitting end separately encodes the at least two video data into a corresponding video code stream by using at least two video encoders, and sends the video code streams corresponding to the at least two video data separately.
  • the video encoders in the three video communication terminals respectively encode the three video data to obtain three video streams, and send the three video streams to the network, and the receiving end Received in the network.
  • the video encoder may be integrated in the video communication terminal or integrated in the camera stitching fuse.
  • the transmitting end and the receiving end need to perform synchronous encoding and decoding.
  • a video communication terminal can also be used to encode and transmit three channels of video data.
  • the advantage of this method is that the synchronization of the three channels of video data is relatively easy to implement, and the structure of the entire video communication system can also be optimized.
  • this method requires the video communication terminal to have higher encoding processing capability.
  • a common optical camera is used to capture a video image.
  • the principle and structure of the common optical camera are described in detail below:
  • 1001 is a prismatic structure having three surfaces 1002, 1003, 1004. These surfaces are flat mirrors, and three cameras C01, C02 and C03 are placed under the mirror surface.
  • the principle of virtual common optical center is explained by taking one of the cameras C02 as an example.
  • L02 is the incident ray
  • R02 is the reflected ray
  • the normal to the reflecting surface 1003 is 1006, the normal line 1006 and the horizontal line 1010 are clipped.
  • the camera will capture a virtual image with a virtual optical center V02.
  • the virtual optical center of the cameras C01, C02 and C03 can be located at the same point, so that the three images captured by the virtual common optical camera can be obtained, and the three images can be stitched and merged, and the Seamlessly stitched images at any depth.
  • FIG. 13 is a block diagram showing the construction of a common center camera used in an embodiment of the present invention.
  • the C01, C02 and C03 are 3 HD camera movements that support 1920x1080 HD video output. To get a better vertical eye-to-eye effect, place the mirror underneath and place the camera movement on top for shooting.
  • the surfaces 1002, 1003 and 1004 are mirror surfaces, and the three camera movements can be independently adjusted to compensate for structural machining errors and errors in the camera movement itself.
  • the adjustment freedom of the movement includes XYZ with the camera movement as the coordinate origin. Translation and rotation of the axis in 3 directions. When shooting, you need to adjust the camera's focal length to the same value to ensure that the range of viewing angles captured by each camera is the same.
  • Fig. 14 is a view showing the effect of mounting the common center camera 9 on the arc curtain 8 bracket 81 according to the embodiment of the present invention.
  • the camera's optical axis In order to capture a range of desktops, the camera's optical axis must have a downward tilt angle that can be adjusted by the device 91 mounted on the projection screen mount, which is 8.5 degrees in this embodiment.
  • Step 310 The receiving end acquires the at least two video code streams sent by the video communication sending end from the network, and passes the video of the receiving end.
  • the decoder separately decodes at least two video data from at least two video code streams;
  • the three video communication terminals at the receiving end acquire the three-way encoded video code streams from the network, and respectively decode them by the video decoder in the video communication terminal to obtain three channels of processed video data.
  • the video decoder may be integrated in the video communication terminal of the receiving end, or may be integrated in the display fusion device.
  • the three video streams received from the network side can be decoded by a single video communication terminal at the receiving end, and multiple decoders can be set in the single video communication terminal. The decoding process of the three video streams is completed.
  • a video communication terminal can also be used to receive and decode three video streams.
  • the advantage of this method is that the synchronization of multiple video data is relatively easy to implement, and the structure of the entire video communication system can also be obtained. Optimization, however, this method requires the video communication terminal to have higher decoding processing capabilities.
  • Step 311 The receiving end performs GAMMA correction on the at least two video data according to the GAMMA correction parameter in the second video processing parameter.
  • the display fuser checks whether the video display parameters required to process the video data have been configured before processing the obtained video data. If the parameter is not configured, the obtained three-way video data is transparently transmitted, that is, the video data is not processed, and is directly output to the display device display; if the parameter is already configured, video data processing is performed.
  • the display fuser sends the output three-way video data into three projectors, and there is brightness between the three channels of video data due to internal differences of the projector and differences between the projectors. And color difference, therefore, it is necessary to perform GAMMA correction of the projector in the display fuser before displaying the video data.
  • corrections can be made by shooting feedback.
  • a template image of 0-255 levels of three color components of RGB is projected, and a brightness and color difference curve between the three projectors can be established as compared with the RGB color components of the three-way panoramic image.
  • the abscissa is the color level of the R component of the template image, and the range is 0-255;
  • the color R component of the two of the data of the video data is considered to be a function f(R) of the R component of the template image. In this way, each projector can establish a curve of the color R component.
  • the difference ⁇ / of the R component of the two projectors can be calculated, and the variable can also be regarded as the A function of the R component of the template image.
  • the color R component of the two panoramic images of the two projectors to be displayed can be adjusted by adjusting the chromaticity curve of the R component of the other projector based on the chromaticity curve of the R component of the projector color. Consistent. Two more color components G The processing method is the same as that of B, and will not be described here.
  • the brightness and color difference inside the projector can also be corrected by a method similar to the above method. Taking a projector to project one video data of the three video data as an example, firstly, the video data of one channel is divided into blocks, and then a brightness and a color difference curve are established for each block data, and the specific implementation method is described. The method of establishing the brightness and color difference curves between the projectors will not be described here.
  • the projector's light leakage compensation must be performed. Since the projector leaks light when projecting a pure black image, the projected image is not pure black, but has a certain brightness, so the brightness of the three-way video data in the overlapping area and the brightness of the non-overlapping area Will not be consistent. A luminance difference between the overlapping area and the non-overlapping area is obtained by calculation, and the calculated luminance value is added to the non-overlapping area to make the overlapping area and the non-overlapping area uniform in brightness.
  • Step 312 The receiving end performs projection correction on the at least two video data according to the projection correction parameter in the second video processing parameter.
  • the display device can be a projector and a screen
  • the screen is an arc screen, an elliptical screen, or a parabola
  • the influence of the shape of the screen on the displayed image is considered. That is, when the video data is projected onto the screen of the above shape, deformation occurs, so that projection correction is performed.
  • Step 313 The receiving end transforms the at least two video data according to the transform parameter in the second video processing parameter.
  • the transforming includes: at least one of translation of video data, rotation of video data, and homography transformation of video data.
  • the video data transformation in this step can be used to compensate for image distortion and misalignment due to inaccurate placement of the projector.
  • the specific transformation method refer to the method for transforming the image at the transmitting end. The specific principle is not described here.
  • Step 314 The receiving end fuses the at least two video data according to the fusion parameter in the second video processing parameter.
  • the purpose of image fusion is to prevent the two video data from having a significant difference in brightness at the seam.
  • the specific method is to first create an overlapping area at the seam of the two video data, and then perform Alpha fusion in the overlapping area, preferably, using a method of nonlinear Alpha fusion, for example, using nonlinear Alpha fusion.
  • the formula is:
  • Step 315 The receiving end clips the ratio of the merged at least two pieces of video data to a second target ratio according to the trimming area parameter in the second video processing parameter.
  • the display fuser in order to be compatible with the display mode of the current video conference system, can be connected to three high-definition flat panel displays in addition to the projector, and the flat panel display has a frame thickness and cannot be seamlessly displayed. Therefore, you need to trim the image at the thickness of the border.
  • the second target ratio is determined based on a frame thickness of the display.
  • the display frame width is defined by the user, and the calibration software on the PC converts the width of the flat panel display in millimeters to the width in pixels according to the size and resolution of the flat panel display.
  • the display fuser clips each video data according to the calculated display frame width to generate a cropped image.
  • Step 316 The receiving end scales the size of the fused at least two video data to a second target size.
  • the size of the video data may become smaller, and the video data is scaled to a size required for display.
  • Step 317 The receiving end outputs the merged at least two pieces of video data to the display device, and the display device displays the merged at least two pieces of video data.
  • the display device includes a projector and a screen, and the screen may use an arc screen, or an elliptical screen, or a parabolic screen, or a folding screen, or a straight screen.
  • the screen may use an arc screen, or an elliptical screen, or a parabolic screen, or a folding screen, or a straight screen.
  • three projectors are provided to respectively project three channels of video data.
  • the video data can also be one way, that is, at the transmitting end, the panoramic video image is not divided. At this time, only one projector or one display can be used to display the obtained video data, but the display effect is poor.
  • an arc curtain is used as the projection screen.
  • the elliptical curtain and the parabolic curtain are similar to the arc curtain, except that since the geometry of the projection screen changes, the algorithm described in step 304 needs to be modified accordingly;
  • the screen or the curtain is used as the screen, since the video data is not distorted on the plane projection, the geometric correction processing of the video data is not performed, and further, the transition between the folding screens can be performed with an obtuse angle, or a circle can be used. The angle is transitioned, and the rounded transition is more natural than the obtuse angle transition. The larger the fillet radius, the better the transition effect, but the rounded corners of the transition require geometric correction of the video data.
  • the transformation of the image is the easiest.
  • the shape of the conference table can be modified accordingly to obtain a better rendering effect, for example, when the folding screen or the straight curtain is used, the conference table can be changed into the form of a folding table. .
  • the video communication terminal is used in steps 309 and 310.
  • the structure of the video communication terminal is described in detail below:
  • each component module of the video communication terminal includes: an audio codec, configured to encode or decode the received audio signal, and the codec standard may be Use G.711, or G.722, or G.723, or G.728, or G.729; video codec for encoding or decoding the received video signal, the coding standard can be used H.261, or H.263; a system control unit, configured to provide signaling for correct operation of the video communication terminal, the signaling including call control, capability exchange, command and indication signaling, and message; Format the audio, video, data, and control streams to be sent, form a message output to the network interface, or extract audio, video, data, and control streams from messages received from the network interface. In addition, the unit performs logical framing, sequence numbering, error detection, and error correction for each media type.
  • step 311 to step 315 the second video processing parameter is required to be used in the video data.
  • the second video processing parameter needs to be configured, and the video data is processed according to the configured parameters. The following describes the configuration method of the second video processing parameter in detail:
  • the display fuser may be combined with a PC, the PC is also referred to as a second processor, the second video processing parameter is calculated by the PC, and the second video processing parameter is configured by the display fuser. And processing the video data according to the parameter; the second video processing parameter may also be directly calculated by the display fusion device, that is, the process of calculating, configuring, and processing the video data directly by the display fusion device without interacting with the PC; or The process of calculating, configuring, and processing video data can also be performed by one or more PCs alone, without interacting with the display fuser.
  • the specific configuration method can be implemented by the steps shown in FIG. 17:
  • Step 501 starting the calibration software on the PC
  • the calibration software is the same as the transmitter calibration software, and is not described here.
  • Step 502 The PC logs in to the display cage through the calibration software.
  • Step 503 The PC calculates a second video processing parameter.
  • the parameters calculated by the PC include a projector GAMMA correction parameter, a video image projection correction parameter, a video image transformation parameter table, a video image Alpha fusion parameter table, and an image clipping region parameter.
  • Step 504 The PC sends the calculated second video processing parameter to the display fusion device.
  • the data transmission interface between the PC and the display fusion device can be transmitted by using an interface manner such as Ethernet or USB.
  • the protocol can be transmitted using the FTP protocol, the HTTP protocol, the custom TCP protocol or the UDP protocol. When the calculated amount of parameter data is large, it is transmitted to the display stitching fuse through the data transmission protocol.
  • the function of the PC further includes sending a configuration command to the display fuser. Similar to the configuration command sent by the PC to the camera splicing fuse, the configuration command can be transmitted in various ways, for example, through a serial port, a parallel port, or a network interface. If it is transmitted through the network interface, you can use the Telnet protocol, or the TCP protocol or UDP protocol for transmission.
  • Step 505 The display fuser configures the received second video processing parameter to a second video processing parameter used by the working state.
  • step 311 315 can be performed on the three video data, and the processed video data is displayed, as described in step 317. At this point, the method steps of the video communication receiving end are completed.
  • the video communication method provided by the embodiment of the present invention combines the acquired at least two video images into a panoramic video image at a transmitting end of the video communication, and the fused panoramic video image can more realistically represent the handover between adjacent video images.
  • the positional relationship of the area makes the last displayed image give the user a more realistic panoramic experience, and solves the problem that overlapping or missing areas of the adjacent video images captured by the camera have overlap and brightness and color inconsistency.
  • the video communication transmitting end encodes the panoramic video image into a video code stream and sends it to the video communication receiving end, and then the video communication receiving end further performs the fusion processing, and then outputs the merged processed video image to the display.
  • the device performs display, and the fusion processing performed by the video communication receiving end enables multiple projection images to be seamlessly presented on the curved screen, and the difference in color and brightness of each projection area is small, thereby improving the visual continuity of the panoramic video image. Can give users a better immersive panoramic experience.
  • an embodiment of the present invention further provides a device for video communication, where the video communication device is applied to a transmitting end of a video communication system, including:
  • the first obtaining unit 601 is configured to acquire at least two local video images
  • the first merging unit 602 is configured to combine the at least two local video images acquired by the first acquiring unit 601 according to the fused parameters in the first video processing parameter to generate a panoramic video image.
  • a first sending unit 603 configured to send the panoramic video image obtained by the first merging unit 602 to a video encoder, encode the panoramic video image into a video code stream by using the video encoder, and The code stream is sent to the remote video communication site.
  • the panoramic video image obtained by the fusion is obtained.
  • the image can more realistically represent the positional relationship of the intersection area between adjacent video images, so that the last displayed video image gives the user a more realistic panoramic experience, and solves the overlapping or missing area of the adjacent video image captured by the camera at the intersection. And the problem of inconsistent brightness and color.
  • the device for video communication further includes:
  • the synchronization unit 604 is configured to provide a synchronization clock, so that the first acquisition unit 601 performs acquisition of at least two local video images under the calibration of the synchronization clock.
  • a first GAMMA correction unit 605 configured to: according to the GAMMA correction parameter in the first video processing parameter, before the merging the at least two local video images acquired by the first acquiring unit 601, At least two local video images acquired by the obtaining unit 601 are subjected to GAMMA correction;
  • the camera stitching fusion device can receive the video image processed by the common optical center camera movement, it can also receive the common optical center camera sensor, such as a CCD or CMOS sensor, and the unprocessed video image is sent.
  • the fuser receives the unprocessed video image, it needs to perform GAMMA correction and sensor dead pixel compensation processing on the video image to improve the display quality of the video image.
  • the dead point compensation unit 606 is configured to: according to the dead point compensation parameter in the first video processing parameter, before the merging the at least two local video images acquired by the first acquiring unit 601, At least two local video images acquired by the obtaining unit 601 perform sensor dead pixel compensation;
  • the dead pixel compensation process may obtain the pixel at the dead pixel according to the difference of the adjacent pixel value of the dead pixel on the video image. Value, eliminates dead pixels on the video image and improves the quality of the video image display.
  • the first transform unit 607 is configured to: before the merging the at least two local video images acquired by the first acquiring unit 601, according to the transform parameters in the first video processing parameter, to the at least two locals
  • the video image is transformed; the transform includes: at least one of a panning of the video image, a rotation of the video image, a homography transformation of the video image, and a cylinder transformation of the video image Change.
  • the first clipping unit 608 is configured to, after merging the at least two local video images acquired by the first acquiring unit 601, according to the clipping region parameter in the first video processing parameter, by the first fusion
  • the ratio of the panoramic video image obtained by the unit 602 is trimmed to the first target ratio; in the present embodiment, the purpose of cropping the image is to eliminate the portion of the image that is not required to be displayed.
  • the first target ratio is artificially determined according to actual conditions.
  • a first scaling unit 609 configured to scale the size of the panoramic video image obtained by the first merging unit 602 to the first target after merging the at least two local video images acquired by the first acquiring unit 601 Size
  • the size of the video image may become smaller, and the video image is scaled to a size required by the user.
  • the first target size is determined manually based on actual conditions.
  • the splitting unit 610 is configured to split the panoramic video image obtained by the first merging unit 602 into at least two pieces of video data after merging at least two local video images acquired by the first acquiring unit 601.
  • the fused panoramic video image is divided into three video data and output to three video communication terminals, which can improve the data processing speed and reduce the error rate.
  • the communication device when the communication device does not need to interact with a PC, the communication device further includes:
  • a first collection unit 611 configured to collect at least two images
  • a first calculating unit 612 configured to calculate a first video processing parameter according to at least two images that are collected by the first collecting unit 611;
  • the first configuration unit 613 is configured to configure the first video processing parameter calculated by the first calculating unit 612 as the first video processing parameter used by the working state.
  • the communication device when the communication device needs to interact with a PC, the communication device further includes:
  • the receiving command unit 614 is configured to receive an image collection command sent by the first processor;
  • a second collection unit 615 is configured to collect at least two images
  • a second sending unit 616 configured to send at least two images that are collected by the second collecting unit 615 to the first processor
  • a first receiving parameter unit 617 configured to receive, by the first processor, the first video processing parameter calculated according to the at least two images collected by the second collecting unit 615;
  • the second configuration unit 618 is configured to configure the first video processing parameter received by the first receiving parameter unit 617 as the first video processing parameter used by the working state.
  • an embodiment of the present invention further provides a device for video communication, where the video communication device is applied to a receiving end of a video communication system, including:
  • a second acquiring unit 701 configured to acquire at least two pieces of video data decoded by the video decoder from the video code stream, where the video code stream is received by the video decoder from a remote video communication station;
  • the unit 702 is configured to fuse at least two pieces of video data acquired by the second acquiring unit 701 according to the fusion parameter in the second video processing parameter.
  • the output unit 703 is configured to output at least two pieces of video data that are fused by the second merging unit 702 to the display device, and display, by the display device, the fused at least two pieces of video data.
  • the device for video communication receives and decodes the video code stream sent by the video communication sender, and further integrates the video data obtained by the decoding by the second fusion unit, and then performs fusion processing by the output unit.
  • the subsequent video image is output to the display device for display, and the fusion processing performed by the video communication receiving end enables multiple projection images to be seamlessly displayed on the curved screen, and the difference in color and brightness of each projection area is small, and the panoramic video is improved.
  • the visual continuity of the image gives the user a better immersive panoramic experience.
  • the video communication device further includes:
  • a second GAMMA correction unit 704 configured to acquire, by the second acquiring unit 701 Before the at least two video data are merged, according to the GAMMA correction in the second video processing parameter;
  • GAMMA correction can be performed by means of shooting feedback to eliminate the influence of the projector on the display effect of the video image.
  • a projection correction unit 707 configured to: according to the projection correction parameter in the second video processing parameter, the second acquisition unit, before merging the at least two video data acquired by the second acquisition unit 701 Performing projection correction on at least two pieces of video data acquired by 701;
  • the second transform unit 708 is configured to: before the merging the at least two video data acquired by the second acquiring unit 701, the at least two video data according to the transform parameter in the second video processing parameter Transforming; the transforming comprises: at least one of translation of video data, rotation of video data, and homography transformation of video data.
  • the image transformation in this step can be used to compensate for image distortion and misalignment due to inaccurate placement of the projector.
  • a second clipping unit 705, configured to: after the merging the at least two video data acquired by the second acquiring unit 701, according to the clipping region parameter in the second video processing parameter, by the second fusion unit 702 The ratio of the at least two video data after the fusion is tailored to the second target ratio;
  • the display fuser in order to be compatible with the display mode of the current video conference system, can be connected to three high-definition flat panel displays in addition to the projector, and the flat panel display has a frame thickness and cannot be seamlessly displayed. Therefore, you need to trim the image at the thickness of the border.
  • the second target ratio is determined based on the frame thickness of the display.
  • the second scaling unit 706 is configured to: after merging the at least two pieces of video data acquired by the second acquiring unit 701, size the at least two pieces of video data that are fused by the second merging unit 702 to the first Two target size; In this embodiment, after the video data is clipped, the size of the video data may become smaller, and the video data is scaled to a size required for display.
  • the communication device when the communication device does not need to interact with a PC, the communication device further includes:
  • a second calculating unit 709 configured to calculate a second video processing parameter
  • the second video processing parameters include a projector GAMMA correction parameter, a video image projection correction parameter, a video image transformation parameter table, a video image Alpha fusion parameter table, and an image clipping region parameter.
  • the third configuration unit 710 is configured to configure the second video processing parameter calculated by the second calculating unit 709 as the second video processing parameter used by the working state.
  • the communication device when the communication device needs to interact with a PC, the communication device further includes:
  • the second receiving parameter unit 711 is configured to receive the second video processing parameter calculated by the second processor.
  • the parameter calculated by the second processor includes the projector GAMMA correction parameter and the video image projection correction parameter. , video image transformation parameter table, video image alpha blending parameter table and image clipping region parameters.
  • the fourth configuration unit 712 is configured to configure the second video processing parameter received by the second receiving parameter unit 711 as the second video processing parameter used by the working state.
  • the technical solution provided by the embodiment of the present invention can be applied to the technical field of video communication such as video conference.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Description

视频通信的方法、 装置和系统 技术领域
本发明涉及通信领域, 尤其涉及一种视频通信的方法、 装置和系统。 背景技术
现有技术中的远程呈现会议系统的大致布局如图 1 所示。 该系统包含三 个大屏幕显示器 1、 2、 3 , 三个高清摄像机 4、 5、 6, 用来拍摄会议桌 8前面 坐着的参会者 14至 19。每个显示器显示一部分参会者, 例如每个显示器显示 两个参会者, 3个显示器显示的内容构成一个完整的会议场景。
在实现本发明的过程中, 发明人发现现有技术中至少存在如下问题: 由 于多个摄像机之间和多个显示设备之间存在差异, 使得每幅图像之间会存在 亮度和颜色差异; 并且每两幅图像在交接处会存在图像过渡差异, 例如, 在 图 1 中, 摄像机 4和 5由于安放位置的限制, 造成相邻两个摄像机的拍摄区 域产生缺失或重叠, 例如摄像机 4会拍摄到 16的一部分图像, 或者 5会拍摄 到 15的一部分图像, 或者摄像机 4和 5都没有拍摄到 15和 16中间部分的区 域, 因而导致最终图像显示效果不理想, 不能给予用户一个无缝式的全景体 验。 发明内容
本发明的实施例提供一种视频通信的方法、 装置和系统, 能够生成大范 围、 高分辨率的全景视频图像并进行无缝呈现, 给用户提供一个较好的沉浸 式全景体验。
为达到上述目的, 本发明的实施例釆用如下技术方案:
一种视频通信站点, 包括:
至少两个本地摄像机, 用于指向至少两个本地用户部分, 拍摄至少两路 本地用户部分的本地视频图像; 本地摄像拼接融合器, 用于根据第一视频处理参数中的融合参数, 对所 述拍摄得到的至少两路本地用户部分的本地视频图像进行融合, 生成全景视 频图像; 并将所述全景视频图像编码成视频码流, 将所述视频码流发送给远 端的视频通信站点;
本地显示融合器, 用于从远端接收到的视频码流中分别解码得到至少两 路视频数据; 根据第二视频处理参数中的融合参数, 对所述解码得到的至少 两路视频数据进行融合; 将融合后的至少两路视频数据输出给本地显示设备; 至少两个本地显示设备, 用于显示经过所述本地显示融合器融合后的至 少两路视频数据。
提供一种视频通信的方法, 包括:
获取至少两路本地视频图像;
根据第一视频处理参数中的融合参数, 对所述至少两路本地视频图像进 行融合, 生成全景视频图像;
将所述全景视频图像发送给视频编码器, 通过所述视频编码器将所述全 景视频图像编码成视频码流, 并将所述视频码流发送出去。
还提供一种视频通信的方法, 包括:
获取视频解码器从视频码流中解码出的至少两路视频数据, 所述视频码 流由所述视频解码器从远端的视频通信站点接收得到;
根据第二视频处理参数中的融合参数, 对所述至少两路视频数据进行融 合;
将所述融合后的至少两路视频数据输出给显示设备, 由所述显示设备显 示所述融合后的至少两路视频数据。
还提供一种视频通信的装置, 包括:
第一获取单元, 用于获取至少两路本地视频图像;
第一融合单元, 用于根据第一视频处理参数中的融合参数, 对由所述第 一获取单元获取的至少两路本地视频图像进行融合, 生成全景视频图像; 第一发送单元, 用于将由所述第一融合单元获得的全景视频图像发送给 视频编码器, 通过所述视频编码器将所述全景视频图像编码成视频码流, 并 将所述视频码流发送出去。
还提供一种视频通信的装置, 包括:
第二获取单元, 用于获取视频解码器从视频码流中解码出的至少两路视 频数据, 所述视频码流由所述视频解码器从远端的视频通信站点接收得到; 第二融合单元, 用于根据第二视频处理参数中的融合参数, 对由所述第 二获取单元获取的至少两路视频数据进行融合;
输出单元, 用于将由所述第二融合单元融合后的至少两路视频数据输出 给显示设备, 由所述显示设备显示所述融合后的至少两路视频数据。
一种视频通信的系统, 包括至少两个视频通信站点。 所述至少两个视频 通信站点的其中一个站点, 用于拍摄至少两路本地用户部分的本地视频图像; 根据第一视频处理参数中的融合参数, 对所述拍摄得到的至少两路本地用户 部分的本地视频图像进行融合, 生成全景视频图像; 并将所述全景视频图像 编码成视频码流, 将所述视频码流通过网络发送出去; 所述至少两个视频通 信站点的至少一个站点, 作为接收站点, 用于从接收到的视频码流中分别解 码得到至少两路视频数据; 根据第二视频处理参数中的融合参数, 对所述解 码得到的至少两路视频数据进行融合; 将融合后的至少两路视频数据输出显 示。
本发明实施例提供的视频通信的方法、 装置和系统, 在视频通信的发送 端将所获取的至少两路视频图像融合为全景视频图像, 融合得到的全景视频 图像能够更真实地表现相邻视频图像之间交接区域位置关系, 使得最后显示 出的图像给用户更真实的全景式体验, 解决了摄像机拍摄的相邻视频图像在 交接处存在重叠或缺失区域、 并且亮度和颜色不一致的问题; 本发明实施例 中的视频通信发送端将融合后的全景视频图像编码成视频码流发送给视频通 信接收端后, 视频通信接收端对其进行进一步地融合处理, 再将融合处理后 的视频图像输出到显示设备进行显示, 视频通信接收端进行的融合处理能够 使得多个投影图像在弧形幕上无缝呈现, 并且各个投影区域在颜色和亮度方 面差异较小, 提高了全景视频图像的视觉连续性, 能够给用户更好的沉浸式 全景体验。 附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实 施例或现有技术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面 描述中的附图是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不 付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为现有技术中视频通信的系统布局示意图;
图 2为本发明实施例提供的视频通信系统布局的俯视图一;
图 3为本发明实施例提供的视频通信系统布局的俯视图二;
图 4为本发明实施例提供的视频通信系统布局的侧视图;
图 5为本发明实施例提供的视频通信系统布局的俯视图三;
图 6为本发明实施例提供的视频通信系统的设备连接图;
图 7为本发明实施例提供的视频通信的方法流程图;
图 8为本发明另一实施例提供的视频通信的方法流程图;
图 9为本发明又一实施例提供的视频通信发送端方法流程图;
图 10为本发明又一实施例提供的视频通信接收端方法流程图; 图 11为本发明实施例提供的共光心摄像机原理图一;
图 12为本发明实施例提供的共光心摄像机原理图二;
图 13为本发明实施例提供的共光心摄像机结构示意图;
图 14为本发明实施例提供的共光心摄像机和弧面幕的安装结构图; 图 15为本发明实施例提供的视频通信的方法中发送端的第一视频处理参 数的配置流程图; 图 17为本发明实施例提供的视频通信的方法中接收端的第二视频处理参 数的配置流程图;
图 18为本发明实施例提供的接收端投影仪之间的亮度和颜色差异曲线; 图 19为本发明实施例提供的发送端视频通信的装置结构示意图一; 图 20为本发明实施例提供的发送端视频通信的装置结构示意图二; 图 21为本发明实施例提供的发送端视频通信的装置结构示意图三; 图 22为本发明实施例提供的发送端视频通信的装置结构示意图四; 图 23为本发明实施例提供的接收端视频通信的装置结构示意图一; 图 24为本发明实施例提供的接收端视频通信的装置结构示意图二; 图 25为本发明实施例提供的接收端视频通信的装置结构示意图三; 图 26为本发明实施例提供的接收端视频通信的装置结构示意图四。 具体实施方式
为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本发 明实施例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。 基于 本发明中的实施例, 本领域普通技术人员在没有付出创造性劳动前提下所获 得的所有其他实施例, 都属于本发明保护的范围。
下面的实施例均以本方法应用于视频会议场景为例进行说明。
一种视频通信站点, 包括:
至少两个本地摄像机, 用于指向至少两个本地用户部分, 拍摄至少两路 本地用户部分的本地视频图像。 本发明实施例中, 所述至少两个本地摄像机 为一个具有三个机芯的共光心摄像机, 由同一个同步时钟来实现每个机芯之 间拍摄时间的同步。
本地摄像拼接融合器, 用于根据第一视频处理参数中的融合参数, 对所 述拍摄得到的至少两路本地用户部分的本地视频图像进行融合, 生成全景视 频图像; 并将所述全景视频图像编码成视频码流, 将所述视频码流发送给远 端的视频通信站点。 在本发明实施例中, 所述第一视频处理参数由所述本地 摄像拼接融合器计算获得, 或者由 PC机计算后, 发送给所述本地摄像拼接融 合器获得。 所述 PC机与所述本地摄像拼接融合器相连。 所述第一视频处理参 数包括: 融合参数、 GAMMA校正参数、 坏点补偿参数、 变换参数、 剪裁区 域参数。
本地显示融合器, 用于从远端接收到的视频码流中分别解码得到至少两 路视频数据; 根据第二视频处理参数中的融合参数, 对所述解码得到的至少 两路视频数据进行融合; 将融合后的至少两路视频数据输出给本地显示设备。 在本发明实施例中, 所述第二视频处理参数由所述本地显示融合器计算获得, 或者由 PC机计算后, 发送给所述本地显示融合器获得。 所述 PC机与所述本 地显示融合器相连。 所述第二视频处理参数包括: 融合参数、 GAMMA校正 参数、 投影校正参数、 变换参数、 剪裁区域参数。
至少两个本地显示设备, 用于显示经过所述本地显示融合器融合后的至 少两路视频数据。 在本发明实施例中, 所述至少两个本地显示设备可以是投 影机和屏幕, 或者显示器; 其中, 所述屏幕可以为弧面幕, 也可以为椭圓幕, 或抛物幕, 或折面幕, 或直幕, 所述显示器一般为高清平板显示器, 以获得 高清的视频图像。
本发明实施例提供的视频通信站点, 通过至少两个本地摄像机获取至少 两路本地视频图像, 将所获取的至少两路本地视频图像融合为全景视频图像, 融合得到的全景视频图像能够更真实地表现相邻视频图像之间交接区域位置 关系, 使得最后显示出的图像给用户更真实的全景式体验, 解决了摄像机拍 摄的相邻视频图像在交接处存在重叠或缺失区域、 并且亮度和颜色不一致的 问题。 并且, 由本地显示融合器接收远端的视频码流, 对其进行进一步地融 合处理, 再将融合处理后的视频数据输出到显示设备进行显示, 所述融合处 理能够使得多个投影图像在屏幕上无缝呈现, 并且各个投影区域在颜色和亮 度方面差异较小, 提高了全景视频图像的视觉连续性, 能够给用户更好的沉 浸式全景体验。
图 2显示了本发明所述的视频会议系统整体布局的俯视图, 包括: 一个共 光心摄像机 9, 用于釆集会议场景的图像数据, 获得三个视频图像, 其中, 所 述共光心摄像机的每个机芯的拍摄时间是同步的, 由同一个同步时钟来实现 这种同步; 一个弧形会议桌 7和多个用户座椅 1 ~ 6 , 其中, 用户表示出席视频 会议的一个或多个个人或者个人的群组, 在视频会议期间, 用户作为讲话者 来参与会话或者作为非讲话者参与; 一个弧面幕 8 , 三个投影仪 10, 11 , 12, 用于显示由显示融合器处理过的三路视频图像和共享数据信息; 摄像机的拍 摄区域是所述的三个本地摄像机的拍摄区域的并集, 摄像机的视角范围与摄 像机的个数和每个摄像机的拍摄视角有关。 由于本发明实施例中有 3个摄像 机, 每个摄像机的视角在 30度〜 45度之间, 因此所述视角范围为 90度〜 135度之 间, 优选地, 摄像机的视角选择 35度, 那么摄像机的视角范围为一个 105度的 圓弧; 投影屏幕以会议桌边缘中点为圓心, 半径在 2500毫米到 7500毫米之间 取值, 优选地, 投影屏幕的半径取为 2700毫米; 投影屏幕的弧长根据所述摄 像机的视角范围和投影屏幕的半径确定, 投影屏幕的高度与投影屏幕的弧长 和视频图像的比例有关, 优选地, 投影屏幕的弧长的取值大约为 4950毫米, 高的取值大约为 900毫米。 上述参数的设计保证了真人大小的视觉效果, 图中 桌面边沿 101处在显示屏上获得约 1 : 1的图像; 因为距离摄像机较近, 100处可 获得约 1.1 :1的投影图像, 反之 102处获得约 0.9:1的投影图像。 摄像机 9的虚拟 光心和投影屏幕 8的上表面中心位于同一垂直线上, 并且所述垂直线的距离约 为 100毫米; 13为背投箱体, 容纳了 3个投影仪 10 , 11 , 12。 其中投影仪釆用 背投的方式将图像投射在弧面幕 8上, 该背投箱体可以设计为暗室, 使得弧面 幕 8上的图像尽可能的少受外界光线的影响, 以获得更好的投影效果; 当然, 除了背投方式外, 还可以釆用前投的方式来实现图像的显示。 标号为 14, 15 , 16为三个麦克风, 用于釆集本地音频信号; 标号为 20, 21 , 22为三个扬声器, 用于输出通过网络传输过来的远端会场音频信号。 图 3显示了本发明所述的视频会议系统整体布局的另一个俯视图。 与图 2 的不同之处在于, 该会议室布局方案釆用了多排用户座位设置(图 3中显示为 两排用户座位)。 在图 2原有的一排会议桌 7的前后, 还可以增加一排或多排会 议桌和相应的座位。 例如, 图 3中增加了一排会议桌 104 , 并增加了座椅 101 ~ 103。 由于后排会议桌 7的座椅距离显示屏较远, 而且有前面一排参会者的遮 挡, 因而体验会变差。 为了解决这个问题, 可以将后排的会议桌和座椅整体 提升一个高度, 形成阶梯状的会议室布局, 并且在设计座位时尽量使后排的 参会者位于前排两个参会者的中间。 这样后排的参会者就不会被前排遮挡, 可以改善用户体验。
图 4显示了本发明所述的视频会议系统整体布局的侧面图 (以一个用户的 侧面为例)。摄像机光心 0位于屏幕后方 100毫米处, 并距离有效屏幕上边缘的 下方 100毫米。 摄像机的拍摄垂直视角约为 20度。 由于摄像机无法放置在用户 的水平视线位置上, 因此摄像机光轴需要向下倾斜一个预定角度, 在本发明 实施例中此角度为 8.5度。 为了看到桌面, 在设计时使桌边 300毫米的弧带在弧 面幕上显示 100毫米的高度, 这样人像可呈现的高度范围约为 800毫米。 人眼 在中间位置, 可以计算出垂直眼对眼偏差角度约为 6.2度, 接近眼神接触角度 偏差可觉察门限 5度, 因此可以获得较好的眼对眼的效果。
可选的, 在本实施例中, 弧面幕不仅可以显示视频数据, 也可以显示共 享数据信息, 并可以根据用户的观看位置进行灵活的配置。 例如: 可以通过 在某一个弧面幕的显示区域划出一个空间, 用来显示该共享数据信息。 共享 数据信息可以包括: 分享的文字、 图片以及视频信息。 所述的共享数据信息 源可以预先存储在本端, 也可以通过网络传输的方式由远端进行分享。 当然, 该共享数据信息还可以通过在另设的至少一个显示装置显示, 该至少一个显 示装置可以布置在会议室的一端, 也可以布置在所述至少一个用来显示远端 会场的显示装置的延伸处。 如图 5 所示, 可以对弧面幕进行扩展, 增加另一 显示设备来显示共享数据信息。 例如可以增加 2个投影显示区域 4, 5。 原来 的投影区域 1、 2、 3 也可以配置成显示共享数据信息。 当用户坐在位置 102 看投影区域 2时, 投影区域 2可以显示远端会场图像, 也可以显示共享数据 信息。 当用户坐在位置 102看投影区域 2和 4时, 2可以配置成显示远端会场 图像, 4可以配置成显示共享数据信息。 当用户坐在位置 102看投影区域 2、 4和 5时, 4和 5可以配置成显示远端会场图像, 2可以配置成显示的共享数 据信息。 这样参会双方都能获得共同看共享数据信息的体验。
图 6显示了本发明实施例提供的两个视频通信站点设备连接图。 发送端的 共光心摄像机 1釆集会场视频图像, 输出三路视频图像(一般釆用 1920x 1080 的高清视频图像)给发送端的摄像拼接融合器。 由于共光心摄像机 1釆集的原 始的三个视频图像无法简单地组合成一个理想的会场全景图像, 因此摄像拼 接融合器需要对这三路视频图像进行处理, 根据第一视频处理参数中的融合 参数, 对所述三路视频图像进行融合, 生成大约为 48: 9的高分辨率的会场全 景视频图像。 全景视频图像可以通过三路方式输出到发送端的三台视频通信 终端, 视频通信终端分别对每路视频图像进行编码, 并将编码后得到的视频 码流封装成网络数据包, 通过网络发送给远端的视频通信站点。
其中, 在本发明的实施例中, 所釆用的网络具体体现于网络中的设备, 包括硬件和任何适当的控制逻辑, 用于互联与网络相耦合的各个元件并辅助 如本实施例所示的各个站点之间的通信。 网络可以包括局域网 (LAN )、 城域 网 (MAN )、 广域网 (WAN )、 任何其他公共或私有网络、 局部、 区域或全球 通信网络、 企业内部互联网络、 其他合适的有线或无线通信链路、 或者前面 各项的任意组合。 网络可以包括网关、 路由器、 集线器、 交换机、 接入点、 基站、 和任何其他硬件、 软件, 或者可以实现任何合适的协议或通信的前面 各项的组合。
接收端接收网络中数据包, 并使用三台视频通信终端中的视频解码单元 对三路视频码流进行解码, 获得三路视频数据, 然后输出给显示融合器。 显 示融合器根据第二视频处理参数中的融合参数, 对所述三路视频数据进行融 合, 最后输出给三台投影仪 2 , 投影到弧面幕上形成一个约为 48: 9的高分辨 率的全景无缝视频图像。 其中, 3和 4为两个 PC工作站, 作为调校工具, 分别 用于和摄像拼接融合器、 显示融合器一起完成图像显示前的融合处理。
需要说明的是, 所述视频编码器可以集成在所述摄像拼接融合器中; 或 者当所述系统包括发送端视频通信终端时, 所述视频编码器也可以集成在所 述发送端视频通信终端中。 所述视频解码器可以集成在所述显示融合器中; 或者当所述系统还包括接收端视频通信终端时, 所述视频解码器也可以集成 在所述接收端视频通信终端中。
本实施例中将摄像拼接融合器置于发送端, 将显示融合器置于接收端。 事实上, 所述显示融合器也可以置于发送端, 与所述摄像拼接融合器相连, 并置于摄像拼接融合器之后, 显示融合器处理视频图像时所需的参数可从接 收端获取; 所述摄像拼接融合器也可以置于接收端, 与所述显示融合器相连, 并置于显示融合器之前, 摄像拼接融合器处理视频图像时所需的参数可从发 送端获取。
本发明实施例提供的视频通信的系统, 在视频通信的发送端将所获取的 至少两路视频图像融合为全景视频图像, 融合得到的全景视频图像能够更真 实地表现相邻视频图像之间交接区域位置关系, 使得最后显示出的图像给用 户更真实的全景式体验, 解决了摄像机拍摄的相邻视频图像在交接处存在重 叠或缺失区域、 并且亮度和颜色不一致的问题。 本发明实施例中的视频通信 发送端将全景视频图像编码成视频码流发送给视频通信接收端后, 视频通信 接收端对其进行进一步地融合处理, 再将融合处理后的视频图像输出到显示 设备进行显示, 视频通信接收端进行的融合处理能够使得多个投影图像在屏 幕上无缝呈现, 并且各个投影区域在颜色和亮度方面差异较小, 提高了全景 视频图像的视觉连续性, 能够给用户更好的沉浸式全景体验。
由于不同设备之间的工作状态会存在差异, 即使在同一工作环境下, 也 会由于其器件的性质不同, 导致输出的结果不同。 为了解决现有技术中存在 的利用多个摄像机拍摄多幅图像, 造成了每幅图像之间存在亮度和颜色差异、 并且在图像交接区域处的图像显示效果不理想的问题, 本发明实施例提供一 种视频通信的方法和装置。
如图 7所示, 本发明实施例提供的视频通信的方法, 包括:
步骤 101 , 获取至少两路本地视频图像;
本实施例中, 所述获取到的至少两路本地视频图像由共光心摄像机拍摄 得到。 具体地, 用户首先通过一台 PC机登录到摄像拼接融合器上, 并通过 PC机向摄像拼接融合器发出图像釆集命令; 摄像拼接融合器接到釆集图像的 命令后, 从共光心摄像机中获取由所述摄像机拍摄到的至少两路视频图像, 并将其保存在摄像拼接融合器的緩存中。 当然, 拍摄得到至少两路视频图像 的设备不局限于共光心摄像机, 此处不再——列举。
步骤 102,根据第一视频处理参数中的融合参数,对所述至少两路本地视 频图像进行融合, 生成全景视频图像;
在本实施例中, 融合参数是第一视频处理参数中的一部分参数, 第一视 频处理参数是根据所获取到的视频图像。 具体地, 将在步骤 101 中获取的至 少两路视频图像传输给所述 PC机, 由 PC机根据所述视频图像来计算所需要 的融合参数。 PC机将计算出的融合参数传输给摄像拼接融合器, 由摄像拼接 融合器将接收到的融合参数配置为工作状态所要使用的参数, 并根据所配置 的融合参数将所述至少两路视频图像拼接为一路全景视频图像。 可以理解的 是, 除了上述实现方法外, 也可以在摄像拼接融合器中计算融合参数, 直接 由摄像拼接融合器完成计算、 配置、 融合等全部的过程, 而不需要与 PC机进 行交互, 这样就需要摄像拼接融合器能够获取到各个本地摄像机的相应参数, 进而确定融合参数以及进行 GAMMA校正、 传感器坏点补偿、 图像处理的相 关变换以及剪裁、 缩放、 分割的相关参数; 或者, 也可以单独由一台或多台 PC机完成计算、 配置、 融合的过程, 而不需要与摄像拼接融合器进行交互。 实际的产品实现方式可由用户的具体需求确定, 此处不再赘述。 步骤 103 , 将所述全景视频图像发送给视频编码器, 通过所述视频编码器 将所述全景视频图像编码成视频码流, 并将视频码流发送出去。
所述视频编码器可以集成在发送端的视频通信终端中, 也可以集成在所 述摄像拼接融合器中。 在本实施例中, 由发送端的视频通信终端对融合后的 全景视频图像进行编码, 并将编码后的视频码流发送到网络。 视频通信接收 端从网络中接收所述视频码流。
在点对点( Point To Point )的视频通信结构中, 编码后的视频码流通过网 络发送到接收端, 而对于点对多点 ( Point To Multi-Point ) 的视频通信的结构 中, 编码后的视频码流可能会发送给相应的多点通信的服务器, 由多点通信 服务器进行多点融合处理后, 在发给相应的接收端。
本发明实施例提供的视频通信的方法, 将所获取的至少两路视频图像融 合为一路全景视频图像, 融合得到的全景视频图像能够更真实地表现相邻视 频图像之间交接区域位置关系, 使得最后显示出的图像给用户更真实的全景 式体验, 解决了摄像机拍摄的相邻视频图像在交接处存在重叠和缺失区域、 并且亮度和颜色不一致的问题。
如图 8所示, 本发明另一个实施例还提供一种视频通信的方法, 包括: 步骤 201 , 获取视频解码器从视频码流中解码出的至少两路视频数据, 所 述视频码流由所述视频解码器从远端的视频通信站点接收得到;
在本实施例中, 发送端将视频码流分割为至少两路视频码流发送, 可以 提高处理速度, 减小出错率。 在接收端, 由视频解码器从网络中接收所述至 少两路视频码流, 并分别对所述视频码流进行解码, 获得至少两路视频数据。 其中, 所述视频解码器可以集成在接收端的视频通信终端中, 也可以集成在 显示融合器中。
步骤 202,根据第二视频处理参数中的融合参数,对所述至少两路视频数 据进行融合。
在本实施例中, 由于显示设备之间的差异, 使得所述至少两路视频数据 之间存在颜色和亮度的差异, 对其进行融合, 以消除此差异。 与发送端的图 像融合操作类似, 首先由 PC机计算出融合参数, 将其发送给显示融合器; 显 示融合器将接收到的融合参数配置为工作状态使用的融合参数, 并根据配置 的融合参数对所述至少两路视频数据进行融合。 可以理解的是, 本发明实施 例中的计算第二视频处理参数的过程也可以在显示融合器中完成, 即直接由 显示融合器完成计算、 配置、 融合的全部过程, 而不需要与 PC机进行交互, 这就需要摄像拼接融合器能够获取到各个本地摄像机的相应参数, 进而确定 融合参数以及进行 GAMMA校正、 传感器坏点补偿、 图像处理的相关变换以 及剪裁、 缩放、 分割的相关参数; 或者, 也可以单独由一台或多台 PC机完成 计算、 配置、 融合的过程, 而不需要与显示融合器进行交互。 实际实现方式 可视用户的具体需求确定, 此处不再赘述。
步骤 203 , 将所述融合后的至少两路视频数据输出给显示设备, 由所述显 示设备显示所述融合后的至少两路视频数据。
在本发明的实施例中, 所述显示设备包括投影仪和屏幕, 或者显示器; 其中, 所述屏幕不限于弧面幕, 还可以为椭圓幕, 或抛物幕, 或折面幕, 或 直幕, 所述显示器一般为高清平板显示器, 以获得高清的视频图像。 并且, 当显示设备为投影仪和弧面幕, 或者投影仪和椭圓幕, 或者投影仪和抛物幕 时, 在对所述至少两路视频数据进行融合之前, 还要根据投影校正参数, 对 所述至少两路视频数据进行投影校正, 以消除屏幕形状的变换对于图像显示 效果的影响。
本发明实施例提供的视频通信的方法, 将融合后的全景视频图像编码成 视频码流发送给视频通信接收端后, 视频通信接收端对其进行进一步地融合 处理, 再将融合处理后的视频图像输出到显示设备进行显示, 视频通信接收 端进行的融合处理能够使得多个投影图像在弧形幕上无缝呈现, 并且各个投 影区域在颜色和亮度方面差异较小, 提高了全景视频图像的视觉连续性, 能 够给用户更好的沉浸式全景体验。 为了使本领域技术人员能够更清楚地理解本发明实施例提供的技术方 说明。
如图 9和图 10所示, 本发明又一实施例提供的视频通信的方法。
下面具体描述视频通信发送端的方法流程, 如图 9所示, 包括以下步骤: 步骤 301 , 发送端获取至少两路本地视频图像;
在本实施例中, 所述获取到的至少两路本地视频图像由共光心摄像机拍 摄得到。 发送端的视频图像釆集是由 PC机登录到摄像拼接融合器后, 向摄像 拼接融合器发出图像釆集的命令, 摄像拼接融合器通过共光心摄像机拍摄至 少两路视频图像而获得的, 在本实施例中, 共光心摄像机机芯有三个, 釆集 三路视频会议场景, 每路视频图像的分辨率为 16: 9或者 4: 3。 如图 2的实 施例所示, 每路视频图像由共光心摄像机的一个拍摄得到, 在图 2 中, 共光 心摄像机包括三个摄像机, 分别为: 左摄像机、 中摄像机以及右摄像机, 其 中, 左摄像机拍摄到用户座椅编号为 1、 2的与会者, 中摄像机拍摄到用户座 椅编号为 3、 4的与会者, 右摄像机拍摄到用户座椅编号为 5、 6的与会者, 所述的共光心摄像机能够拍摄到所有参会人员, 并且, 以上三个摄像机的拍 摄时间是同步的。
步骤 302, 发送端根据第一视频处理参数中的 GAMMA校正参数, 对所 述至少两路本地视频图像进行 GAMMA校正;
在本实施例中, 摄像拼接融合器在对所获取的图像进行处理之前, 需要 检查处理图像所需要的视频处理参数是否已经配置。 所述的视频处理参数在 本实施例之中, 指代的是第一视频处理参数。 如果所述参数没有配置, 对所 接收到的视频图像进行透传, 即不对所述视频图像进行处理, 直接输出; 如 果所述参数已经配置, 进行图像处理。 在参数已经配置的情况下, 还需要具 体判断配置了哪些参数, 例如, 如果只配置了 GAMMA参数, 而没有配置传 感器坏点补偿参数, 则只进行 GAMMA校正操作。 由于摄像拼接融合器可以 接收共光心摄像机机芯处理后的视频图像, 也可以接收共光心摄像机传感器, 如电荷耦合器件 (Charge Coupled Device, CCD )或互补金属氧化物半导体 ( Complementary Metal Oxide Semiconductor, CMOS )传感器送出的没有经过 处理的视频图像, 当摄像拼接融合器接收的是没有经过处理的视频图像时, 需要对所述视频图像进行 GAMMA校正和传感器坏点补偿处理。
步骤 303 ,发送端根据第一视频处理参数中的坏点补偿参数, 对所述至少 两路本地视频图像进行传感器坏点补偿;
在本实施例中, 如果输出视频图像的传感器存在坏点, 导致所述视频图 像上也存在坏点, 坏点补偿处理可以根据视频图像上坏点的邻近像素值差值 得到坏点处的像素值, 由于具体的像素插值属于图像处理中的现有技术, 可 以包括许多方式, 在此不再赘述。
步骤 304,发送端根据第一视频处理参数中的变换参数,对所述至少两路 本地视频图像进行变换;
在本实施例中, 所述变换包括: 视频图像的平移、 视频图像的旋转、 视 频图像的单应性变换和视频图像的柱面变换中的任意一种及其组合。
在本实施例中, 首先根据射影几何原理, 将空间中的三维点投影到共光 心摄像机的成像平面上, 所述三维点与平面点之间的坐标变换关系为:
= K[R I t]X 式子 ( 1 )
K 式子 ( 2 )
0 0 1
其中 为平面坐标的齐次表示, X 为世界坐标系的齐次表示, Λ和/ ^为 水平和垂直方向上的等效焦距, s为图像的畸变系数, MQ , I¾为图像主点坐标; R 为摄像机的旋转矩阵, t为摄像机平移向量。 其中 ,Κ称为摄像机的内参, 包括 水平和垂直方向上的等效焦距、 图像的畸变系数、 图像主点坐标; R和 t称为 摄像机的外参。 将三维点转换为平面点之后, 可以有下述三种方法进行视频图像的变换: 方法一, 对于共光心摄像机中的三个摄像机机芯拍摄的具有重叠区域的 三个视频图像, 空间中某个平面上的点在其中两个视频图像上的成像关系为: κ
χ' = Hx = κ 〃 h22 式子 (3)
h 其中 Η为一个 3χ3的矩阵, 自由度为 8, 其代表了两个成像平面之间的变 换关系, 称之为单应性矩阵。 X为变换前图像坐标的齐次表示, χ'为变换后图 像坐标的齐次表示。
对于共光心摄像机, 不考虑参数 t,因此 Η可以表示为:
H-K^ K 1 式子 (4)
假设已知变换前和变换后图像上的一个点对坐标(X, y)和( x、 y' ), 可以得到两个方程: χ = hux + hn + h = x + h22 + h23 式子 (5)
h3lx + h32y + h33 h3lx + h32y + h33 由于 H的自由度为 8, 因此最少只要通过 4对点对建立 8个方程就可以求 出单应性矩阵 H。求出单应性矩阵 H后,可以通过一个坐标变换将两个图像拼 接到一起, 将重叠区域的像素对齐。 H的计算方法有多种, 一种是手动的方 法, 由用户至少选择变换前图像上的 4个点的坐标, 以及该 4个点在变换后 图像上的坐标。 根据这 4个点对的坐标我们可以利用式子(5)建立包括至少 8个方程的方程组, 求解出单应性矩阵 H。 另一种方法是自动的方法, 该方法 要求两个图像之间具有较大的重叠区域。 可以通过特征点提取算法, 例如尺 度不变性变换算法( Scale-invariant feature transform, SIFT ), 在重叠区域进行 特征点提取, 找到多个特征点, 建立特征点之间的匹配关系, 再利用式子(5) 建立包括至少 8个方程的方程组, 通过迭代优化算法求出两个图像之间的单 应性矩阵11。
方法二, 由于方法一中求单应性矩阵 H比较复杂, 对于图像变化较小的 情况, 也可以利用仿射变换来模拟单应性变换。 可以釆用下面的变换公式:
χ' = S[R I T]x 式子 (6 ) s =
Figure imgf000019_0001
其中 S为一个图像缩放矩阵, R为二维旋转矩阵, T为平移向量。 X为变 换前图像坐标的齐次表示, x'为变换后图像坐标的齐次表示。
方法三, 利用柱面坐标变换将平面坐标转换为柱面坐标, 在柱面坐标下 通过对图像的平移来进行图像拼接。 柱面坐标的变换和反变换为:
X = 5 tan"1― γ' = s . ^ = 式子 ( 7 ) x = f tan— y = /^-sec— 式子 ( 8 )
s s s
需要说明的是, 步骤 302、 303以及 304是现有的实现步骤, 更换上述三个 步骤的顺序也不影响本发明的达到的效果。
步骤 305 ,发送端根据第一视频处理参数中的融合参数,对所述至少两路 本地视频图像进行融合, 生成全景视频图像;
在本实施例中, 经过步骤 304 图像变换后, 一般还是无法得到比较理想 的无缝图像, 还必须考虑共光心摄像机机芯拍摄的视频图像由于曝光或者视 频图像之间的颜色差异导致的视频图像在亮度或色度上的差异, 所述差异在 两个视频图像的接缝处尤其明显, 因此需要对多个图像进行融合, 以消除不 同图像间亮度或色度等的差异。 在本实施例中, 可以在所述视频图像接缝处 的重叠区域进行 Alpha融合, 该 Alpha融合的公式为:
/(X, y) = , (x, y)I ( , y) + 2 ( , y)I2 (x, y) 式子 ( 9 ) 其中 c¾ (x, y)为视频图像 1像素(X, y)的 Alpha值, (x, y)为视频图像 1像 素(x, _y)的颜色值; 和 /2 (x, _y)为视频图像 2像素(x, _y)的 Alpha值和颜 色值。 对于简单的线性 Alpha融合, ^, + ^^, 二:!。 所述 Alpha融合一 般只能对视频图像接缝处的亮度或色度差异进行融合, 如果视频图像之间本 身的亮度或色度差异较大, 使用所述 Alpha 融合不能获得良好的效果, 这时 可以在整个视频图像上进行拉普拉斯金字塔融合, 或者梯度阔值融合, 或者 泊松融合, 此处不再对其具体的融合原理进行赘述。
步骤 306,发送端根据第一视频处理参数中的剪裁区域参数, 将所述全景 视频图像的比例剪裁为第一目标比例;
本实施例中, 对于图像进行剪裁的目的是消除图像中不需要显示的部分。 所述第一目标比例根据实际情况人为确定。
在步骤 302〜步骤 306中需要用到第一视频处理参数, 在对视频图像进行 处理之前, 需要先配置第一视频处理参数, 根据所配置的参数对视频图像进 行处理。 下面对于所述第一视频处理参数的配置方法进行详细描述:
本实施例中可以将摄像拼接融合器与 PC机进行结合, 所述 PC机也称为 第一处理机, 由所述 PC机计算第一视频处理参数, 由摄像拼接融合器配置第 一视频处理参数, 并根据所述参数处理图像; 除了上述实现方法外, 也可以 由摄像拼接融合器釆集至少两幅图像, 由摄像拼接融合器根据所釆集到的图 像计算视频处理参数, 即直接由摄像拼接融合器完成计算、 配置、 处理图像 的过程, 而不需要与 PC机进行交互, 例如, 用户可以通过遥控器、 鼠标等方 法手工控制摄像拼接融合器生成所需要的视频处理参数, 也可以釆用自动算 法由摄像拼接融合器自动生成所需要的视频处理参数; 或者, 也可以单独由 一台或多台 PC机完成计算、 配置、 处理图像的过程, 而不需要与摄像拼接融 合器进行交互, 例如, 用户通过所述 PC机上的调校软件自动生成所需要的视 频处理参数, 再根据所述参数直接在 PC机上进行图像处理, 其中, 计算参数 部分由中央处理器( Central Processing Unit, CPU ) 完成, 处理图像部分可由 CPU或图形处理器 (Graphic Processing Unit, GPU)完成。 如果单台 PC机无法 完成所述图像处理, 可以釆用多台 PC机进行联网分布式计算, 所述多台 PC 机之间通过高速以太网进行互联。 上述多种方法的实际实现方式可视用户的 具体需求确定, 此处不再赘述。 需要说明的是, 上述实施例所说明的 PC机只是具体的一种实现方式, 事 实上, 釆用具有音视频输入输出设备的具有处理器的装置就能够完成上述的 图像处理。 随着云计算技术的发展, 针对本实施例还可以通过在远程呈现管 理服务器设置处理器阵列, 统一由服务器侧完成对釆集的图像进行相应的图 像处理。
当釆用摄像拼接融合器和 PC机(也称第一处理机 )共同完成视频处理参 数的计算和配置时, 具体的配置方法可以通过如图 15中所示的步骤来实现: 步骤 401 , 启动 PC机上的调校软件;
本实施例中, 所述调校软件具有 GUI界面。 如图 16所示, 所述 GUI界 面包括菜单栏, 工具栏, Tab栏, 显示区, 状态栏和对话框。 其中, 菜单栏用 于用户选择相关的命令, 支持鼠标和键盘快捷键操作; 工具栏用于用户快速 选择常用的命令; Tab栏用于列出打开的图像, 用户可以在打开的图像之间进 行切换和关闭打开的图像; 显示区用于显示用户操作的当前图像, 支持滚动 条, 无法在当前窗口显示的图像内容用户可以拖动滚动条进行查看。 用户可 以利用鼠标和键盘在显示区域对需要拼接的图像进行交互式操作, 如对图像 调整变换和融合参数, 实时查看效果等; 状态栏用于显示一些当前重要的信 息, 如图像大小, 当前鼠标坐标等; 对话框由菜单栏或工具栏激发, 用于完 成需要用户键盘输入等的复杂的工作任务等。 用户得到了满意的图像拼接融 合效果后, 可以通过一个简单的命令生成摄像拼接融合器和显示融合器所需 的图像变换参数、 Alpha融合和 GAMMA校正等参数, 并传输给摄像拼接融 合器和显示融合器。
步骤 402, PC机通过所述调校软件登录到摄像拼接融合器上;
所述调校软件可以作为第三方软件本地化安装在所述的 PC机上,也可以 通过内置于所述的 PC机中的 WEB页面来访问服务器运行。
步骤 403 , PC机向摄像拼接融合器发送图像釆集命令;
步骤 404, 摄像拼接融合器从共光心摄像机中获取釆集到的至少两幅图 像;
在本实施例中, 摄像拼接融合器接到视频图像釆集命令后, 从共光心摄 像机中获取釆集到的 3个视频图像, 并保存在摄像拼接融合器的緩存中。
步骤 405,摄像拼接融合器将所釆集到的至少两幅图像发送给所述 PC机; 在本实施例中, 摄像拼接融合器通过数据传输协议将緩存中的 3 个视频 图像发送给 PC机。
步骤 406, PC机根据所釆集到的图像计算第一视频处理参数;
在本实施例中, PC机计算出的参数包括摄像机 GAMMA校正参数、 摄 像机传感器坏点的补偿参数、 图像变换参数、 图像 Alpha融合参数表和图像 剪裁区域参数的一种或多种。
步骤 407, PC机将计算出的第一视频处理参数发送给摄像拼接融合器; 在本实施例中, PC机和摄像拼接融合器之间的数据传输接口可以釆用以 太网、 USB 等接口方式, 传输协议可以釆用文件传输协议 (File Transfer Protocol, FTP ), 超文本传输协议( Hypertext Transfer Protocol, HTTP )、 或使 用传输控制协议 (Transmission Control Protocol, TCP), 用户数据 4艮协议 ( User Datagram Protocol , UDP ) 自定义高层传输协议进行传输。 当计算得到的参数 数据量较大时, 通过数据传输协议传输给摄像拼接融合器。
在本实施例中, PC机的功能还包括向摄像拼接融合器发送配置命令。 配 置命令可以通过多种方式进行传输, 例如通过串行端口、 并行端口或网络接 口等进行传输。 如果通过网络接口传输, 可以釆用远程登录协议 (Telnet, Teletype network ), 或者是使用 TCP协议、 UDP协议自定义高层传输协议进 行传输。
步骤 408,摄像拼接融合器将接收到的第一视频处理参数配置为工作状态 使用的第一视频处理参数。
步骤 307 , 发送端将所述全景视频图像的大小缩放至第一目标大小; 本实施例中, 步骤 306对视频图像进行剪裁后, 视频图像的大小可能会 变小, 对视频图像进行缩放, 使其尺寸达到用户所需要的大小。 所述第一目 标大小根据实际情况人为确定。
步骤 308, 发送端将所述全景视频图像分割为至少两路视频数据; 本实施例中, 将全景视频图像分割为三路视频数据输出给三个视频通信 终端, 能够提高数据处理速度, 降低出错率。 当然, 也可以不分割全景视频 图像, 直接对其编码发送, 但此种方法的视频图像显示效果较差。
步骤 309,发送端通过至少两个视频编码器将所述至少两路视频数据分别 编码成对应的视频码流, 并将所述至少两路视频数据对应的视频码流分别发 送出去;
本实施例中, 由三个视频通信终端中的视频编码器对所述三路视频数据 分别编码, 获得三路视频码流, 并将这三路视频码流发送到网络中, 由接收 端从网络中接收。 其中, 所述视频编码器可以集成在视频通信终端中, 也可 以集成在摄像拼接融合器中。 为了保证端到端的同步, 发送端和接收端需要 进行同步的编码和解码。 为了防止网络抖动等因素带来的编解码不同步, 还 需要在视频码流中进行标记, 例如, 在视频码流的数据包上打上时间戳, 以 保证接收端的视频解码器能够按照正确的顺序解码。
本实施例中, 也可以釆用一台视频通信终端对三路视频数据进行编码和 发送, 这种方法的优点是三路视频数据的同步比较容易实现, 整个视频通信 系统的结构也可以得到优化, 但是, 这种方法要求视频通信终端具有更高的 编码处理能力。
在步骤 301 中用到共光心摄像机拍摄视频图像, 下面对于共光心摄像机 的原理及结构进行详细描述:
如图 11所示, 1001为一个棱台结构, 具有 3个表面 1002 , 1003 , 1004 , 这 些表面为平面镜面, 镜面的下方放置 3个摄像机 C01 , C02和 C03。 以其中的一 个摄像机 C02为例说明虚拟共光心原理。 如图 12所示, L02为入射光线, R02 为反射光线, 垂直于反射面 1003的法线为 1006 , 法线 1006和水平线 1010的夹 角为 (9 = 45° , 反射点到摄像机 C02的实际光心 02的垂直距离为 d。 根据光线反 射原理, 摄像机会拍摄到一个虚像, 该虚像有一个虚拟光心 V02。 通过设计 镜面的角度和摄像机的摆放位置, 可以使摄像机 C01 , C02和 C03的虚拟光心 位于同一点, 从而得到虚拟共光心摄像机拍摄得到的 3个图像, 对这 3个图像 进行拼接融合处理, 可以得到在任意深度上都是无缝拼接的图像。
图 13显示了本发明实施例所用共光心摄像机的结构图。 C01 , C02和 C03 为 3台高清摄像机机芯, 支持 1920x 1080的高清视频输出。 为了获得更好的垂 直眼对眼效果, 把反射镜置于下方, 把摄像机机芯置于上方进行拍摄。 表面 1002, 1003和 1004为反射镜面, 3个摄像机机芯可以独立进行调解, 用于补偿 结构加工误差和摄像机机芯本身的误差, 机芯的调节自由度包括以摄像机机 芯为坐标原点的 XYZ轴 3个方向上的平移和旋转。 在拍摄时, 需要将摄像机的 焦距调成相同的值, 以保证每个摄像机拍摄的视角范围一致。
图 14显示了本发明实施例所述共光心摄像机 9安装在弧面幕 8支架 81上的 效果图。 为了拍摄到一定范围的桌面, 摄像机光轴必须有一个拍摄的下倾角 度, 该角度通过摄像机安装在投影幕支架上的装置 91可以进行调整, 在本实 施例中取 8.5度。
下面具体描述视频通信接收端的方法流程,如图 10所示, 包括以下步骤: 步骤 310,接收端从网络获取视频通信发送端发送的所述至少两路视频码 流, 并通过所述接收端的视频解码器从至少两路视频码流中分别解码出至少 两路视频数据;
本实施例中, 接收端的三个视频通信终端从网络中获取三路编码后的视 频码流, 由视频通信终端中的视频解码器分别对其进行解码, 获得三路已进 行处理的视频数据。 其中, 所述视频解码器可以集成在所述接收端的视频通 信终端中, 也可以集成在显示融合器中。
可以理解, 也可以通过接收端的单一的视频通信终端完成从网络侧接收 的三路视频码流进行解码, 可以通过在该单一视频通信终端设置多个解码器 完成对该三路视频码流的解码处理。
本实施例中, 也可以釆用一台视频通信终端对三路视频码流进行接收和 解码, 这种方法的优点是多路视频数据的同步比较容易实现, 整个视频通信 系统的结构也可以得到优化, 但是, 这种方法要求视频通信终端具有更高的 解码处理能力。
步骤 311 , 接收端根据第二视频处理参数中的 GAMMA校正参数, 对所 述至少两路视频数据进行 GAMMA校正;
与发送端类似, 显示融合器在对所获得的视频数据进行处理之前, 还要 检查处理视频数据所需要的视频显示参数是否已经配置。 如果所述参数没有 配置, 对所获得的三路视频数据进行透传, 即不对所述视频数据进行处理, 直接输出到显示设备显示; 如果所述参数已经配置, 进行视频数据处理。
在本实施例中, 显示融合器将输出的三路视频数据送入三个投影仪中显 示, 由于投影仪的内部差异以及投影仪之间的差异会导致所述三路视频数据 之间存在亮度和颜色差异, 因此, 在显示所述视频数据之前, 需要在显示融 合器中对其进行投影仪的 GAMMA校正。
对于投影仪之间的亮度和颜色差异,可以通过拍摄反馈的方法进行校正。 投影 RGB三个颜色分量的 0— 255级的模板图像, 与所述三路全景图像的 RGB 颜色分量相比较, 可以建立三个投影仪之间的亮度和颜色差异曲线。 下面详 细地描述如何进行投影仪之间的 GAMMA校正: 假设 P1和 P2为两台不同的投 影仪, 如图 18所示, 横坐标为模板图像 R分量的颜色级别, 范围是 0— 255; 纵 坐标为所述视频数据的其中两路数据的颜色 R分量,可以认为是所述模板图像 R分量的函数 f(R)。 这样每台投影仪都可以建立一条颜色 R分量的曲线, 对于 0-255的每个级别, 都可以计算得到两台投影仪的 R分量的差值 Δ/,该变量也 可以看作是所述模板图像 R分量的函数。 这样就可以以一个投影仪颜色 R分量 的色度曲线为基准, 通过调整另一个投影仪颜色 R分量的色度曲线, 使两台投 影仪的所要显示的所述两路全景图像的颜色 R分量一致。 另外两个颜色分量 G 和 B的处理方法相同, 在此不再赘述。
对于投影仪内部的亮度和颜色差异, 也可以通过与上述方法类似的方法 校正。 以一台投影仪投影所述三路视频数据中的一路视频数据为例, 首先将 所述一路视频数据进行分块, 然后对每个分块数据建立亮度和颜色差异曲线, 具体的实现方法参见投影仪之间的亮度和颜色差异曲线建立方法, 此处不再 赘述。
除了校正投影仪之间和投影仪内部的亮度和颜色差异外, 要获得更好的 投影效果, 还必须进行投影仪的漏光补偿。 由于投影仪在投影纯黑图像时会 有光线泄露, 导致投影得到的图像并不是纯黑的, 而是有一定的亮度, 所以 所述三路视频数据在重叠区域的亮度和非重叠区域的亮度会不一致。 通过计 算得到所述重叠区域和所述非重叠区域之间的亮度差异, 给所述非重叠区域 加上计算出来的亮度值, 使所述重叠区域和所述非重叠区域的亮度一致。
步骤 312,接收端根据第二视频处理参数中的投影校正参数, 对所述至少 两路视频数据进行投影校正;
在本实施例中, 由于显示设备可以为投影仪和屏幕, 当屏幕是弧面幕, 或者是椭圓幕, 或者是抛物幕时, 要考虑屏幕的形状对于显示图像的影响。 即所述视频数据投影到上述形状的屏幕上时会产生变形, 因此要进行投影校 正。
步骤 313 ,接收端根据第二视频处理参数中的变换参数,对所述至少两路 视频数据进行变换;
本实施例中, 所述变换包括: 视频数据的平移、 视频数据的旋转和视频 数据的单应性变换中的至少一种变换。 此步骤中的视频数据变换可以用于补 偿由于投影仪安放位置不准确造成的图像变形和不对齐。 具体的变换方法可 以参见发送端对于图像进行变换的方法, 具体的原理此处不再赘述。
步骤 314,接收端根据第二视频处理参数中的融合参数,对所述至少两路 视频数据进行融合; 在本实施例中,图像融合的目的是使两个视频数据在接缝处不会有明显的 亮度差异。 具体方法是, 首先在两个视频数据的接缝处制造重叠区域, 然后 在所述重叠区域进行 Alpha融合,优选地,釆用非线性 Alpha融合的方法,例如, 非线性 Alpha融合所釆用的公式为:
1 1 丄 1 1 丄
Alpha(x) = (- + - cos θχ)γ Alpha(x) = (- - - cos θχ)γ 式子 ( 10 ) 其中 (9为角度值, y为 GAMMA值, 通过调整 >和 y可以获得最佳的融合 效果。
步骤 315,接收端根据第二视频处理参数中的剪裁区域参数, 将所述融合 后的至少两路视频数据的比例剪裁为第二目标比例;
在本实施例中, 为了兼容目前视频会议系统的显示方式,显示融合器除了 接投影仪外, 还可以接三台高清平板显示器, 而所述平板显示器具有边框厚 度, 无法做到无缝显示, 因此需要将图像位于边框厚度的部分剪裁掉。 所述 第二目标比例根据所述显示器的边框厚度来确定。 由用户定义显示器边框宽 度, PC机上的调校软件根据所述平板显示器的大小和分辨率, 将以毫米为单 位的平板显示器边框宽度换算为以像素单位的宽度。 显示融合器根据计算得 到的显示器边框宽度对每路视频数据进行剪裁, 生成剪裁后的图像。
步骤 316,接收端将所述融合后的至少两路视频数据的大小缩放至第二目 标大小;
本实施例中, 对视频数据进行剪裁后, 视频数据的大小可能会变小, 对 视频数据进行缩放, 使其尺寸达到显示时所需要的大小。
步骤 317,接收端将所述融合后的至少两路视频数据输出给显示设备, 由 显示设备显示所述融合后的至少两路视频数据。
在本实施例中, 所述显示设备包括投影仪和屏幕, 所述屏幕可以釆用弧 面幕, 或者椭圓幕, 或者抛物幕, 或者折面幕, 或者直幕。 所述显示设备至 少为两个, 在本实施例中, 设置三个投影仪来分别投影显示三路视频数据。 当然, 视频数据也可以为一路, 即在发送端, 不分割全景视频图像, 此时, 可以只釆用一个投影仪或者一个显示器显示所得到的一路视频数据, 但显示 效果较差。
优选地, 在本实施例中釆用弧面幕作为投影幕。 其中, 所述椭圓幕和抛 物幕与所述弧面幕类似, 不同之处在于, 由于投影幕的几何形状发生了变化, 需要对步骤 304 中所述的算法进行相应的修改; 当使用折面幕或者直幕作为 屏幕时, 由于视频数据在平面投影上没有畸变, 不用进行视频数据的几何校 正处理, 进一步地, 所述折面幕之间可以釆用钝角进行过渡, 也可以釆用圓 角进行过渡, 釆用圓角过渡比釆用钝角过渡更自然。 圓角半径越大, 过渡的 效果越好, 但过渡的圓角部分需要对视频数据进行几何校正处理; 当使用直 幕作为屏幕时, 对图像的变换处理最简单。 进一步地, 根据屏幕的几何形状, 可以对会议桌的形状进行相应的修改以获得更好的呈现效果, 例如在釆用折 面幕或者直幕时, 可以将会议桌改为折面桌的形式。
在步骤 309和步骤 310中用到了视频通信终端, 下面对视频通信终端的 结构进行详细描述:
当所述视频编、 解码器集成在视频通信终端中时, 所述视频通信终端的 各组成模块包括: 音频编解码器, 用于对所接收到的音频信号进行编码或者 解码, 编解码标准可以釆用 G.711,或 G.722, 或 G.723,或 G.728,或 G.729; 视频 编解码器, 用于对所接收到的视频信号进行编码或解码, 编码标准可以釆用 H.261,或 H.263; 系统控制单元, 用于对视频通信终端的正确操作提供信令, 所述信令包括呼叫控制, 能力交换, 命令和指示的信令以及消息; 格式化单 元, 用于对待发送的音频、 视频、 数据和控制流进行格式化, 形成消息输出 到网络接口, 或者从网络接口接收到的消息中提取音频、 视频、 数据和控制 流。 另外, 该单元还对每一种媒体类型, 完成逻辑成帧、 顺序编号、 差错检 测和差错纠正。
在步骤 311〜步骤 315中需要用到第二视频处理参数, 在对视频数据进行 处理之前, 需要先配置第二视频处理参数, 根据所配置的参数对视频数据进 行处理。 下面对于所述第二视频处理参数的配置方法进行详细描述:
本实施例中可以将显示融合器与 PC机进行结合, 所述 PC机也称为第二 处理机, 由所述 PC机计算第二视频处理参数, 由显示融合器配置第二视频处 理参数, 并根据所述参数处理视频数据; 也可以由显示融合器直接计算第二 视频处理参数, 即直接由显示融合器完成计算、 配置、 处理视频数据的过程, 而不需要与 PC机进行交互;或者,也可以单独由一台或多台 PC机完成计算、 配置、 处理视频数据的过程, 而不需要与显示融合器进行交互。
当釆用显示融合器和 PC机(也称第二处理机 )共同完成视频显示参数的 计算和配置时, 具体的配置方法可以通过如图 17中所示的步骤来实现:
步骤 501 , 启动 PC机上的调校软件;
所述调校软件与发送端调校软件相同, 此处不再进行赘述。
步骤 502, PC机通过所述调校软件登录到显示融合器上;
步骤 503 , PC机计算出第二视频处理参数;
在本实施例中, PC机计算出的参数包括投影仪 GAMMA校正参数、 视频 图像投影校正参数、 视频图像变换参数表、 视频图像 Alpha融合参数表和图像 剪裁区域参数。
步骤 504, PC机将计算出的第二视频处理参数发送给显示融合器; 在本实施例中, PC机和显示融合器之间的数据传输接口可以釆用以太网、 USB等接口方式, 传输协议可以釆用 FTP协议、 HTTP协议、 自定义的 TCP协 议或 UDP协议进行传输。 当计算得到的参数数据量较大时, 通过数据传输协 议传输给显示拼接融合器。
在本实施例中, PC机的功能还包括向显示融合器发送配置命令。 与 PC 机向摄像拼接融合器发送配置命令类似, 所述配置命令可以通过多种方式进 行传输, 例如通过串行端口、 并行端口或网络接口等进行传输。 如果通过网 络接口传输, 可以釆用 Telnet协议, 或者是 TCP协议、 UDP协议进行传输。 步骤 505 ,显示融合器将接收到的第二视频处理参数配置为工作状态使用 的第二视频处理参数。
本实施例中, 对第二视频处理参数进行配置后, 就可以对所述三路视频 数据进行步骤 311 315 的处理了, 并显示处理后的视频数据, 即如步骤 317 所述。 到此, 视频通信接收端的方法步骤完成。
本发明实施例提供的视频通信的方法, 在视频通信的发送端将所获取的 至少两路视频图像融合为全景视频图像, 融合得到的全景视频图像能够更真 实地表现相邻视频图像之间交接区域位置关系, 使得最后显示出的图像给用 户更真实的全景式体验, 解决了摄像机拍摄的相邻视频图像在交接处存在重 叠或缺失区域、 并且亮度和颜色不一致的问题。 本发明实施例中的视频通信 发送端将全景视频图像编码成视频码流发送给视频通信接收端后, 视频通信 接收端对其进行进一步地融合处理, 再将融合处理后的视频图像输出到显示 设备进行显示, 视频通信接收端进行的融合处理能够使得多个投影图像在弧 形幕上无缝呈现, 并且各个投影区域在颜色和亮度方面差异较小, 提高了全 景视频图像的视觉连续性, 能够给用户更好的沉浸式全景体验。
如图 19所示, 本发明实施例还提供一种视频通信的装置, 所述视频通信 装置应用于视频通信系统的发送端, 包括:
第一获取单元 601 , 用于获取至少两路本地视频图像;
第一融合单元 602, 用于根据第一视频处理参数中的融合参数,对由所述 第一获取单元 601 获取的至少两路本地视频图像进行融合, 生成全景视频图 像;
第一发送单元 603 ,用于将由所述第一融合单元 602获得的全景视频图像 发送给视频编码器, 通过所述视频编码器将所述全景视频图像编码成视频码 流, 并将所述视频码流发送给远端的视频通信站点。
本发明实施例提供的视频通信的装置, 由第一融合单元将由第一获取单 元所获取的至少两路视频图像融合为全景视频图像, 融合得到的全景视频图 像能够更真实地表现相邻视频图像之间交接区域位置关系, 使得最后显示出 的视频图像给用户更真实的全景式体验, 解决了摄像机拍摄的相邻视频图像 在交接处存在重叠或缺失区域、 并且亮度和颜色不一致的问题。
进一步地, 如图 20所示, 所述视频通信的装置还包括:
同步单元 604, 用于提供同步时钟,使得第一获取单元 601在同步时钟的 校准下, 进行至少两路本地视频图像的获取。
第一 GAMMA校正单元 605 , 用于在对由所述第一获取单元 601获取的 至少两路本地视频图像进行融合之前, 根据所述第一视频处理参数中的 GAMMA校正参数, 对由所述第一获取单元 601获取的至少两路本地视频图 像进行 GAMMA校正;
本实施例中, 由于摄像拼接融合器可以接收共光心摄像机机芯处理后的 视频图像,也可以接收共光心摄像机传感器,如 CCD或 CMOS传感器送出的 没有经过处理的视频图像, 当摄像拼接融合器接收的是没有经过处理的视频 图像时, 需要对所述视频图像进行 GAMMA校正和传感器坏点补偿处理, 以 提高视频图像的显示质量。
坏点补偿单元 606,用于在对由所述第一获取单元 601获取的至少两路本 地视频图像进行融合之前, 根据所述第一视频处理参数中的坏点补偿参数, 对由所述第一获取单元 601 获取的至少两路本地视频图像进行传感器坏点补 偿;
在本实施例中, 如果输出视频图像的传感器存在坏点, 导致所述视频图 像上也存在坏点, 坏点补偿处理可以根据视频图像上坏点的邻近像素值差值 得到坏点处的像素值, 消除视频图像上的坏点, 提高视频图像显示质量。
第一变换单元 607 ,用于在对由所述第一获取单元 601获取的至少两路本 地视频图像进行融合之前, 根据所述第一视频处理参数中的变换参数, 对所 述至少两路本地视频图像进行变换; 所述变换包括: 视频图像的平移、 视频 图像的旋转、 视频图像的单应性变换和视频图像的柱面变换中的至少一种变 换。
第一剪裁单元 608 ,用于在对由所述第一获取单元 601获取的至少两路本 地视频图像进行融合之后, 根据所述第一视频处理参数中的剪裁区域参数, 将由所述第一融合单元 602获得的全景视频图像的比例剪裁为第一目标比例; 本实施例中, 对于图像进行剪裁的目的是消除图像中不需要显示的部分。 所述第一目标比例根据实际情况人为确定。
第一缩放单元 609 ,用于在对由所述第一获取单元 601获取的至少两路本 地视频图像进行融合之后, 将由所述第一融合单元 602获得的全景视频图像 的大小缩放至第一目标大小;
本实施例中, 对视频图像进行剪裁后, 视频图像的大小可能会变小, 对 视频图像进行缩放, 使其尺寸达到用户所需要的大小。 所述第一目标大小根 据实际情况人为确定。
分割单元 610 ,用于在对由所述第一获取单元 601获取的至少两路本地视 频图像进行融合之后, 将由所述第一融合单元 602获得的全景视频图像分割 为至少两路视频数据。
本实施例中, 将融合后的全景视频图像分割为三路视频数据输出给三个 视频通信终端, 能够提高数据处理速度, 降低出错率。
进一步地, 如图 21所示, 当所述通信装置不需要与 PC机交互时, 所述 通信装置还包括:
第一釆集单元 611 , 用于釆集至少两幅图像;
第一计算单元 612 ,用于根据由所述第一釆集单元 611釆集的至少两幅图 像计算出第一视频处理参数;
第一配置单元 613 ,用于将由所述第一计算单元 612计算出的第一视频处 理参数配置为工作状态使用的第一视频处理参数。
进一步地, 如图 22所示, 当所述通信装置需要与 PC机交互时, 所述通 信装置还包括: 接收命令单元 614, 用于接收第一处理机发送的图像釆集命令;
第二釆集单元 615 , 用于釆集至少两幅图像;
第二发送单元 616 ,用于将由所述第二釆集单元 615釆集到的至少两幅图 像发送给所述第一处理机;
第一接收参数单元 617 ,用于接收由所述第一处理机根据由所述第二釆集 单元 615釆集的至少两幅图像计算出的第一视频处理参数;
第二配置单元 618 ,用于将由所述第一接收参数单元 617接收到的第一视 频处理参数配置为工作状态使用的第一视频处理参数。
本发明实施例提供的视频通信的装置具体实现方法可以参见本发明实施 例提供的视频通信的方法所述, 此处不再赘述。
如图 23所示, 本发明实施例还提供一种视频通信的装置, 所述视频通信 装置应用于视频通信系统的接收端, 包括:
第二获取单元 701 ,用于获取视频解码器从视频码流中解码出的至少两路 视频数据, 所述视频码流由所述视频解码器从远端的视频通信站点接收得到; 第二融合单元 702, 用于根据第二视频处理参数中的融合参数,对由所述 第二获取单元 701获取的至少两路视频数据进行融合;
输出单元 703 ,用于将由所述第二融合单元 702融合后的至少两路视频数 据输出给显示设备, 由所述显示设备显示所述融合后的至少两路视频数据。
本发明实施例提供的视频通信的装置, 将由视频通信发送端发送的视频 码流进行接收解码, 由第二融合单元对解码后获得的视频数据进行进一步地 融合处理, 再由输出单元将融合处理后的视频图像输出到显示设备进行显示, 视频通信接收端进行的融合处理能够多个投影图像在弧形幕上无缝呈现, 并 且各个投影区域在颜色和亮度方面差异较小, 提高了全景视频图像的视觉连 续性, 能够给用户更好的沉浸式全景体验。
进一步地, 如图 24所示, 所述视频通信装置还包括:
第二 GAMMA校正单元 704, 用于在对由所述第二获取单元 701获取的 至少两路视频数据进行融合之前, 根据所述第二视频处理参数中的 GAMMA 校正;
本实施例中, 由于投影仪之间和投影仪内部的亮度和颜色差异, 可以通 过拍摄反馈的方法进行 GAMMA校正,以消除投影仪对于视频图像显示效果 的影响。
投影校正单元 707 ,用于在对由所述第二获取单元 701获取的至少两路视 频数据进行融合之前, 根据所述第二视频处理参数中的投影校正参数, 对由 所述第二获取单元 701获取的至少两路视频数据进行投影校正;
第二变换单元 708 ,用于在对由所述第二获取单元 701获取的至少两路视 频数据进行融合之前, 根据所述第二视频处理参数中的变换参数, 对所述至 少两路视频数据进行变换; 所述变换包括: 视频数据的平移、 视频数据的旋 转和视频数据的单应性变换中的至少一种变换。
此步骤中的图像变换可以用于补偿由于投影仪安放位置不准确造成的图 像变形和未对齐。
第二剪裁单元 705 ,用于在对由所述第二获取单元 701获取的至少两路视 频数据进行融合之后, 根据所述第二视频处理参数中的剪裁区域参数, 将由 所述第二融合单元 702 融合后的至少两路视频数据的比例剪裁为第二目标比 例;
在本实施例中, 为了兼容目前视频会议系统的显示方式, 显示融合器除 了接投影仪外, 还可以接三台高清平板显示器, 而所述平板显示器具有边框 厚度, 无法做到无缝显示, 因此需要将图像位于边框厚度的部分剪裁掉。 所 述第二目标比例根据所述显示器的边框厚度来确定。
第二缩放单元 706 ,用于在对由所述第二获取单元 701获取的至少两路视 频数据进行融合之后, 将由所述第二融合单元 702 融合后的至少两路视频数 据的大小缩放至第二目标大小; 本实施例中, 对视频数据进行剪裁后, 视频数据的大小可能会变小, 对 视频数据进行缩放, 使其尺寸达到显示时所需要的大小。
进一步地, 如图 25所示, 当所述通信装置不需要与 PC机交互时, 所述 通信装置还包括:
第二计算单元 709, 用于计算出第二视频处理参数;
所述第二视频处理参数包括投影仪 GAMMA校正参数、视频图像投影校正 参数、视频图像变换参数表、视频图像 Alpha融合参数表和图像剪裁区域参数。
第三配置单元 710,用于将由所述第二计算单元 709计算出的第二视频处 理参数配置为工作状态使用的第二视频处理参数。
进一步地, 如图 26所示, 当所述通信装置需要与 PC机交互时, 所述通 信装置还包括:
第二接收参数单元 711 , 用于接收由第二处理机计算出的第二视频处理参 数; 在本实施例中, 第二处理机计算出的参数包括投影仪 GAMMA校正参数、 视频图像投影校正参数、 视频图像变换参数表、 视频图像 Alpha融合参数表和 图像剪裁区域参数。
第四配置单元 712,用于将由所述第二接收参数单元 711接收到的第二视 频处理参数配置为工作状态使用的第二视频处理参数。
本发明实施例提供的视频通信的装置具体实现方法可以参见本发明实施 例提供的视频通信的方法所述, 此处不再赘述。
本发明实施例提供的技术方案可应用在视频会议等视频通信的技术领域 中。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤 是可以通过程序来指令相关的硬件完成, 所述的程序可以存储于计算机可读 存储介质中, 如 ROM/RAM、 磁碟或光盘等。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到变 化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护范围应 所述以权利要求的保护范围为准。

Claims

权 利 要求 书
1、 一种视频通信站点, 其特征在于, 包括:
至少两个本地摄像机, 用于指向至少两个本地用户部分, 拍摄至少两路本 地用户部分的本地视频图像;
本地摄像拼接融合器, 用于根据第一视频处理参数中的融合参数, 对所述 拍摄得到的至少两路本地用户部分的本地视频图像进行融合, 生成全景视频图 像; 并将所述全景视频图像编码成视频码流, 将所述视频码流发送给远端的视 频通信站点;
本地显示融合器, 用于从远端接收到的视频码流中分别解码得到至少两路 视频数据; 根据第二视频处理参数中的融合参数, 对所述解码得到的至少两路 视频数据进行融合; 将融合后的至少两路视频数据输出给本地显示设备;
至少两个本地显示设备, 用于显示经过所述本地显示融合器融合后的至少 两路视频数据。
2、 根据权利要求 1所述的视频通信站点, 其特征在于, 所述本地摄像拼接 融合器还用于: 对所述至少两路本地用户部分的本地视频图像执行至少一种如 下操作: GAMMA校正、 传感器坏点补偿、 图像处理的相关变换;
所述本地摄像拼接融合器还用于: 对所述全景视频图像执行至少一种如下 操作: 剪裁、 缩放、 分割。
3、 根据权利要求 1所述的视频通信站点, 其特征在于, 所述本地显示融合 器还用于对所述解码得到的至少两路视频数据执行至少一种如下操作:
GAMMA校正、 投影校正、 变换;
所述本地显示融合器还用于对所述融合后的至少两路视频数据执行至少一 种如下操作: 剪裁、 缩放。
4、 根据权利要求 1所述的视频通信站点, 其特征在于, 所述至少两个本地 摄像机的拍摄区域为所述至少两个本地摄像机的拍摄范围的并集, 所述至少两 个本地摄像机的拍摄区域覆盖所述至少两个本地用户部分。
5、 根据权利要求 4所述的视频通信站点, 其特征在于,
所述至少两个本地摄像机为共光心摄像机;
所述至少两个本地显示设备包括: 投影仪和投影屏幕, 或者显示器。
6、 根据权利要求 5所述的视频通信站点, 其特征在于, 当所述至少两个本 地显示设备为投影仪和投影屏幕时, 所述投影屏幕的位置以所述会议桌的边缘 的中线位置的点为圓心布局; 所述共光心摄像机位于所述投影屏幕弧长的中线 位置。
7、 根据权利要求 1所述的视频通信站点, 其特征在于, 所述至少两个本地 显示设备中包括至少一个用于显示共享数据信息的显示设备; 或者,
所述系统还包括另一本地显示设备, 用于显示共享数据信息。
8、 一种视频通信的方法, 其特征在于, 包括:
获取至少两路本地视频图像;
根据第一视频处理参数中的融合参数, 对所述至少两路本地视频图像进行 融合, 生成全景视频图像;
将所述全景视频图像发送给视频编码器, 通过所述视频编码器将所述全景 视频图像编码成视频码流, 并将所述视频码流发送出去。
9、 根据权利要求 8所述的视频通信的方法, 其特征在于, 所述获取至少两 路本地视频图像包括: 通过至少两个本地摄像机根据同步时钟获取至少两路本 地视频图像。
10、 根据权利要求 8所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路本地视频图像进行融合之前, 根据所述第一视频处理参 数中的 GAMMA校正参数, 对所述至少两路本地视频图像进行 GAMMA校正; 和 /或
在对所述至少两路本地视频图像进行融合之前, 根据所述第一视频处理参 数中的坏点补偿参数, 对所述至少两路本地视频图像进行传感器坏点补偿。
11、 根据权利要求 8所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路本地视频图像进行融合之前, 根据所述第一视频处理参 数中的变换参数, 对所述至少两路本地视频图像进行图像处理的相关变换; 所 述图像处理的相关变换包括: 视频图像的平移、 视频图像的旋转、 视频图像的 单应性变换和视频图像的柱面变换中的至少一种变换。
12、 根据权利要求 8所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路本地视频图像进行融合之后, 根据所述第一视频处理参 数中的剪裁区域参数, 将所述全景视频图像的比例剪裁为第一目标比例; 和 /或 在对所述至少两路本地视频图像进行融合之后, 将所述全景视频图像的大 小缩放至第一目标大小。
13、 根据权利要求 8所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路本地视频图像进行融合之后, 将所述全景视频图像分割 为至少两路视频数据;
所述通过视频编码器将所述全景视频图像编码成视频码流, 并将所述视频 码流发送出去为: 通过至少两个视频编码器将所述至少两路视频数据分别编码 成对应的视频码流, 并将所述至少两路视频数据对应的视频码流分别发送出去。
14、根据权利要求 8至 12中任意一项所述的视频通信的方法,其特征在于, 还包括: 配置第一视频处理参数;
所述配置第一视频处理参数包括:
釆集至少两幅图像;
根据所述至少两幅图像计算出第一视频处理参数;
将计算出的第一视频处理参数配置为工作状态使用的第一视频处理参数。
15、根据权利要求 8至 12中任意一项所述的视频通信的方法,其特征在于, 所述配置第一视频处理参数的步骤包括:
接收第一处理机发送的图像釆集命令;
釆集至少两幅图像;
将所釆集到的至少两幅图像发送给所述第一处理机;
接收由所述第一处理机根据所述至少两幅图像计算出的第一视频处理参 数;
将接收到的第一视频处理参数配置为工作状态使用的第一视频处理参数。
16、 一种视频通信的方法, 其特征在于, 包括:
获取视频解码器从视频码流中解码出的至少两路视频数据, 所述视频码流 由所述视频解码器从远端的视频通信站点接收得到;
根据第二视频处理参数中的融合参数, 对所述至少两路视频数据进行融合; 将所述融合后的至少两路视频数据输出给显示设备, 由所述显示设备显示 所述融合后的至少两路视频数据。
17、 根据权利要求 16所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路视频数据进行融合之前, 根据所述第二视频处理参数中 的 GAMMA校正参数, 对所述至少两路视频数据进行 GAMMA校正。
18、 根据权利要求 16所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路视频数据进行融合之前, 根据所述第二视频处理参数中 的投影校正参数, 对所述至少两路视频数据进行投影校正。
19、 根据权利要求 16所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路视频数据进行融合之前, 根据所述第二视频处理参数中 的变换参数, 对所述至少两路视频数据进行变换; 所述变换包括: 视频数据的 平移、 视频数据的旋转、 视频数据的单应性变换和视频图像的柱面变换中的中 的至少一种变换。
20、 根据权利要求 16所述的视频通信的方法, 其特征在于, 所述方法还包 括:
在对所述至少两路视频数据进行融合之后, 根据所述第二视频处理参数中 的剪裁区域参数, 将所述融合后的至少两路视频数据的比例剪裁为第二目标比 例; 和 /或
在对所述至少两路视频数据进行融合之后, 将所述融合后的至少两路视频 数据的大小缩放至第二目标大小。
21、 根据权利要求 16至 20中任意一项所述的视频通信的方法, 其特征在 于, 还包括: 配置第二视频处理参数;
所述配置第二视频处理参数包括:
计算出第二视频处理参数, 或接收由第二处理机计算出的第二视频处理参 数;
将接收到的第二视频处理参数配置为工作状态使用的第二视频处理参数。
22、 一种视频通信的装置, 其特征在于, 包括:
第一获取单元, 用于获取至少两路本地视频图像;
第一融合单元, 用于根据第一视频处理参数中的融合参数, 对由所述第一 获取单元获取的至少两路本地视频图像进行融合, 生成全景视频图像;
第一发送单元, 用于将由所述第一融合单元获得的全景视频图像发送给视 频编码器, 通过所述视频编码器将所述全景视频图像编码成视频码流, 并将所 述视频码流发送出去。
23、根据权利要求 22所述的视频通信装置, 其特征在于, 所述装置还包括: 同步单元, 用于提供同步时钟, 使得第一获取单元在同步时钟的校准下, 进行 至少两路本地视频图像的获取。
24、 根据权利要求 22所述的视频通信的装置, 其特征在于, 所述装置还包 括:
第一 GAMMA校正单元, 用于在对由所述第一获取单元获取的至少两路本 地视频图像进行融合之前,根据所述第一视频处理参数中的 GAMMA校正参数, 对由所述第一获取单元获取的至少两路本地视频图像进行 GAMMA校正; 和 / 或
坏点补偿单元, 用于在对由所述第一获取单元获取的至少两路本地视频图 像进行融合之前, 根据所述第一视频处理参数中的坏点补偿参数, 对由所述第 一获取单元获取的至少两路本地视频图像进行传感器坏点补偿。
25、 根据权利要求 22所述的视频通信的装置, 其特征在于, 所述装置还包 括:
第一变换单元, 用于在对由所述第一获取单元获取的至少两路本地视频图 像进行融合之前, 根据所述第一视频处理参数中的变换参数, 对所述至少两路 本地视频图像进行图像处理的相关变换; 所述图像处理的相关变换包括: 视频 图像的平移、 视频图像的旋转、 视频图像的单应性变换和视频图像的柱面变换 中的至少一种变换。
26、 根据权利要求 22所述的视频通信的装置, 其特征在于, 所述装置还包 括:
第一剪裁单元, 用于在对由所述第一获取单元获取的至少两路本地视频图 像进行融合之后, 根据所述第一视频处理参数中的剪裁区域参数, 将由所述第 一融合单元获得的全景视频图像的比例剪裁为第一目标比例; 和 /或
第一缩放单元, 用于在对由所述第一获取单元获取的至少两路本地视频图 像进行融合之后, 将由所述第一融合单元获得的全景视频图像的大小缩放至第 一目标大小。
27、 根据权利要求 22所述的视频通信的装置, 其特征在于, 所述装置还包 括:
分割单元, 用于在对由所述第一获取单元获取的至少两路本地视频图像进 行融合之后, 将由所述第一融合单元获得的全景视频图像分割为至少两路视频 数据。
28、 根据权利要求 22至 26中任意一项所述的视频通信的装置, 其特征在 于, 所述装置还包括:
第一釆集单元, 用于釆集至少两幅图像;
第一计算单元, 用于根据由所述第一釆集单元釆集的至少两幅图像计算出 第一视频处理参数;
第一配置单元, 用于将由所述第一计算单元计算出的第一视频处理参数配 置为工作状态使用的第一视频处理参数。
29、 根据权利要求 22至 26中任意一项所述的视频通信的装置, 其特征在 于, 所述装置还包括:
接收命令单元, 用于接收第一处理机发送的图像釆集命令;
第二釆集单元, 用于釆集至少两幅图像;
第二发送单元, 用于将由所述第二釆集单元釆集到的至少两幅图像发送给 所述第一处理机;
第一接收参数单元, 用于接收由所述第一处理机根据由所述第二釆集单元 釆集的至少两幅图像计算出的第一视频处理参数;
第二配置单元, 用于将由所述第一接收参数单元接收到的第一视频处理参 数配置为工作状态使用的第一视频处理参数。
30、 一种视频通信的装置, 其特征在于, 包括:
第二获取单元, 用于获取视频解码器从视频码流中解码出的至少两路视频 数据, 所述视频码流由所述视频解码器从远端的视频通信站点接收得到;
第二融合单元, 用于根据第二视频处理参数中的融合参数, 对由所述第二 获取单元获取的至少两路视频数据进行融合;
输出单元, 用于将由所述第二融合单元融合后的至少两路视频数据输出给 显示设备, 由所述显示设备显示所述融合后的至少两路视频数据。
31、 根据权利要求 30所述的视频通信的装置, 其特征在于, 所述装置还包 括:
第二 GAMMA校正单元, 用于在对由所述第二获取单元获取的至少两路视 频数据进行融合之前, 根据所述第二视频处理参数中的 GAMMA校正参数, 对
32、 根据权利要求 30所述的视频通信的装置, 其特征在于, 所述装置还包 括:
投影校正单元, 用于在对由所述第二获取单元获取的至少两路视频数据进 行融合之前, 根据所述第二视频处理参数中的投影校正参数, 对由所述第二获 取单元获取的至少两路视频数据进行投影校正。
33、 根据权利要求 30所述的视频通信的装置, 其特征在于, 所述装置还包 括:
第二变换单元, 用于在对由所述第二获取单元获取的至少两路视频数据进 行融合之前, 根据所述第二视频处理参数中的变换参数, 对所述至少两路视频 数据进行变换; 所述变换包括: 视频数据的平移、 视频数据的旋转、 视频数据 的单应性变换和视频图像的柱面变换中的中的至少一种变换。
34、 根据权利要求 30所述的视频通信的装置, 其特征在于, 所述装置还包 括:
第二剪裁单元, 用于在对由所述第二获取单元获取的至少两路视频数据进 行融合之后, 根据所述第二视频处理参数中的剪裁区域参数, 将由所述第二融 合单元融合后的至少两路视频数据的比例剪裁为第二目标比例; 和 /或
第二缩放单元, 用于在对由所述第二获取单元获取的至少两路视频数据进 行融合之后, 将由所述第二融合单元融合后的至少两路视频数据的大小缩放至 第二目标大小。
35、 根据权利要求 30至 34中任意一项所述的视频通信的装置, 其特征在 于, 所述装置还包括: 第二计算单元, 用于计算出第二视频处理参数;
第三配置单元, 用于将由所述第二计算单元计算出的第二视频处理参数配 置为工作状态使用的第二视频处理参数。
36、 根据权利要求 30至 34中任意一项所述的视频通信的装置, 其特征在 于, 所述装置还包括:
第二接收参数单元, 用于接收由第二处理机计算出的第二视频处理参数; 第四配置单元, 用于将由所述第二接收参数单元接收到的第二视频处理参 数配置为工作状态使用的第二视频处理参数。
37、 一种视频通信的系统, 其特征在于, 包括至少两个如权利要求 1至 7 中任意一项所述的视频通信站点;
所述至少两个视频通信站点的一个站点, 用于拍摄至少两路本地用户部分 的本地视频图像; 根据第一视频处理参数中的融合参数, 对所述拍摄得到的至 少两路本地用户部分的本地视频图像进行融合, 生成全景视频图像; 并将所述 全景视频图像编码成视频码流, 将所述视频码流通过网络发送出去;
所述至少两个视频通信站点的至少一个站点, 作为接收站点, 用于从接收 到的视频码流中分别解码得到至少两路视频数据; 根据第二视频处理参数中的 融合参数, 对所述解码得到的至少两路视频数据进行融合; 将融合后的至少两 路视频数据输出显示。
PCT/CN2010/070427 2010-01-29 2010-01-29 视频通信的方法、装置和系统 WO2011091604A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP10844382.1A EP2525574A4 (en) 2010-01-29 2010-01-29 METHOD, DEVICE AND SYSTEM FOR VIDEO COMMUNICATION
PCT/CN2010/070427 WO2011091604A1 (zh) 2010-01-29 2010-01-29 视频通信的方法、装置和系统
US13/561,928 US8890922B2 (en) 2010-01-29 2012-07-30 Video communication method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2010/070427 WO2011091604A1 (zh) 2010-01-29 2010-01-29 视频通信的方法、装置和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/561,928 Continuation US8890922B2 (en) 2010-01-29 2012-07-30 Video communication method, device and system

Publications (1)

Publication Number Publication Date
WO2011091604A1 true WO2011091604A1 (zh) 2011-08-04

Family

ID=44318629

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/070427 WO2011091604A1 (zh) 2010-01-29 2010-01-29 视频通信的方法、装置和系统

Country Status (3)

Country Link
US (1) US8890922B2 (zh)
EP (1) EP2525574A4 (zh)
WO (1) WO2011091604A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014005297A1 (en) * 2012-07-04 2014-01-09 Intel Corporation Panorama based 3d video coding
CN103581609A (zh) * 2012-07-23 2014-02-12 中兴通讯股份有限公司 一种视频处理方法及装置、系统
EP2733934A1 (en) * 2011-11-08 2014-05-21 Huawei Technologies Co., Ltd Method and terminal for transmitting information
CN104574339A (zh) * 2015-02-09 2015-04-29 上海安威士科技股份有限公司 一种用于视频监控的多尺度柱面投影全景图像生成方法
WO2018188609A1 (zh) * 2017-04-11 2018-10-18 中兴通讯股份有限公司 一种拍照装置、方法及设备

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI510082B (zh) * 2012-06-06 2015-11-21 Etron Technology Inc 用於影像辨識之影像擷取方法及其系統
US9507750B2 (en) 2012-10-12 2016-11-29 A9.Com, Inc. Dynamic search partitioning
US9357165B2 (en) 2012-11-16 2016-05-31 At&T Intellectual Property I, Lp Method and apparatus for providing video conferencing
US9055216B1 (en) * 2012-11-19 2015-06-09 A9.Com, Inc. Using sensor data to enhance image data
EP3020183B1 (en) * 2013-07-11 2019-12-11 Harman International Industries, Inc. System and method for digital audio conference workflow management
GB2517730A (en) 2013-08-29 2015-03-04 Mediaproduccion S L A method and system for producing a video production
US9363476B2 (en) 2013-09-20 2016-06-07 Microsoft Technology Licensing, Llc Configuration of a touch screen display with conferencing
US20150085060A1 (en) 2013-09-20 2015-03-26 Microsoft Corporation User experience for conferencing with a touch screen display
CN104836977B (zh) * 2014-02-10 2018-04-24 阿里巴巴集团控股有限公司 即时通讯过程中的视频通讯方法及系统
GB2528060B (en) * 2014-07-08 2016-08-03 Ibm Peer to peer audio video device communication
US10395403B1 (en) * 2014-12-22 2019-08-27 Altia Systems, Inc. Cylindrical panorama
US11856297B1 (en) * 2014-12-31 2023-12-26 Gn Audio A/S Cylindrical panorama hardware
CN105898184A (zh) * 2016-04-26 2016-08-24 乐视控股(北京)有限公司 视频通话方法及装置
JP6742444B2 (ja) * 2016-06-07 2020-08-19 ヴィズビット インコーポレイテッド ライブストリーミング用のバーチャルリアリティ360度ビデオカメラシステム
CN106231233B (zh) * 2016-08-05 2019-12-20 北京邮电大学 一种基于权值的实时融屏方法
CN106791656B (zh) * 2016-12-23 2023-09-01 北京汉邦高科数字技术股份有限公司 一种万向调节的双目全景摄像机及其工作方法
CN108270993A (zh) * 2016-12-30 2018-07-10 核动力运行研究所 一种用于反应堆压力容器螺栓孔多摄像头视频检查装置
TWI672677B (zh) * 2017-03-31 2019-09-21 鈺立微電子股份有限公司 用以融合多深度圖的深度圖產生裝置
CN108289002B (zh) * 2017-11-06 2019-09-27 诸暨市青辰科技服务有限公司 安全型老人专用收音机
US10509968B2 (en) * 2018-01-30 2019-12-17 National Chung Shan Institute Of Science And Technology Data fusion based safety surveillance system and method
CN108449360B (zh) * 2018-04-17 2021-06-18 广州视源电子科技股份有限公司 智能交互一体机
CN108762706B (zh) * 2018-05-29 2021-07-27 海信视像科技股份有限公司 图像处理方法和装置
CN112911159B (zh) * 2018-08-27 2023-04-18 深圳市大疆创新科技有限公司 图像呈现方法、图像获取设备及终端装置
CN109803099B (zh) * 2018-12-24 2021-10-22 南京巨鲨显示科技有限公司 一种视频拼接器显示图层的动态管理方法
CN110545376B (zh) * 2019-08-29 2021-06-25 上海商汤智能科技有限公司 通信方法及装置、电子设备和存储介质
CN112866634A (zh) * 2020-12-31 2021-05-28 上海远动科技有限公司 一种工控平台与视频监控平台的数据交互系统
CN112954234A (zh) * 2021-01-28 2021-06-11 天翼物联科技有限公司 一种多视频融合的方法、系统、装置及介质
CN115552467A (zh) 2021-05-21 2022-12-30 商汤国际私人有限公司 一种边缘计算方法、装置、边缘设备及存储介质
WO2022243736A1 (en) * 2021-05-21 2022-11-24 Sensetime International Pte. Ltd. Edge computing method and apparatus, edge device and storage medium
CN113938617A (zh) * 2021-09-06 2022-01-14 杭州联吉技术有限公司 一种多路视频显示方法、设备、网络摄像机及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07135646A (ja) * 1993-11-11 1995-05-23 Nec Eng Ltd テレビ会議システム
CN1972431A (zh) * 2006-12-14 2007-05-30 北京中星微电子有限公司 一种视频会议图像处理系统
WO2008115416A1 (en) * 2007-03-16 2008-09-25 Kollmorgen Corporation System for panoramic image processing
CN101483758A (zh) * 2008-01-11 2009-07-15 天地阳光通信科技(北京)有限公司 一种视频监控系统与视频会议系统的融合系统
CN101534413A (zh) * 2009-04-14 2009-09-16 深圳华为通信技术有限公司 一种远程呈现的系统、装置和方法

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020180727A1 (en) * 2000-11-22 2002-12-05 Guckenberger Ronald James Shadow buffer control module method and software construct for adjusting per pixel raster images attributes to screen space and projector features for digital warp, intensity transforms, color matching, soft-edge blending, and filtering for multiple projectors and laser projectors
US7680192B2 (en) * 2003-07-14 2010-03-16 Arecont Vision, Llc. Multi-sensor panoramic network camera
CN1725851A (zh) 2004-07-20 2006-01-25 赵亮 一种视频会议的控制系统及控制方法
US7925391B2 (en) * 2005-06-02 2011-04-12 The Boeing Company Systems and methods for remote display of an enhanced image
US7973823B2 (en) * 2006-12-29 2011-07-05 Nokia Corporation Method and system for image pre-processing
BRPI0821283A2 (pt) 2008-03-17 2015-06-16 Hewlett Packard Development Co Método para representar fluxos de imagem de vídeo e sistema de gerenciamento cliente de pontos de extremidade
US8502857B2 (en) * 2008-11-21 2013-08-06 Polycom, Inc. System and method for combining a plurality of video stream generated in a videoconference
KR101269900B1 (ko) * 2008-12-22 2013-05-31 한국전자통신연구원 다중 영상 기반의 모션 제어 카메라 효과 구현 방법 및 장치
US8212855B2 (en) * 2009-04-29 2012-07-03 Embarq Holdings Company, Llc Video conferencing eyewear

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07135646A (ja) * 1993-11-11 1995-05-23 Nec Eng Ltd テレビ会議システム
CN1972431A (zh) * 2006-12-14 2007-05-30 北京中星微电子有限公司 一种视频会议图像处理系统
WO2008115416A1 (en) * 2007-03-16 2008-09-25 Kollmorgen Corporation System for panoramic image processing
CN101483758A (zh) * 2008-01-11 2009-07-15 天地阳光通信科技(北京)有限公司 一种视频监控系统与视频会议系统的融合系统
CN101534413A (zh) * 2009-04-14 2009-09-16 深圳华为通信技术有限公司 一种远程呈现的系统、装置和方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2525574A4 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2733934A4 (en) * 2011-11-08 2015-04-15 Huawei Tech Co Ltd METHOD AND TERMINAL FOR TRANSMITTING INFORMATION
US9357173B2 (en) 2011-11-08 2016-05-31 Huawei Technologies Co., Ltd. Method and terminal for transmitting information
EP2733934A1 (en) * 2011-11-08 2014-05-21 Huawei Technologies Co., Ltd Method and terminal for transmitting information
US9088696B2 (en) 2011-11-08 2015-07-21 Huawei Technologies Co., Ltd. Method and terminal for transmitting information
CN104350745A (zh) * 2012-07-04 2015-02-11 英特尔公司 基于全景的3d视频译码
WO2014005297A1 (en) * 2012-07-04 2014-01-09 Intel Corporation Panorama based 3d video coding
CN104350745B (zh) * 2012-07-04 2018-12-11 英特尔公司 基于全景的3d视频译码
EP2852157A4 (en) * 2012-07-23 2015-07-22 Zte Corp VIDEO PROCESSING, DEVICE AND SYSTEM
CN103581609A (zh) * 2012-07-23 2014-02-12 中兴通讯股份有限公司 一种视频处理方法及装置、系统
US9497390B2 (en) 2012-07-23 2016-11-15 Zte Corporation Video processing method, apparatus, and system
CN103581609B (zh) * 2012-07-23 2018-09-28 中兴通讯股份有限公司 一种视频处理方法及装置、系统
CN104574339A (zh) * 2015-02-09 2015-04-29 上海安威士科技股份有限公司 一种用于视频监控的多尺度柱面投影全景图像生成方法
WO2018188609A1 (zh) * 2017-04-11 2018-10-18 中兴通讯股份有限公司 一种拍照装置、方法及设备

Also Published As

Publication number Publication date
EP2525574A1 (en) 2012-11-21
US20120287222A1 (en) 2012-11-15
US8890922B2 (en) 2014-11-18
EP2525574A4 (en) 2013-07-10

Similar Documents

Publication Publication Date Title
WO2011091604A1 (zh) 视频通信的方法、装置和系统
WO2014036741A1 (zh) 图像处理方法和图像处理设备
US9270941B1 (en) Smart video conferencing system
EP2368364B1 (en) Multiple video camera processing for teleconferencing
US7202887B2 (en) Method and apparatus maintaining eye contact in video delivery systems using view morphing
EP2047422B1 (en) Method and system for producing seamless composite images having non-uniform resolution from a multi-imager
US20220070371A1 (en) Merging webcam signals from multiple cameras
KR20100085188A (ko) 3차원 비디오 통신 단말기, 시스템 및 방법
WO2010118685A1 (zh) 一种远程呈现的系统、装置和方法
BRPI0924076B1 (pt) Sistema de telepresença e método de telepresença
US9143727B2 (en) Dual-axis image equalization in video conferencing
CN103905741A (zh) 超高清全景视频实时生成与多通道同步播放系统
WO2011029398A1 (zh) 一种图像处理方法及装置
US8149260B2 (en) Methods and systems for producing seamless composite images without requiring overlap of source images
JPWO2014192804A1 (ja) デコーダ及び監視システム
TWI616102B (zh) 視訊影像生成系統及其視訊影像生成之方法
WO2013067898A1 (zh) 传输信息的方法和终端
WO2011011917A1 (zh) 视频通信方法、装置和系统
JP6004978B2 (ja) 被写体画像抽出装置および被写体画像抽出・合成装置
WO2013060295A1 (zh) 一种视频处理方法和系统
EP3033883B1 (en) Video feedback of presenter image for optimizing participant image alignment in a videoconference
JP2008301399A (ja) テレビ会議装置、テレビ会議方法、テレビ会議システム、コンピュータプログラム及び記録媒体
KR20170059310A (ko) 텔레 프레젠스 영상 송신 장치, 텔레 프레젠스 영상 수신 장치 및 텔레 프레젠스 영상 제공 시스템
KR101219457B1 (ko) 화상 회의 단말장치 및 그 장치에서의 화상 표시 방법
JP2015186177A (ja) 映像配信システム及び映像配信方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10844382

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2010844382

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2010844382

Country of ref document: EP