WO2012031566A1 - 多屏视频会议中对与会者图像显示进行调整的方法及装置 - Google Patents

多屏视频会议中对与会者图像显示进行调整的方法及装置 Download PDF

Info

Publication number
WO2012031566A1
WO2012031566A1 PCT/CN2011/079523 CN2011079523W WO2012031566A1 WO 2012031566 A1 WO2012031566 A1 WO 2012031566A1 CN 2011079523 W CN2011079523 W CN 2011079523W WO 2012031566 A1 WO2012031566 A1 WO 2012031566A1
Authority
WO
WIPO (PCT)
Prior art keywords
screen
participant
image
site
displayed
Prior art date
Application number
PCT/CN2011/079523
Other languages
English (en)
French (fr)
Inventor
吴姣黎
陈显义
宋文
Original Assignee
华为终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为终端有限公司 filed Critical 华为终端有限公司
Publication of WO2012031566A1 publication Critical patent/WO2012031566A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a method and apparatus for adjusting participant image display in a multi-screen video conference.
  • the video conferencing service is a multimedia communication service, which uses a video terminal and a communication network to hold a conference, and can simultaneously realize image, voice, and data interaction between two or more locations.
  • the terminal in the conference site compresses and encodes the image signal captured by the local camera and the voice signal of the participant picked up by the microphone in the participant area, and transmits it to the remote conference site through the transmission network.
  • the digital signal transmitted from the remote site is received through the transmission network, and the digital signal is decoded to obtain images and signals of the participants in the remote site.
  • the venue has been developed from a previous camera, a monitor, and a participant area to multiple cameras, multiple displays, multiple participant areas, multiple cameras, multiple displays in the same venue. Multiple participant areas are associated by physical or logical relationships.
  • a multi-point control server (using an MCU, Multipoint Control Unit, for example, a multi-point control unit) in a communication network identifies a speaker with the highest current voice, and maximizes the current voice.
  • the image of each participant in the conference site is switched to the target site.
  • the target site is the site except the site where the largest speaker is located.
  • the target site can only display the image of each participant in the same site, that is, only the participant images of the site where the participant with the loudest voice is located can be displayed, so that if the participants currently participating in the discussion are participants of different sites, The participants in the target venue cannot see the images of the participants currently participating in the discussion. Summary of the invention
  • the embodiment of the invention provides a method and a device for adjusting the image display of a participant in a multi-screen video conference, which can flexibly perform on-screen voice-activated switching to improve the experience of the participant.
  • a method for adjusting a participant image display in a multi-screen video conference includes:
  • the predetermined number of participants to be displayed are determined in turn;
  • the image displayed by the screen controlling the image to be switched is switched to the predetermined number of images of the participant to be displayed.
  • a network side media processing device includes:
  • the participant selection unit is configured to sequentially determine a predetermined number of participants to be displayed according to the order of the participants in the current conference from the largest to the smallest;
  • a screen selection unit configured to determine a predetermined number of screens corresponding to the currently displayed participant in the first site as a screen that needs to switch images
  • a first control switching unit configured to control an image displayed by the screen on which the image needs to be switched to be switched to the predetermined number of images of the participant to be displayed.
  • the embodiment of the present invention determines a predetermined number of screens corresponding to the currently displayed participants in the first site as a screen that needs to switch images, and then switches the image in the screen that needs to switch images to be based on the volume of each participant in the conference.
  • the image of the participant to be displayed is determined in small order. Since the selected participants to be displayed are determined according to the order of the volume of the participants in the current conference, the participants who are currently participating in the discussion and located at different sites can be displayed, and the participants in the first conference can be enabled. See the images of the participants participating in the discussion and improve the experience of the participants.
  • 1 is a schematic structural view of a multi-screen conference site
  • FIG. 2A is a flowchart of a method for adjusting a participant image display in a multi-screen video conference according to an embodiment of the present invention
  • 2B is a flowchart of a method for adjusting a participant image display in a multi-screen video conference according to another embodiment of the present invention
  • 2C is a flowchart of a method for adjusting a participant image display in a multi-screen video conference according to another embodiment of the present invention.
  • 2D is a flowchart of a method for adjusting a participant image display in a multi-screen video conference according to another embodiment of the present invention.
  • FIG. 3 is a flowchart of a method for adjusting an image of a participant based on a recent speaker list according to an embodiment of the present invention
  • FIG. 4 is a flowchart of another method for adjusting a participant image display based on a recent speaker list according to an embodiment of the present invention
  • FIG. 5 is a flowchart of still another method for adjusting a participant image display based on a recent speaker list according to an embodiment of the present invention
  • FIG. 6A is a schematic diagram of an image of a three-screen venue screen switched by the method of FIG. 3, 4 or 5 according to an embodiment of the present invention.
  • FIG. 6B is a schematic diagram of an image of a two-screen venue screen switched by the method of FIG. 3, 4 or 5 according to an embodiment of the present invention.
  • FIG. 6C is a schematic diagram of an image of a three-screen venue screen cut by using a method for specifying a screen for displaying a maximum speaker image according to an embodiment of the present invention
  • 6D is a schematic diagram of an image of a screen of a two-screen venue screen being replaced by a method for specifying a screen for displaying a maximum speaker image according to an embodiment of the present invention
  • FIG. 7 is a flowchart of a method for adjusting an image of a participant's image by considering the position of a screen in a conference site according to an embodiment of the present invention
  • FIG. 8 is a schematic diagram of a conference site superimposing a multi-screen image on a speaker image with the loudest voice according to an embodiment of the present invention
  • FIG. 9 is a schematic diagram of playing a mix (a plurality of participant sounds of a remote site) by a playback device in a conference site according to an embodiment of the present invention
  • FIG. 10 is a schematic diagram showing simultaneous display of multiple images by displaying an image of a participant having the loudest voice according to an embodiment of the present invention
  • FIG. 11 is a structural diagram of a network side media processing device according to an embodiment of the present invention.
  • FIG. 12 and Figure 13 are structural diagrams of the screen selection unit
  • Figure 14 is a block diagram of the video source control unit.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference, where the method specifically includes:
  • the predetermined number of participants to be displayed are determined in turn.
  • the volume of the participant's volume is changed from the largest to the smallest.
  • the volume energy value of the participant's speech for a period of time is counted, and the period of time may be required for the participant.
  • the time before the time when the image is adjusted, the duration of the period of time may be set by the user; wherein the predetermined number may be one, and the determined participant is the participant with the loudest voice; or, the predetermined number For multiple, specifically can be set by the network side media processing device,
  • the terminal is set and sent to the network side media processing device.
  • the terminal of the chair site is set and sent to the network side media processing device.
  • the screen corresponding to the currently displayed participant in the first site is determined as the screen for switching the image according to the ranking result of the participant currently displayed on the screen of the first site.
  • the ranking result of the participant currently displayed on the screen of the first site is performed according to the following sorting condition, and the sorting condition includes one of the following conditions: the voice size of the currently displayed participant, and the speaking time of the currently displayed participant.
  • the sorting condition includes one of the following conditions: the voice size of the currently displayed participant, and the speaking time of the currently displayed participant.
  • the point distance, the duration of the currently displayed participant, the number of times the participant currently displays on the screen of the first site, and the screen corresponding to the participant currently displayed on the screen of the first site are the main screen.
  • the sorting result may be sorted according to one of the following ways: the currently displayed participants are in order of sound from large to small; the currently displayed participants' speaking time points are in order from near to far; the currently displayed participants The duration of the speech is in the order of length to short; the number of times the participant currently displays on the screen of the first site is in order of increasing order; in addition, whether the screen corresponding to the participant currently displayed on the screen of the first site is The main screen can be used as an additional sorting condition.
  • the sorting order of the currently displayed attendees of the first site of the main screen is located before the sorting order of the currently displayed attendees of the first site of the non-main screen.
  • the participant with the lowest voice is the participant who does not participate in the discussion, and the participant with the loud voice is the participant who participates in the discussion, so in order to select the screen of the participant who did not participate in the discussion as the screen to be switched Therefore, the voice size of the currently displayed participant is used as one of the sorting conditions; in a video conference, the probability that the participant who is speaking at a relatively short time is speaking again is relatively large, and the probability that the participant who is speaking at a longer time speaks again is speaking. It is relatively small, so the current time of the participant's speaking time is used as one of the sorting conditions.
  • the probability that the participant who has a long speech time speaks again is relatively large, and the probability that the participant who speaks the short time speaks again It is relatively small, so the duration of the currently displayed participant's speech is used as one of the sorting conditions.
  • the person who normally speaks will have a higher probability of speaking again. In order to better count the probability of the participant speaking, Therefore, the number of times the participants can speak can be used as one of the sorting conditions;
  • the middle screen corresponds to the main screen;
  • the two screens adjacent to the central axis correspond to the main screen, and the main screen generally presents the conference chairman and other conferences.
  • the image of the person therefore, in order to better count the participants presented on the main screen, it is possible to use the screen corresponding to the participant currently displayed on the screen of the first site as the main screen of the main screen as a sorting condition.
  • the corresponding weights can be set according to the corresponding importance (for example: the sum of the weights assigned by all the sorting conditions is normalized to 1, and of course, the sum of the weights can be designed to be 1).
  • the factors of each sorting condition are defined according to their characteristics, and then the weighted sum of these factors is calculated as the sorting reference value;
  • the weight of the participant's voice is 0.1
  • the weight of the speech time is 0.4
  • the weight of the speech is 0.2
  • the weight of the speech is 0.2
  • the weight of the participant's screen is 0.1.
  • the sum of the weights of all these factors is 1.
  • all of these factors have their own values.
  • the size of the participant's voice ranges from 1 to 10. The larger the sound, the larger the value, and the smaller the sound, the smaller the value.
  • the size of the voice is the size of each participant's voice at the time of the recent speech; the time range of the speech is in the range of 1 to 1000, and the speaking time of each participant is the time of the last time each participant speaks, wherein it can be assumed The meeting starts with 1 and then increases by 1 after 1 minute.
  • the duration of the speech ranges from 1 to 500, in minutes. It can be the duration of the last time the participant has spoken, or it can be the specific time period of the participant.
  • the accumulated value of the duration of the speech such as the total duration of the speeches of the participants within one hour; the number of speeches ranges from 1 to 100, which can be the number of speeches in a specific time period, such as the number of speeches within one hour.
  • the participant's screen value is 0 or 1, that is, when the participant's screen is the main screen, the value is 1 . Otherwise, the value is 0.
  • the middle screen is the main screen.
  • the middle two can be considered as the main screen.
  • Participant's Sort Reference Value Participant's Voice X Participant Sound Size Weight + Participant's Speech Time Point X Speech Time Point Weight + Participant's Speech Duration X Speak Duration Weight + Participant's Sentence Number X Sentence Weight + The participant's participant is on the screen where the X participant's screen weight is.
  • each participant is sorted in descending order of the sorting reference value, and a screen corresponding to a predetermined number of participants of the sorting result is selected as a screen on which the image needs to be switched.
  • each participant's voice when sorting the currently displayed participants of the screen of the first site, only the size of each participant's voice may be considered, and then the order of the participants' voices is sorted according to the order of the participants; It is possible to consider only the distances of the speaking time points of the participants, in accordance with the speaking time of the participants. The points are sorted from near to far; or only the duration of each participant's speech can be considered. In this case, the duration of the participants' speech is sorted from long to short; or only the size and each of the participants' voices can be considered. Participants speak the point in time, regardless of other conditions, assuming that the participants’ voices are weighted as
  • the weight of the speaking time point is 0.6, it can be assumed that the participant's sound size ranges from 1 to 10, wherein the larger the sound, the larger the value, the smaller the sound, the smaller the value, among which the participants
  • the size of the sound is the size of each participant's voice at the most recent speaking time.
  • the speaking time point ranges from 1 to 1000.
  • the speaking time points of each participant are the time points of the last time each participant speaks.
  • the order of each participant is sorted; or, the duration of each participant's speech and the distance of each participant's speaking time point may be considered, regardless of other conditions, and the implementation of the present invention is not affected.
  • the step is to select the participant with the loudest voice and the participant with the loudest voice, and determine the participant and the voice with the loudest voice.
  • the screen corresponding to the large participant is used as the screen for switching images.
  • the step 201A may be performed first and then the step 202A may be performed, or the step 202A may be performed before the step 201A may be performed, or may be performed simultaneously.
  • the predetermined number of the participants in the first site may be specified in advance, may be specified in advance by the administrator of the conference management station, or may be specified in advance for the conference terminal of the conference, or may be preset by the multimedia control server. set.
  • step 203A may be implemented as follows. : selecting, according to the ranking result of the participant currently displayed on the screen of the first site, selecting the last currently displayed participant, and determining whether the screen of the last currently displayed participant is the first specific screen, if No, determine that the screen that needs to switch images is the screen of the last currently displayed participant; if so, select the row that is currently displayed in the last The previous currently displayed participant of the participant, determining that the screen for switching the image is the screen of the previous currently displayed participant of the last currently displayed participant; wherein the first specific screen is The second specific screen is symmetrical about the center line of the screen, and the second specific screen is a screen of the first meeting place that can achieve the eye-to-eye effect with the speaker image with the loudest voice, and the center line of the screen is sequentially the screens in the first meeting place.
  • the second specific screen is a screen of the first meeting place that can achieve the eye-to-eye effect with the speaker image with the loudest voice
  • the screen of the first specific screen is symmetric with respect to the center line of the screen of the second specific screen
  • the participant image of the area 1 captured by the camera 1 in one conference site defaults to the screen 1 in another conference site or 3 Presentation (If the image processing technology is not used, the participant image of one site area 1 is rendered by default on the screen 3 of another site; if the image processing is applied to the captured image, the participant image of zone 1 defaults.
  • the screen 1 of another site is presented;); the participant image of the area 2 captured by the camera 2 in one site is presented by default on the screen 2 of another site, and the participant image of the zone 3 captured by the camera 3 in a conference site
  • the default is displayed on screen 1 or 3 of another site (the participant images of area 1 of the same site are similarly presented in another site).
  • the participant's image of one site is displayed by default on the screen of another site, the participant's image can be displayed in another site, and the participant and the participant in the other site achieve the eye-to-eye effect.
  • Figure 1 shows the default presentation of the participants in the site 1 in the site 2 when the image processing technology is not used. It is assumed that the participants in the zone 1 in the two sites are the participants 1 and the participants in the zone 2.
  • the Participant in Area 3 is Participant 3.
  • the participant 1 in the site 1 is the participant with the highest current voice
  • the second specific screen is the screen 3 in the site 2
  • the screen 3 in the site 2 is symmetric about the center line of the screen.
  • the screen 1 is the screen 1 in the conference site 2
  • the screen 1 in the conference site 2 is the first specific screen, that is, the image of the participant 1 in the conference site 1 cannot be displayed on the screen 1 in the conference site 2.
  • the image processing technology it is assumed that the participant 1 in the site 1 is the participant with the highest current voice
  • the second specific screen is the screen 1 in the site 2
  • the screen 1 in the site 2 is symmetric about the center line of the screen.
  • Screen 3 in the conference site 2 the screen 3 in the conference site 2 is the first specific screen, that is, the image of the participant 1 in the conference site 1 cannot be displayed on the screen 3 in the conference site 2. It should be noted that, if the number of screens is an odd number of sites, if the screen corresponding to the speaker image with the loudest voice is the middle screen, the first specific screen does not exist, and the screen that needs to switch the image can be directly determined. At the screen where the last participant is located.
  • step 201A the participant determined in step 201A is the participant with the loudest voice, and when the participant with the loudest voice is already displayed on the screen of the first site, the participant is no longer executed.
  • Step 202 A and step 203 A the participant determined in step 201A is the participant with the loudest voice, and when the participant with the loudest voice is already displayed on the screen of the first site, the participant is no longer executed.
  • the screen of the first site in the foregoing method embodiment is a screen capable of image switching in the first site, and the screen that can switch images in the first site is all screens in the first site or screens other than the predetermined screen.
  • the predetermined screen is a predetermined screen that cannot switch images, such as a screen for displaying conference data (i.e., a secondary stream screen), or a screen for displaying a conference chairperson, or a screen for displaying a plurality of screens.
  • the network side media processing device may be a multipoint control server (for example, an MCU;), or may be a terminal device having the foregoing media control function (for example, integration).
  • a multipoint control server for example, an MCU;
  • a terminal device having the foregoing media control function for example, integration
  • the video conference terminal of the media control function may be another network device; or, the step 201A is performed by the network side media processing device, and the step 202A is performed by the terminal of the first site, specifically: the terminal of the first site is according to the first The ranking result of the participant currently displayed on the screen of the venue, selecting a predetermined number of participants, determining the screen corresponding to the selected participant as a screen for switching images, and then notifying the number of the selected predetermined number of screens The network side media processing device, at this time, the predetermined number can be specified in advance for the participants of the first site.
  • the number of screens that are assumed to be less than or equal to the first site can switch images is assumed. If the predetermined number is greater than the number of screens that can switch images in the first site, then the current conference is attended. The order of the volume from the largest to the smallest, starting from the participant with the highest volume, selects the participant to be displayed with the same number of screens as the first site can switch images, and controls the first site to switch the map. The image displayed on the screen of the image is switched to the selected image of the participant to be displayed.
  • step 201A needs to follow the order of the volume from the largest participant to the participant other than the specific participant. Starting from the participant with the highest volume, determining a predetermined number of participants to be displayed in sequence, and determining, in step 202A, a screen for switching images in the screen of the first site other than the specific screen. .
  • the embodiment of the present invention determines a predetermined number of screens corresponding to the currently displayed participant in the first site as a screen that needs to switch images, and then switches the screen that needs to switch images according to the order of the volume of each participant in the conference.
  • the determined image of the participant to be displayed is determined according to the order in which the volume of the participant in the current conference is as large as 'j, so the current participation in the discussion and the different venues can be displayed.
  • the participants can enable the participants in the first venue to see the images of the participants participating in the discussion and improve the experience of the participants.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference.
  • the network-side media processing device is specifically an MCU, and the MCU first selects a participant with a loud voice in the current conference. Then, the screen that needs to be switched in the first site is selected, and then the image displayed on the screen that needs to switch the image is switched to the image of the participant to be displayed with a large sound.
  • the method specifically includes:
  • Each site sends the collected voice of the participant and the captured image of the participant to the MCU.
  • MCU initiates voice-activated switching.
  • the MCU initiates voice-activated switching in this step, that is, the MCU can perform voice-activated switching.
  • 203B The MCU selects a predetermined number of participants to be displayed in order according to the volume of the participants in the current conference from the largest to the smallest, starting from the participant with the highest volume.
  • the MCU selects a predetermined number of participants to be displayed to indicate that the MCU wants to start voice switching.
  • the predetermined number may be one or more, and when the predetermined number is multiple, the specific number may be It can also be set by the terminal and sent to the MCU.
  • the terminal of the chair site is set and sent to the network side media processing device.
  • the MCU sorts the currently displayed participants of the screen of the first site according to the sorting condition, and obtains the sorting result of the currently displayed participants of the screen of the first site.
  • sorting on demand may be sorted when the MCU starts to perform voice control switching.
  • step 202A The specific sorting manner is the same as the corresponding description in step 202A, and details are not described herein again.
  • the MCU selects a predetermined number of currently displayed participants according to the ranking result of the currently displayed participants of the screen of the first site, and determines a screen corresponding to the selected currently displayed participant as a screen for switching images.
  • the MCU controls the image displayed on the screen that needs to switch the image to be switched to the predetermined number of images of the participant to be displayed.
  • the screen for controlling the image to be switched is controlled.
  • the images displayed by the at least two screens are switched to the images of the at least two participants to be displayed, such that the orientations of the images of the at least two participants to be displayed displayed in the first venue are The order of the physical positions of the at least two participants to be displayed in the second site is the same.
  • the image of the participant corresponding to the area 1 of the second site in the first site, and the direction of the image of the participant corresponding to the area 2 in the first site are the screen of the participant image of the area 1 in the second site, and the area 2 The order of the orientation of the screen of the participant image.
  • the image switching mode is adopted, so that the images of the at least two participants to be displayed after the switching are kept in the same order of the physical positions of the at least two participants to be displayed in the original site, so that at least the first site is displayed.
  • the two participants to be displayed can better maintain the physical position of the original venue.
  • the participant image of area 1 of the A site and the screen of the participant image of area 2 are screen 1 and screen 2 of the site B.
  • the participant image of area 1 of display A and the screen of participant image of area 2 are screen 2 of screen B, screen 3 of screen.
  • the participant's image of the area 1 of the A site and the screen of the participant's image of the area 2 are screen 1 and screen 3 of the site B.
  • the direction order of the screens of the participant images showing the A site areas 1, 2 is sorted in the direction of 1/2/3/4/5 (ie, if the default corresponding manner is described above, the display area 1
  • the screen number of the participant image must be smaller than the screen number of the participant area 2 display image;).
  • the MCU selects a screen that needs to switch images according to the ranking result of the participants displayed on the screen in the first site, and then switches the screen that needs to switch the image to the order according to the volume of each participant in the conference from large to d.
  • the selected participant's image because the sorting result is sorted according to at least one of the participant's voice size, the speaking time point, and the speaking duration displayed on the screen in the first site, so that the current speech can be guaranteed.
  • the image of the participant may be displayed on the screen of the first site, so that the participants in the first site can see the image of the participant currently participating in the discussion, thereby improving the experience of the participant.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference.
  • the network-side media processing device is an MCU, and the MCU first selects a screen in the first site that needs to switch images. Then, the participant to be displayed in the current conference is selected, and then the image displayed on the screen that needs to switch the image is switched to the image of the participant to be displayed with a large sound.
  • the method includes:
  • Each participant will send the collected voice of the participant and the captured image of the participant to the MCU.
  • MCU initiates voice-activated switching.
  • the MCU initiates voice-activated switching in this step, that is, the MCU can perform voice-activated switching.
  • the 203C and the MCU sort the participants currently displayed on the screen of the first site according to the sorting condition, and obtain the sorting result of the currently displayed participants of the screen of the first site.
  • step 204B For a specific sorting manner and sorting time, refer to the corresponding description of step 204B, and details are not described herein.
  • the MCU selects a predetermined number of currently displayed participants according to the ranking result of the currently displayed participants of the screen of the first site, and determines a screen corresponding to the selected predetermined number of currently displayed participants as the image to be switched. Screen.
  • the MCU selects a predetermined number of participants to be displayed in order from the highest volume participant in the order of the participants in the current conference.
  • the MCU selects a predetermined number of participants to be displayed to indicate that the MCU is to start voice switching.
  • the predetermined number may be one or more, and when the predetermined number is multiple, the configurable number may be set by the terminal and sent to the MCU, for example, the terminal of the conference site is set and sent to the network. Side media processing device.
  • the MCU controls the image displayed on the screen that needs to switch the image to be switched to the predetermined number of images of the participant to be displayed.
  • the MCU selects a screen that needs to switch images according to the ranking result of the participant currently displayed on the screen in the first site, and then switches the screen that needs to switch the image according to the order of the volume of each participant in the conference.
  • the selected image of the currently displayed participant because the sorting result is sorted according to at least one of the participant's voice size, the speaking time point, and the speaking duration displayed on the screen in the first site, so that the current order can be guaranteed
  • the images of the participants of the speech may be displayed on the screen of the first site, so that the participants in the first site can see the images of the participants currently participating in the discussion, thereby improving the experience of the participants.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference.
  • the method is different from the above two embodiments in that: the terminal of the first site is currently according to the screen of the first site. Sort results of the displayed participants, select the screen that needs to switch images and then notify The MCU controls the switching of the screen display image in the first site by the MCU, and the method specifically includes:
  • Each site sends the collected voice of the participant and the captured image of the participant to the MCU.
  • MCU initiates voice-activated switching.
  • the terminal of the first site sorts the currently displayed participants of the screen of the first site according to the sorting condition, and obtains the sorting result of the currently displayed participant of the screen of the first site.
  • step 204B For a specific sorting manner and sorting time, refer to the corresponding description of step 204B, and details are not described herein.
  • the terminal of the first site selects a predetermined number of currently displayed participants according to the ranking result of the participant currently displayed on the screen of the first site, and determines a screen corresponding to the selected participant as a screen for switching the image.
  • the terminal of the first site sends the number of the screen in the first site that needs to switch images to the MCU.
  • the 206D and the MCU sequentially determine the predetermined number of participants to be displayed according to the order of the participants in the current conference from the largest to the smallest, starting from the participant with the highest volume.
  • the predetermined number may be one or more, and when the predetermined number is multiple, the terminal may be set and sent to the MCU.
  • the MCU controls the image displayed on the screen of the image to be switched to be switched to a predetermined number of images of the participant to be displayed.
  • the terminal of the first site selects a screen that needs to switch images according to the ranking result of the participant displayed on the screen in the first site, and then the screen that needs to switch the image is controlled by the MCU to switch to the volume according to each participant in the conference.
  • the images of the participants who are currently speaking may be displayed on the screen of the first site, which enables the participants in the first site to see the images of the participants participating in the discussion, thereby improving the experience of the participants.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference.
  • the whole method in which the network side media processing device is an MCU, the MCU first selects an image corresponding to the participant with the highest current voice as the image to be displayed, and then selects according to the voice size of the participant displayed on the screen in the first site. Switching the screen of the image, the method specifically includes:
  • Each site sends the collected voices of the participants and the images of the captured participants to the MCU.
  • the MCU initiates voice switching.
  • the MCU determines the participant with the highest current voice, and the participant with the highest voice is the participant to be displayed.
  • the MCU determines whether the switching condition is met. If yes, execute 305. If no, end the process.
  • the voice of the participant with the highest voice may be determined whether the voice of the participant with the highest voice lasts for a preset period of time, and if yes, the handover condition is met, otherwise the handover condition is not provided.
  • the MCU determines whether the participant currently displayed on the screen capable of switching the image in the first site has the participant in the most recent speaker list. If not, execute 306, and if yes, execute 307.
  • the MCU determines, according to the sound size of the participant currently displayed on the screen of the switchable image of the first site, that the image of the participant with the smallest voice is located on the screen that needs to switch the image, and controls the image displayed on the screen to be the smallest from the sound.
  • the participant image is switched to the image of the participant with the highest current voice, and the process ends.
  • the screen that can switch images in the first conference site is all screens in the first conference site or screens other than the predetermined screen, and the predetermined screen is a preset screen that cannot perform image switching.
  • the predetermined screen is a predetermined screen that cannot switch images, such as a screen for displaying conference data, or a screen for displaying a conference chairperson, or a screen for displaying a plurality of screens.
  • the multi-screen image can be used as the participant image with the smallest voice, so that the multi-screen image can be switched when the image is switched for the first time after the voice-activated switching is started.
  • the MCU determines whether the participant currently displayed on the screen of the first site can switch the image belongs to the latest speaker list. If yes, execute 308. If no, execute 309. 308. The MCU selects, according to the ranking result of the participants in the recent speaker list, the screen of the participant whose ranking result is lower is the screen that needs to switch the image, and then controls the image displayed on the screen to switch to the image of the participant with the loudest voice. End this process.
  • the currently displayed participants of the first site of the description are sorted in the same manner, and are not described here.
  • the latest speaker list may also be a list of images, that is, a list of images of participants who have recently spoken.
  • the MCU selects the participant with the smallest voice from among the currently displayed participants that are not in the latest speaker list, and uses the screen where the selected participant is located as the screen that needs to switch the image, and the MCU controls to switch the image displayed on the screen to The image of the participant with the loudest voice.
  • the participant with the smallest voice can be selected from the currently displayed participants that are not in the latest speaker list, and the screen where the participant with the smallest voice is located is a screen that needs to switch images, and the image that controls the screen display is switched to The image of the participant with the loudest voice.
  • the embodiment of the present invention selects the participant to be switched from among the participants not belonging to the latest speaker list, or selects the participation result after the ranking result according to the ranking result of the participants in the recent speaker list.
  • the voice-activated switching method can prevent the participant who frequently speaks from being switched off, so that the user in the conference can see the participant's image participating in the discussion, and improve the experience of the participant; further, as long as the voice is the largest If the voice of the speaker satisfies the switching condition, the image of the speaker with the loudest voice can be switched to the conference site, so that the user in the conference site can immediately see the image of the participant with the loudest voice, thereby improving the experience of the participant.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference.
  • the difference between the method and the embodiment shown in FIG. 3 is: the MCU is first displayed according to the screen in the first conference site. The size of the participant's voice, select the screen that needs to switch images, and then select the participant with the highest current voice.
  • the method includes:
  • Each site sends the collected participant's voice and the obtained participant's image to the MCU.
  • the MCU initiates voice switching.
  • the MCU determines that the screen that can switch images in the first site is currently displayed. Whether the participant has the participant in the recent speaker list, if not, execute 404, and if yes, execute 405.
  • the cycle time can be preset, for example, one cycle is 2s, so step 403 is executed every two seconds.
  • the MCU selects, according to the voice size of the participant currently displayed on the screen of the first site that can switch images, the screen where the image of the participant with the smallest voice is located as the screen that needs to switch the image.
  • the definition of the screen that can switch the image in the first site is the same as the description of the corresponding part in the embodiment shown in FIG. 3, and details are not described herein again.
  • the MCU determines whether the participant currently displayed on the screen of the first site can switch images belongs to the most recent speaker list. If yes, execute 406. If no, execute 407.
  • the MCU selects, according to the ranking result of the participants in the recent speaker list, the screen of the participant whose ranking result is behind is a screen that needs to switch images.
  • the currently displayed participants of the first site of the description are sorted in the same manner, and are not described here.
  • the most recent speaker list may also be a list of images, that is, a list of images of participants who have recently spoken.
  • the MCU selects the participant with the smallest voice from among the currently displayed participants that are not in the latest speaker list, and uses the screen where the selected participant is located as the screen that needs to switch the image.
  • the MCU determines the speaker with the highest current voice, and the participant with the highest voice is the participant to be displayed.
  • the MCU determines whether the switching condition is met. If yes, execute 410. If no, the processing is not performed, and the process returns to step 403.
  • the MCU controls the image displayed on the screen of the image to be switched to the image of the participant with the loudest voice.
  • the embodiment of the present invention selects the participant to be switched among the currently displayed participants that do not belong to the latest speaker list, or selects the ranking result according to the ranking result of the participants in the recent speaker list.
  • the subsequent participant is the participant to be switched.
  • This voice-switching method can avoid the image of the participant who frequently speaks frequently being switched off, so that the users in the conference can see the participation. Discuss the participants' images to enhance the participant's experience.
  • an embodiment of the present invention provides a method for adjusting an image of a participant in a multi-screen video conference.
  • the difference between the method and the embodiment shown in FIG. 3 and FIG. 4 is as follows: The size of the participant's voice displayed on the screen in a venue, select the screen that needs to switch images and then notify the MCU.
  • the method specifically includes:
  • Each venue sends the voice of the participant and the image of the participant to the MCU.
  • the MCU initiates voice switching.
  • the terminal of the first site determines whether the participant currently displayed on the screen that can switch the image in the first site has the participant in the latest speaker list. If not, execute 504. If yes, execute 505.
  • the cycle time can be preset, for example, one cycle is 2s, so step 503 is executed every two seconds.
  • the terminal of the first site selects a screen where the image of the participant with the smallest voice is located according to the sound size of the participant currently displayed on the screen of the first site switchable image, and the screen that needs to switch the image.
  • the definition of the screen that can switch the image in the first site is the same as the description of the corresponding part in the embodiment shown in FIG. 3, and details are not described herein again.
  • the terminal of the first site determines whether the participant currently displayed on the screen of the first site switchable image belongs to the latest speaker list. If yes, execute 506. If no, execute 507.
  • the terminal of the first site selects the screen of the participant who is behind the sorting result according to the sorting result of the participant in the recent speaker list, and the screen that needs to switch the image.
  • the currently displayed participants of the first site of the description are sorted in the same manner, and are not described here.
  • the most recent speaker list may also be a list of images, that is, a list of images of participants who have recently spoken.
  • the terminal of the first site selects the participant with the smallest voice from among the currently displayed participants that are not in the latest speaker list, and uses the screen where the selected participant is located as the screen that needs to switch the image.
  • the terminal of the first site sends the number of the screen that needs to switch the image to the MCU.
  • the MCU determines the speaker with the highest current voice, and the speaker with the largest voice is the one to be displayed. Participants.
  • the MCU determines whether there is a switching condition. If yes, execute 511. If no, do not process, and end the process.
  • the MCU controls the image displayed on the screen of the image to be switched to the image of the participant with the loudest voice.
  • the participant to be switched when considering the most recent speech list, is selected among the participants not belonging to the latest speaker list, or the ranking result is selected according to the ranking result of the participants in the recent speaker list.
  • the voice-activated switching method can prevent the image of the participant who frequently speaks from being switched off, so that the user in the site can see the image of the participant participating in the discussion, thereby improving the experience of the participant.
  • the terminal of the first site selects a screen that needs to switch images, which reduces the work of the MCU and reduces the requirements for the MCU.
  • step 202A For details on how to sort the participants, refer to the detailed description of step 202A, and details are not described here.
  • the conference chair image can be controlled to be always in the speaker image list, and the multi-screen image is always in the speaker image list.
  • the conference chairperson image may enter the list of recent speakers at the beginning of the conference, or may be cut into the list of recent speakers after the conference chairperson's speech. Specifically, if the speaker with the loudest voice is the conference chairperson, the conference chairperson The image is placed in the list of recent speakers.
  • the speaker with the largest current voice can be placed in the recent speaker list. Specifically, after the speaker image with the largest current voice is switched to the screen display, the speaker with the highest voice is placed in the most recent speaker.
  • the list of speakers can also be placed in the most recent speaker list before the switch.
  • select an image that can achieve an eye-to-eye effect with the current highest-sounding participant display an image of the participant with the current highest sound, or select a screen that can achieve the eye-to-eye effect with the participant who has the highest current sound.
  • the adjacent screen displays the image of the participant with the loudest sound. For example, if the current maximum participant is the participant on the left side of the A site, and the screen that can achieve the eye-to-eye effect with the current highest voice is the screen on the left side of the B site, select the screen on the left side of the B site as needed. Switch the screen of the image, or, select the middle screen of the B venue as the screen for switching images.
  • the screen near the screen of the participant image of the same site is selected to display the participant with the highest current voice. image.
  • the image of the screen in the first specific screen or the screen outside the first specific screen is not switched.
  • the screen outside the first specific screen is a screen on the side of the first specific screen facing away from the geometric center line, such as a five-screen venue. If the first specific screen is screen 4, the screen outside the first specific screen is screen 5, if The first specific screen is screen 2, and the screen outside the first specific screen is screen 3.
  • each camera captures a group of participants.
  • the group of participants shares one or more MICs (microphones, microphones).
  • the sound of the group of MICs represents an orientation of the site sound (such as left, The left and right middle positions of the middle and right sides, each site sends the sound of the MIC of different orientations to the MCU, and the MCU that will maximize the sound when the MCU switches the voice control (this group of MICs corresponds to a position in a venue)
  • the corresponding image is switched to display; or multiple cameras capture images of a group of participants or even the entire venue.
  • the group of participants share a set of MICs.
  • the sounds of the group of MICs represent a sound orientation or the sound of the entire venue (such as a single voice channel). In the case of a voice protocol, it means the entire site.) Each site sends the sound of the MIC in different directions to the MCU. When the MCU switches voice commands, the MIC that will maximize the sound (this group of MICs corresponds to one of the venues). Orientation or a site) The corresponding image (the image of a group of participants captured by multiple cameras or the image of the entire site) is switched; In the above two cases, there may be another processing method, that is, each site selects the first few sounds from the azimuth sounds corresponding to the MICs of the group, that is, selects several sets of MIC sounds, and sends the selected sounds. The MCU and the MCU select the MIC of the largest group of sounds from the entire conference, and switch the corresponding images.
  • FIG. 6A a method for adjusting the display of the participant image in the multi-screen video conference provided by the embodiment of the present invention is described in detail by taking a three-screen conference site as an example.
  • the venue, the B-site, the C-site, and the D-site are all three-screen venues.
  • the E-site, the F-site, and the G-site are all two-screen venues.
  • the J venue and the K venue are single-screen venues.
  • Screens 1, 2, and 3 of the A site respectively display the image captured by the camera E1 in the E site, the image captured by the camera J1 in the J site, and the image captured by the camera G2 in the G site; after the voice switching is started, the current participant's voice constantly changes.
  • the image switching process of the venue A includes:
  • the participant's voice is the smallest in the image captured by camera G2
  • the participant's voice is the largest in the image captured by camera K1
  • a venue is switched from the image captured by camera G2 to the image captured by camera K1. Place the participants captured by camera K1 into the list of recent speakers;
  • the participant's voice is the smallest in the image captured by camera F2, and the participant's voice is the largest in the image captured by camera K1. Since the image captured by camera K1 is already displayed on screen 3, no processing is performed;
  • the participant's voice of the image captured by the current camera K1 is the smallest
  • the participant's voice of the image captured by the camera F2 is the second smallest
  • the participant's voice of the image captured by the camera C3 is the largest, due to the point of the speech from the near to the farthest.
  • the participant photographed by the camera F2 is at the last position of the recent speaker list. Therefore, the image displayed by the control screen 1 is switched from the image captured by the camera F2 to the image captured by the camera C3, since the cameras C2 and C3 belong to the same venue, exchange The screens displayed by the cameras C2 and C3, the control screen 1 displays the images taken by the camera C2, and the control screen 2 displays the images taken by the camera C3.
  • the A conference site, the B conference site, the C conference site, and the D conference site are all described.
  • the E-site, the F-site, and the G-site are all two-screen venues.
  • the J-site and the K-site are single-screen venues.
  • the E-site screens 1 and 2 respectively display the E-site.
  • the image captured by the camera E2 the image captured by the camera J1 in the J venue, after the voice control is switched, the current participant's voice changes continuously.
  • the image switching process of the field E includes: 1) The participant's voice of the image captured by the current camera J1 At the minimum, the participant's voice of the image captured by the camera F2 is the largest, and the image displayed by the control screen 2 is switched from the image captured by the camera J1 to the image captured by the camera F2, and the participant photographed by the camera F2 is placed in the latest speaker list;
  • the image taken by camera E2 has the smallest participant's voice
  • the image captured by camera C2 is The participant's voice is the loudest
  • the image displayed on the control screen 1 is switched from the image captured by the camera E2 to the image captured by the camera C2, and the participant photographed by the camera C2 is placed in the latest speaker list;
  • the image of the participant captured by camera C2 is the smallest, and the image of the participant captured by camera K1 is the loudest.
  • the participant captured by camera C2 is in the most recent speech.
  • the image displayed by the control screen 1 is switched from the image captured by the camera C2 to the image captured by the camera K1, and the participant photographed by the camera K1 is placed in the latest speaker list while being deleted from the recent speaker list. Participants taken by camera C2;
  • the participant's voice is the smallest in the image captured by camera F2, and the participant's voice is the largest in the image captured by camera K1. Since the image captured by camera K1 is already displayed on the screen, it is not processed.
  • the image captured by the camera K1 is the smallest among the participants.
  • the image captured by the camera C3 is the loudest.
  • the image displayed by the control screen 1 is switched from the image captured by the camera K1 to the image captured by the camera C3.
  • the image displayed on the screen in the one screen is switched from the original image to the image with the largest current sound.
  • an embodiment of the present invention provides a method for adjusting an image display of a participant in a multi-screen video conference.
  • the difference between the method and the embodiment shown in FIG. 3 , FIG. 4 and FIG. 5 is that the MCU considers the first conference site. While the screen of the currently displayed screen of the image can be switched, the physical position of the screen in the first site is considered, and the method specifically includes:
  • Each site sends the voice of the participant and the image of the participant to the MCU.
  • the MCU initiates voice switching.
  • the MCU determines the participant with the highest current voice, and the participant with the highest voice is the participant to be displayed.
  • the MCU determines whether the switching condition is met. If yes, execute 705. If no, end the process.
  • it may be determining whether the voice of the participant with the highest voice lasts for a preset time. Segment, if yes, the switching condition is met, otherwise the switching condition is not available.
  • the MCU selects the last participant according to the ranking result of the participant currently displayed on the screen of the first site.
  • the MCU sorts the currently displayed participants of the screen of the first site according to the sorting condition, and obtains the sorting result of the currently displayed participants of the screen of the first site.
  • the specific sorting mode and sorting time refer to the corresponding descriptions of step 204B and step 202A, and details are not described herein again.
  • the MCU determines whether the screen where the last participant is located is the first specific screen. If not, execute 707; if yes, execute 708.
  • step 202A For a description of the first specific screen, refer to the related description in step 202A, and details are not described herein again.
  • the MCU determines that the screen that needs to switch images is the screen where the last participant is located.
  • the MCU selects the previous participant of the last participant, and determines that the screen that needs to switch the image is the screen of the previous participant of the last participant.
  • the MCU controls the screen that needs to switch images to switch to the participant image with the largest current voice.
  • step 706 it is determined whether the screen of the last participant is the first specific screen, when there are four screens in the first site, five a screen, or a larger number of screens, in this step is to determine whether the screen in which the last participant is located is the first specific screen or the screen outside the first specific screen, the screen outside the first specific screen.
  • the first specific screen faces the screen on the side of the centerline of the screen. For example, a five-screen venue, the first specific screen is the screen 4, the screen outside the first specific screen is the screen 5; and the other specific screen is the screen 3, the screen outside the first specific screen is Screen 4.
  • the previous participant of the last participant determines whether there are five screens in the first site. Whether the screen is the first specific screen or the screen outside the first specific screen. If not, it is determined that the screen that needs to switch the image is the screen of the previous participant who is ranked last, if yes, according to Sort the results, find the participant in the third last position, and determine that the screen that needs to switch images is the screen where the participant is located. For example, for a five-screen venue, assume that the first specific screen is screen 4, when the row is at the most When the next participant is located at screen 4, the previous participant of the last participant is searched. If it is located at screen 5, the participant located at the third last position is searched, and the screen for which the image needs to be switched is determined to be the participant. The screen on which the image is located.
  • the MCU considers the physical position of the screen in the first site while considering the order of the participants of the screen display capable of switching the image in the first site, and avoids the image switching of the participant with the largest voice reaching the target.
  • the eye effect is displayed on the screen to enhance the participant's experience.
  • the solution is also applicable to the scenario where the MCU first selects a scene that needs to switch screens, and then selects the scene of the participant with the highest voice, and the same applies to the scene that is selected by the terminal of the first site to switch screens.
  • the MCU can perform image switching on the screens that need to switch images in each site according to the solution provided in the foregoing embodiment; or, if the conference has a chairperson, first sort the results of the participants currently displayed on each screen in the chair site. In the chair site, select a screen that needs to switch images, control the image displayed on the screen that needs to switch images to switch to the image of the participant to be displayed, and then, according to the position of the selected screen in the chair site and other venues. The position of the screen in the corresponding site is controlled, and the image of the participant to be displayed is switched to the corresponding screen display in the other site; wherein the corresponding screen in the other site has the same number as the selected screen.
  • a three-screen site can specify that the screen 3 displays the participant with the highest current voice.
  • the designated screen 3 displays the image of the participant whose current voice is the loudest; as shown in Fig. 6D, the designated screen 2 displays the image of the participant whose current voice is the loudest.
  • the screen of the participant who specifies the maximum display sound can be changed according to the policy requirement. For a single-screen site, you can view the image of the participant with the highest current voice, or you can view the multi-screen image (the image of multiple participants can be displayed through multiple sub-screens), where the image of the participant with the highest current voice is one of the sub-screens. Image. In order to reach the current highest voice attendees and local meetings The field participants are better eye-to-eye, and the image of the current participant with the largest voice can always be displayed on the home screen.
  • the site adjusts the camera to the front of the participant in the site, and sends the image to the far end; for a three-screen site, you can also specify the left screen to display the multi-screen image, the middle screen to display the conference chair, and the right screen to display the current voice. The biggest attendee.
  • the method may further include: the MCU controls the panoramic image of the site of the participant with the highest voice to be superimposed after image processing. Displayed on a partial area of the image of the participant who is currently the loudest. Specifically, the MCU reduces the panoramic image of the conference site of the participant with the highest voice, and superimposes the reduced conference panorama image on a partial area of the image of the participant with the highest voice.
  • the F site is a site with three cameras, three screens, and three regions.
  • the three cameras respectively capture the image of the participant in the corresponding area, and the terminal in the F site transmits the image of the participant in each area.
  • the screen 1 of the MCU control A site (assumed to be a three-screen site) displays the participant image captured by the camera F1 (assuming the participant is The participant with the loudest voice) assumes that the three screens in the A site respectively display the participant image captured by camera F1, the participant image captured by camera C2, and the participant image captured by camera G2 (see Figure 8).
  • the MCU splices the participant images (3 participant images) captured by the three cameras (F1, F2, F3) in the F site into a panoramic image, and reduces the panoramic image to control the screen in the A site.
  • the MCU splices the participant images (3 participant images) captured by the three cameras (F1, F2, F3) in the F site into a panoramic image, and reduces the panoramic image to control the screen in the A site.
  • 1 Superimpose the reduced panoramic image on the participant image captured by camera F1, or superimpose the field name on the panoramic image, or superimpose the field name on other areas of the participant image captured by camera F1. display.
  • the voice data with the azimuth information that is, the site carries the corresponding relationship between the voice data and the camera video data in the voice data sent to the MCU; when the MCU processes the data, according to the number of screens of the destination site, the number of speakers, etc. Corresponding to the image and audio viewed at the destination venue, so that the sound is in it The speaker displayed near the screen displayed by the image is played.
  • the adjacent participant is the participant adjacent to the participant.
  • the MCU can mix the voice of the participant into the channel corresponding to the adjacent participant, so that the voice of the participant and the adjacent participant can simultaneously display the screen of the adjacent participant image.
  • the corresponding playback device is broadcasted as shown in FIG. 9.
  • the participant image captured by camera F2, the participant image captured by camera F3, the participant image captured by camera G2, and the participant image captured by camera C2 are displayed on the four screens of the four-screen B site. Assume that the camera of the F-site of the four-screen is sorted as Fl, F2, F3, and F4. If the participant photographed by the camera F1 is speaking, the MCU controls the participant photographed by F1 and the participant photographed by F2 (ie, the participant photographed with F1). The sounds of the adjacent participants are mixed and played out from the playback device corresponding to the screen of the participant image captured by the camera F2, so that the participants of the site B hear the two through the playback device.
  • the voice of the participants can confirm that the two participants are adjacent; if the participant photographed by camera F4 is speaking, the MCU controls the participants of F3 shooting and the participants of F4 (ie, the participants who are shooting with F4)
  • the sound of the neighboring participant is mixed and played out from the playback device corresponding to the screen of the participant image captured by the camera F3, and the participant at the site B hears the two participants through the playback device.
  • the sound you can be sure that the two participants are adjacent. In this way, the participants of the site B can determine the physical positional relationship of the sound source through the sound emitted by the sound reproducing device.
  • the voice of the participant photographed by the camera F1 becomes large, the image of the participant photographed by the camera F1 needs to be displayed, and the sound is also played in the playback device corresponding to the screen on which the image is displayed, for example, by the camera F1.
  • the participant's image is switched to screen 4, and the sound of the image should be played from the playback device corresponding to screen 4.
  • the participant image captured by the camera F1 is switched to the screen 4 display, in order not to suddenly jump the sound of the image from the playback device corresponding to the screen 1 to the playback device corresponding to the screen 4, a sound transition may be adopted.
  • Method such as first making the sound of the image on the screen corresponding to the playback device Attenuation is 3db during playback, and is also attenuated by 3db when the playback device corresponding to screen 4 is played, so that the sound size of the image heard by the participant is the same as the actual sound size, and then the sound of the playback device corresponding to the screen 1 is gradually changed.
  • the sound of the playback device corresponding to the screen 4 is gradually increased, and the sound is transitioned to the playback device corresponding to the screen 4.
  • the attenuation value used for adjustment during the transition can be determined according to the positional relative relationship between the two screens.
  • the specific video source may have the following Ways:
  • the first mode When the voice-activated switching starts, the same video source is configured on the screen in each site corresponding to the image of a participant in the site. For example, three three-screen venues, venue 1, venue 2, and venue 3, the participant image of the area 1 in the conference site 1 can achieve the eye-to-eye effect when displayed on the third screen of each venue, so the screen configuration of each venue 3
  • the same video source In the same way, the same video source is also configured on the No. 2 screen of each site.
  • the same video source is also configured on the No. 1 screen of each site. In this way, when the voice is switched, the MCU selects the images to be switched for each site.
  • the image of the participant with the loudest voice can be switched to the same numbered screen in each venue. That is, when each site has the same number of screens, the same video source is configured for the screen with the same screen number in each site.
  • the second mode obtaining an image of the participant with the highest current voice, determining whether the second specific screen in the conference site can display an image of the participant with the loudest voice, and if so, controlling the second specific screen to display the voice maximum If there is no, according to the physical distance of the other screens in the venue to the second specific screen, the order of the other screens can be sequentially determined to determine whether the other screen can display the image of the participant with the loudest voice. Until the screen for displaying the image of the participant with the loudest voice is found, the screen found to control displays the image of the participant with the highest current voice, wherein the second specific screen is reachable to the participant with the loudest voice.
  • the screen for eye-to-eye effects For a description of the second specific screen, refer to the corresponding description of the first embodiment, and details are not described herein again.
  • the site in the mode refers to any site in the video conference. If any site is processed in the above manner, you can ensure that the same number of screens in each site have the same video source. If the method is used, the participant's image with the loudest voice is switched to the corresponding screen display according to the second method described above, and the same site number of the same screen is guaranteed. After the numbered screens have the same video source, they are switched according to the scheme described in the embodiments shown in FIG. 2B, FIG. 2C, FIG. 2D, FIG. 3, FIG. 4, FIG. 5, and FIG.
  • the determining whether the second specific screen in the conference site can display the image of the participant with the highest voice may be: determining whether the second specific screen in the conference site is currently displaying the conference chair image, and if yes, indicating that the second specific screen cannot be Displaying an image of the participant with the highest current voice; determining whether the second specific screen in the conference site is currently displaying the multi-screen image, and if so, indicating that the second specific screen cannot display the image of the participant with the highest current voice; Whether the specific screen is currently displaying the participant in the recent speaker list, and if so, indicating that the second specific screen cannot display the image of the participant with the highest current voice; when the second specific screen in the venue is currently displaying the image is neither multi-screen When the image, which is not the conference chair image, nor the participant image in the recent speaker list, the image of the participant with the loudest voice can be displayed on the second specific screen.
  • the following may be: determining whether the other screen can display the image of the participant having the loudest voice according to the physical distance of the other screens in the site to the second specific screen from the near to the farthing sequence: The physical distance of the other screens to the second specific screen is from near to far, and it is sequentially determined whether the other screen is currently displaying the conference chair image, and if so, it indicates that the screen cannot display the image of the participant with the highest current voice; Or, sequentially determining whether the other screen is currently displaying the multi-screen image, and if so, indicating that the screen cannot display the image of the participant with the highest current voice; or, sequentially determining whether the other screen is currently displaying the image in the recent speaker list, If yes, it means that the screen cannot display the image of the participant with the highest current voice; only the image currently displayed by the determined screen is neither a multi-screen image nor a conference chair image, nor is it an image in the recent speaker list. You can display this on this screen The maximum sound image of the participants.
  • the third mode If the conference has a chairperson, first select the screen that needs to switch images in the embodiment shown in FIG. 3, 4, 5, and 7 according to the size of the participant voice in the participant image displayed on each screen in the chair site. Solution, select a screen, switch the image of the screen in the chair site to the image of the participant with the loudest voice; then, according to the position of the selected screen in the chair site and other screens in the site The position of the screen in the corresponding site, the image of the participant who controls the loudest voice is switched to the corresponding screen display in the other site; wherein the corresponding screen in the other site is in the other site, the physical position in the screen group is in the selected screen The physical position of the screen group in the chair site is the same; or the corresponding screen in the other site has the same number as the selected screen.
  • the size of the participant's voice in the participant image displayed on each screen in a conference site may be used first, and the screen in which the image needs to be switched is selected in the embodiment shown in FIG. 3, 4, 5, and 7.
  • Scenario select a screen, control the image of the screen to switch to the image of the participant with the loudest voice, and then, in the same manner as above, control the image of the participant with the loudest voice to switch to the corresponding screen in the other venue, so that Ensure that the same numbered screens of each site with the same number of screens have the same video source.
  • the participant with the highest current voice is switched to the corresponding screen, for example, there are three three-screen venues.
  • the voice of the participant with the loudest voice is satisfied.
  • the image of the participant with the loudest voice is switched to the left screen of the three sites; the voice of each participant is constantly changing, and the voice of the participant with the loudest voice meets the switching condition.
  • the method may further include: the MCU can control the image of the participant with the loudest voice to be replaced.
  • One of the multi-screen images causes the image of the participant with the loudest sound to be displayed in the multi-screen image. In this way, when the image of the participant with the loudest voice is displayed full screen on one screen in the same venue, the image of the participant with the loudest voice is simultaneously displayed in the multi-screen.
  • an embodiment of the present invention provides a network-side media processing device, including: an participant selection unit 100, configured to start from a participant with the highest volume according to the order of the volume of the participant in the current conference. Determining a predetermined number of participants to be displayed in sequence;
  • the screen selection unit 300 is configured to determine a predetermined number of screens corresponding to the currently displayed participants in the first site as a screen for switching images.
  • the first control switching unit 400 is configured to control an image displayed by the screen on which the image needs to be switched to be switched to the predetermined number of images of the participant to be displayed.
  • the device also includes:
  • the sorting unit 200 is configured to sort the participants currently displayed on the screen of the first site according to the sorting condition, and obtain the sorting result of the currently displayed participant of the screen of the first site, and the sorting condition is one of the following conditions:
  • the currently displayed participant's voice size, the time point of the speech, the duration of the speech, the number of speeches of the participant currently displayed on the screen of the first site, and the screen corresponding to the participant currently displayed on the screen of the first site are the main screen.
  • the screen selection unit 300 is specifically configured to determine, according to the ranking result of the participant currently displayed on the screen of the first site, a predetermined number of screens corresponding to the currently displayed participant in the first site as the screen for switching the image.
  • the predetermined number may be one; referring to FIG. 12, the screen selection unit 300 includes: a determining subunit 3001, configured to determine whether the participant displayed on the screen capable of switching images in the first site belongs to the most recent speaker list; The screen selects a first subunit 3002, and is used to participate in an attendee list that is not in the recent speaker list when there are participants in the most recent speaker list among the participants displayed on the screen capable of switching images in the first site.
  • a determining subunit 3001 configured to determine whether the participant displayed on the screen capable of switching images in the first site belongs to the most recent speaker list
  • the screen selects a first subunit 3002, and is used to participate in an attendee list that is not in the recent speaker list when there are participants in the most recent speaker list among the participants displayed on the screen capable of switching images in the first site.
  • the screen of the currently displayed participant who is selected after the sorting result is the screen that needs to switch images.
  • the order of the participants in the recent speaker list is sorted refer to the corresponding description in the method embodiment, and no longer Above.
  • the predetermined number is one; referring to FIG.
  • the screen selection unit 300 includes: a first selection subunit 3004, configured to select the last display currently listed according to the ranking result of the participant currently displayed on the screen of the first conference site. a screen in which the participant is located; a specific screen determining sub-unit 3005, configured to determine whether the screen of the last currently displayed participant is the first specific screen; the second selection sub-unit 3006, configured to determine when the specific screen is When the determination result of the subunit 3005 is YES, the screen of the previous currently displayed participant of the last currently displayed participant is selected; the determining subunit 3007 is used for determining the judgment of the specific screen determination subunit 3005.
  • the screen that needs to switch the image is determined to be the screen selected by the first selection subunit 3004; when the determination result of the specific screen determination subunit 3005 is YES, the screen that needs to switch the image is determined to be the second The screen selected by subunit 3006 is selected.
  • the first control switching unit 400 is specifically configured to: when the predetermined number of the images of the participant to be displayed, the image of the at least two participants to be displayed is from the second At the time of the conference, the image displayed by the at least two screens in the screen that controls the image to be switched is switched to the image of the at least two participants to be displayed, so that the at least two to-be-displayed in the first conference site
  • the order of the orientation of the images of the displayed participants is the same as the order of the physical locations of the at least two participants to be displayed in the second venue.
  • the device further includes: a control superimposing unit 500, configured to control the panoramic image of the site where the participant with the highest sound is to be displayed is image processed And superimposed on a partial area of the image of the participant to be displayed that has the largest sound, specifically, the panoramic image of the site where the participant to be displayed having the largest current voice is controlled is subjected to reduction processing, and the current sound is superimposed to the maximum The partial area of the image of the participant to be displayed is displayed.
  • a control superimposing unit 500 configured to control the panoramic image of the site where the participant with the highest sound is to be displayed is image processed And superimposed on a partial area of the image of the participant to be displayed that has the largest sound, specifically, the panoramic image of the site where the participant to be displayed having the largest current voice is controlled is subjected to reduction processing, and the current sound is superimposed to the maximum The partial area of the image of the participant to be displayed is displayed.
  • the device further includes: a video source control unit 600, configured to control the same number of screens of the same site with the same number of screens having the same video source.
  • the video source control unit 600 may specifically include: a first determining subunit 6001, configured to determine whether a second specific screen in the first site can display the current An image of the participant whose sound is the most to be displayed; a second determining subunit 6002, configured to determine, when the determination result of the first determining subunit 6001 is negative, determining that the first meeting is away from the second specific a screen having a physical distance closest to the screen and capable of displaying an image of the participant whose sound is the largest to be displayed; a control display subunit 6003, configured to control the result when the determination result of the first determining subunit is YES a second specific screen displays an image of the participant with the loudest voice; when the determination result of the first determining subunit is negative, controlling a screen found by the second determining subunit to display the largest sound to be displayed The image of the participant.
  • a first determining subunit 6001 configured to determine whether a second specific screen in the first site can display the current An image of the participant whose sound is the most to be displayed
  • the device further includes: a second control switching unit 700, which is further configured to control the first meeting site The image displayed on the corresponding screen of the other site is switched to a predetermined number of images of the participant to be displayed; wherein the corresponding screen of the other site has the same number as the screen of the selected first site that needs to switch images.
  • the apparatus further includes: a multi-screen image control display unit 800 for capturing the image of the participant to be displayed with the largest sound and The other plurality of images are spliced into a multi-screen image, and the other screens that control the first site display the multi-screen image, and the other screens are ones of the first site except the selected screens that need to switch images or Multiple screens.
  • a predetermined number of screens are selected from the screen of the first site as a screen for switching images, and then the screen that needs to switch images is switched to a predetermined number.
  • the image of the participant avoids that the image captured by a certain camera as in the prior art can only be displayed on a specific screen of the remote site (ie, the screen corresponding to the image by default), which can be switched by the screen voice control.
  • the users in the user see the images of the participants participating in the discussion and improve the experience of the participants.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Description

多屏视频会议中对与会者图像显示进行调整的方法及装置 本申请要求于 2010年 9月 9日提交中国专利局、申请号为 201010279924.1 , 发明名称为"多屏视频会议中对与会者图像显示进行调整的方法及装置"的中国 专利申请的优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信技术领域, 特别涉及一种多屏视频会议中对与会者图像显 示进行调整的方法及装置。
背景技术
视讯会议业务是一种多媒体通信业务, 其利用视讯终端和通信网络召开会 议, 可以同时实现两地或多个地点之间的图像、 语音、 数据的交互。 会场中的 终端将本地摄像机拍摄的图像信号、 与会者区域中的麦克风拾取的与会者的声 音信号进行压缩编码, 经过传输网络传至远方会场。 同时, 通过传输网络接收 远方会场传来的数字信号, 对数字信号进行解码得到远方会场与会者的图像和 信号。 随着视频会议的发展, 会场已经由以前的一个摄像机、 一个显示器、 一 个与会者区域发展到多个摄像机、 多个显示器、 多个与会者区域, 这些在同一 会场的多个摄像机、 多个显示器、 多个与会者区域通过物理的或者逻辑的关系 进行关联。
现有技术提供一种按会场的声控切换方法, 通信网络中的多点控制服务器 (以 MCU, Multipoint Control Unit, 即多点控制单元为例 )识别当前声音最大 的发言者, 将当前声音最大的发言者所在会场的各与会者的图像都切换到目标 会场中, 其中目标会场是会议中除最大发言者所在会场以外的各会场。
现有技术具有如下缺点:
现有技术中目标会场只能显示同一会场的各与会者图像, 即只能显示声音 最大的与会者所在会场的各与会者图像, 这样, 如果当前参与讨论的与会者是 不同会场的与会者时, 目标会场中的与会者就不能看到当前参与讨论的与会者 图像。 发明内容
本发明实施例提供一种多屏视频会议中对与会者图像显示进行调整的方法 及装置, 能够灵活的进行按屏幕声控切换, 提高与会者的体验。
有鉴于此, 本发明实施例提供:
一种多屏视频会议中对与会者图像显示进行调整的方法, 包括:
按照当前会议中与会者音量从大到小的顺序, 从音量最大的与会者开始, 依次确定预定个数的待显示的与会者;
确定第一会场中预定个数的当前显示的与会者对应的屏幕作为需要切换图 像的屏幕;
控制所述需要切换图像的屏幕所显示的图像切换为所述预定个数的待显示 与会者的图像。
一种网络侧媒体处理设备, 包括:
与会者选择单元, 用于按照当前会议中与会者音量从大到小的顺序, 从音 量最大的与会者开始, 依次确定预定个数的待显示的与会者;
屏幕选择单元, 用于确定第一会场中预定个数的当前显示的与会者对应的 屏幕作为需要切换图像的屏幕;
第一控制切换单元, 用于控制所述需要切换图像的屏幕所显示的图像切换 为所述预定个数的待显示与会者的图像。
本发明实施例确定第一会场中预定个数的当前显示的与会者对应的屏幕作 为需要切换图像的屏幕, 然后将需要切换图像的屏幕中的图像切换为根据会议 中各与会者音量从大到小的顺序而确定的待显示与会者的图像。 由于所选择的 待显示的与会者是按照当前会议中与会者音量从大到小的顺序而确定的, 所以 可以显示当前参与讨论、 且位于不同会场的与会者, 能够使第一会场中的与会 者看到参与讨论的与会者图像, 提高与会者的体验。
附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例中所需要使 用的附图作筒单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明的一些 实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可 以根据这些附图获得其他的附图。
图 1是多屏会场的结构示意图;
图 2A是本发明一实施例提供的多屏视频会议中对与会者图像显示进行调整 的方法流程图;
图 2B是本发明另一实施例提供的多屏视频会议中对与会者图像显示进行调 整的方法流程图;
图 2C是本发明又一实施例提供的多屏视频会议中对与会者图像显示进行调 整的方法流程图;
图 2D是本发明又一实施例提供的多屏视频会议中对与会者图像显示进行调 整的方法流程图;
图 3是本发明实施例提供的一种基于最近发言者列表对与会者图像显示进 行调整的方法流程图;
图 4是本发明实施例提供的另一种基于最近发言者列表对与会者图像显示 进行调整的方法流程图;
图 5是本发明实施例提供的又一种基于最近发言者列表对与会者图像显示 进行调整的方法流程图;
图 6A是本发明实施例提供的采用图 3、 4或者 5的方法切换三屏会场屏幕的图 像的示意图;
图 6B是本发明实施例提供的采用图 3、 4或者 5的方法切换两屏会场屏幕的图 像的示意图;
图 6C是本发明实施例提供的采用指定显示最大发言者图像的屏幕的方法切 换三屏会场屏幕的图像的示意图;
图 6D是本发明实施例提供的采用指定显示最大发言者图像的屏幕的方法切 换两屏会场屏幕的图像的示意图;
图 7是本发明实施例提供的一种考虑会场中屏幕的位置对与会者图像显示 进行调整的方法流程图;
图 8是本发明实施例提供的会场将多画面图像叠加显示在声音最大的发言 者图像上的示意图;
图 9是本发明实施例提供的会场中的放音设备播放混音(远端会场的多个与 会者声音) 的示意图; 图 10是本发明实施例提供的显示声音最大的与会者图像的同时显示多画面 的示意图;
图 11是本发明实施例提供的一种网络侧媒体处理设备结构图;
图 12、 图 13分别为屏幕选择单元结构图;
图 14为视频源控制单元结构图。
具体实施方式
为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本发明 实施例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。 基于本发明中 的实施例, 本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其 他实施例, 都属于本发明保护的范围。
参阅图 2A, 本发明实施例提供一种多屏视频会议中对与会者图像显示进行 调整的方法, 该方法具体包括:
201A、 按照当前会议中与会者音量从大到小的顺序, 从音量最大的与会者 开始, 依次确定预定个数的待显示的与会者。
其中, 与会者的音量从大到小的顺序, 在需要对与会者的图像显示进行调 整时, 对与会者的一段时间讲话的音量能量值进行统计, 所述的一段时间可以 为需要对与会者的图像进行调整的时刻之前的一段时间, 该一段时间的时长可 以由用户设定; 其中, 预定个数可以是一个, 此时所确定的与会者为声音最大 的与会者; 或者, 预定个数为多个, 具体可以由网络侧媒体处理设备设置的,
端设置并发送给网络侧媒体处理设备, 比如, 主席会场的终端设置后发送给网 络侧媒体处理设备。
202A、 确定第一会场中预定个数的当前显示的与会者对应的屏幕作为需要 切换图像的屏幕。
具体的, 可以根据用户的自定义选择, 也可以根据会议过程中管理员的指 定, 还可以根据第一会场的屏幕当前显示的与会者的排序结果, 来确定第一会 场中预定个数的当前显示的与会者对应的屏幕作为需要切换图像的屏幕。 其中, 第一会场的屏幕当前显示的与会者的排序结果是按照如下排序条件进行的, 所 述排序条件包括如下条件之一: 当前显示的与会者的声音大小、 当前显示的与 会者的发言时间点远近、 当前显示的与会者的发言时长、 第一会场的屏幕当前 显示的与会者的发言次数和第一会场的屏幕当前显示的与会者所对应的屏幕是 否为主屏。 其中, 排序结果可以是按照如下方式之一进行排序: 当前显示的与 会者按照声音从大到小的顺序; 当前显示的与会者的发言时间点按照从近到远 的顺序; 当前显示的与会者的发言时长按照从长到短的顺序; 第一会场的屏幕 当前显示的与会者的发言次数按照从多到少的顺序; 此外, 第一会场的屏幕当 前显示的与会者所对应的屏幕是否为主屏可以作为附加的排序条件, 屏幕为主 屏的第一会场的当前显示的与会者的排序顺序位于屏幕为非主屏的第一会场的 当前显示的与会者的排序顺序之前。
在视讯会议中, 一般声音最小的与会者是没有参与讨论的与会者, 声音较 大的与会者是参与讨论的与会者, 所以为了能选择到没有参与讨论的与会者所 在屏幕作为待切换的屏幕, 所以将当前显示的与会者的声音大小作为排序条件 之一; 在视讯会议中, 一般发言时间点较近的与会者再次发言的概率比较大, 发言时间点较远的与会者再次发言的概率比较小, 所以将当前显示的与会者的 发言时间点远近作为排序条件之一; 在视讯会议中, 一般发言时间长的与会者 再次发言的概率比较大, 发言时长短的与会者再次发言的概率比较小, 所以将 当前显示的与会者的发言时长作为排序条件之一; 在视讯会议中, 一般经常发 言的人其再次发言的概率就会更高, 为了更好的统计与会者发言的概率, 所以 可以将与会者的发言次数作为排序条件之一; 另外, 在视讯会议中, 对于奇数 个显示屏幕的会场, 中间的屏幕对应的主屏; 对于偶数个显示屏幕的会场, 中 轴线相邻的两个屏幕对应的是主屏, 而主屏一般呈现会议主席等会议主要参与 者的图像, 因此, 为了更好的统计主屏呈现的与会者, 所以可以将第一会场的 屏幕当前显示的与会者所对应的屏幕是否为主屏主屏作为排序条件。 对于不同的排序条件可以按照相应的重要性设定对应的权重(作为举例: 所有的排序条件分配的权重之和归一化为 1 , 当然, 也可以设计权重之和不为 1 的情况), 并对各个排序条件的因素按照其特征定义取值范围, 然后通过计算这 些因素的加权和作为排序参考值;
如下举例说明: 假定与会者声音大小的权重为 0.1、 发言时间点远近的权重 为 0.4、 发言时长的权重为 0.2、 发言次数多少的权重为 0.2、 与会者所在屏幕是 否为主屏的权重为 0.1 , 所有这些因素的权重之和为 1。 而且, 所有这些因素都 有自己的值, 比如, 与会者声音大小的取值范围为 1 ~ 10, 其中, 声音越大, 取 值越大, 声音越小, 取值越小, 其中, 各与会者声音的大小为最近发言时间点 的各与会者声音的大小; 发言时间点的取值范围为 1 ~ 1000, 各与会者发言时间 点为各与会者最近一次发言的时间点, 其中, 可以假定会议开始时记为 1 , 然后 过 1分钟, 就加 1; 发言时长取值范围 1 ~ 500, 以分钟为单位, 其可以是与会 者最近一次发言的时长, 也可以是与会者特定时间段内发言时长的累加值, 比 如与会者在 1小时之内发言的总时长; 发言次数取值范围为 1 ~ 100, 其可以为 特定时间段内的发言次数, 比如 1小时之内的发言次数, 也可以是从会议开始 算起得到的总发言次数; 与会者所在屏幕取值为 0或者 1 , 即与会者所在屏幕为 主屏时, 则取值为 1 , 否则取值为 0, 对于三屏或者五屏会场, 中间的屏幕为主 屏, 对于四屏会场, 可以认为中间的两个为主屏。 然后按照以下公式计算各与 会者的排序参考值:
与会者的排序参考值 =与会者的声音 X与会者声音大小权重 +与会者的发 言时间点 X发言时间点权重 +与会者的发言时长 X发言时长权重 +与会者的发言 次数 X发言次数权重 +与会者的与会者所在屏幕 X与会者所在屏幕权重。
然后, 按照排序参考值从大到小的顺序对各与会者进行排序, 选择排序结 果靠后的预定个数的与会者所对应的屏幕作为需要切换图像的屏幕。
其中, 需要说明的是, 在对第一会场的屏幕当前显示的与会者进行排序时, 可以仅考虑各与会者声音的大小, 此时按照与会者的声音从大到小的顺序进行 排序; 也可以仅考虑各与会者发言时间点的远近, 此时按照与会者的发言时间 点从近到远的顺序进行排序; 也可以仅考虑各与会者的发言时长, 此时按照与 会者的发言时长从长到短的顺序进行排序; 也可以仅考虑各与会者声音的大小 和各与会者发言时间点的远近, 而不考虑其他条件, 假定与会者声音的权重为
0.4、 发言时间点的权重为 0.6, 可以假定与会者声音大小的取值范围为 1 ~ 10, 其中, 声音越大, 取值越大, 声音越小, 取值越小, 其中, 各与会者声音的大 小为最近发言时间点的各与会者声音的大小, 发言时间点的取值范围为 1 ~ 1000, 各与会者发言时间点为各与会者最近一次发言的时间点, 此时, 按照以 下公式计算各与会者的排序参考值: 与会者的排序参考值 =与会者的声音 X与会 者声音大小权重 +与会者的发言时间点 X发言时间点权重, 然后, 按照排序参考 值从大到小的顺序对各与会者进行排序; 或者, 也可以仅考虑各与会者的发言 时长和各与会者发言时间点的远近, 而不考虑其他条件, 不影响本发明的实现。
203A、 控制所述需要切换图像的屏幕所显示的图像切换为所述预定个数的 待显示与会者的图像。
假定预定个数为两个, 而排序条件为按照声音从大到小的方式进行排序, 则该步骤是选择声音最大的与会者和声音次大的与会者, 确定声音最大的与会 者和声音次大的与会者对应的屏幕作为需要切换图像的屏幕。
需要说明的, 上述步骤 201 A、 与步骤 202A没有执行上的先后顺序, 可以 先执行步骤 201A再执行步骤 202A,也可以先执行步骤 202A再执行步骤 201A, 也可以同时执行。 其中, 所述预定个数可以为第一会场的与会者提前指定, 可 以为会议管理台的管理员提前指定, 还可以为会议的主席终端的与会者提前指 定, 还可以由多媒体控制服务器预先设定。
需要说明的是, 上述预定个数可以为一个, 也可以为多个, 当预定个数为 一个时, 步骤 201A中就选择了当前声音最大的与会者, 此时, 步骤 203A可以 采用如下方式实现: 根据第一会场的屏幕当前显示的与会者的排序结果, 选择 排在最后的当前显示的与会者, 判断所述排在最后的当前显示的与会者所在的 屏幕是否是第一特定屏幕, 如果否, 确定需要切换图像的屏幕为所述排在最后 的当前显示的与会者所在的屏幕; 如果是, 选择所述排在最后的当前显示的与 会者的前一个当前显示的与会者, 确定需要切换图像的屏幕为所述排在最后的 当前显示的与会者的前一个当前显示的与会者所在的屏幕; 其中, 所述第一特 定屏幕与第二特定屏幕关于屏幕中心线对称, 所述第二特定屏幕是能和声音最 大的发言者图像达到眼对眼效果的第一会场的屏幕, 屏幕中心线为所述第一会 场中各屏幕依次连接所形成的屏幕组的几何中心线。
其中, 由于第二特定屏幕是能和声音最大的发言者图像达到眼对眼效果的 第一会场的屏幕, 而第一特定屏幕与第二特定屏幕关于屏幕中心线对称的屏幕, 这样, 如果声音最大的发言者图像在第一特定屏幕上显示, 则该声音最大的发 言者与第一会场中的与会者达不到较好的眼对眼效果, 因此, 当排在最后的与 会者所在的屏幕是第一特定屏幕时, 就选择所述排在最后的与会者的前一个与 会者所在的屏幕作为需要切换图像的屏幕。
为了使上述描述更加清楚, 如下以三屏会场为例进行详细说明, 假定有两 个三屏会场, 一个会场中摄像机 1拍摄的区域 1的与会者图像缺省在另一会场 中的屏幕 1或者 3呈现(如果对图像未采用镜像处理技术, 则一会场区域 1的 与会者图像缺省在另一会场的屏幕 3呈现; 如果对拍摄图像采用镜像处理技术, 则区域 1的与会者图像缺省在另一会场的屏幕 1呈现;); 一个会场中的摄像机 2 拍摄的区域 2的与会者图像缺省在另一会场的屏幕 2呈现, 一个会场中的摄像 机 3拍摄的区域 3的与会者图像缺省在另一会场的屏幕 1或者 3呈现(同一会 场的区域 1的与会者图像在另一会场的呈现方式类似)。 上述一个会场的与会者 图像缺省在另一会场的屏幕上呈现时, 能够使该与会者图像在另一会场中显示 时, 该与会者与另一会场中的与会者达到眼对眼的效果。 图 1示出了未采用镜 像处理技术时, 会场 1中的与会者在会场 2中的缺省呈现的方式, 假定两个会 场中区域 1中的与会者为与会者 1 , 区域 2中的与会者为与会者 2, 区域 3中的 与会者为与会者 3。 利用本发明实施例提供的技术方案,假定会场 1中的与会者 1是当前声音最大的与会者, 则第二特定屏幕为会场 2中的屏幕 3, 会场 2中的 屏幕 3关于屏幕中心线对称的屏幕为会场 2中的屏幕 1 , 此时会场 2中的屏幕 1 为第一特定屏幕, 即会场 1中的与会者 1的图像不能在会场 2中的屏幕 1显示。 当采用镜像处理技术时, 假定会场 1中的与会者 1是当前声音最大的与会者, 则第二特定屏幕为会场 2中的屏幕 1 ,会场 2中的屏幕 1关于屏幕中心线对称的 屏幕为会场 2中的屏幕 3 , 此时会场 2中的屏幕 3为第一特定屏幕, 即会场 1中 的与会者 1的图像不能在会场 2中的屏幕 3显示。 其中, 需要说明的是, 屏幕 数目为奇数的会场, 如果声音最大的发言者图像所对应的屏幕为中间屏幕时, 则不会存在第一特定屏幕了, 可以直接确定需要切换图像的屏幕为排在最后的 与会者所在的屏幕。
需要说明的是, 当预定个数为 1个时, 则步骤 201A确定的与会者为声音最 大的与会者, 当该声音最大的与会者已经在第一会场的屏幕上显示时, 则不再 执行步骤 202 A和步骤 203 A。
其中, 上述方法实施例中的第一会场的屏幕为第一会场中能进行图像切换 的屏幕, 第一会场中能切换图像的屏幕为第一会场中所有的屏幕或者除预定屏 幕以外的屏幕。 所述预定屏幕为预定的不能切换图像的屏幕, 比如显示会议数 据资料的屏幕(即: 辅流屏幕), 或者指定显示会议主席的屏幕, 或者指定显示 多画面的展幕。
需要说明的, 上述各步骤可以由网络侧媒体处理设备执行, 网络侧媒体处 理设备可以是多点控制服务器 (以 MCU为例;), 也可以是具有上述媒体控制功 能的终端设备(譬如: 集成媒体控制功能的视讯会议终端), 还可以是其他网络 设备; 或者, 步骤 201A由网络侧媒体处理设备执行, 而步骤 202A由第一会场 的终端执行, 具体的: 第一会场的终端根据第一会场的屏幕当前显示的与会者 的排序结果, 选择预定个数的与会者, 确定所选择的与会者所对应的屏幕作为 需要切换图像的屏幕, 然后将所选择的预定个数的屏幕的编号通知网络侧媒体 处理设备, 此时, 预定个数可以为第一会场的与会者提前指定。
需要说明的是, 该实施例是以假定预定个数小于或者等于第一会场能切换 图像的屏幕个数, 如果预定个数大于第一会场能切换图像的屏幕个数, 则按照 当前会议中与会者音量从大到小的顺序, 从音量最大的与会者开始, 选择与第 一会场能切换图像的屏幕数目相同的待显示的与会者, 控制第一会场能切换图 像的屏幕所显示的图像切换为所选择的待显示与会者的图像。
另外, 如果会议中规定某一会场的一个特定与会者在第一会场的某一特定 屏幕上显示时,则步骤 201A需要对除所述特定与会者以外的与会者按照音量从 大到小的顺序, 从音量最大的与会者开始, 依次确定预定个数的待显示的与会 者,且在步骤 202A中需要在除上述特定屏幕以外的第一会场能切换图像的的屏 幕中确定需要切换图像的屏幕。
本发明实施例确定第一会场中预定个数的当前显示的与会者对应的屏幕作 为需要切换图像的屏幕, 然后将需要切换图像的屏幕切换为根据会议中各与会 者音量从大到小的顺序而确定的待显示与会者的图像, 由于所选择的待显示的 与会者是按照当前会议中与会者音量从大到 ' j、的顺序而确定的, 所以可以显示 当前参与讨论、 且位于不同会场的与会者, 能够使第一会场中的与会者看到参 与讨论的与会者图像, 提高与会者的体验。
参阅图 2B, 本发明实施例提供一种多屏视频会议中对与会者图像显示进行 调整的方法, 该方法中网络侧媒体处理设备具体为 MCU, MCU先选择当前会 议中声音较大的与会者, 再选择第一会场中需要切换图像的屏幕, 然后控制需 要切换图像的屏幕所显示的图像切换为声音较大的待显示的与会者的图像, 该 方法具体包括:
201B、 各个会场将采集到的与会者的声音和拍摄得到的与会者的图像都发 给 MCU。
202B、 MCU启动声控切换。
其中, 该步骤中 MCU启动声控切换是指 MCU可以进行声控切换了。 203B、 MCU按照当前会议中与会者音量从大到小的顺序, 从音量最大的与 会者开始, 依次选择预定个数的待显示的与会者。
该步骤中 MCU选择预定个数的待显示的与会者表示 MCU要开始声控切换 了。
其中, 预定个数可以为 1个或者为多个, 当预定个数为多个, 具体可以是 的, 还可以是由终端设置并发送给 MCU的, 比如, 主席会场的终端设置后发送 给网络侧媒体处理设备。
204B、 MCU按照排序条件对第一会场的屏幕当前显示的与会者进行排序, 得到第一会场的屏幕当前显示的与会者的排序结果。
具体的, 可以是在周期时间到达时进行排序, 或者随机进行排序, 或者按 需进行排序, 其中,按需进行排序可以是在 MCU要开始进行声控切换的时候进 行排序。
其中, 具体的排序方式与步骤 202A中的相应描述相同, 在此不再赘述。
205B、 MCU根据第一会场的屏幕当前显示的与会者的排序结果, 选择预定 个数的当前显示的与会者, 确定所选择的当前显示的与会者所对应的屏幕作为 需要切换图像的屏幕。
206B、 MCU控制所述需要切换图像的屏幕所显示的图像切换为所述预定个 数的待显示与会者的图像。
其中, 当所述预定个数的待显示的与会者的图像中存在至少两个待显示的 与会者的图像来自于同一会场 (假定为第二会场) 时, 控制所述需要切换图像 的屏幕中至少两个屏幕所显示的图像切换为所述至少两个待显示的与会者的图 像, 使得在所述第一会场中显示的所述至少两个待显示的与会者的图像的方向 顺序与所述至少两个待显示的与会者在所述第二会场中的物理位置的顺序相 同。 其中, 第一会场中显示第二会场的区域 1对应的与会者的图像、 区域 2对 应的与会者的图像的方向顺序为该会场中显示第二会场的区域 1与会者图像的 屏幕、 区域 2与会者图像的屏幕的方向顺序。
采用这种图像切换方式, 使得切换后的至少两个待显示的与会者的图像, 能够保持该至少两个待显示的与会者在原会场的物理位置的顺序相同, 使得在 第一会场显示的至少两个待显示的与会者能够更好的保持在原会场的物理位置 不变。
如下举实例说明: 假定 2个五屏会场 (会场 A、 会场 B ), 会场 A中区域 1 的与会者缺省对应的屏幕为屏幕 1 , 区域 2/3/4/5的与会者缺省对应的屏幕分别 为犀幕 2/3/4/5 , 如果 A会场的区域 1、 区域 2的与会者图像都在 B会场中显示, 则 MCU可以调整第一会场中屏幕显示的图像,使第一会场中的屏幕的显示包括 但不限于如下几种方式:
1 )、 显示 A会场的区域 1的与会者图像、 区域 2的与会者图像的屏幕分别 为 B会场的屏幕 1、 屏幕 2。
2 )、 显示 A会场的区域 1的与会者图像、 区域 2的与会者图像的屏幕分别 为 B会场的屏幕 2、 屏幕 3。
3 )、 显示 A会场的区域 1的与会者图像、 区域 2的与会者图像的屏幕分别 为 B会场的屏幕 1、 屏幕 3。
也就是说显示 A会场区域 1、 2的与会者图像的屏幕的方向顺序是按照 1/2/3/4/5这个方向排序的(即如果以上面描述的缺省对应的方式,那么显示区域 1与会者图像的屏幕编号一定比显示区域 2与会者图像的屏幕编号小;)。
本发明实施例中 MCU根据第一会场中屏幕显示的与会者的排序结果,选择 需要切换图像的屏幕, 然后将需要切换图像的屏幕切换为根据会议中各与会者 音量从大到 d、的顺序而选择的与会者的图像, 由于排序结果是根据第一会场中 屏幕显示的与会者声音大小、 发言时间点远近、 发言时长中至少一个条件进行 排序的排序结果, 所以能够保证当前不断讲话的与会者的图像都可能在第一会 场的屏幕中显示, 能够使第一会场中的与会者看到当前参与讨论的与会者图像, 提高与会者的体验。
参阅图 2C, 本发明实施例提供一种多屏视频会议中对与会者图像显示进行 调整的方法, 该方法中网络侧媒体处理设备为 MCU, MCU先选择第一会场中 需要切换图像的屏幕, 再选择当前会议中声音较大的待显示的与会者, 然后控 制需要切换图像的屏幕所显示的图像切换为声音较大的待显示的与会者的图 像, 该方法具体包括:
201C、 各个会场将采集到的与会者的声音和拍摄得到的与会者的图像都发 给 MCU。
202C、 MCU启动声控切换。 其中, 该步骤中 MCU启动声控切换是指 MCU可以进行声控切换了。
203C、 MCU按照排序条件对第一会场的屏幕当前显示的与会者进行排序, 得到第一会场的屏幕当前显示的与会者的排序结果。
其中, 具体的排序方式和排序时间可以参考步骤 204B的相应描述, 在此不 再赘述。
204C、 MCU根据第一会场的屏幕当前显示的与会者的排序结果, 选择预定 个数的当前显示的与会者, 确定所选择的预定个数的当前显示的与会者所对应 的屏幕作为需要切换图像的屏幕。
205C、 MCU按照当前会议中与会者音量从大到小的顺序, 从音量最大的与 会者开始, 依次选择预定个数的待显示的与会者。
该步骤中 MCU选择预定个数的待显示的与会者表示 MCU要开始进行声控 切换了。 其中, 预定个数可以为 1个或者为多个, 当预定个数为多个, 具体可 设置的, 还可以是由终端设置并发送给 MCU的, 比如, 主席会场的终端设置后 发送给网络侧媒体处理设备。
206C、 MCU控制所述需要切换图像的屏幕所显示的图像切换为所述预定个 数的待显示的与会者的图像。
本发明实施例中 MCU根据第一会场中屏幕当前显示的与会者的排序结果, 选择需要切换图像的屏幕, 然后将需要切换图像的屏幕切换为根据会议中各与 会者音量从大到小的顺序而选择的当前显示的与会者的图像, 由于排序结果是 根据第一会场中屏幕显示的与会者声音大小、 发言时间点远近、 发言时长中至 少一个条件进行排序的排序结果, 所以能够保证当前不断讲话的与会者的图像 都可能在第一会场的屏幕中显示, 能够使第一会场中的与会者看到当前参与讨 论的与会者图像, 提高与会者的体验。
参阅图 2D, 本发明实施例提供一种多屏视频会议中对与会者图像显示进行 调整的方法, 该方法与上述两个实施例的区别在于: 第一会场的终端根据第一 会场的屏幕当前显示的与会者的排序结果, 选择需要切换图像的屏幕然后通知 MCU, 由 MCU控制第一会场中屏幕显示图像的切换, 该方法具体包括:
201D、 各个会场将采集到的与会者的声音和拍摄得到的与会者的图像都发 给 MCU。
202D、 MCU启动声控切换。
203D、 第一会场的终端按照排序条件对第一会场的屏幕当前显示的与会者 进行排序, 得到第一会场的屏幕当前显示的与会者的排序结果。
其中, 具体的排序方式和排序时间可以参考步骤 204B的相应描述, 在此不 再赘述。
204D、第一会场的终端根据第一会场的屏幕当前显示的与会者的排序结果 , 选择预定个数的当前显示的与会者, 确定所选择的与会者所对应的屏幕作为需 要切换图像的屏幕。
205D、 第一会场的终端向 MCU发送第一会场中需要切换图像的屏幕的编 号。
206D、 MCU按照当前会议中与会者音量从大到小的顺序, 从音量最大的与 会者开始, 依次确定预定个数的待显示的与会者。
其中, 预定个数可以为 1个或者为多个, 当预定个数为多个, 具体可以是 由终端设置并发送给 MCU。
207D、 MCU控制所述需要切换图像的屏幕所显示的图像切换为预定个数的 待显示的与会者的图像。
本发明实施例中第一会场的终端根据第一会场中屏幕显示的与会者的排序 结果, 选择需要切换图像的屏幕, 然后由 MCU控制需要切换图像的屏幕切换为 根据会议中各与会者音量从大到小的顺序而选择出的与会者图像, 由于排序结 果是根据第一会场中屏幕显示的与会者声音大小、 发言时间点远近、 发言时长 中至少一个条件进行排序的排序结果, 所以能够保证当前不断讲话的与会者的 图像都可能在第一会场的屏幕中显示, 能够使第一会场中的与会者看到参与讨 论的与会者图像, 提高与会者的体验。
参阅图 3 ,本发明实施例提供一种多屏视频会议中对与会者图像显示进行调 整的方法, 该方法中网络侧媒体处理设备为 MCU, MCU先选择当前声音最大 的与会者对应的图像作为待显示的图像, 然后根据第一会场中屏幕显示的与会 者的声音大小, 选择需要切换图像的屏幕, 该方法具体包括:
301、 各个会场将采集到的与会者的声音和拍摄得到的与会者的图像都发给 MCU。
302、 MCU启动声控切换。
303、 MCU确定当前声音最大的与会者, 该声音最大的与会者为待显示的 与会者。
304、 MCU判断是否满足切换条件, 如果是, 执行 305 , 如果否, 结束本流 程。
具体的, 可以是判断当前声音最大的与会者的声音是否持续一个预设时间 段, 如果是, 则满足切换条件, 否则不具备切换条件。
305、 MCU判断第一会场中能切换图像的屏幕当前显示的与会者是否有最 近发言者列表中的与会者, 如果否, 则执行 306, 如果是, 则执行 307。
306、 MCU根据第一会场的能切换图像的屏幕当前显示的与会者的声音大 小, 确定声音最小的与会者的图像所在的屏幕为需要切换图像的屏幕, 控制该 屏幕显示的图像从声音最小的与会者图像切换为当前声音最大的与会者的图 像, 结束本流程。
其中, 第一会场中能切换图像的屏幕为第一会场中所有的屏幕或者除预定 屏幕以外的屏幕, 所述预定屏幕为预置的不能进行图像切换的屏幕。 所述预定 屏幕为预定的不能切换图像的屏幕, 比如显示会议数据资料的屏幕, 或者指定 显示会议主席的屏幕, 或者指定显示多画面的屏幕。
需要说明的是, 本实施例及后续各实施例中, 可以将多画面图像作为声音 最小的与会者图像, 这样在声控切换启动后, 第一次进行图像切换时就可以将 该多画面图像切换为当前声音最大的与会者图像。
307、 MCU判断第一会场能切换图像的屏幕当前显示的与会者是否都属于 最近发言者列表, 如果是, 执行 308 , 如果否, 则执行 309。 308、 MCU按照最近发言者列表中与会者的排序结果, 选择排序结果靠后 的与会者所在的屏幕为需要切换图像的屏幕, 则控制该屏幕显示的图像切换为 声音最大的与会者的图像, 结束本流程。 描述的第一会场的屏幕当前显示的与会者的排序方式相同, 在此不再赘述。 其 中, 最近发言者列表也可以是图像列表, 即近期发言的与会者的图像的列表。
309、 MCU从不属于最近发言者列表的当前显示的与会者中选择声音最小 的与会者, 将所选择的与会者所在的屏幕作为需要切换图像的屏幕, MCU控制 将该屏幕显示的图像切换为声音最大的与会者的图像。
具体的, 可以从不属于最近发言者列表的当前显示的与会者中选择声音最 小的与会者, 则该声音最小的与会者所在的屏幕为需要切换图像的屏幕, 控制 该屏幕显示的图像切换为声音最大的与会者的图像。
本发明实施例在考虑最近发言列表时, 从不属于最近发言者列表的与会者 中选择待切换的与会者, 或者, 根据最近发言者列表中与会者的排序结果, 选 择排序结果靠后的与会者作为待切换图像, 这种声控切换方法, 能够避免最近 经常发言的与会者被切换掉, 使会场中的用户能够看到参与讨论的与会者图像, 提高与会者的体验; 进一步, 只要声音最大的发言者的声音满足切换条件, 则 可以将声音最大的发言者的图像切换到会场中, 使会场中的用户即时看到声音 最大的与会者的图像, 提高与会者的体验。
参阅图 4,本发明实施例提供一种多屏视频会议中对与会者图像显示进行调 整的方法, 该方法中与图 3所示实施例的区别在于: MCU先根据第一会场中屏 幕显示的与会者的声音大小, 选择需要切换图像的屏幕, 然后再选择当前声音 最大的与会者, 该方法具体包括:
401、 各个会场将采集得到的与会者的声音和获取得到的与会者的图像都发 给 MCU。
402、 MCU启动声控切换。
403、 周期时间到达时, MCU判断第一会场中能切换图像的屏幕当前显示 的与会者是否有最近发言者列表中的与会者, 如果否, 则执行 404, 如果是, 则 执行 405。
具体的, 可以预先设定周期时间, 比如一个周期为 2s, 这样每隔两秒就会 执行步骤 403。
404、 MCU根据第一会场的能切换图像的屏幕当前显示的与会者的声音大 小, 选择声音最小的与会者的图像所在的屏幕作为需要切换图像的屏幕。
其中, 第一会场中能切换图像的屏幕的定义与图 3所示实施例相应部分的 描述相同, 在此不再赘述。
405、 MCU判断第一会场能切换图像的屏幕当前显示的与会者是否都属于 最近发言者列表, 如果是, 执行 406, 如果否, 则执行 407。
406、 MCU按照最近发言者列表中与会者的排序结果, 选择排序结果靠后 的与会者所在的屏幕为需要切换图像的屏幕。 描述的第一会场的屏幕当前显示的与会者的排序方式相同, 在此不再赘述。 其 中, 最近发言者列表也可以是图像列表, 即近期发言的与会者的图像的列表。
407、 MCU从不属于最近发言者列表的当前显示的与会者中选择声音最小 的与会者, 将所选择的与会者所在的屏幕作为需要切换图像的屏幕。
408、 MCU确定当前声音最大的发言者, 该声音最大的与会者为待显示的 与会者。
409、 MCU判断是否具备切换条件, 如果是, 执行 410, 如果否, 不进行处 理, 返回执行步骤 403。
410、 MCU控制需要切换图像的屏幕显示的图像切换为声音最大的与会者 的图像。
本发明实施例在考虑最近发言列表时, 在不属于最近发言者列表的当前显 示的与会者中选择待切换的与会者, 或者, 根据最近发言者列表中与会者的排 序结果, 选择排序结果靠后的与会者作为待切换的与会者, 这种声控切换方法, 能够避免最近经常发言的与会者图像被切换掉, 使会场中的用户能够看到参与 讨论的与会者图像, 提高与会者的体验。
参阅图 5 ,本发明实施例提供一种多屏视频会议中对与会者图像显示进行调 整的方法, 该方法中与图 3、 图 4所示实施例的区别在于: 第一会场的终端根据 第一会场中屏幕显示的与会者的声音大小, 选择需要切换图像的屏幕然后通知 MCU, 该方法具体包括:
501、 各个会场将与会者的声音和与会者的图像都发给 MCU。
502、 MCU启动声控切换。
503、 周期时间到达时, 第一会场的终端判断第一会场中能切换图像的屏幕 当前显示的与会者是否有最近发言者列表中的与会者, 如果否, 则执行 504, 如 果是, 则执行 505。
具体的, 可以预先设定周期时间, 比如一个周期为 2s, 这样每隔两秒就会 执行步骤 503。
504、 第一会场的终端根据第一会场能切换图像的屏幕当前显示的与会者的 声音大小, 选择声音最小的与会者的图像所在的屏幕作为需要切换图像的屏幕。
其中, 第一会场中能切换图像的屏幕的定义与图 3所示实施例相应部分的 描述相同, 在此不再赘述。
505、 第一会场的终端判断第一会场能切换图像的屏幕当前显示的与会者是 否都属于最近发言者列表, 如果是, 执行 506, 如果否, 则执行 507。
506、 第一会场的终端按照最近发言者列表中与会者的排序结果, 选择排序 结果靠后的与会者所在的屏幕为需要切换图像的屏幕。 描述的第一会场的屏幕当前显示的与会者的排序方式相同, 在此不再赘述。 其 中, 最近发言者列表也可以是图像列表, 即近期发言的与会者的图像的列表。
507、 第一会场的终端从不属于最近发言者列表的当前显示的与会者中选择 声音最小的与会者, 将所选择的与会者所在的屏幕作为需要切换图像的屏幕。
508、 第一会场的终端向 MCU发送需要切换图像的屏幕的编号。
509、 MCU确定当前声音最大的发言者, 该声音最大的发言者为待显示的 与会者。
510、 MCU判断是否具备切换条件, 如果是, 执行 511 , 如果否, 不进行处 理, 结束本流程。
511、 MCU控制需要切换图像的屏幕显示的图像切换为声音最大的与会者 的图像。
本发明实施例在考虑最近发言列表时, 在不属于最近发言者列表的与会者 中选择待切换的与会者, 或者, 根据最近发言者列表中与会者的排序结果, 选 择排序结果靠后的与会者作为待切换的与会者, 这种声控切换方法, 能够避免 最近经常发言的与会者图像被切换掉, 使会场中的用户能够看到参与讨论的与 会者图像, 提高与会者的体验。 进一步, 由第一会场的终端选择需要切换图像 的屏幕, 减少了 MCU的工作, 降低了对 MCU的要求。
如下对最近发言者列表进行详细介绍:
1、 关于与会者的排序方式参见步骤 202A的详细描述, 在此不再赘述。
2、 当最近发言者列表为图像列表时, 可以控制会议主席图像一直位于发言 者图像列表中, 多画面图像一直位于发言者图像列表中。 其中, 会议主席图像 可以在会议一开始就进入最近发言者列表, 也可以在会议主席讲话后切入最近 发言者列表中, 具体的, 若当前声音最大的发言者为会议主席时, 将该会议主 席图像放入最近发言者列表中。
3、 关于最近发言者列表的更新, 有如下几种更新方式:
1 )、 可以将当前声音最大的发言者放入最近发言者列表中, 具体的, 可以 在将当前声音最大的发言者图像切换到屏幕上显示之后, 将当前声音最大的发 言者放入最近发言者列表, 也可以在切换之前, 将当前声音最大的发言者放入 最近发言者列表。
2 )、 在启动声控切换时, 将会场中当前各屏幕显示的与会者放入最近发言 者列表中。
3 )、 当最近发言者列表中与会者的个数大于会场中屏幕的个数时, 根据最 近发言者列表的排序结果, 将在最近发言者列表中的排序位数超过会场中屏幕 的个数的与会者删除; 或者, 当最近发言者列表中与会者的个数大于会场中屏 幕的个数时, 清空最近发言者列表。
4 )、 当最近发言者列表中有预定时间段内没有发言的与会者时, 将所述预 定时间段内没有发言的与会者从最近发言者列表中删除。
5 )、 最近发言者列表中与会者的个数大于会场中除特定屏幕以外的屏幕个 数时, 将在最近发言者列表中的排序位数超过除特定屏幕以外的屏幕个数的与 会者删除, 或者, 将最近发言者列表清空, 其中特定屏幕是不能进行图像切换 的屏幕, 比如专用于显示会议辅助资料的屏幕等。
4、 当已确定最近发言者列表中的与会者所在的屏幕需要进行图像切换时, 也可以采用下面这几种特殊策略:
第一、 选择能和当前声音最大的与会者达到眼对眼效果的屏幕显示该当前 声音最大的与会者的图像, 或者, 选择与能和当前声音最大的与会者达到目艮对 眼效果的屏幕相邻的屏幕来显示该声音最大的与会者的图像。 例如, 当前声音 最大与会者为 A会场左侧的与会者, 假定能和当前声音最大的与会者达到眼对 眼效果的屏幕为 B会场左侧的屏幕, 则选择 B会场左侧的屏幕作为需要切换图 像的屏幕, 或者, 选择 B会场的中间屏幕作为需要切换图像的屏幕。
第二、 如果当前声音最大与会者与最近发言者列表中的某个与会者为同一 会场中的发言者时, 选择在同一会场的与会者图像所在屏幕的附近屏幕显示该 当前声音最大的与会者图像。
第三、 优先切换主屏幕的图像。
第四、 不切换本会场中的第一特定屏幕或者第一特定屏幕外侧的屏幕的图 像, 其中, 对第一特定屏幕的描述请参见第一个实施例步骤 202A中的相关 描 述, 在此不再赘述。 第一特定屏幕外侧的屏幕为第一特定屏幕背向几何中心线 一侧的屏幕, 比如一个五屏会场, 如果第一特定屏幕为屏幕 4, 则第一特定屏幕 外侧的屏幕为屏幕 5 , 如果第一特定屏幕为屏幕 2, 则第一特定屏幕外侧的屏幕 为屏幕 3。
第五、 将最近发言者列表中声音最小的与会者的图像切换掉。 5、 对于多屏会场, 每个摄像机拍摄一组与会者, 这组与会者共享一个或者 多个 MIC (麦克风, 筒称麦克), 这组 MIC的声音代表该会场声音的一个方位 (比如左、 中、右中的左方位 ), 各会场把不同方位的 MIC的声音发送给 MCU, MCU在声控切换时, 会把声音最大的那组 MIC (这组 MIC对应了一个会场中 的一个方位) 的对应的图像进行显示切换; 或者多个摄像机拍摄一组与会者甚 至整个会场的图像, 这组与会者共享一组 MIC, 这组 MIC的声音代表一个声音 方位或者整个会场的声音(比如单声音道语音协议的情况下, 就是代表整个会 场), 各会场把不同方位的 MIC的声音发送给 MCU, MCU在声控切换时, 会 把声音最大的那组 MIC(这组 MIC对应了一个会场中的一个方位或者一个会场) 对应的图像(多个摄像机所拍摄的一组与会者的图像或者整个会场的图像)进 行显示切换; 对于上述两种情况, 还可以有另外一种处理方式, 即各会场从本 会场各组 MIC对应的方位声音中选出前几大声音, 即选出几组 MIC的声音,把 选出的声音发送 MCU, MCU再从整个会议中选出声音最大那组 MIC, 把其对 应的图像进行显示切换。
为了使本发明上述实施例更加清楚明白,参阅图 6A,如下以三屏会场为例, 详细说明本发明实施例提供的多屏视频会议中对与会者图像显示进行调整的方 法, 图中, A会场、 B会场、 C会场、 D会场都为 3屏会场, E会场、 F会场、 G会场都为 2屏会场, J会场、 K会场都为单屏会场, 具体的,在启动声控切换之 前, A会场的屏幕 1、 2、 3分别显示 E会场中摄像机 E1拍摄的图像, J会场中 摄像机 J1拍摄的图像, G会场中摄像机 G2拍摄的图像; 启动声控切换之后, 当前与会者声音不断变化, 则会场 A的图像切换过程包括:
1 )当前摄像机 E1拍摄的图像中的与会者声音最小, 摄像机 F2拍摄的图像 中的与会者声音最大,则控制 A会场的屏幕 1显示的图像从摄像机 E1拍摄的图 像切换为摄像机 F2拍摄的图像, 将摄像机 F2拍摄的与会者放入最近发言者列 表中;
2 )、 然后, 摄像机 F2拍摄的图像中的与会者声音最小, 摄像机 J1拍摄的 图像中的与会者声音次小, 摄像机 C2拍摄的图像的与会者声音最大, 由于摄像 机 F2拍摄的与会者已经在最近发言者列表中, 所以选择声音次小的与会者的图 像进行切换, 此时, 控制 A会场的屏幕 2显示的图像从摄像机 J1拍摄的图像切 换成摄像机 C2拍摄的图像, 将摄像机 C2拍摄的与会者放入最近发言者列表;
3 )、 然后, 摄像机 G2拍摄的图像中与会者声音最小, 摄像机 K1拍摄的图 像中与会者声音最大, 控制 A会场的屏幕 3显示的图像从摄像机 G2拍摄的图 像切换成摄像机 K1拍摄的图像, 将摄像机 K1拍摄的与会者放入最近发言者列 表中;
4 )、 然后, 摄像机 F2拍摄的图像中与会者声音最小, 摄像机 K1拍摄的图 像中与会者声音最大, 由于摄像机 K1拍摄的图像已经在屏幕 3上显示, 所以不 做处理;
5 )、 然后, 当前摄像机 K1拍摄的图像的与会者声音最小, 摄像机 F2拍摄 的图像的与会者声音次小, 摄像机 C3拍摄的图像的与会者声音最大, 由于按照 发言时间点从近到远的顺序, 摄像机 F2拍摄的与会者在最近发言者列表的最后 位置, 因此, 控制屏幕 1显示的图像从摄像机 F2拍摄的图像切换为摄像机 C3 拍摄的图像, 由于摄像机 C2和 C3都属于同一会场, 调换摄像机 C2和 C3显示 的屏幕, 控制屏幕 1显示摄像机 C2拍摄的图像, 控制屏幕 2显示摄像机 C3拍 摄的图像。
参阅图 6B, 如下以两屏会场为例, 详细说明本发明实施例提供的多屏视频 会议中对与会者图像显示进行调整的方法, 图中, A会场、 B会场、 C会场、 D 会场都为 3屏会场, E会场、 F会场、 G会场都为 2屏会场, J会场、 K会场都 为单屏会场, 具体的, 在启动声控切换之前, E会场的屏幕 1、 2分别显示 E会 场中摄像机 E2拍摄的图像, J会场中摄像机 J1拍摄的图像,启动声控切换之后, 当前与会者声音不断变化, 则会场 E的图像切换过程包括: 1 )、 当前摄像机 J1 拍摄的图像的与会者声音最小, 摄像机 F2拍摄的图像的与会者声音最大, 控制 屏幕 2显示的图像从摄像机 J1拍摄的图像切换成摄像机 F2拍摄的图像, 将摄 像机 F2拍摄的与会者放入最近发言者列表;
2 )然后, 摄像机 E2拍摄的图像与会者声音最小, 摄像机 C2拍摄的图像与 会者声音最大, 控制屏幕 1显示的图像从摄像机 E2拍摄的图像切换成摄像机 C2拍摄的图像, 将摄像机 C2拍摄的与会者放入最近发言者列表;
3 )然后, 摄像机 C2拍摄的图像与会者声音最小,摄像机 K1拍摄的图像与 会者声音最大, 按照最近发言者列表中与会者声音从大到小的顺序, 则摄像机 C2拍摄的与会者位于最近发言者列表的最后位置, 因此, 控制屏幕 1显示的图 像从摄像机 C2拍摄的图像切换成摄像机 K1拍摄的图像, 将摄像机 K1拍摄的 与会者放入最近发言者列表, 同时从最近发言者列表中删除摄像机 C2拍摄的与 会者;
4 ) 然后, 摄像机 F2拍摄的图像中与会者声音最小, 摄像机 K1拍摄的图 像中与会者声音最大, 由于摄像机 K1拍摄的图像已在屏幕中显示, 所以不做处 理。
5 )、 摄像机 K1拍摄的图像中与会者声音最小, 摄像机 C3拍摄的图像中与 会者声音最大, 控制屏幕 1显示的图像从摄像机 K1拍摄的图像切换成摄像机 C3拍摄的图像.
对于一屏会场, 则控制该一屏会场中的屏幕显示的图像从原来的图像切换 为当前声音最大的图像。
参阅图 7,本发明实施例提供一种多屏视频会议中对与会者图像显示进行调 整的方法, 该方法与上述图 3、 4、 5所示实施例的区别在于: MCU在考虑第一 会场中能切换图像的屏幕当前显示的与会者的排序的同时, 考虑了第一会场中 屏幕的物理位置, 该方法具体包括:
701、 各个会场将与会者的声音和与会者的图像都发给 MCU。
702、 MCU启动声控切换。
703、 MCU确定当前声音最大的与会者, 该声音最大的与会者为待显示的 与会者。
704、 MCU判断是否满足切换条件, 如果是, 执行 705 , 如果否, 结束本流 程。
具体的, 可以是判断当前声音最大的与会者的声音是否持续一个预设时间 段, 如果是, 则满足切换条件, 否则不具备切换条件。
705、 MCU根据第一会场的屏幕当前显示的与会者的排序结果, 选择排在 最后的与会者。
在该步骤之前, MCU会按照排序条件对第一会场的屏幕当前显示的与会者 进行排序, 得到第一会场的屏幕当前显示的与会者的排序结果。 其中, 具体的 排序方式和排序时间参见步骤 204B和步骤 202A的相应描述, 在此不再赘述。
706、 MCU判断所述排在最后的与会者所在的屏幕是否是第一特定屏幕, 如果否, 执行 707; 如果是, 执行 708。
其中, 关于第一特定屏幕的描述请参见步骤 202A中的相关描述, 在此不再 赘述。
707、 MCU确定需要切换图像的屏幕为所述排在最后的与会者所在的屏幕。
708、 MCU选择所述排在最后的与会者的前一个与会者, 确定需要切换图 像的屏幕为所述排在最后的与会者的前一个与会者所在的屏幕。
709、 MCU控制需要切换图像的屏幕切换为当前声音最大的与会者图像。 当第一会场中有三个以下屏幕(包括三个屏幕) 时, 则步骤 706中是判断 排在最后的与会者所在的屏幕是否是第一特定屏幕, 当第一会场中有四个屏幕、 五个屏幕、 或者更多个数的屏幕时, 该步骤中是判断所述排在最后的与会者所 在的屏幕是否是第一特定屏幕或者第一特定屏幕外侧的屏幕, 第一特定屏幕外 侧的屏幕为第一特定屏幕背向屏幕中心线一侧的屏幕。 比如, 一个五屏会场, 第一特定屏幕为屏幕 4,则第一特定屏幕外侧的屏幕为屏幕 5;再如一个四屏幕, 第一特定屏幕为屏幕 3 , 则第一特定屏幕外侧的屏幕为屏幕 4。 且, 当第一会场 中有五个屏幕时, 在步骤 708中查找到所述排在最后的与会者的前一个与会者 之后, 会继续判断所述排在最后的与会者的前一个与会者所在的屏幕是否是第 一特定屏幕或者第一特定屏幕外侧的屏幕, 如果否, 则确定需要切换图像的屏 幕为所述排在最后的与会者的前一个与会者所在的屏幕, 如果是, 根据排序结 果, 查找位于倒数第三位的与会者, 确定需要切换图像的屏幕为该与会者所在 的屏幕。 比如, 对于一个五屏会场, 假定第一特定屏幕为屏幕 4, 当所述排在最 后的与会者位于屏幕 4时, 则查找所述排在最后的与会者的前一个与会者, 如 果位于屏幕 5 , 则查找位于倒数第三位的与会者, 确定需要切换图像的屏幕为该 与会者图像所在的屏幕。
本发明实施例 MCU在考虑第一会场中能切换图像的屏幕显示的与会者的 排序的同时, 考虑了第一会场中屏幕的物理位置, 避免声音最大的与会者图像 切换到达不到目艮对眼效果的屏幕上显示, 提高与会者的体验。
需要说明的是, 该方案也适用于 MCU先选择需要切换屏幕的场景,再选择 声音最大的与会者的场景, 同样适用于由第一会场的终端选择需要切换屏幕的 场景。
需要说明的是, MCU可以按照上述实施例提供的方案将各会场中需要切换 图像的屏幕进行图像切换; 或者, 如果会议存在主席, 则先按照主席会场中各 屏幕当前显示的与会者的排序结果, 在主席会场中选择需要切换图像的屏幕, 控制所述需要切换图像的屏幕所显示的图像切换为待显示与会者的图像, 然后, 根据所选屏幕在主席会场中的位置及其他会场中的屏幕在相应会场中的位置, 控制待显示的与会者图像切换到其他会场中的相应屏幕显示; 其中, 所述其他 会场中的相应屏幕与所选屏幕具有相同的编号。 当会议中不存在主席时, 则可 以先按照一个会场中各屏幕当前显示的与会者的排序, 选择需要切换图像的屏 幕, 控制所选屏幕的图像切换为待显示的与会者的图像, 然后, 按照与上面相 同的方式, 控制待显示的与会者图像切换到其他会场中的相应屏幕显示。
可选的, 也可以指定当前声音最大的与会者始终在远端会场特定的屏幕上 显示, 比如一个三屏会场, 可以指定屏幕 3显示当前声音最大的与会者。 如图 6C所示, 指定屏幕 3显示当前声音最大的与会者的图像; 如图 6D所示, 指定 屏幕 2显示当前声音最大的与会者的图像。
具体的, 可以根据策略需要可以对指定显示声音最大的与会者的屏幕进行 改变。 对于单屏会场可以看当前声音最大的与会者的图像, 也可以看多画面图 像(通过多个子画面可以显示多个与会者的图像), 其中当前声音最大的与会者 的图像作为其中一个子画面的图像。 为了达到当前声音最大的与会者与本地会 场与会者更好的眼对眼, 可把当前声音最大与会者的图像始终在主屏上显示。 再进一步, 会场调整摄像机对照本会场的与会者的正面, 把该图像发送给远端; 对于一个三屏会场, 也可以指定左屏显示多画面图像、 中屏显示会议主席, 右 屏显示当前声音最大的与会者。
为了能在该声音最大的与会者图像上叠加显示声音最大的与会者所在会场 的全景图像, 所以该方法还可以包括: MCU控制当前声音最大的与会者的会场 全景图像经过图像处理后, 叠加到当前声音最大的与会者的图像的部分区域上 显示。 具体的, MCU将当前声音最大的与会者的会场全景图像缩小, 并将缩小 后的会场全景图像叠加到当前声音最大的与会者的图像的部分区域上显示。 如 下举实例进行说明,假定 F会场是具有 3个摄像机、 3个屏幕、 3个区域的会场, 这三个摄像机分别拍摄对应区域的与会者图像, F会场中的终端将各区域与会者 图像传给 MCU, 假定当前摄像机 F1拍摄的与会者的声音最大, 采用前面介绍 的技术方案, MCU控制 A会场 (假定为三屏会场) 的屏幕 1显示摄像机 F1拍 摄的与会者图像(假定该与会者为声音最大的与会者 ), 此时假定 A会场中的三 个屏幕分别显示摄像机 F1拍摄的与会者图像,摄像机 C2拍摄的与会者图像, 摄 像机 G2拍摄的与会者图像(参阅图 8 )。则,该 MCU将 F会场中三个摄像机 (F1、 F2、 F3)拍摄的与会者图像(3个与会者图像)进行拼接成一个全景图像, 将该 全景图像缩小后, 控制 A会场中的屏幕 1将缩小后的全景图像叠加到摄像机 F1 拍摄的与会者图像上显示, 也可以将会场名叠加到该全景图像上显示, 或者, 将会场名叠加到摄像机 F1拍摄的与会者图像的其他区域上显示。
在上述本发明实施例提供的技术方案中, 可以通过如下方式保证声音和图 像的良好同步:
1 )、 多声道技术, 即语音声道数和摄像机一样, 即可以实现每路摄像机的 活动视频都有自己的对应方位的声道语音数据;
2 )、 带方位信息的语音数据, 即会场把发给 MCU的语音数据中携带该语音 数据与摄像机视频数据的对应关系; MCU在处理这些数据时, 根据目的会场的 屏幕数量、 音箱个数等, 把目的会场观看的图像和音频对应起来, 使声音在其 图像所显示的屏幕附近的音箱播放。
当某个多屏会场只有一个或者几个摄像机所拍摄的与会者图像被远端会场 某个或者某几个屏幕显示出来, 而该会场其他摄像机所拍摄区域中的与会者也 在讲话时(比如已关闭声控切换或者该与会者的声音不足以产生图像切换), 控 制该与会者的声音在显示相邻与会者图像的屏幕所对应的放音设备中播出。 其 中, 相邻与会者是同该与会者相邻的与会者。 具体的, MCU可以将该与会者的 声音混音到相邻与会者对应的声道中, 这样, 就可以将该与会者和相邻与会者 的声音同时在显示相邻与会者图像的屏幕所对应的放音设备中播出如图 9所示。
在四屏的 B会场的四个屏幕上分别显示摄像机 F2拍摄的与会者图像、摄像 机 F3拍摄的与会者图像、 摄像机 G2拍摄的与会者图像、 摄像机 C2拍摄的与 会者图像。 假定四屏的 F会场的摄像机排序为 Fl、 F2、 F3和 F4, 如果摄像机 F1拍摄的与会者在讲话,则 MCU控制 F1拍摄的与会者和 F2拍摄的与会者(即 与 F1拍摄的与会者相邻的与会者) 的声音进行混音, 并从显示摄像机 F2拍摄 的与会者图像的屏幕所对应的放音设备中播出,这样, B会场的与会者通过该放 音设备听到了这两个与会者的声音, 就能确定这两个与会者相邻; 如果摄像机 F4拍摄的与会者在讲话,则 MCU控制 F3拍摄的与会者和 F4拍摄的与会者(即 与 F4拍摄的与会者相邻的与会者) 的声音进行混音, 并从显示摄像机 F3拍摄 的与会者图像的屏幕所对应的放音设备中播出, B会场的与会者通过该放音设备 听到了这两个与会者的声音, 就能确定这两个与会者相邻。 这样, B会场的与会 者通过放音设备放出的声音就能确定声源的物理位置关系。
进一步, 如果摄像机 F1拍摄的与会者的声音变大, 则需要将该摄像机 F1 拍摄的与会者图像显示出来, 其声音也跟随显示该图像的屏幕所对应的放音设 备中播放, 比如摄像机 F1拍摄的与会者图像被切换到屏幕 4显示, 该图像的声 音应该从屏幕 4所对应的放音设备中播出。
进一步, 比如摄像机 F1拍摄的与会者图像被切换到屏幕 4显示, 为了不使 该图像的声音突然从屏幕 1所对应的放音设备跳跃到屏幕 4所对应的放音设备, 可以采用声音过渡的方法, 比如先使该图像的声音在屏幕 1所对应的放音设备 播放时衰减 3db, 在屏幕 4所对应的放音设备播放时也衰减 3db, 这样与会者听 到的该图像的声音大小和实际声音大小相同, 再逐步把屏幕 1所对应的放音设 备的声音衰减下去, 屏幕 4所对应的放音设备的声音逐步增大, 声音就过渡的 到了屏幕 4所对应的放音设备中。 其中, 过渡过程中用于调节的衰减值可根据 两个屏幕之间的位置相对关系决定。
为了保证当前声音最大的与会者在各会场中具有相同屏幕编号的屏幕中显 示,则 MCU需要控制具有相同屏幕个数的各会场同一编号的屏幕具有相同的视 频源, 具体的, 可以有如下几种方式:
第一种方式: 在启动声控切换开始时, 在会场中某一与会者图像所对应的 各会场中的屏幕上配置相同的视频源。 比如, 三个三屏会场, 会场 1、 会场 2和 会场 3 ,会场 1中区域 1的与会者图像在各会场 3号屏幕上显示时能达到眼对眼 的效果, 所以各会场 3号屏幕配置相同的视频源。 同理, 各会场的 2号屏幕也 配置相同的视频源, 各会场的 1号屏幕也配置相同的视频源, 这样后续在声控 切换时, MCU针对各会场选择的待切换的图像都是相同的, 所以每次声控切换 时都保证了声音最大的与会者的图像能够切换到各个会场中同一编号的屏幕上 显示。 即在各会场具有相同屏幕数目时, 则为各会场中相同屏幕号的屏幕配置 相同的视频源。
第二种方式: 获取当前声音最大的与会者的图像, 判断会场中第二特定屏 幕是否能显示所述声音最大的与会者的图像, 如果是, 控制所述第二特定屏幕 显示所述声音最大的与会者的图像; 如果否, 按照所述会场中其他屏幕到所述 第二特定屏幕的物理距离由近到远的顺序, 依次判断其他屏幕是否能显示所述 声音最大的与会者的图像, 直到找到能显示所述声音最大的与会者的图像的屏 幕为止, 控制找到的屏幕显示所述当前声音最大的与会者的图像, 其中, 所述 第二特定屏幕是能和声音最大的与会者达到眼对眼效果的屏幕。 其中, 对第二 特定屏幕的举例说明请参见第一个实施例的相应描述, 在此不再赘述。
其中, 该方式中的会场是指视频会议中的任意一个会场, 对任意一个会场 都采用上述方式进行处理, 就能保证各会场同一编号的屏幕具有相同的视频源。 如果采用这种方式, 则可以是在启动声控切换开始时, 先按照上述第二种方式, 将声音最大的与会者图像切换到相应的屏幕上显示, 保证具有相同屏幕个数的 各会场的同一编号的屏幕具有相同的视频源之后, 再按照图 2B、 图 2C、 图 2D、 图 3、 图 4、 图 5、 图 7所示实施例所述的方案进行切换。
其中, 判断会场中第二特定屏幕是否能显示所述声音最大的与会者的图像 具体可以是: 判断会场中第二特定屏幕当前是否正在显示会议主席图像, 如果 是, 则表示第二特定屏幕不能显示当前声音最大的与会者的图像; 判断会场中 第二特定屏幕当前是否正在显示多画面图像, 如果是, 则表示第二特定屏幕不 能显示当前声音最大的与会者的图像; 判断会场中第二特定屏幕当前是否正在 显示最近发言者列表中的与会者, 如果是, 则表示第二特定屏幕不能显示当前 声音最大的与会者的图像; 当会场中第二特定屏幕当前显示的图像既不是多画 面图像, 也不是会议主席图像, 也不是最近发言者列表中的与会者图像时, 则 可以在该第二特定屏幕上显示该声音最大的与会者的图像。
其中, 按照所述会场中其他屏幕到所述第二特定屏幕的物理距离由近到远 的顺序, 依次判断其他屏幕是否能显示所述声音最大的与会者的图像具体可以 是: 按照所述会场中其他屏幕到所述第二特定屏幕的物理距离由近到远的顺序, 依次判断其他屏幕当前是否正在显示会议主席图像, 如果是, 则表示该屏幕不 能显示当前声音最大的与会者的图像; 或者, 依次判断其他屏幕当前是否正在 显示多画面图像, 如果是, 则表示该屏幕不能显示当前声音最大的与会者的图 像; 或者, 依次判断其他屏幕当前是否正在显示最近发言者列表中的图像, 如 果是, 则表示该屏幕不能显示当前声音最大的与会者的图像; 只有所判断的屏 幕当前显示的图像既不是多画面图像, 也不是会议主席图像, 也不是最近发言 者列表中的图像时, 则可以在该屏幕上显示该声音最大的与会者的图像。
第三种方式: 如果会议存在主席, 则先按照主席会场中各屏幕显示的与会 者图像中与会者声音的大小, 采用图 3、 4、 5、 7所示实施例中选择需要切换图 像的屏幕的方案, 选择一个屏幕, 将主席会场中该屏幕的图像切换为该声音最 大的与会者图像; 然后, 根据所选屏幕在主席会场中的位置及其他会场中的屏 幕在相应会场中的位置, 控制声音最大的与会者图像切换到其他会场中的相应 屏幕显示; 其中, 所述其他会场中的相应屏幕在其他会场中屏幕组中的物理位 置与所选屏幕在主席会场中屏幕组的物理位置相同; 或者, 所述其他会场中的 相应屏幕与所选屏幕具有相同的编号。 当会议中不存在主席时, 则可以先按照 一个会场中各屏幕显示的与会者图像中与会者声音的大小, 采用图 3、 4、 5、 7 所示实施例中选择需要切换图像的屏幕的方案, 选择一个屏幕, 控制该屏幕的 图像切换为该声音最大的与会者图像, 然后, 按照与上面相同的方式, 控制声 音最大的与会者图像切换到其他会场中的相应屏幕上显示, 这样可以保证具有 相同屏幕个数的各会场的同一编号的屏幕具有相同的视频源。
第四种方式: 按照各屏幕在会场中的排序, 将当前声音最大的与会者切换 到相应的屏幕上, 比如有三个三屏会场, 在启动声控切换后, 当声音最大的与 会者的声音满足切换条件时, 将该声音最大的与会者图像切换到这三个会场中 的左屏幕上显示; 各与会者的声音在不断变化, 此时又有声音最大的与会者的 声音满足切换条件, 则将该声音最大的与会者的图像切换到这三个会场中的中 屏上显示; 再有声音最大的与会者的声音满足切换条件时, 则将该声音最大的 与会者的图像切换到这三个会场中的右屏上显示, 这样可以满足三个三屏会场 中具有同一编号的屏幕具有相同的视频源。
可选的, 为了实现声音最大的与会者图像在一个屏幕上全屏显示的同时该 声音最大的与会者图像也在多画面中显示, 还可以包括: MCU可以控制用声 音最大的与会者的图像替换多画面图像中的某一个画面, 使该声音最大的与会 者的图像在多画面图像中显示出来。 这样, 就可以在同一会场中一个屏幕全屏 显示声音最大的与会者图像时, 该声音最大的与会者图像同时在多画面中显示。 具体的, 假定第一会场是 3屏会场, 屏幕 1显示摄像机 F1拍摄的与会者图像, 屏幕 2显示摄像机 C2拍摄的与会者图像, 屏幕 3显示多画面图像, 当前摄像机 C2拍摄的与会者为声音最大的与会者, MCU将该声音最大的与会者的图像与 其他多个图像拼接成多画面图像, 控制屏幕 3显示拼接后的多画面图像, 如图 10所示。 参阅图 11 , 本发明实施例提供一种网络侧媒体处理设备, 其包括: 与会者选择单元 100, 用于按照当前会议中与会者音量从大到小的顺序, 从 音量最大的与会者开始, 依次确定预定个数的待显示的与会者;
屏幕选择单元 300 ,用于确定第一会场中预定个数的当前显示的与会者对应 的屏幕作为需要切换图像的屏幕。
第一控制切换单元 400 ,用于控制所述需要切换图像的屏幕所显示的图像切 换为所述预定个数的待显示与会者的图像。
该设备还包括:
排序单元 200,用于按照排序条件对第一会场的屏幕当前显示的与会者进行 排序, 得到所述第一会场的屏幕当前显示的与会者的排序结果, 所述排序条件 为如下条件之一: 当前显示的与会者的声音大小、 发言时间点远近、 发言时长、 第一会场的屏幕当前显示的与会者的发言次数和第一会场的屏幕当前显示的与 会者所对应的屏幕是否为主屏。 其中, 第一会场的屏幕当前显示的与会者的具 体排序方式请参见方法实施例的相应描述, 在此不再赘述。
屏幕选择单元 300具体用于根据第一会场的屏幕当前显示的与会者的排序 结果, 确定第一会场中预定个数的当前显示的与会者对应的屏幕作为需要切换 图像的屏幕。
其中, 预定个数可以为一个; 参阅图 12, 屏幕选择单元 300包括: 判断子 单元 3001 , 用于判断所述第一会场中能切换图像的屏幕所显示的与会者是否属 于最近发言者列表; 屏幕选择第一子单元 3002, 用于当所述第一会场中能切换 图像的屏幕所显示的各与会者中有属于最近发言者列表的与会者时, 从不属于 最近发言者列表中的与会者中选择当前显示的声音最小的与会者的图像, 将所 选择的图像所在的屏幕作为需要切换图像的屏幕; 屏幕选择第二子单元 3003 , 用于当所述第一会场中能切换图像的屏幕所显示的与会者均为最近发言者列表 中的与会者时, 按照最近发言者列表中与会者的排序结果, 选择排序结果靠后 的当前显示的与会者所在的屏幕为需要切换图像的屏幕, 其中, 最近发言者列 表中与会者的排序方式请参考方法实施例中的相应描述, 在此不再赘述。 或者, 预定个数为一个; 参阅图 13 , 屏幕选择单元 300包括: 第一选择子 单元 3004, 用于根据第一会场的屏幕当前显示的与会者的排序结果, 选择排在 最后的当前显示的与会者所在的屏幕; 特定屏幕判断子单元 3005 , 用于判断所 述排在最后的当前显示的与会者所在的屏幕是否是第一特定屏幕; 第二选择子 单元 3006, 用于当特定屏幕判断子单元 3005的判断结果为是时, 选择所述排在 最后的当前显示的与会者的前一个当前显示的与会者所在的屏幕; 确定子单元 3007, 用于当特定屏幕判断子单元 3005的判断结果为否时, 确定需要切换图像 的屏幕为所述第一选择子单元 3004所选择的屏幕; 当特定屏幕判断子单元 3005 的判断结果为是时, 确定需要切换图像的屏幕为所述第二选择子单元 3006所选 择的屏幕。 其中, 关于第一特定屏幕、 第二特定屏幕的定义和举例描述请参考 方法实施例的相应描述, 在此不再赘述。
当所述预定个数为多个时, 第一控制切换单元 400具体用于当所述预定个 数的待显示的与会者的图像中存在至少两个待显示的与会者的图像来自于第二 会场时, 控制所述需要切换图像的屏幕中至少两个屏幕所显示的图像切换为所 述至少两个待显示与会者的图像, 使得在所述第一会场中显示的所述至少两个 待显示的与会者的图像的方向顺序与所述至少两个待显示的与会者在所述第二 会场中的物理位置的顺序相同。
为了在显示声音最大的与会者图像的同时显示该与会者的全景图像, 该装 置还包括: 控制叠加单元 500, 用于控制当前声音最大的待显示的与会者所在会 场的全景图像经过图像处理后, 叠加到当前声音最大的待显示的与会者的图像 的部分区域上显示, 具体的, 可以是控制当前声音最大的待显示的与会者所在 会场的全景图像经过缩小处理后, 叠加到当前声音最大的待显示的与会者的图 像的部分区域上显示。
为了保证将声音最大的与会者图像切换到各会场相同屏幕标号的屏幕上, 该装置还包括: 视频源控制单元 600, 用于控制具有相同屏幕个数的各会场同一 编号的屏幕具有相同的视频源。 参阅图 14, 该视频源控制单元 600可以具体包 括: 第一确定子单元 6001 , 用于判断第一会场中第二特定屏幕是否能显示当前 声音最大的待显示的与会者的图像; 第二确定子单元 6002, 用于在所述第一确 定子单元 6001的判断结果为否时, 则确定所述第一会场中离所述第二特定屏幕 的物理距离最近, 且能显示所述声音最大的待显示的与会者的图像的屏幕; 控 制显示子单元 6003 , 用于在所述第一确定子单元的判断结果为是时, 控制所述 第二特定屏幕显示所述声音最大的与会者的图像; 在所述第一确定子单元的判 断结果为否时, 控制所述第二判断子单元找到的屏幕显示所述声音最大的待显 示的与会者的图像。 其中, 关于第二特定屏幕的定义和举例描述请参考方法实 施例的相应描述, 在此不再赘述。
为了保证将声音最大的与会者图像切换到各会场相同屏幕标号的屏幕上, 也可以先将声音最大的与会者图像切换到一个会场的相应屏幕上显示, 然后对 其他会场采用相同的切换方式进行切换, 比如, 先将声音最大的待显示的与会 者图像切换到第一会场的相应屏幕上显示, 此时, 该装置还包括: 第二控制切 换单元 700 ,还用于控制除第一会场以外的其他会场的相应屏幕显示的图像切换 为预定个数的待显示的与会者的图像; 其中, 所述其他会场的相应屏幕与所选 择的第一会场中需要切换图像的屏幕具有相同的编号。
为了在显示声音最大的与会者图像的同时在多画面中显示该声音最大的与 会者, 该装置还包括: 多画面图像控制显示单元 800, 用于将声音最大的待显示 的与会者的图像与其他多个图像拼接成多画面图像, 控制所述第一会场的其他 屏幕显示所述多画面图像, 所述其他屏幕为所述第一会场中除所选择的需要切 换图像的屏幕以外的一个或者多个屏幕。
本发明实施例根据第一会场中屏幕显示的与会者的声音大小, 从第一会场 的屏幕中选择预定个数的屏幕作为需要切换图像的屏幕, 然后将需要切换图像 的屏幕切换为预定个数的与会者图像, 避免了像现有技术那样某一摄像机拍摄 的图像只能在远方会场的特定屏幕(即该图像所缺省对应的屏幕)上显示, 这 种按屏幕声控切换, 能够使会场中的用户看到参与讨论的与会者图像, 提高与 会者的体验。
以上对本发明所提供的一种多屏视频会议中对与会者图像显示进行调整的 方法及装置进行了详细介绍, 对于本领域的一般技术人员, 依据本发明实施例 的思想, 在具体实施方式及应用范围上均会有改变之处, 综上所述, 本说明书 内容不应理解为对本发明的限制。

Claims

权利要求 书
1、 一种多屏视频会议中对与会者图像显示进行调整的方法, 其特征在于, 包括:
按照当前会议中与会者音量从大到小的顺序, 从音量最大的与会者开始, 依次确定预定个数的待显示的与会者;
确定第一会场中预定个数的当前显示的与会者对应的屏幕作为需要切换图 像的屏幕;
控制所述需要切换图像的屏幕所显示的图像切换为所述预定个数的待显示 与会者的图像。
2、 根据权利要求 1所述的方法, 其特征在于, 所述确定第一会场中预定个 数的当前显示的与会者对应的屏幕作为需要切换图像的屏幕, 具体为:
根据第一会场的屏幕当前显示的与会者的排序结果, 确定第一会场中预定 个数的当前显示的与会者对应的屏幕作为需要切换图像的屏幕。
3、 根据权利要求 2所述的方法, 其特征在于, 所述第一会场的屏幕当前显 示的与会者的排序结果是按照如下排序条件进行的, 所述排序条件包括如下条 件之一: 当前显示的与会者的声音大小、 当前显示的与会者的发言时间点远近、 当前显示的与会者的发言时长、 第一会场的屏幕当前显示的与会者的发言次数 和第一会场的屏幕当前显示的与会者所对应的屏幕是否为主屏。
4、 根据权利要求 3所述方法, 其特征在于, 其中, 所述排序结果是按照如 下方式之一进行排序:
当前显示的与会者按照声音从大到 ' j、的顺序;
当前显示的与会者的发言时间点按照从近到远的顺序;
当前显示的与会者的发言时长按照从长到短的顺序;
第一会场的屏幕当前显示的与会者的发言次数按照从多到少的顺序。
5、 根据权利要求 4所述的方法, 其特征在于,
所述预定个数为 1个;
根据第一会场的屏幕当前显示的与会者的排序结果, 确定所述第一会场中 预定个数的当前显示的与会者对应的屏幕作为需要切换图像的屏幕包括: 根据所述第一会场的屏幕当前显示的与会者的排序结果, 判断排在最后的 当前显示的与会者所在的屏幕是否是第一特定屏幕, 如果否, 确定需要切换图 像的屏幕为所述排在最后的当前显示的与会者所在的屏幕; 如果是, 确定需要 切换图像的屏幕为所述排在最后的当前显示的与会者的前一个当前显示的与会 者所在的屏幕; 其中, 所述第一特定屏幕与第二特定屏幕关于屏幕中心线对称, 所述第二特定屏幕是能和声音最大的发言者图像达到眼对眼效果的第一会场的 屏幕, 屏幕中心线为所述第一会场中各屏幕依次连接所形成的屏幕组的几何中
6、 根据权利要求 4所述的方法, 其特征在于,
控制所述需要切换图像的屏幕所显示的图像切换为所述预定个数的待显示 的与会者的图像包括:
当所述预定个数的待显示的与会者的图像中存在至少两个待显示的与会者 的图像来自于第二会场时, 控制所述需要切换图像的屏幕中至少两个屏幕所显 示的图像切换为所述至少两个待显示的与会者的图像, 使得在所述第一会场中 显示的所述至少两个待显示的与会者的图像的方向顺序与所述至少两个待显示 的与会者在所述第二会场中的物理位置的顺序相同。
7、 根据权利要求 4所述的方法, 其特征在于, 该方法还包括:
控制当前声音最大的待显示的与会者所在会场的全景图像经过图像处理 后, 叠加到当前声音最大的待显示的与会者的图像的部分区域上显示。
8、 根据权利要求 4所述的方法, 其特征在于, 在按照当前会议中与会者音 量从大到小的顺序, 从音量最大的与会者开始, 依次确定预定个数的待显示的 与会者之前, 该方法还包括:
控制具有相同屏幕个数的各会场同一编号的屏幕具有相同的视频源。
9、 根据权利要求 8所述的方法, 其特征在于,
所述控制具有相同屏幕个数的各会场同一编号的屏幕具有相同的视频源包 括: 获取当前声音最大的待显示的与会者的图像, 判断所述第一会场中第二特 定屏幕是否能显示所述声音最大的待显示的与会者的图像, 如果是, 控制所述 第二特定屏幕显示所述声音最大的待显示的与会者的图像; 如果否, 则确定所 述第一会场中离所述第二特定屏幕的物理距离最近, 且能显示所述声音最大的 待显示的与会者的图像的屏幕, 控制所述确定的屏幕显示所述当前声音最大的 待显示的与会者的图像, 其中, 所述第二特定屏幕是能和声音最大的发言者图 像达到眼对眼效果的第一会场的屏幕。
10、 根据权利要求 4或 8所述的方法, 其特征在于,
在控制所述需要切换图像的屏幕所显示的图像切换为所述预定个数的待显 示与会者的图像之后, 该方法还包括:
控制除所述第一会场以外的其他会场的相应屏幕显示的图像切换为预定个 数的待显示的与会者的图像; 其中, 所述其他会场的相应屏幕与所选择的第一 会场中需要切换图像的屏幕具有相同的编号。
11、 一种网络侧媒体处理设备, 其特征在于, 包括:
与会者选择单元, 用于按照当前会议中与会者音量从大到小的顺序, 从音 量最大的与会者开始, 依次确定预定个数的待显示的与会者;
屏幕选择单元, 用于确定第一会场中预定个数的当前显示的与会者对应的 屏幕作为需要切换图像的屏幕;
第一控制切换单元, 用于控制所述需要切换图像的屏幕所显示的图像切换 为所述预定个数的待显示与会者的图像。
12、 根据权利要求 11所述的设备, 其特征在于, 所述屏幕选择单元具体用 于: 根据第一会场的屏幕当前显示的与会者的排序结果, 确定第一会场中预定 个数的当前显示的与会者对应的屏幕作为需要切换图像的屏幕。
13、 根据权利要求 12所述的设备, 其特征在于, 所述设备还包括: 排序单元, 用于按照排序条件对第一会场的屏幕当前显示的与会者进行排 序, 得到所述第一会场的屏幕当前显示的与会者的排序结果, 所述排序条件为 如下条件之一: 当前显示的与会者的声音大小、 发言时间点远近、 发言时长、 第一会场的屏幕当前显示的与会者的发言次数和第一会场的屏幕当前显示的与 会者所对应的屏幕是否为主屏。
14、 根据权利要求 13所述的设备, 其特征在于, 所述排序结果是按照如下 方式之一进行排序:
当前显示的与会者按照声音从大到 ' j、的顺序;
当前显示的与会者的发言时间点按照从近到远的顺序;
当前显示的与会者的发言时长按照从长到短的顺序;
第一会场的屏幕当前显示的与会者的发言次数按照从多到少的顺序。
15、 根据权利要求 14所述的设备, 其特征在于,
所述预定个数为 1个;
所述屏幕选择单元包括:
第一选择子单元, 用于根据第一会场的屏幕当前显示的与会者的排序结果, 选择排在最后的当前显示的与会者所在的屏幕;
特定屏幕判断子单元, 用于判断所述排在最后的当前显示的与会者所在的 屏幕是否是第一特定屏幕, 其中, 所述第一特定屏幕与第二特定屏幕关于屏幕 中心线对称, 所述第二特定屏幕是能和声音最大的发言者图像达到眼对眼效果 的第一会场的屏幕, 屏幕中心线为所述第一会场中各屏幕依次连接所形成的屏 幕组的几何中心线;
第二选择子单元, 用于当所述特定屏幕判断子单元的判断结果为是时, 选 择所述排在最后的当前显示的与会者的前一个当前显示的与会者所在的屏幕; 确定子单元, 用于当特定屏幕判断子单元的判断结果为否时, 确定需要切 换图像的屏幕为所述第一选择子单元所选择的屏幕; 当特定屏幕判断子单元的 判断结果为是时, 确定需要切换图像的屏幕为所述第二选择子单元所选择的屏 眷。
16、 根据权利要求 14所述的设备, 其特征在于,
所述第一控制切换单元具体用于当所述预定个数的待显示的与会者的图像 中存在至少两个待显示的与会者的图像来自于第二会场时, 控制所述需要切换 图像的屏幕中至少两个屏幕所显示的图像切换为所述至少两个待显示与会者的 图像, 使得在所述第一会场中显示的所述至少两个待显示的与会者的图像的方 向顺序与所述至少两个待显示的与会者在所述第二会场中的物理位置的顺序相 同。
17、 根据权利要求 14所述的设备, 其特征在于, 还包括:
控制叠加单元, 用于控制当前声音最大的待显示的与会者所在会场的全景 图像经过图像处理后, 叠加到当前声音最大的待显示的与会者的图像的部分区 域上显示。
18、 根据权利要求 14所述的设备, 其特征在于, 还包括:
视频源控制单元, 用于控制具有相同屏幕个数的各会场同一编号的屏幕具 有相同的视频源。
19、 根据权利要求 18所述的设备, 其特征在于,
视频源控制单元包括:
第一确定子单元, 用于判断第一会场中第二特定屏幕是否能显示当前声音 最大的待显示的与会者的图像, 所述第二特定屏幕是能和声音最大的待显示的 发言者图像达到目艮对眼效果的第一会场的屏幕;
第二确定子单元, 用于在所述第一确定子单元的判断结果为否时, 则确定 所述第一会场中离所述第二特定屏幕的物理距离最近, 且能显示所述声音最大 的待显示的与会者的图像的屏幕;
控制显示子单元, 用于在所述第一确定子单元的判断结果为是时, 控制所 述第二特定屏幕显示所述声音最大的待显示的与会者的图像; 在所述第一确定 子单元的判断结果为否时, 控制所述第二确定子单元所确定的屏幕显示所述声 音最大的待显示的与会者的图像。
20、 根据权利要求 14或 18所述的设备, 其特征在于, 还包括:
第二控制切换单元, 用于控制除所述第一会场以外的其他会场的相应屏幕 显示的图像切换为预定个数的待显示的与会者的图像; 其中, 所述其他会场的 相应屏幕与所选择的第一会场中需要切换图像的屏幕具有相同的编号。
21、 根据权利要求 11-19任一权利要求所述的设备, 其特征在于, 所述网络 侧媒体处理设备为: 多点控制单元。
+
PCT/CN2011/079523 2010-09-09 2011-09-09 多屏视频会议中对与会者图像显示进行调整的方法及装置 WO2012031566A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010279924.1 2010-09-09
CN201010279924.1A CN102404542B (zh) 2010-09-09 2010-09-09 多屏视频会议中对与会者图像显示进行调整的方法及装置

Publications (1)

Publication Number Publication Date
WO2012031566A1 true WO2012031566A1 (zh) 2012-03-15

Family

ID=45810135

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079523 WO2012031566A1 (zh) 2010-09-09 2011-09-09 多屏视频会议中对与会者图像显示进行调整的方法及装置

Country Status (2)

Country Link
CN (1) CN102404542B (zh)
WO (1) WO2012031566A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109617859A (zh) * 2018-11-13 2019-04-12 视联动力信息技术股份有限公司 一种分屏模式的实现方法和装置

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8830295B2 (en) * 2012-05-23 2014-09-09 Google Inc. Multimedia conference endpoint transfer system
CN102833520A (zh) * 2012-08-16 2012-12-19 华为技术有限公司 一种视频会议信号处理的方法、视频会议服务器及系统
NO336217B1 (no) * 2012-12-21 2015-06-15 Pexip AS Fremgangsmåte, datamaskinprogram og system for håndtering av mediestrømmer i videokonferanser.
CN103281492A (zh) * 2013-05-23 2013-09-04 深圳锐取信息技术股份有限公司 视频画面切换方法、系统、录播服务器及视频录播系统
CN103337242B (zh) * 2013-05-29 2016-04-13 华为技术有限公司 一种语音控制方法和控制设备
KR101685466B1 (ko) * 2014-08-28 2016-12-12 삼성에스디에스 주식회사 다자간 영상 회의 서비스의 참여자 확장 방법
CN104934037B (zh) * 2015-06-02 2019-06-25 阔地教育科技有限公司 一种直录播互动系统中的音频处理方法及装置
CN107690056A (zh) * 2016-08-05 2018-02-13 鸿富锦精密工业(深圳)有限公司 视频会议控制系统及方法
CN107396036A (zh) * 2017-09-07 2017-11-24 北京小米移动软件有限公司 视频会议中视频处理方法及终端
CN107682752B (zh) * 2017-10-12 2020-07-28 广州视源电子科技股份有限公司 视频画面显示的方法、装置、系统、终端设备及存储介质
CN108712577B (zh) * 2018-08-28 2021-03-12 维沃移动通信有限公司 一种通话模式切换方法及终端设备
JP7230394B2 (ja) * 2018-09-25 2023-03-01 京セラドキュメントソリューションズ株式会社 テレビ会議装置及びテレビ会議プログラム
CN109547732A (zh) * 2018-12-19 2019-03-29 深圳银澎云计算有限公司 一种音视频处理方法、装置、服务器及视频会议系统
CN109861986B (zh) * 2019-01-03 2021-03-23 视联动力信息技术股份有限公司 一种图像调度方法及装置
CN109976689B (zh) * 2019-03-05 2022-03-04 河南泰德信息技术有限公司 一种利用智能手机对分布式拼接处理器进行快速配置的方法与装置
CN110769189B (zh) * 2019-10-15 2021-02-12 广州国音智能科技有限公司 视频会议切换方法、装置及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001016558A (ja) * 1999-06-29 2001-01-19 Canon Inc 通信システム及び方法並びに端末装置
CN101442654A (zh) * 2008-12-26 2009-05-27 深圳华为通信技术有限公司 视频通信中视频对象切换的方法、装置及系统
CN101583011A (zh) * 2009-05-27 2009-11-18 深圳华为通信技术有限公司 视频会议控制方法、系统、视频会议网络设备和会场
US20090322854A1 (en) * 2008-06-25 2009-12-31 Global Ip Solutions, Inc. Video selector

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060055771A1 (en) * 2004-08-24 2006-03-16 Kies Jonathan K System and method for optimizing audio and video data transmission in a wireless system
CN101753803B (zh) * 2008-12-18 2011-08-10 华为技术有限公司 画面显示的控制方法、系统和多媒体资源功能处理器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001016558A (ja) * 1999-06-29 2001-01-19 Canon Inc 通信システム及び方法並びに端末装置
US20090322854A1 (en) * 2008-06-25 2009-12-31 Global Ip Solutions, Inc. Video selector
CN101442654A (zh) * 2008-12-26 2009-05-27 深圳华为通信技术有限公司 视频通信中视频对象切换的方法、装置及系统
CN101583011A (zh) * 2009-05-27 2009-11-18 深圳华为通信技术有限公司 视频会议控制方法、系统、视频会议网络设备和会场

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109617859A (zh) * 2018-11-13 2019-04-12 视联动力信息技术股份有限公司 一种分屏模式的实现方法和装置

Also Published As

Publication number Publication date
CN102404542B (zh) 2014-06-04
CN102404542A (zh) 2012-04-04

Similar Documents

Publication Publication Date Title
WO2012031566A1 (zh) 多屏视频会议中对与会者图像显示进行调整的方法及装置
US9661270B2 (en) Multiparty communications systems and methods that optimize communications based on mode and available bandwidth
CN104038725B (zh) 多屏视频会议中对与会者图像显示进行调整的方法及装置
RU2533304C2 (ru) Способ управления конференц-связью и относящиеся к нему устройство и система
US9782675B2 (en) Systems and methods for interfacing video games and user communications
US8379076B2 (en) System and method for displaying a multipoint videoconference
US9509953B2 (en) Media detection and packet distribution in a multipoint conference
US9024997B2 (en) Virtual presence via mobile
US8508575B2 (en) Television system, television set and method for operating a television system
US20140063176A1 (en) Adjusting video layout
WO2010072075A1 (zh) 视频通信的方法、装置及系统
WO2008113269A1 (fr) Procédé et dispositif pour réaliser une conversation privée dans une session multipoint
WO2011140812A1 (zh) 多画面合成方法、系统及媒体处理装置
US20110050840A1 (en) Apparatus, system and method for video call
EP3070876A1 (en) Method and system for improving teleconference services
JPH08163522A (ja) テレビ会議システムおよび端末装置
KR101918676B1 (ko) 복수 개의 영상회의용 단말을 이용하여 영상회의를 제공할 수 있는 영상회의 서버 및 그 카메라 추적방법
WO2014026478A1 (zh) 一种视频会议信号处理的方法、视频会议服务器及系统
WO2011153926A1 (zh) 会场图像广播方法及多点控制单元
WO2021254452A1 (zh) 视频会议系统的控制方法、多点控制单元及存储介质
RU2617680C1 (ru) Способ, устройство и система передачи мультиконтентных мультимедийных данных
EP4300918A1 (en) A method for managing sound in a virtual conferencing system, a related system, a related acoustic management module, a related client device
CN116185329A (zh) 多设备多声卡音频同步采集方法及其系统
WO2010076759A2 (en) A conference control system
JP2004186841A (ja) 衛星テレビ会議システム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11823089

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11823089

Country of ref document: EP

Kind code of ref document: A1