CN113473066A - Video conference picture adjusting method - Google Patents

Video conference picture adjusting method Download PDF

Info

Publication number
CN113473066A
CN113473066A CN202110499454.8A CN202110499454A CN113473066A CN 113473066 A CN113473066 A CN 113473066A CN 202110499454 A CN202110499454 A CN 202110499454A CN 113473066 A CN113473066 A CN 113473066A
Authority
CN
China
Prior art keywords
video conference
speaker
information
camera
adjusting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110499454.8A
Other languages
Chinese (zh)
Inventor
孔尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mingwork Information Technology Co ltd
Original Assignee
Shanghai Mingwork Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mingwork Information Technology Co ltd filed Critical Shanghai Mingwork Information Technology Co ltd
Priority to CN202110499454.8A priority Critical patent/CN113473066A/en
Publication of CN113473066A publication Critical patent/CN113473066A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Abstract

The invention relates to a video conference picture adjusting method, which comprises the following steps: s1, initiating a video conference call; s2, acquiring the complexity and the site noise of the speaker in the video conference and the conference room, and adjusting the pickup parameters of the pickup of the site conference room; s3, acquiring face information, voice and voiceprint information of the video conference speaker and on-site seat direction information to adjust the angle of the camera in the video conference room and focus and aim at the speaker; s4, monitoring whether the mouth of the speaker moves or not when the voice, the voiceprint information and the voice position information of the speaker change, and if so, keeping the mouth unchanged; if not, the process returns to step S3. The invention overcomes the defects of the prior art, quickly organizes the video conference, quickly locks the clear picture of the speaker, improves the reality sense of the video conference and improves the user experience.

Description

Video conference picture adjusting method
Technical Field
The invention relates to the technical field of video conferences, in particular to a video conference picture adjusting method.
Background
Video conference (videcoconference), which may also be referred to as video conference, is a communication method for transmitting sound and images in real time by holding a conference between user terminals at two or more locations through a transmission channel using video technology and equipment. It can also be used to transmit still image, document, fax, etc. People participating in the video conference can make comments on the television, simultaneously observe the image, action, expression and the like of the other party, show real television images of real objects, drawings, documents and the like or display characters and pictures written on a blackboard and a whiteboard, so that people participating in the conference at different places feel like having face-to-face conversation with the other party, and can effectively replace the live conference.
Due to the technical development, the one-to-one video call conference can not meet the requirements of a multi-person conference, and the problem that participants at the opposite side can not accurately know the speaking mood of the speaker and accurately understand the expressed content due to the fact that the speaker can not be captured quickly when the multi-person conference is in a meeting and the displayed picture is too small when the speaker speaks is solved.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art and provide a video conference picture adjusting method, which can quickly organize a video conference, quickly lock the clear picture of a speaker and improve the reality sense and the user experience of the video conference.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: a video conference picture adjusting method comprises the following steps:
s1, initiating a video conference call;
s2, acquiring the complexity and the site noise of the speaker in the video conference and the conference room, and adjusting the pickup parameters of the pickup of the site conference room;
s3, acquiring face information, voice and voiceprint information of the video conference speaker and on-site seat direction information to adjust the angle of the camera in the video conference room and focus and aim at the speaker;
s4, monitoring whether the mouth of the speaker moves or not when the voice, the voiceprint information and the voice position information of the speaker change, and if so, keeping the mouth unchanged; if not, the process returns to step S3.
Further, the video conference call is initiated in step S1, wherein: the conference initiator initiates a video conference call through the participant ID information in the video conference system.
Further, the conference initiator initiates the video conference call through the ID information of the participants in the video conference system, which is characterized in that: the ID information of the participants can be obtained through a pre-sent short message link, third-party software or face and ID information which is pre-recorded when the participants enter a conference room.
Further, when initiating the video conference call in step S1, the method further includes: when a group participant calls the whole members, the participant who is present is judged according to the face information, the seat information and the participant ID information which are acquired on the spot, and the participant who is not present in the conference is actively screened and called.
Further, the step S2 includes recognizing the speaker and the position thereof by the sound source position and the voiceprint, and adjusting the camera angle to aim at the speaker.
Further, in the step S3, the angle and the focal length of the camera in the video conference room are adjusted, which is characterized in that: and adjusting the focusing and amplifying of the camera to be placed at the maximum position of the picture, and adjusting the focal length of the camera to recognize the head, the upper hand and the arm of the speaker according to the recognized still image of the speaker.
Further, in step S3, when the video conference system detects that the sound volume of the speaker and the speaking time are less than the certain parameters, the camera position is not adjusted, and the system stays at the position of the last speaker.
Further, when the video conference system monitors that no person speaks within a period of time, or speaking time is too short, or sound positions acquired during speaking are too many, the camera is adjusted to be in a panoramic mode, so that conference scenes of all participants are determined to be acquired.
Further, the video conference system monitors that no one speaks within a period of time parameter, specifically, the time is more than 60 seconds;
the speaking time is too short, specifically, the speaking time is less than 5 seconds;
when the voice position obtained during speaking is too much, specifically when the voice print information of the speaker is obtained to be more than 3.
Further, a system using the video conference picture adjustment method of claim 1.
The invention has the following advantages: the invention achieves the purpose of rapidly gathering the participants through the information of the participants stored in the video conference system, can rapidly determine the attitude and the expression of the speaker by rapidly locking the position of the speaker and focusing and amplifying the picture of the speaker during the conference, improves the reality of the video conference, improves the user experience, lightens the operation amount of the participants of the conference, and can keep the integrity and the continuity of the conference experience so as to ensure that the video conference is smoothly carried out.
Drawings
Fig. 1 is a schematic flowchart of a video conference picture adjustment method according to an embodiment of the present application.
Detailed Description
The present invention will be described in further detail with reference to examples.
As shown in fig. 1, the method for adjusting the video conference picture of the present invention is specifically implemented as follows:
s1, initiating a video conference call;
s2, acquiring the complexity and the site noise of the speaker in the video conference and the conference room, and adjusting the pickup parameters of the pickup of the site conference room;
s3, acquiring face information, voice and voiceprint information of the video conference speaker and on-site seat direction information to adjust the angle of the camera in the video conference room and focus and aim at the speaker;
s4, monitoring whether the mouth of the speaker moves or not when the voice, the voiceprint information and the voice position information of the speaker change, and if so, keeping the mouth unchanged; if not, the process returns to step S3.
Scene: the visitor enters the formed face and ID information (account information when the visitor makes a reservation for a meeting by adopting nailing software or other APP, or employee ID and face information, or the ID and face information of the meeting staff acquired by the host through various ways) through entrance guard when making a reservation or entering the door.
When a meeting is started, the face image information acquired by the video conference camera is automatically matched with the ID, so that the unified marking of the marked seat, the face head portrait and the ID account is realized.
S1, when a meeting is started, firstly, a meeting initiator calls participants in a participant group through nails, participants in a meeting room site do not need to call, and when a whole-member call is initiated in the meeting, participants in the meeting room site are screened out according to face information, seat information and nail ID information acquired in the site, participants not in the meeting room site are actively screened out, and a call is initiated to group participants not in the meeting room site to wait for accessing the meeting.
S2 after the conference is successfully accessed, the pickup parameters of the pickup of the on-site conference room are adjusted according to the complexity, noise and the like of the conference speaker and the conference room, so that the problem of noise of the surrounding environment is solved, the authenticity transmission of the conference audio is ensured, each device is connected through a network cable, the power can be supplied and transmitted, and the low delay is ensured.
S3 if the first step is to select to initiate the calling participant, the meeting room camera acquires the face information, voice and voiceprint information of the on-site participant and the on-site seat information, determines that ID, face, voice direction and voiceprint are uniform by combining the nail (or other APP) ID information and the face information, identifies who speaks, adjusts the angle according to the camera, focuses and aims at the speaker (determines the position by matching the microphone with the camera, and determines the speaker by the sound source position in order to adjust the speaker angle).
S4 verifies again that the speaker is focused and enlarged by adjusting the camera focal length (placed at the position of the maximum screen and adjusted to the best position to recognize the speaker 'S head, upper body hand, and arm motions based on the recognized speaker still image) based on the adjusted camera position (i.e., the speaker is determined based on the sound and voiceprint and the camera is adjusted to the speaker), in combination with the speaker' S mouth motion acquired by the camera over a period of time, the speaker is determined again and the camera focal length is adjusted to focus and enlarge the speaker. If the speaker identified by the camera has not been operated at the mouth after a period of time (greater than 5S) or other sound information is detected by the sound pick-up of the on-site conference equipment, the process re-enters S3.
Scene: when a video conference is initiated to start a multi-person discussion, face information, voiceprint information corresponding to a face ID of a mark is identified and acquired through position information of a sound source acquired by intelligent conference equipment and a mobile camera, and face characteristics of a speaker are focused through a speaking time adjusting camera.
Preferably, in step S1, a video conference call is initiated, and the conference initiator initiates the video conference call through the participant ID information in the video conference system.
Preferably, when the conference initiator initiates the video conference call through the ID information of the participants in the video conference system, the ID information of the participants can be obtained through a pre-sent short message link, third-party software, or pre-entered face and ID information when entering a conference room.
Preferably, when initiating the video conference call in step S1 and initiating a full member call to the group participants, the method determines the participants who are already present according to the face information, seat information and participant ID information obtained on site, and actively screens the participants who are not calling on the conference site.
The above "participant who is already present" specifically includes all the face information acquired before the conference start time set by the main participant and appearing in the camera device as the participant who is already present.
Preferably, step S2 further includes recognizing the speaker and the position thereof from the sound source position and the voiceprint, and adjusting the camera angle to aim at the speaker.
Preferably, in step S3, the camera angle and the focal length in the video conference room are adjusted, the camera is focused and enlarged to be placed at the position with the largest screen, and the focal length of the camera is adjusted to recognize the head, the upper hand and the arm of the speaker according to the recognized still image of the speaker.
Preferably, in step S3, when the video conference system detects that the sound volume of the speaker and the speaking time are less than the certain parameters, the video conference system does not adjust the position of the camera and stays at the position of the last speaker.
Preferably, when the video conference system monitors that no person speaks within a period of time, or the speaking time is too short, or the sound position acquired during speaking is too much, the camera is adjusted to be in a panoramic mode, so as to determine to acquire the meeting scenes of all the participants. And after the acquired sound and voiceprint information is carried out for a period of time, specifically more than 120 seconds, locking the speaker again, and adjusting the camera to be in a mode of focusing the speaker.
Preferably, the video conference system monitors that no one speaks within a period of time parameter, specifically, the time is more than 60 seconds;
the speaking time is too short, specifically, the speaking time is less than 5 seconds;
when the number of sound positions acquired during speaking is too large, specifically, when the number of voiceprint information of the speaker is acquired to be more than 3.
Preferably, the invention also comprises a system using the video conference picture adjusting method.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (10)

1. A video conference picture adjusting method is characterized by comprising the following steps:
s1, initiating a video conference call;
s2, acquiring the complexity and the site noise of the speaker in the video conference and the conference room, and adjusting the pickup parameters of the pickup of the site conference room;
s3, acquiring face information, voice and voiceprint information of the video conference speaker and on-site seat direction information to adjust the angle of the camera in the video conference room and focus and aim at the speaker;
s4, monitoring whether the mouth of the speaker moves or not when the voice, the voiceprint information and the voice position information of the speaker change, and if so, keeping the mouth unchanged; if not, the process returns to step S3.
2. The video conference picture adjustment method according to claim 1, wherein a video conference call is initiated in step S1, wherein: the conference initiator initiates a video conference call through the participant ID information in the video conference system.
3. The video conference picture adjustment method according to claim 1, wherein the conference initiator initiates a video conference call through participant ID information in the video conference system, and the method comprises the steps of: the ID information of the participants can be obtained through a pre-sent short message link, third-party software or face and ID information which is pre-recorded when the participants enter a conference room.
4. The video conference picture adjustment method according to claim 1, wherein when initiating a video conference call in step S1, the method further comprises: when a group participant calls the whole members, the participant who is present is judged according to the face information, the seat information and the participant ID information which are acquired on the spot, and the participant who is not present in the conference is actively screened and called.
5. The video conference picture adjustment method according to claim 1, wherein: in step S2, the method further includes recognizing the speaker and the position thereof by the sound source position and the voiceprint, and adjusting the camera angle to aim at the speaker.
6. The video conference picture adjusting method according to claim 1, wherein the step S3 is performed to adjust the angle and the focal length of the camera in the video conference room, and the method comprises the steps of: and adjusting the focusing and amplifying of the camera to be placed at the maximum position of the picture, and adjusting the focal length of the camera to recognize the head, the upper hand and the arm of the speaker according to the recognized still image of the speaker.
7. The video conference picture adjustment method according to claim 1, wherein: in step S3, when the video conference system detects that the sound volume and speaking time of the speaker are less than certain parameters, the position of the camera is not adjusted, and the video conference system continues to stay at the position of the last speaker.
8. The method for adjusting video conference pictures according to claim 1, wherein: when the video conference system monitors that no person speaks within a period of time, or the speaking time is too short, or the sound position acquired during speaking is too much, the camera is adjusted to be in a panoramic mode, so that the meeting scenes of all the participants are determined to be acquired.
9. The method for adjusting video conference pictures according to claim 8, wherein:
the video conference system monitors that no person speaks within a period of time parameter, specifically, the time is more than 60 seconds;
the speaking time is too short, specifically, the speaking time is less than 5 seconds;
when the voice position obtained during speaking is too much, specifically when the voice print information of the speaker is obtained to be more than 3.
10. A system for adjusting video conference pictures, comprising: a system using the video conference picture adjustment method of claim 1.
CN202110499454.8A 2021-05-10 2021-05-10 Video conference picture adjusting method Pending CN113473066A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110499454.8A CN113473066A (en) 2021-05-10 2021-05-10 Video conference picture adjusting method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110499454.8A CN113473066A (en) 2021-05-10 2021-05-10 Video conference picture adjusting method

Publications (1)

Publication Number Publication Date
CN113473066A true CN113473066A (en) 2021-10-01

Family

ID=77870738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110499454.8A Pending CN113473066A (en) 2021-05-10 2021-05-10 Video conference picture adjusting method

Country Status (1)

Country Link
CN (1) CN113473066A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI799165B (en) * 2022-03-04 2023-04-11 圓展科技股份有限公司 System and method for capturing sounding target

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105915798A (en) * 2016-06-02 2016-08-31 北京小米移动软件有限公司 Camera control method in video conference and control device thereof
EP3101839A1 (en) * 2015-06-03 2016-12-07 Thomson Licensing Method and apparatus for isolating an active participant in a group of participants using light field information
CN107333090A (en) * 2016-04-29 2017-11-07 中国电信股份有限公司 Videoconference data processing method and platform
KR20190066659A (en) * 2017-12-06 2019-06-14 서울과학기술대학교 산학협력단 System and method for speech recognition in video conference based on 360 omni-directional
CN111191205A (en) * 2019-12-17 2020-05-22 中移(杭州)信息技术有限公司 Method for managing teleconference, server, and computer-readable storage medium
CN111263106A (en) * 2020-02-25 2020-06-09 厦门亿联网络技术股份有限公司 Picture tracking method and device for video conference
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device
US20200358982A1 (en) * 2019-05-08 2020-11-12 Optoma Corporation Video conference system, video conference apparatus, and video conference method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3101839A1 (en) * 2015-06-03 2016-12-07 Thomson Licensing Method and apparatus for isolating an active participant in a group of participants using light field information
CN107333090A (en) * 2016-04-29 2017-11-07 中国电信股份有限公司 Videoconference data processing method and platform
CN105915798A (en) * 2016-06-02 2016-08-31 北京小米移动软件有限公司 Camera control method in video conference and control device thereof
KR20190066659A (en) * 2017-12-06 2019-06-14 서울과학기술대학교 산학협력단 System and method for speech recognition in video conference based on 360 omni-directional
US20200358982A1 (en) * 2019-05-08 2020-11-12 Optoma Corporation Video conference system, video conference apparatus, and video conference method
CN111191205A (en) * 2019-12-17 2020-05-22 中移(杭州)信息技术有限公司 Method for managing teleconference, server, and computer-readable storage medium
CN111263106A (en) * 2020-02-25 2020-06-09 厦门亿联网络技术股份有限公司 Picture tracking method and device for video conference
CN111586341A (en) * 2020-05-20 2020-08-25 深圳随锐云网科技有限公司 Shooting method and picture display method of video conference shooting device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI799165B (en) * 2022-03-04 2023-04-11 圓展科技股份有限公司 System and method for capturing sounding target

Similar Documents

Publication Publication Date Title
US9179098B2 (en) Video conferencing
US8154578B2 (en) Multi-camera residential communication system
US8159519B2 (en) Personal controls for personal video communications
US8253770B2 (en) Residential video communication system
US8154583B2 (en) Eye gazing imaging for video communications
US8063929B2 (en) Managing scene transitions for video communication
US8730295B2 (en) Audio processing for video conferencing
CN109413359B (en) Camera tracking method, device and equipment
US20110216153A1 (en) Digital conferencing for mobile devices
US20040021764A1 (en) Visual teleconferencing apparatus
US20060001737A1 (en) Video conference arrangement
US20140063176A1 (en) Adjusting video layout
CN1863301A (en) Method and system for a videoconferencing
US20120327176A1 (en) Video Call Privacy Control
WO2005002201A2 (en) Visual teleconferencing apparatus
JPH08163522A (en) Video conference system and terminal equipment
CN113905204B (en) Image display method, device, equipment and storage medium
CN113473066A (en) Video conference picture adjusting method
CN116614598A (en) Video conference picture adjusting method, device, electronic equipment and medium
EP3884461B1 (en) Selective distortion or deformation correction in images from a camera with a wide angle lens
CN113676693B (en) Picture presentation method, video conference system, and readable storage medium
JP6544209B2 (en) INFORMATION PROCESSING APPARATUS, CONFERENCE SYSTEM, INFORMATION PROCESSING METHOD, AND PROGRAM
JP2007251355A (en) Relaying apparatus for interactive system, interactive system, and interactive method
TWI248021B (en) Method and system for correcting out-of-focus eyesight of attendant images in video conferencing
JP2006339869A (en) Apparatus for integrating video signal and voice signal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination