CN111986703A - Video conference method and system, and computer readable storage medium - Google Patents

Video conference method and system, and computer readable storage medium Download PDF

Info

Publication number
CN111986703A
CN111986703A CN202010845755.7A CN202010845755A CN111986703A CN 111986703 A CN111986703 A CN 111986703A CN 202010845755 A CN202010845755 A CN 202010845755A CN 111986703 A CN111986703 A CN 111986703A
Authority
CN
China
Prior art keywords
current speaker
speaking
video conference
video
video conferencing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010845755.7A
Other languages
Chinese (zh)
Other versions
CN111986703B (en
Inventor
李璐
冯文澜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suirui Technology Group Co Ltd
Original Assignee
Suirui Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suirui Technology Group Co Ltd filed Critical Suirui Technology Group Co Ltd
Priority to CN202010845755.7A priority Critical patent/CN111986703B/en
Publication of CN111986703A publication Critical patent/CN111986703A/en
Application granted granted Critical
Publication of CN111986703B publication Critical patent/CN111986703B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01HMEASUREMENT OF MECHANICAL VIBRATIONS OR ULTRASONIC, SONIC OR INFRASONIC WAVES
    • G01H17/00Measuring mechanical vibrations or ultrasonic, sonic or infrasonic waves, not provided for in the preceding groups
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information

Abstract

The invention discloses a video conference method and a system thereof, and a computer readable storage medium, wherein the video conference method comprises the following steps: confirming a current speaker in the video conference process; starting expression recognition on the current speaker in the video stream, and starting an intelligent microphone detection function when the expression of the current speaker is abnormal; and when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, a prompt is sent to the client side of the current speaker. The video conference method and the system can monitor the emotion of personnel in the video conference and realize auxiliary management of the conference.

Description

Video conference method and system, and computer readable storage medium
Technical Field
The present invention relates to the field of video communication technologies, and in particular, to a video conference method and system, and a computer-readable storage medium.
Background
With the development of internet technology, the application range of video conferences is wider and wider.
The inventor finds that in the process of implementing the invention, in the video conference process, overstrain or runaway of people can sometimes occur, so that the conference can not normally progress or even be interrupted, and the conference efficiency and effect are seriously affected, and an effective solution is not available at present.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Disclosure of Invention
The invention aims to provide a video conference method and system and a computer readable storage medium, which can monitor the emotion of personnel in a video conference and realize auxiliary management of the conference.
To achieve the above object, the present invention provides a video conference method, which includes: confirming a current speaker in the video conference process; starting expression recognition on the current speaker in the video stream, and starting an intelligent microphone detection function when the expression of the current speaker is abnormal; and when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, a prompt is sent to the client side of the current speaker.
In an embodiment of the present invention, the confirming the current speaker in the video conference process includes: and confirming the current speaker in the video conference process through the change of the microphone pickup.
In an embodiment of the present invention, the video conference method further includes: and when the fact that the speaking decibel of the current speaker is higher than the first threshold value for multiple times or the speaking frequency is higher than the second threshold value for multiple times is detected, sending out an emotional overstimulation prompt to the client side of the current speaker, and providing an option of turning off the voice or the camera.
In an embodiment of the present invention, the video conference method further includes: presetting and storing a plurality of keywords before a video conference is started, and allocating the grade of each keyword in advance; performing voice recognition on the current speaker in the video conference process; when the speech content of the current speaker comprises the keywords, a warning of a corresponding grade is popped up to the client of the current speaker according to the grade degree of the keywords; and when the keyword in the speech content of the current speaker is in the highest level, forbidding speaking for the current speaker.
In an embodiment of the present invention, the video conference method further includes: and when the speaking decibel of the current speaker is not detected to be higher than the first threshold value, the speaking frequency is not detected to be higher than the second threshold value and the keywords are not detected to appear in the speaking content of the current speaker within a period of time, the intelligent microphone detection function is closed.
In an embodiment of the present invention, the video conference method further includes: in the video conference process, carrying out face recognition on each person in a video stream, and continuously recording the face contour position of each person for multiple times; when the overlapped part of the face contour positions of a certain person recorded twice continuously is lower than a third threshold value, recording as one large-amplitude movement; and when the number of the large-scale movements accumulated by a certain person in a period of time is detected to exceed a fourth threshold value, sending a prompt for paying attention to the client of the person.
In an embodiment of the present invention, the video conference method further includes: in the video conference process, face recognition is carried out on each person in the video stream, and when face information of a person cannot be detected within a period of time, a prompt for paying attention to a client of the person is sent.
In an embodiment of the present invention, the video conference method further includes: and when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, sending a prompt for paying attention to important speaking contents to the client of the participant in the video conference process in a voice or text mode.
Based on the same inventive concept, the present invention also provides a video conference system, which includes: the system comprises a current speaker confirmation module, an expression recognition module and a first reminding module. The current speaker confirmation module is used for confirming a current speaker in the video conference process. The expression recognition module is coupled with the current speaker confirmation module and used for starting expression recognition on the current speaker in the video stream. The microphone intelligent detection module is coupled with the expression recognition module and used for starting microphone intelligent detection when the expression recognition module judges that the expression of the current speaker is abnormal. The first reminding module is coupled with the microphone intelligent detection module and the current speaker confirmation module and used for sending a reminding to the client of the current speaker when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected.
In one embodiment, the video conferencing system further comprises: a keyword module and a voice recognition module. The keyword module is used for presetting and storing a plurality of keywords before the video conference is started, and pre-distributing the grade of each keyword. The voice recognition module is used for performing voice recognition on the current speaker in the video conference process, and if the content of the current speaker including the keywords is recognized, a warning of a corresponding grade is popped up to the client of the current speaker according to the grade degree of the keywords; and if the keyword in the speech content of the current speaker is identified as the highest level, forbidding speaking for the current speaker.
In one embodiment, the video conferencing system further comprises: and the closing module is used for closing the intelligent microphone detection function when the fact that the speaking decibel of the current speaker is higher than the first threshold value, the speaking frequency is higher than the second threshold value and the fact that the keywords appear in the speaking content of the current speaker are not detected within a period of time.
In one embodiment, the video conferencing system further comprises: the system comprises a face recognition module, a large-amplitude movement recording module, a second reminding module and a third reminding module. The face recognition module is used for carrying out face recognition on each person in the video stream in the video conference process and continuously recording the face contour position of each person for multiple times. The large-amplitude movement recording module is coupled with the face recognition module and used for recording as one large-amplitude movement when the overlapped part of the face outline position of a certain person recorded twice continuously is lower than a third threshold value. And the second reminding module is coupled with the large-amplitude movement recording module and the face recognition module and is used for sending a reminding of paying attention to a client of a certain person in a voice or text mode when the number of times of large-amplitude movement of the person accumulated in a period of time exceeds a fourth threshold value is detected. The third reminding module is coupled with the face recognition module and is used for sending a reminding to the client of a certain person to pay attention when the face recognition module cannot detect the face information of the person within a continuous period of time.
In one embodiment, the video conferencing system further comprises: and the fourth reminding module is coupled with the microphone intelligent detection module and is used for sending a reminding for paying attention to the key speaking content to the client of the participant in the video conference process when the microphone intelligent detection module detects that the speaking decibel of the current speaker is higher than the first threshold and/or the speaking frequency is higher than the second threshold.
Based on the same inventive concept, the present invention also provides a computer-readable storage medium for performing the video conference method according to any one of claims 1 to 8.
Compared with the prior art, according to the video conference method and the video conference system, expression recognition is carried out on a speaker, after the expression is judged to be abnormal preliminarily, intelligent detection of a microphone is triggered, then the voice of the speaker is judged, whether the emotion of the speaker is out of control or not is judged according to the expression and the voice, the emotion of the speaker can be judged accurately, further, when the emotion and the voice of the speaker are abnormal, a prompt is given out, the speaker in a conference is guided to calm down, the purpose of auxiliary management of the conference is achieved, and the video conference efficiency can be improved.
Drawings
FIG. 1 is a video conferencing method according to an embodiment of the present invention;
fig. 2 is a video conferencing system according to an embodiment of the present invention.
Detailed Description
The following detailed description of the present invention is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
Fig. 1 is a video conference method according to an embodiment of the present invention, the video conference method including: step S1 to step S3.
The current speaker is confirmed in step S1. The current speaker during the video conference can be confirmed through the change of the microphone pickup.
And in step S2, the intelligent microphone detection function is started after the expression abnormality is determined. Specifically, expression recognition is started for a current speaker in the video stream, and if the expression of the current speaker is abnormal, such as anger, irritability, excitement and the like, a microphone intelligent detection function is started.
In step S3, an emotional alert is performed. And when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, reminding is sent to the client side of the current speaker in a voice or text mode. Specifically, when the fact that the speaking decibel of the current speaker is higher than 10% of the normal decibel of the speaker for multiple times (for example, 2 times is higher than 70 decibels) or the speaking frequency exceeds 10% of the normal frequency of the speaker for multiple times is detected, an emotional overstimulation prompt is sent to a client of the current speaker to remind whether the speaker needs to suspend and then continues to participate in the conference, an option of closing the voice and/or the camera can be provided, and the speaker can automatically close the voice and/or the camera under the condition of losing states.
According to the embodiment, the expression recognition is firstly carried out on the speaker, after the expression is judged to be abnormal preliminarily, the microphone is triggered to carry out intelligent detection, then the voice of the speaker is judged, whether the emotion of the speaker is out of control or not is judged according to the two conditions of the expression and the voice, the emotion of the speaker can be judged accurately, furthermore, when the emotion and the voice of the speaker are abnormal, a prompt is sent out to guide the speaker in the conference to calm down, the purpose of auxiliary management of the conference is achieved, and the conference efficiency can be improved. In addition, the method only manages speakers, and the required resources are small.
Preferably, in order to improve the conference quality, the video conference method of an embodiment further includes: presetting and storing a plurality of keywords before a video conference is started, and allocating the grade of each keyword in advance; performing voice recognition on a current speaker in the video conference process; and if the speech content of the current speaker comprises the keywords, popping up a warning of a corresponding grade to the client of the current speaker according to the grade degree of the keywords, and if the keywords in the speech content of the current speaker are the highest grade, forbidding speaking for the current speaker. The keywords can be names of honours, special events, forbidden names, non-civilized terms and the like, the keyword grades can be set to be primary grade, secondary grade, tertiary grade and the like, and if the non-civilized terms are primary grade, a warning is directly popped up and a language is forbidden; and triggering the intelligent microphone detection function if the privacy of the person is related to secondary warning, carrying out corresponding processing according to the result, and giving a prompt to please the spokesman to pay attention to the courtesy if the name of the caller is tertiary warning.
Preferably, in order to save network resources, the video conference method further comprises: and when the speaking decibel of the current speaker is not detected to be higher than the first threshold value, the speaking frequency is higher than the second threshold value and the keywords are not detected to appear in the speaking content of the current speaker within a period of time, the intelligent detection function of the microphone is closed.
Preferably, in order to further improve the effect of the conference, the video conference method further includes: in the video conference process, carrying out face recognition on each person in a video stream, and continuously recording the face contour position of each person for multiple times; when the overlapped part of the facial contour positions recorded by a certain person twice continuously is lower than a third threshold value, recording as one large-amplitude movement; and when the number of the large-scale movements of a certain person accumulated in a period of time is detected to exceed a fourth threshold value, sending a prompt for paying attention to a client of the person in a voice or text mode.
Preferably, in order to further improve the effect of the conference, the video conference method further includes: in the video conference process, face recognition is carried out on each person in a video stream, and when face information of a person cannot be detected within a continuous period of time, a prompt asking for attention is sent to a client of the person in a voice or text mode.
Preferably, the video conference method further comprises: and when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, a prompt for paying attention to the important speaking content is sent to the client side of the participant in the video conference process in a voice or text mode. This embodiment can use in the remote teaching field, and in the lecture of lecturer, if the lecturer sound grow, the system sends the warning to student's customer end, reminds student here to be the key point, please pay attention to the listening. The reminding mode can be a striking screen character reminding mode, a 2-3-time screen flashing reminding mode, a system sending a short special prompt sound of 'dripping' and the like, so that the overlong system reminding mode is avoided, and the sound of a teacher is covered.
Based on the same inventive concept, the present embodiment further provides a video conference system, as shown in fig. 2, the system includes: the system comprises a current speaker confirmation module 10, an expression recognition module 11, a microphone intelligent detection module 12 and a first reminding module 13.
The current speaker confirmation module 10 is configured to confirm a current speaker during a video conference, and in particular, the present embodiment confirms the current speaker during the video conference through changes of microphone sound pickup.
The expression recognition module 11 is coupled to the current speaker determination module 10 and configured to initiate expression recognition for a current speaker in the video stream.
The microphone intelligent detection module 12 is coupled with the expression recognition module 11, and is configured to start microphone intelligent detection when the expression recognition module 11 determines that the expression of the current speaker is abnormal.
The first reminding module 13 is coupled to the microphone intelligent detection module 12 and the current speaker confirmation module 10, and configured to send a reminder to a client of the current speaker in a form of voice or text when it is detected that the speaking decibel of the current speaker is higher than the first threshold and/or the speaking frequency is higher than the second threshold. The reminding module is also used for sending out an emotional overstimulation reminder to the client side of the current speaker and providing an option for closing the voice or the camera when detecting that the speaking decibel of the current speaker is higher than a first threshold for multiple times or the speaking frequency is higher than a second threshold for multiple times.
Preferably, in order to improve the conference quality, in an embodiment, the video conference system further includes: a keyword module and a voice recognition module.
The keyword module is used for presetting and storing a plurality of keywords before the video conference is started, and pre-distributing the grade of each keyword.
The voice recognition module is used for performing voice recognition on the current speaker in the video conference process, and if the content of the current speaker including the keywords is recognized, a warning of a corresponding grade is popped up to the client of the current speaker according to the grade degree of the keywords; and if the keyword in the speech content of the current speaker is identified as the highest level, forbidding speaking for the current speaker.
Preferably, in order to save network resources, in an embodiment, the video conference system further includes: and the closing module is used for closing the intelligent microphone detection function when the fact that the speaking decibel of the current speaker is higher than the first threshold value, the speaking frequency is higher than the second threshold value and the fact that the keywords appear in the speaking content of the current speaker are not detected within a period of time.
Preferably, in order to further improve the effect of the conference, in an embodiment, the video conference system further includes: the system comprises a face recognition module, a large-amplitude movement recording module, a second reminding module and a third reminding module.
The face recognition module is used for carrying out face recognition on each person in the video stream in the video conference process, and continuously recording the face contour position of each person for multiple times (for example, recording every 1 second).
The large-amplitude movement recording module is coupled with the face recognition module and is used for recording a large-amplitude movement when the coincidence part of the face contour positions recorded by a person twice in succession is lower than a third threshold (such as 65%).
And the second reminding module is coupled with the large-amplitude movement recording module and the face recognition module and is used for sending a reminding of paying attention to a client of a certain person in a voice or text mode when the number of times of large-amplitude movement of the person in a certain period of time exceeds a fourth threshold (for example, the number of times of large-amplitude movement exceeds 6 within 1 minute).
The third reminding module is coupled with the face recognition module, and is used for sending a reminding asking for attention to the client of a person in a voice or text mode when the face recognition module cannot detect the face information of the person within a continuous period of time (such as within 10 seconds).
In one embodiment, the video conferencing system further comprises: and a fourth reminding module, coupled to the microphone intelligent detection module 12, configured to send a reminder to please pay attention to the important speech content to the client of the participant in the video conference process in a form of voice or text when the microphone intelligent detection module 12 detects that the current speaker speaks decibels higher than the first threshold and/or the speaking frequency is higher than the second threshold. This embodiment can use in the remote teaching field, and in the lecture of lecturer, if the lecturer sound grow, the system sends the warning to student's customer end, reminds student here to be the key point, please pay attention to the listening. The reminding mode can be a striking screen character reminding mode, a 2-3-time screen flashing reminding mode, a system sending a short special prompt sound of 'dripping' and the like, so that the overlong system reminding mode is avoided, and the sound of a teacher is covered.
Based on the same inventive concept, the present embodiment also provides a computer-readable storage medium for executing the video conference method according to any one of the above embodiments.
In summary, according to the video conference method and system of the embodiment, firstly, the expression of the speaker is recognized, after the expression is judged to be abnormal preliminarily, the microphone is triggered to perform intelligent detection, then, the voice of the speaker is judged, whether the emotion of the speaker is out of control is judged according to the two conditions of the expression and the voice, the emotion of the speaker can be judged more accurately, further, when the emotion and the voice of the speaker are abnormal, a prompt is given to guide the speaker in the conference to calm down, the purpose of auxiliary management of the conference is achieved, and the video conference efficiency can be improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (10)

1. A video conferencing method, comprising:
confirming a current speaker in the video conference process;
starting expression recognition on the current speaker in the video stream, and starting an intelligent microphone detection function when the expression of the current speaker is abnormal;
and when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, a prompt is sent to the client side of the current speaker.
2. The video conferencing method of claim 1, wherein the confirming the current speaker during the video conference comprises:
and confirming the current speaker in the video conference process through the change of the microphone pickup.
3. The video conferencing method of claim 1, wherein the video conferencing method further comprises:
and when the fact that the speaking decibel of the current speaker is higher than the first threshold value for multiple times or the speaking frequency is higher than the second threshold value for multiple times is detected, sending out an emotional overstimulation prompt to the client side of the current speaker, and providing an option of turning off the voice or the camera.
4. The video conferencing method of claim 1, wherein the video conferencing method further comprises:
presetting and storing a plurality of keywords before a video conference is started, and allocating the grade of each keyword in advance;
performing voice recognition on the current speaker in the video conference process;
when the speech content of the current speaker comprises the keywords, a warning of a corresponding grade is popped up to the client of the current speaker according to the grade degree of the keywords;
and when the keyword in the speech content of the current speaker is in the highest level, forbidding speaking for the current speaker.
5. The video conferencing method of claim 4, wherein the video conferencing method further comprises:
and when the speaking decibel of the current speaker is not detected to be higher than the first threshold value, the speaking frequency is not detected to be higher than the second threshold value and the keywords are not detected to appear in the speaking content of the current speaker within a period of time, the intelligent microphone detection function is closed.
6. The video conferencing method of claim 1, wherein the video conferencing method further comprises:
in the video conference process, carrying out face recognition on each person in a video stream, and continuously recording the face contour position of each person for multiple times;
when the overlapped part of the contour position of the face of a certain person recorded twice continuously is lower than a third threshold value, recording as one large-amplitude movement;
and when the number of the large-scale movements accumulated by a certain person in a period of time is detected to exceed a fourth threshold value, sending a prompt for paying attention to the client of the person.
7. The video conferencing method of claim 1, wherein the video conferencing method further comprises:
in the video conference process, face recognition is carried out on each person in the video stream, and when face information of a person cannot be detected within a period of time, a prompt for paying attention to a client of the person is sent.
8. The video conferencing method of claim 1, wherein the video conferencing method further comprises:
and when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected, sending a prompt for paying attention to important speaking content to the client side of the participant in the video conference process.
9. A video conferencing system, comprising:
the current speaker confirmation module is used for confirming a current speaker in the video conference process;
the expression recognition module is coupled with the current speaker confirmation module and used for starting expression recognition on the current speaker in the video stream;
the microphone intelligent detection module is coupled with the expression recognition module and used for starting microphone intelligent detection when the expression recognition module judges that the expression of the current speaker is abnormal;
and the reminding module is coupled with the microphone intelligent detection module and the current speaker confirmation module and is used for sending a reminder to the client of the current speaker when the fact that the speaking decibel of the current speaker is higher than a first threshold and/or the speaking frequency is higher than a second threshold is detected.
10. A computer-readable storage medium for performing the video conferencing method of any of claims 1-8.
CN202010845755.7A 2020-08-20 2020-08-20 Video conference method and system, and computer readable storage medium Active CN111986703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010845755.7A CN111986703B (en) 2020-08-20 2020-08-20 Video conference method and system, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010845755.7A CN111986703B (en) 2020-08-20 2020-08-20 Video conference method and system, and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111986703A true CN111986703A (en) 2020-11-24
CN111986703B CN111986703B (en) 2023-05-26

Family

ID=73442448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010845755.7A Active CN111986703B (en) 2020-08-20 2020-08-20 Video conference method and system, and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111986703B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116996801A (en) * 2023-09-25 2023-11-03 福州天地众和信息技术有限公司 Intelligent conference debugging speaking system with wired and wireless access AI

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1422494A (en) * 2000-12-05 2003-06-04 皇家菲利浦电子有限公司 Method and apparatus for predicting events in video conferencing and other applications
US20050110867A1 (en) * 2003-11-26 2005-05-26 Karsten Schulz Video conferencing system with physical cues
JP2005269207A (en) * 2004-03-18 2005-09-29 Nippon Telegr & Teleph Corp <Ntt> Information delivery method and communication apparatus for realizing this method, and program therefor
WO2009009966A1 (en) * 2007-07-17 2009-01-22 Huawei Technologies Co., Ltd. A method, device and system for displaying a speaker in videoconference
CN102843543A (en) * 2012-09-17 2012-12-26 华为技术有限公司 Video conferencing reminding method, device and video conferencing system
CN107636684A (en) * 2015-03-18 2018-01-26 阿凡达合并第二附属有限责任公司 Emotion identification in video conference
WO2018076615A1 (en) * 2016-10-27 2018-05-03 中兴通讯股份有限公司 Information transmitting method and apparatus
JP2019176375A (en) * 2018-03-29 2019-10-10 株式会社アドバンスト・メディア Moving image output apparatus, moving image output method, and moving image output program
US20200184203A1 (en) * 2018-12-05 2020-06-11 International Business Machines Corporation Automatically suggesting behavioral adjustments during video conferences

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1422494A (en) * 2000-12-05 2003-06-04 皇家菲利浦电子有限公司 Method and apparatus for predicting events in video conferencing and other applications
US20050110867A1 (en) * 2003-11-26 2005-05-26 Karsten Schulz Video conferencing system with physical cues
JP2005269207A (en) * 2004-03-18 2005-09-29 Nippon Telegr & Teleph Corp <Ntt> Information delivery method and communication apparatus for realizing this method, and program therefor
WO2009009966A1 (en) * 2007-07-17 2009-01-22 Huawei Technologies Co., Ltd. A method, device and system for displaying a speaker in videoconference
CN102843543A (en) * 2012-09-17 2012-12-26 华为技术有限公司 Video conferencing reminding method, device and video conferencing system
WO2014040429A1 (en) * 2012-09-17 2014-03-20 华为技术有限公司 Video conference alerting method and device and video conference system
CN107636684A (en) * 2015-03-18 2018-01-26 阿凡达合并第二附属有限责任公司 Emotion identification in video conference
WO2018076615A1 (en) * 2016-10-27 2018-05-03 中兴通讯股份有限公司 Information transmitting method and apparatus
CN107993674A (en) * 2016-10-27 2018-05-04 中兴通讯股份有限公司 A kind of emotion control method and device
JP2019176375A (en) * 2018-03-29 2019-10-10 株式会社アドバンスト・メディア Moving image output apparatus, moving image output method, and moving image output program
US20200184203A1 (en) * 2018-12-05 2020-06-11 International Business Machines Corporation Automatically suggesting behavioral adjustments during video conferences

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116996801A (en) * 2023-09-25 2023-11-03 福州天地众和信息技术有限公司 Intelligent conference debugging speaking system with wired and wireless access AI
CN116996801B (en) * 2023-09-25 2023-12-12 福州天地众和信息技术有限公司 Intelligent conference debugging speaking system with wired and wireless access AI

Also Published As

Publication number Publication date
CN111986703B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
US11386381B2 (en) Meeting management
US8649494B2 (en) Participant alerts during multi-person teleconferences
US10269374B2 (en) Rating speech effectiveness based on speaking mode
US7933226B2 (en) System and method for providing communication channels that each comprise at least one property dynamically changeable during social interactions
Roger et al. The development of a comprehensive system for classifying interruptions
US9666209B2 (en) Prevention of unintended distribution of audio information
US20210327436A1 (en) Voice Interaction Method, Device, and System
US20170154637A1 (en) Communication pattern monitoring and behavioral cues
US10652396B2 (en) Stream server that modifies a stream according to detected characteristics
WO2014040429A1 (en) Video conference alerting method and device and video conference system
US20220131979A1 (en) Methods and systems for automatic queuing in conference calls
US20210306457A1 (en) Method and apparatus for behavioral analysis of a conversation
JP7463469B2 (en) Automated Call System
US20220139388A1 (en) Voice Filtering Other Speakers From Calls And Audio Messages
CN113779208A (en) Method and device for man-machine conversation
US11164577B2 (en) Conversation aware meeting prompts
CN111986703A (en) Video conference method and system, and computer readable storage medium
CN110865789A (en) Method and system for intelligently starting microphone based on voice recognition
JP2000333150A (en) Video conference system
US20210327416A1 (en) Voice data capture
US20240155058A1 (en) Method for dynamically adjusting a volume level in an online meeting
Heldner et al. Interruption impossible
JP2023088360A (en) Video call device, video call method, and control program of video call device
WO2023163895A1 (en) Systems and methods for improved group communication sessions
CN116566758A (en) Audio and video conference quick response, teaching questioning response method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant