CN107333090B - Video conference data processing method and platform - Google Patents

Video conference data processing method and platform Download PDF

Info

Publication number
CN107333090B
CN107333090B CN201610283899.1A CN201610283899A CN107333090B CN 107333090 B CN107333090 B CN 107333090B CN 201610283899 A CN201610283899 A CN 201610283899A CN 107333090 B CN107333090 B CN 107333090B
Authority
CN
China
Prior art keywords
video
speaker
voiceprint
identity information
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610283899.1A
Other languages
Chinese (zh)
Other versions
CN107333090A (en
Inventor
赵婧
曹宁
徐晓微
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201610283899.1A priority Critical patent/CN107333090B/en
Publication of CN107333090A publication Critical patent/CN107333090A/en
Application granted granted Critical
Publication of CN107333090B publication Critical patent/CN107333090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents

Abstract

The invention provides a video conference data processing method and a video conference data processing platform, and relates to the technical field of video conferences. The video conference data processing method comprises the following steps: acquiring voiceprint information of a speaker; recognizing voiceprint information and determining identity information of a speaker; and carrying out video synthesis on the identity information of the speaker and the video picture so as to display the video picture after the video synthesis. By the method, the identity of the speaker can be identified according to the voice of the participants, and the speaker can be displayed to the participants through the video pictures, so that the participants can conveniently identify the identity of the speaker, and the user experience of the video conference is improved.

Description

Video conference data processing method and platform
Technical Field
The invention relates to the technical field of video conferences, in particular to a video conference data processing method and a video conference data processing platform.
Background
At present, different participants are generally identified in a video conference by a microphone excitation mode, and who speaks is identified according to which sound source sounds.
However, in many cases, especially in a large-scale conference, a plurality of people share one microphone, or a plurality of participants use the same sound source, so that the speaker cannot be identified by the way of distinguishing the sound source alone, the conference effect is greatly affected, and the participants cannot correspond the content of the speech to the speaker, so that a large gap is generated between the effect of the video conference and the live conference, and the user friendliness of the video conference is greatly reduced.
Disclosure of Invention
It is an object of the present invention to propose a solution that facilitates the identification of the speaker by the videoconference user.
According to an aspect of the present invention, a method for processing videoconference data is provided, which includes: acquiring voiceprint information of a speaker; recognizing voiceprint information and determining identity information of a speaker; and carrying out video synthesis on the identity information of the speaker and the video picture so as to display the video picture after the video synthesis.
Optionally, the video picture is a virtual reality video picture, and the video terminal is a virtual reality video display terminal.
Optionally, recognizing the voiceprint information and determining the identity information of the speaker comprises: performing feature matching according to the voiceprint information, and identifying the voiceprint features matched with the voiceprint information; and searching identity information of the speaker corresponding to the matched voiceprint characteristics.
Optionally, the video-synthesizing and displaying the identity information of the speaker with the video picture includes: carrying out video synthesis on the identity information of the speaker and the video picture; and sending the video picture after the video synthesis to a video terminal for display.
Optionally, the method further comprises: extracting vocal print characteristics of the voices of the participants based on the recorded voices of the participants to generate a vocal print library; voiceprint characteristics of the participant's voice are associated with the participant's identity information.
Optionally, the method further comprises: acquiring the facial features of the speaker according to the incidence relation between the voiceprint features and the facial features; positioning a speaker in the video picture according to the facial features of the speaker; and carrying out video synthesis on the positioning identification of the speaker and the video picture.
Optionally, the method further comprises: extracting facial features of the participants; facial features of the participant are associated with voiceprint features.
By the method, the identity of the speaker can be identified according to the voice of the participants, and the speaker can be displayed to the participants through the video pictures, so that the participants can conveniently identify the identity of the speaker, and the user experience of the video conference is improved.
According to another aspect of the present invention, there is provided a video conference platform comprising: the voiceprint information extraction module is used for acquiring the voiceprint information of the speaker; the identity information determining module is used for identifying the voiceprint information and determining the identity information of the speaker; and the video synthesis module is used for carrying out video synthesis on the identity information of the speaker and the video picture so as to display the video picture after video synthesis.
Optionally, the video picture is a virtual reality video picture, and the video terminal is a virtual reality video display terminal.
Optionally, the identity information determination module includes: the voiceprint matching unit is used for carrying out feature matching according to the voiceprint information and identifying matched voiceprint features; and the identity information acquisition unit is used for acquiring the identity information of the speaker corresponding to the matched voiceprint characteristics.
Optionally, the video composition module comprises: the video synthesis unit is used for carrying out video synthesis on the identity information of the speaker and the video picture; and the video sending unit is used for sending the video picture after the video synthesis to the video terminal for displaying.
Optionally, the method further comprises: the voiceprint feature extraction module is used for extracting voiceprint features of the voices of the participants based on the recorded voices of the participants to generate a voiceprint library; and the identity information correlation module is used for correlating the voiceprint characteristics of the sound of the participant with the identity information of the participant.
Optionally, the method further comprises: the facial feature acquisition module is used for acquiring the facial features of the speaker according to the incidence relation between the voiceprint features and the facial features; the facial feature positioning module is used for positioning the speaker in the video picture according to the facial features of the speaker; the video synthesis module is also used for carrying out video synthesis on the positioning identification of the speaker and the video picture.
Optionally, the method further comprises: the facial feature extraction module is used for extracting facial features of the participants; and the facial feature association module is used for associating the facial features of the participants with the voiceprint features.
The platform can identify the identity of the speaker according to the voice of the participants and display the identity of the speaker to the participants through the video pictures, so that the participants can conveniently identify the identity of the speaker, and the user experience of the video conference is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a flowchart of an embodiment of a video conference data processing method according to the present invention.
Fig. 2 is a flowchart of another embodiment of a video conference data processing method according to the present invention.
Fig. 3 is a flowchart of a video conference data processing method according to another embodiment of the present invention.
Fig. 4 is a flowchart of a video conference data processing method according to still another embodiment of the present invention.
Fig. 5 is a schematic diagram of one embodiment of a video conferencing platform of the present invention.
Fig. 6 is a schematic diagram of another embodiment of a video conferencing platform of the present invention.
Fig. 7 is a schematic diagram of yet another embodiment of a video conferencing platform of the present invention.
Fig. 8 is a schematic diagram of yet another embodiment of a video conferencing platform of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
A flow diagram of one embodiment of a video conference data processing method of the present invention is shown in fig. 1.
In step 101, voiceprint information of a speaker is acquired. In one embodiment, the sound collected from the microphone may be subjected to audio data processing to obtain voiceprint information of the speaker.
In step 102, voiceprint information is identified and identity information of the speaker is determined. In one embodiment, the voiceprint information of the speaker can be feature-matched according to the voiceprint features of the participants, and the matched voiceprint features can be determined, so that the voiceprint information of the speaker can be determined.
In step 103, the identity information of the speaker is video-composited with the video frame. The synthesized video picture has identity information of the speaker. The video pictures after the synthesis processing can be sent to each terminal of the conference, so that the participants can know the identity information of the speaker while watching the video pictures.
By the method, the identity of the speaker can be identified according to the voice of the participants and displayed to the participants through the video pictures, so that the participants can conveniently identify the identity of the speaker, and the user experience of the video conference is improved.
In one embodiment, the video pictures can be virtual reality video pictures, virtual reality scenes can be created in related meeting places through the virtual reality video display terminals, or atmosphere of a live meeting is created in a mode that participants wear virtual reality video display glasses, and meeting experience is improved. By such a method, the effect of the video conference can be further optimized.
A flow chart of another embodiment of the videoconference data processing method of the present invention is shown in fig. 2.
In step 201, voiceprint information of a speaker is acquired. In one embodiment, the sound collected from the microphone may be subjected to audio data processing to obtain voiceprint information of the speaker.
In step 202, feature matching is performed based on the voiceprint information, identifying voiceprint features that match the voiceprint information.
In step 203, identity information corresponding to the matched voiceprint feature is obtained. The identity information of the speaker may include the name, identity, affiliation, and association information with the conference, etc. of the speaker. The identity and the position of the speaker can be more intuitively positioned according to the information. The identity information of the speaker can also comprise contact ways such as the telephone number of the speaker and the like, so that the participants can conveniently and directly communicate after meeting.
In step 204, the acquired identity information is video-synthesized with the video frame. In one embodiment, the identity information of the speaker may be displayed at a predetermined location of the virtual reality video frame.
In step 205, the video-combined screen is transmitted to the video terminal and displayed.
Since each participant cannot know all participants, especially in a large conference, knowing the sound source cannot know exactly the name and the title of the speaker, information related to the conference, and the like. By the method, the speaker can be identified according to the voice of the participants, the relevant identity information of the speaker is inquired, and the identity information is displayed to the participants through the video pictures, so that the participants can know the identity of the speaker and the background information of the speaker better, and the user experience of the video conference is improved.
In one embodiment, a voiceprint library including voiceprint characteristics of the participant is established, and voiceprint information is identified based on the voiceprint library. A flow chart of yet another embodiment of the videoconference data processing method of the present invention is shown in fig. 3.
In step 301, voiceprint features of the participant's voice are extracted based on the recorded participant's voice, and a voiceprint library is generated. In one embodiment, each participant may be asked to enter a voice before the meeting begins. In another embodiment, only the voices of the participants who have not extracted the voiceprint feature may be entered.
In step 302, the voiceprint characteristics of the participant are associated with the identity information of the participant. In one embodiment, the identity information of each participant may be entered in advance, and the voiceprint feature and the identity information may be associated in a voice entry or voiceprint feature extraction process. In one embodiment, the identity information may include names, titles, relationships or contact addresses with the meeting attendees, and the like.
In step 303, during the conference, the voice of the speaker is collected and voiceprint information is extracted.
In step 304, the voiceprint information of the speaker is matched with the voiceprint features in the voiceprint library, the matched voiceprint features are determined, and then the identity information associated with the voiceprint features is obtained.
In step 305, the identity information of the speaker is video-composited with the video frame. The synthesized video picture has identity information of the speaker. The video pictures after the synthesis processing can be sent to each terminal of the conference, so that the participants can know the identity information of the speaker while watching the video pictures.
By the method, the voiceprint library comprising the voiceprint characteristics of the participants can be generated, the voiceprint information is identified based on the voiceprint library, the voiceprint information of the speaker can be identified quickly and effectively, the identity information of the speaker is determined according to the incidence relation between the voiceprint characteristics and the identity information, the operation efficiency is improved, and the method is convenient to popularize and apply.
In one embodiment, when the voiceprint features are extracted according to the voices of the participants to generate the voiceprint library, the recorded voiceprint features can be stored in groups according to the meeting place arrangement of the participants, so that when the voiceprint information of a speaker is identified, the meeting place where the speaker is located can be judged according to the sound source, and then the voiceprint information of the speaker is identified according to the voiceprint features in the meeting place groups. The method greatly reduces the operation amount of voiceprint information identification and improves the operation efficiency.
In one embodiment, the speaker can be positioned and labeled according to the video picture, so that the participants can see the speaker more intuitively, and the user experience is further improved.
A flow chart of yet another embodiment of the video conference data processing method of the present invention is shown in fig. 4.
In step 401, voiceprint features of the participant's voice are extracted based on the recorded participant's voice, and a voiceprint library is generated. In one embodiment, each participant may be asked to enter a voice before the meeting begins. In another embodiment, only the voices of the participants who have not extracted the voiceprint feature may be entered.
In step 402, the voiceprint characteristics of the participant are associated with the identity information of the participant. In one embodiment, the identity information of each participant can be entered in advance, and the voiceprint features and the identity information can be associated in a voice entry or voiceprint feature extraction process. In one embodiment, the identity information may include names, titles, relationships or contact addresses with the meeting attendees, and the like.
In step 403, facial features of the participant are extracted and associated with the voiceprint features. Facial features of the participants can be collected through the photos uploaded by the participants, or the facial features of the participants can be collected while the voices of the participants are input.
In step 404, during the conference, the voice of the speaker is collected and voiceprint information is extracted.
In step 405, the voiceprint information of the speaker is matched with the voiceprint features in the voiceprint library to determine the matched voiceprint features.
In step 406, identity information and facial features associated with the voiceprint feature are obtained.
In step 407, the speaker is located in the video frame and the location identifier is added, and the location identifier and the identity information of the speaker are video-combined with the video frame, so as to transmit the combined video frame to each terminal.
By the method, the facial features of the participants can be collected, the speaker can be determined by taking the voiceprint features as the identification, and the speaker is positioned and labeled in the video picture, so that the participants can know the identity information of the speaker and can know the speaker more intuitively, the video conference is more humanized, and the user experience is further improved. Especially under the virtual reality video conference scene, can fix a position the speaker fast, reach the effect of face-to-face communication.
A schematic diagram of one embodiment of a video conferencing platform of the present invention is shown in fig. 5. The voiceprint information extraction module 501 can obtain the voiceprint information of the speaker. In one embodiment, the voiceprint information extraction module 501 can perform audio data processing on the sound collected from the microphone to obtain the voiceprint information of the speaker. Identity information determination module 502 is capable of recognizing voiceprint information and determining identity information of a speaker. In one embodiment, the identity information determining module 502 may perform feature matching on the voiceprint information of the speaker according to the voiceprint features of the participants to determine the matched voiceprint features, thereby determining the voiceprint information of the speaker. The video synthesis module 503 is configured to perform video synthesis on the identity information of the speaker and the video frame, where the synthesized video frame has the identity information of the speaker. The video pictures after the synthesis processing can be sent to each terminal of the conference, so that the participants can know the identity information of the speaker while watching the video pictures.
The video conference platform can identify the identity of the speaker according to the voice of the participants and display the identity of the speaker to the participants through the video pictures, so that the participants can conveniently identify the identity of the speaker, and the user experience of the video conference is improved.
In one embodiment, the video pictures can be virtual reality video pictures, virtual reality scenes can be created in related meeting places through the virtual reality video display terminals, or atmosphere of a live meeting is created in a mode that participants wear virtual reality video display glasses, and meeting experience is improved.
A schematic diagram of another embodiment of the video conferencing platform of the present invention is shown in fig. 6. The voiceprint information extraction module 61 is configured to obtain voiceprint information of a speaker. The identity information determining module 62 comprises a voiceprint matching unit 621 and an identity information obtaining unit 622, wherein the voiceprint matching unit 621 can perform feature matching according to the voiceprint information to identify voiceprint features matched with the voiceprint information; the identity information obtaining unit 622 can obtain the identity information corresponding to the matched voiceprint feature. The identity information of the speaker may include the speaker's name, identity, and associated information with the conference. The identity and the position of the speaker can be more intuitively positioned according to the information. The identity information of the speaker can also comprise contact ways such as the telephone number of the speaker and the like, so that the participants can conveniently and directly communicate after meeting. The video synthesizing module 63 includes a video synthesizing unit 631 and a video sending unit 632, the video synthesizing unit 631 can perform video synthesis on the acquired identity information and the video picture, and the video sending unit 632 can send the synthesized video picture to the video terminal to be displayed to the participants.
The platform can identify the speaker according to the voice of the participants, inquire the relevant identity information of the speaker, and display the identity information to the participants through the video pictures, so that the participants can know the identity of the speaker and the background information of the speaker better, and the user experience of the video conference is improved.
A schematic diagram of yet another embodiment of a video conferencing platform of the present invention is shown in fig. 7. The structures and functions of the voiceprint information extraction module 701, the identity information determination module 702, and the video composition module 703 are similar to those in the embodiment of fig. 5. The video conferencing platform also includes a voiceprint feature extraction module 704 and an identity information association module 705. The voiceprint feature extraction module 704 can acquire the voiceprint features of the voices of the participants based on the recorded voices of the participants to generate a voiceprint library; identity information association module 705 associates the voiceprint characteristics of the participant with the identity information of the participant. In one embodiment, the identity information of each participant can be recorded, and the voiceprint features and the identity information are associated in the voice recording or voiceprint feature extraction process. In one embodiment, the identity information may include names, titles, relationships or contact addresses with the meeting attendees, and the like.
The platform can generate a voiceprint library comprising voiceprint characteristics of participants, and can identify voiceprint information based on the voiceprint library, so that voiceprint information of a speaker can be quickly and effectively identified, identity information of the speaker is determined according to incidence relation between the voiceprint characteristics and the identity information, operation efficiency is improved, and popularization and application are facilitated.
In one embodiment, when the voiceprint feature extraction module 704 extracts the voiceprint features according to the sound of the attendees to generate the voiceprint library, the recorded voiceprint features can be stored in groups according to meeting place arrangement of the attendees, so that when the voiceprint information of the speaker is identified, the meeting place where the speaker is located can be judged according to the sound source, and then the voiceprint information of the speaker can be identified according to the voiceprint features in the meeting place group. The platform greatly reduces the calculation amount of voiceprint information identification and improves the calculation efficiency.
In one embodiment, the video conference platform can also position and mark the speaker according to the video picture, so that the participants can see the speaker more intuitively, and the user experience is further improved.
A schematic diagram of yet another embodiment of a video conferencing platform of the present invention is shown in fig. 8. The voiceprint feature extraction module 804 is configured to extract voiceprint features of the voices of the participants based on the recorded voices of the participants, and generate a voiceprint library. Identity information association module 805 is configured to associate the voiceprint characteristics of the participant with the identity information of the participant. The facial feature extraction module 806 may be capable of extracting facial features of the participant, extracting facial features of the participant from a photograph uploaded by the participant, or extracting facial features of the participant while recording the participant's voice. The facial feature association module 807 is used to associate facial features of the participant with voiceprint features.
The voiceprint information extraction module 801 is configured to extract voiceprint information according to the collected voice of the speaker during the conference. Identity information determination module 802 is configured to match voiceprint information of a speaker with voiceprint features in a voiceprint library, determine the matched voiceprint features, and obtain voiceprint information associated with the voiceprint features. The facial feature acquisition module 808 is used to acquire facial feature information associated with the voiceprint feature. The facial feature positioning module 809 is used for performing positioning operation in the video picture according to the acquired facial feature information of the speaker. The video composition module 803 is configured to perform video composition on the positioning identifier of the speaker and the identity information of the speaker and the video frame, so as to transmit the video frame after composition to each terminal.
The platform can analyze the facial features of the participants, determine the speaker by taking the voiceprint features as the identification, position and mark the speaker in the video picture, enable the participants to know the identity information of the speaker and know the speaker more intuitively, enable the video conference to be more humanized and further improve the user experience. Especially under the virtual reality video conference scene, can fix a position the speaker fast, reach the effect of face-to-face communication.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention and not to limit it; although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art will understand that: modifications to the specific embodiments of the invention or equivalent substitutions for parts of the technical features may be made; without departing from the spirit of the present invention, it is intended to cover all aspects of the invention as defined by the appended claims.

Claims (8)

1. A method for processing videoconference data, comprising:
extracting voiceprint characteristics of the voice of the participants based on the recorded voice of the participants to generate a voiceprint library;
associating the voiceprint characteristics of the participant's voice with the identity information of the participant;
acquiring voiceprint information of a speaker;
recognizing the voiceprint information and determining the identity information of the speaker;
acquiring the facial features of the speaker according to the incidence relation between the voiceprint features and the facial features;
locating the speaker in a video picture according to the facial features of the speaker;
and carrying out video synthesis on the identity information of the speaker and the positioning identification of the speaker and the video picture to display the video picture after the video synthesis, wherein a virtual reality scene is built in a meeting place through a virtual reality video display terminal, or a live meeting is built in a mode that participants wear virtual reality video display glasses, the video picture is a virtual reality video picture, and the video terminal is a virtual reality video display terminal.
2. The method of claim 1, wherein the identifying the voiceprint information and determining the identity information of the speaker comprises:
performing feature matching according to the voiceprint information, and identifying voiceprint features matched with the voiceprint information;
and acquiring the identity information of the speaker corresponding to the matched voiceprint characteristics.
3. The method of claim 2,
the video synthesis and display of the identity information of the speaker and the video picture comprises:
carrying out video synthesis on the identity information of the speaker and a video picture;
and sending the video picture after the video synthesis to a video terminal for display.
4. The method of claim 1, further comprising:
extracting facial features of the participants;
associating the facial features of the participant with the voiceprint features.
5. A video conferencing platform, comprising:
the voiceprint feature extraction module is used for extracting voiceprint features of the voices of the participants based on the recorded voices of the participants to generate a voiceprint library;
the identity information correlation module is used for correlating the voiceprint characteristics of the sound of the participant with the identity information of the participant;
the voiceprint information extraction module is used for acquiring the voiceprint information of the speaker;
the identity information determining module is used for identifying the voiceprint information and determining the identity information of the speaker;
the facial feature acquisition module is used for acquiring the facial features of the speaker according to the incidence relation between the voiceprint features and the facial features;
the facial feature positioning module is used for positioning the speaker in a video picture according to the facial features of the speaker;
the video synthesis module is used for carrying out video synthesis on the identity information of the speaker and the positioning identification of the speaker and the video picture so as to display the video picture after the video synthesis, and comprises a virtual reality scene built in a meeting place through a virtual reality video display terminal or a live meeting built in a mode that participants wear virtual reality video display glasses, wherein the video picture is a virtual reality video picture, and the video terminal is a virtual reality video display terminal.
6. The platform of claim 5, wherein the identity information determination module comprises:
the voiceprint matching unit is used for carrying out feature matching according to the voiceprint information and identifying matched voiceprint features;
and the identity information acquisition unit is used for acquiring the identity information of the speaker corresponding to the matched voiceprint characteristics.
7. The platform of claim 6,
the video composition module includes:
the video synthesis unit is used for carrying out video synthesis on the identity information of the speaker and a video picture;
and the video sending unit is used for sending the video picture after the video synthesis to the video terminal for displaying.
8. The platform of claim 5, further comprising:
the facial feature extraction module is used for extracting facial features of the participants;
a facial feature association module for associating facial features of the participant with voiceprint features.
CN201610283899.1A 2016-04-29 2016-04-29 Video conference data processing method and platform Active CN107333090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610283899.1A CN107333090B (en) 2016-04-29 2016-04-29 Video conference data processing method and platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610283899.1A CN107333090B (en) 2016-04-29 2016-04-29 Video conference data processing method and platform

Publications (2)

Publication Number Publication Date
CN107333090A CN107333090A (en) 2017-11-07
CN107333090B true CN107333090B (en) 2020-04-07

Family

ID=60192620

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610283899.1A Active CN107333090B (en) 2016-04-29 2016-04-29 Video conference data processing method and platform

Country Status (1)

Country Link
CN (1) CN107333090B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108922538B (en) * 2018-05-29 2023-04-07 平安科技(深圳)有限公司 Conference information recording method, conference information recording device, computer equipment and storage medium
CN109561273A (en) * 2018-10-23 2019-04-02 视联动力信息技术股份有限公司 The method and apparatus for identifying video conference spokesman
CN109194906B (en) * 2018-11-06 2020-09-11 苏州科达科技股份有限公司 Video conference authentication system, method, device and storage medium
CN111182256A (en) * 2018-11-09 2020-05-19 中移(杭州)信息技术有限公司 Information processing method and server
CN112004046A (en) * 2019-05-27 2020-11-27 中兴通讯股份有限公司 Image processing method and device based on video conference
CN110996021A (en) * 2019-11-30 2020-04-10 咪咕文化科技有限公司 Director switching method, electronic device and computer readable storage medium
CN111651632A (en) * 2020-04-23 2020-09-11 深圳英飞拓智能技术有限公司 Method and device for outputting voice and video of speaker in video conference
CN111818294A (en) * 2020-08-03 2020-10-23 上海依图信息技术有限公司 Method, medium and electronic device for multi-person conference real-time display combined with audio and video
CN113014857A (en) * 2021-02-25 2021-06-22 游密科技(深圳)有限公司 Control method and device for video conference display, electronic equipment and storage medium
CN113473066A (en) * 2021-05-10 2021-10-01 上海明我信息技术有限公司 Video conference picture adjusting method
CN113542604A (en) * 2021-07-12 2021-10-22 口碑(上海)信息技术有限公司 Video focusing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103680497A (en) * 2012-08-31 2014-03-26 百度在线网络技术(北京)有限公司 Voice recognition system and voice recognition method based on video
CN104427292A (en) * 2013-08-22 2015-03-18 中兴通讯股份有限公司 Method and device for extracting a conference summary
CN104639777A (en) * 2013-11-14 2015-05-20 中兴通讯股份有限公司 Conference control method, conference control device and conference system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7668304B2 (en) * 2006-01-25 2010-02-23 Avaya Inc. Display hierarchy of participants during phone call
CN102263772A (en) * 2010-05-28 2011-11-30 经典时空科技(北京)有限公司 Virtual conference system based on three-dimensional technology
CN103888714B (en) * 2014-03-21 2017-04-26 国家电网公司 3D scene network video conference system based on virtual reality
CN104580986A (en) * 2015-02-15 2015-04-29 王生安 Video communication system combining virtual reality glasses

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103680497A (en) * 2012-08-31 2014-03-26 百度在线网络技术(北京)有限公司 Voice recognition system and voice recognition method based on video
CN104427292A (en) * 2013-08-22 2015-03-18 中兴通讯股份有限公司 Method and device for extracting a conference summary
CN104639777A (en) * 2013-11-14 2015-05-20 中兴通讯股份有限公司 Conference control method, conference control device and conference system

Also Published As

Publication number Publication date
CN107333090A (en) 2017-11-07

Similar Documents

Publication Publication Date Title
CN107333090B (en) Video conference data processing method and platform
US9064160B2 (en) Meeting room participant recogniser
CN107911646B (en) Method and device for sharing conference and generating conference record
WO2018107605A1 (en) System and method for converting audio/video data into written records
US7920158B1 (en) Individual participant identification in shared video resources
US9282284B2 (en) Method and system for facial recognition for a videoconference
KR101636716B1 (en) Apparatus of video conference for distinguish speaker from participants and method of the same
CN110139062B (en) Video conference record creating method and device and terminal equipment
CN112653902B (en) Speaker recognition method and device and electronic equipment
CN107527623B (en) Screen transmission method and device, electronic equipment and computer readable storage medium
US20080235724A1 (en) Face Annotation In Streaming Video
CN106331293A (en) Incoming call information processing method and device
JP2007241130A (en) System and device using voiceprint recognition
CN112532931A (en) Video processing method and device and electronic equipment
CN114240342A (en) Conference control method and device
US20160260435A1 (en) Assigning voice characteristics to a contact information record of a person
CN112151041B (en) Recording method, device, equipment and storage medium based on recorder program
CN114257778A (en) Teleconference system and multi-microphone voice recognition playing method
CN113784058A (en) Image generation method and device, storage medium and electronic equipment
CN116472705A (en) Conference content display method, conference system and conference equipment
CN111160051B (en) Data processing method, device, electronic equipment and storage medium
CN112532912A (en) Video processing method and device and electronic equipment
CN114762039A (en) Conference data processing method and related equipment
CN114764690A (en) Method, device and system for intelligently conducting conference summary
CN112671632A (en) Intelligent earphone system based on face recognition and information interaction and/or social contact method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant