CN110600036A - Conference picture switching device and method based on voice recognition - Google Patents

Conference picture switching device and method based on voice recognition Download PDF

Info

Publication number
CN110600036A
CN110600036A CN201910907963.2A CN201910907963A CN110600036A CN 110600036 A CN110600036 A CN 110600036A CN 201910907963 A CN201910907963 A CN 201910907963A CN 110600036 A CN110600036 A CN 110600036A
Authority
CN
China
Prior art keywords
voice recognition
conference
method based
semantic analysis
switching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910907963.2A
Other languages
Chinese (zh)
Inventor
陈洪浩
冯文澜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suirui Technology Group Co Ltd
Original Assignee
Suirui Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suirui Technology Group Co Ltd filed Critical Suirui Technology Group Co Ltd
Priority to CN201910907963.2A priority Critical patent/CN110600036A/en
Publication of CN110600036A publication Critical patent/CN110600036A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a conference picture switching device and method based on voice recognition, wherein the conference picture switching method based on voice recognition comprises the following steps: step one, generating a voice recognition library based on sign-in information and platform address book information; secondly, conference voice recognition is carried out based on a voice recognition library so as to find a matching result; thirdly, performing semantic analysis on the matching result; and step four, switching pictures according to the semantic analysis result. The conference picture switching method based on the voice recognition enables the conference picture to be switched more intelligently, and has real experience of face-to-face meeting. And the semantic recognition function is added, so that the accurate judgment can be performed on the participants the user wants to watch, the specific voice operation can be added, and the conference experience is improved.

Description

Conference picture switching device and method based on voice recognition
Technical Field
The present invention relates to the field of wireless communication, and more particularly, to a conference screen switching device and method based on voice recognition.
Background
In the process of the video conference, especially in the multi-party conference, the video conference object picture is often required to be switched so as to ensure better conference effect.
In the prior art, conference pictures are switched to manual switching or voice excitation, wherein leaders or other members are required to put forward video switching requirements in the manual switching process, then operators perform video switching, and the conference pictures to be switched need to be searched in the operation process, so that the efficiency is low, and the experience is poor. In the process of voice excitation, a certain speaking party speaks, and corresponding video switching is performed after voice is recognized. However, in an actual process, a speaker temporarily leaves without hearing a call, the handover function cannot be triggered, the misjudgment rate is high, the active handover function is not supported, only 245428is available, core waits for and continuously queries, the effect is poor, and the experience is poor.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Disclosure of Invention
The invention aims to provide a conference picture switching device based on voice recognition and a method thereof, which enable the conference picture switching to be more intelligent and have the real experience of face-to-face meeting. Not only can accurately judge the participant who the user wants to watch, but also can add specific voice operation, thereby improving the conference experience.
In order to achieve the above object, the present invention provides a conference screen switching apparatus based on voice recognition, including: the voice recognition library generating module generates a voice recognition library based on the check-in information and the platform address book information; the voice recognition module is used for carrying out conference voice recognition based on the voice recognition library so as to find a matching result; the semantic analysis module is used for carrying out semantic analysis on the matching result of the voice recognition module; and the picture switching module is used for switching pictures according to the semantic analysis result.
The invention also provides a conference picture switching method based on voice recognition, which comprises the following steps: step one, generating a voice recognition library based on sign-in information and platform address book information; secondly, conference voice recognition is carried out based on a voice recognition library so as to find a matching result; thirdly, performing semantic analysis on the matching result; and step four, switching pictures according to the semantic analysis result.
In a preferred embodiment, the step one specifically includes: the method comprises a sign-in step, a step of acquiring information of participating members from a platform and a step of generating a voice recognition detection list.
In a preferred embodiment, the platform address book information includes: names, nicknames and remark names of others.
In a preferred embodiment, the method for acquiring the check-in information includes: face recognition, manual check-in, card swiping check-in and terminal automatic check-in.
In a preferred embodiment, the second step specifically includes: the method comprises a voice recognition step, a matching step based on a voice recognition library and a matching judgment step.
In a preferred embodiment, step three specifically includes: semantic learning and editing, semantic analysis and generation recording and semantic scene conforming judgment.
In a preferred embodiment, the semantic analysis is to analyze whether the main sentence calls or talks about a person or directs an operation.
In a preferred embodiment, the step four specifically includes: a step of viewing display strategy and a step of switching conference pictures.
In a preferred embodiment, the display strategy for switching the screen is as follows: the picture with large proportion of people is displayed preferentially, and if the proportion of people is equivalent, the front picture of people is displayed preferentially.
Compared with the prior art, the conference picture switching method based on the voice recognition integrates the sign-in information and the address book information into the voice recognition library, so that the conference picture switching is more intelligent, and the conference picture switching method has real experience of face-to-face meeting. And the semantic recognition function is added, so that the accurate judgment can be performed on the participants the user wants to watch, the specific voice operation can be added, and the conference experience is improved.
Drawings
Fig. 1 is a flowchart of a conference screen switching method based on voice recognition according to an embodiment of the present invention.
Detailed Description
The following detailed description of the present invention is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
The conference picture switching device based on voice recognition according to the preferred embodiment of the present invention comprises: the device comprises a voice recognition library generation module, a voice recognition module, a semantic analysis module and a picture switching module. The voice recognition library generating module generates a voice recognition library based on the sign-in information and the platform address book information; the voice recognition module carries out conference voice recognition based on the voice recognition library so as to find a matching result; the semantic analysis module is used for performing semantic analysis on the matching result of the voice recognition module; and the picture switching module is used for switching pictures according to the semantic analysis result.
As shown in fig. 1, the main flow of the switching method of the conference picture switching apparatus based on voice recognition according to the preferred embodiment of the present invention is as follows: and (3) voice recognition name-the terminal (or the conference room system) corresponding to the matched name-switching the camera picture of the corresponding terminal (or the conference room system). The method specifically comprises the following steps:
firstly, generating a voice recognition library based on sign-in information and platform address book information;
the steps mainly relate to platform address book and sign-in function. Wherein, platform address book information includes: names and nicknames of the participants and remark names of the participants. The acquisition mode of the check-in information comprises the following steps: the system comprises face recognition, manual check-in, card swiping check-in and terminal automatic check-in, and is used for determining information such as presence members of a conference room, a place where the conference room is located and the like. The conference picture switching method combines the information to form a voice recognition library and a query table.
Secondly, conference voice recognition is carried out based on a voice recognition library so as to find a matching result;
for example: "… … the next time is handed to Zhang III, 'Zhang III' (i.e., the call process where a match is found)".
Thirdly, performing semantic analysis on the matching result;
semantic analysis means to analyze whether a main sentence calls a person, talks about a person or commands an operation; if the semantic is determined to be the call instruction, finding the appointed meeting place according to the query table, and switching the pictures. For example: "… … the next time is handed to Zhang III, 'Zhang III', two 'Zhang III' appeared in the former words, both will be recognized in the speech recognition, because of recognizing the name, both will be lost to the semantic analysis step for analysis, according to the pause time before and after, the coherence, or the prior art means to analyze whether it is the call instruction. In the conference process, the switching is not sensed (for example, Zhang III wants to make the Li IV send an opinion, only the Li IV needs to be said, and what you have is seen, at this moment, the picture is switched to the picture of the Li IV, and then everybody can see the video picture of the Li IV to wait for the answer of the Li IV, so that the real experience of face-to-face meeting is better). And the conference picture switching method based on the voice recognition is added with a semantic recognition function, so that not only can accurate judgment be carried out on participants who the user wants to watch, but also specific voice operation (command operation) can be added, such as switching to a Beijing conference place.
And finally, switching pictures according to the semantic analysis result.
Switching screens based on a display policy (for example, directly switching screens in the case of a single window, switching screens in the case of multiple windows, switching screens in a large window, switching screens in a window arranged at the top in the case of a large window, etc.), for example: zhang III can sign in meeting room 1, and at the same time, a notebook computer and a mobile phone are used in meeting room 1 to carry out a meeting, at the moment, more than two cameras are all aligned to Zhang III, such as meeting room 1 system cameras (more than 2 cameras can be used in a meeting room system) and notebook terminal cameras. In this case, there may be a priority algorithm, such as a priority display in which the image person ratio is large (the upper face is certainly the notebook terminal camera), and if the ratio is about large, a priority display in the front face, etc. The method can also be matched with the voice excitation mode to select the method with short pickup distance.
In conclusion, the conference picture switching method based on the voice recognition integrates the sign-in information and the address book information into the voice recognition library, so that the conference picture switching is more intelligent, and the real experience of face-to-face meeting is achieved. And the semantic recognition function is added, so that the accurate judgment can be performed on the participants the user wants to watch, the specific voice operation can be added, and the conference experience is improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (10)

1. A conference screen switching apparatus based on voice recognition, comprising:
the voice recognition library generating module generates a voice recognition library based on the check-in information and the platform address book information;
the voice recognition module is used for carrying out conference voice recognition based on the voice recognition library so as to find a matching result;
the semantic analysis module is used for carrying out semantic analysis on the matching result of the voice recognition module; and
and the picture switching module is used for switching pictures according to the semantic analysis result.
2. The switching method of the conference screen switching apparatus based on the voice recognition as claimed in claim 1, comprising the steps of:
step one, generating a voice recognition library based on sign-in information and platform address book information;
secondly, conference voice recognition is carried out based on a voice recognition library so as to find a matching result;
thirdly, performing semantic analysis on the matching result;
and step four, switching pictures according to the semantic analysis result.
3. The conference screen switching method based on speech recognition as claimed in claim 2, wherein the first step specifically comprises: the method comprises a sign-in step, a step of acquiring information of participating members from a platform and a step of generating a voice recognition detection list.
4. The conference screen switching method based on voice recognition according to claim 2, wherein the platform address book information includes: names, nicknames and remark names of others.
5. The conference screen switching method based on voice recognition according to claim 2, wherein the acquisition mode of the check-in information includes: face recognition, manual check-in, card swiping check-in and terminal automatic check-in.
6. The conference screen switching method based on speech recognition according to claim 3, wherein the second step specifically comprises: the method comprises a voice recognition step, a matching step based on a voice recognition library and a matching judgment step.
7. The conference screen switching method based on speech recognition according to claim 6, wherein the third step specifically comprises: semantic learning and editing, semantic analysis and generation recording and semantic scene conforming judgment.
8. The method as claimed in claim 2, wherein the semantic analysis is to analyze whether a main sentence is to call or talk about a person or to command an operation.
9. The conference screen switching method based on speech recognition according to claim 7, wherein the fourth step specifically comprises: a step of viewing display strategy and a step of switching conference pictures.
10. The conference screen switching method based on speech recognition according to claim 2, wherein the display policy of the switching screen is: the picture with large proportion of people is displayed preferentially, and if the proportion of people is equivalent, the front picture of people is displayed preferentially.
CN201910907963.2A 2019-09-24 2019-09-24 Conference picture switching device and method based on voice recognition Pending CN110600036A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910907963.2A CN110600036A (en) 2019-09-24 2019-09-24 Conference picture switching device and method based on voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910907963.2A CN110600036A (en) 2019-09-24 2019-09-24 Conference picture switching device and method based on voice recognition

Publications (1)

Publication Number Publication Date
CN110600036A true CN110600036A (en) 2019-12-20

Family

ID=68862930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910907963.2A Pending CN110600036A (en) 2019-09-24 2019-09-24 Conference picture switching device and method based on voice recognition

Country Status (1)

Country Link
CN (1) CN110600036A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405232B (en) * 2020-03-05 2021-08-06 深圳震有科技股份有限公司 Video conference speaker picture switching processing method and device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510990A (en) * 2009-02-27 2009-08-19 深圳华为通信技术有限公司 Method and system for processing remote presentation conference user signal
CN102131071A (en) * 2010-01-18 2011-07-20 华为终端有限公司 Method and device for video screen switching
CN102638671A (en) * 2011-02-15 2012-08-15 华为终端有限公司 Method and device for processing conference information in video conference
CN105608754A (en) * 2014-11-12 2016-05-25 中兴通讯股份有限公司 Video conference signing method, video conference signing apparatus and video conference signing system
CN106231259A (en) * 2016-07-29 2016-12-14 北京小米移动软件有限公司 The display packing of monitored picture, video player and server
CN107277427A (en) * 2017-05-16 2017-10-20 广州视源电子科技股份有限公司 Automatically select method, device and the audio-visual system of camera picture

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510990A (en) * 2009-02-27 2009-08-19 深圳华为通信技术有限公司 Method and system for processing remote presentation conference user signal
CN102131071A (en) * 2010-01-18 2011-07-20 华为终端有限公司 Method and device for video screen switching
CN102638671A (en) * 2011-02-15 2012-08-15 华为终端有限公司 Method and device for processing conference information in video conference
CN105608754A (en) * 2014-11-12 2016-05-25 中兴通讯股份有限公司 Video conference signing method, video conference signing apparatus and video conference signing system
CN106231259A (en) * 2016-07-29 2016-12-14 北京小米移动软件有限公司 The display packing of monitored picture, video player and server
CN107277427A (en) * 2017-05-16 2017-10-20 广州视源电子科技股份有限公司 Automatically select method, device and the audio-visual system of camera picture

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405232B (en) * 2020-03-05 2021-08-06 深圳震有科技股份有限公司 Video conference speaker picture switching processing method and device, equipment and medium

Similar Documents

Publication Publication Date Title
US9064160B2 (en) Meeting room participant recogniser
US20220254158A1 (en) Learning situation analysis method, electronic device, and storage medium
EP3125154A1 (en) Photo sharing method and device
US20180227341A1 (en) Communication Device and Method
CN107644646B (en) Voice processing method and device for voice processing
CN112653902B (en) Speaker recognition method and device and electronic equipment
US20160359941A1 (en) Automated video editing based on activity in video conference
US9641801B2 (en) Method, apparatus, and system for presenting communication information in video communication
EP3701715B1 (en) Electronic apparatus and method for controlling thereof
CN111258528B (en) Voice user interface display method and conference terminal
WO2020119032A1 (en) Biometric feature-based sound source tracking method, apparatus, device, and storage medium
CN106331293A (en) Incoming call information processing method and device
CN110769189B (en) Video conference switching method and device and readable storage medium
KR20140078258A (en) Apparatus and method for controlling mobile device by conversation recognition, and apparatus for providing information by conversation recognition during a meeting
CN110673811B (en) Panoramic picture display method and device based on sound information positioning and storage medium
CN110351513B (en) Court trial recording method and device, computer equipment and storage medium
CN117897930A (en) Streaming data processing for hybrid online conferencing
CN106326804B (en) Recording control method and device
CN114240342A (en) Conference control method and device
CN110600036A (en) Conference picture switching device and method based on voice recognition
CN110865789A (en) Method and system for intelligently starting microphone based on voice recognition
CN115623133A (en) Online conference method and device, electronic equipment and readable storage medium
WO2021217897A1 (en) Positioning method, terminal device and conference system
CN116193179A (en) Conference recording method, terminal equipment and conference recording system
CN112954260B (en) Automatic meeting place lens switching method and system in video conference process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191220