CN111432157B - Conference processing method, device, equipment and storage medium based on video networking - Google Patents

Conference processing method, device, equipment and storage medium based on video networking Download PDF

Info

Publication number
CN111432157B
CN111432157B CN202010100395.8A CN202010100395A CN111432157B CN 111432157 B CN111432157 B CN 111432157B CN 202010100395 A CN202010100395 A CN 202010100395A CN 111432157 B CN111432157 B CN 111432157B
Authority
CN
China
Prior art keywords
video
conference
audio data
voice
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010100395.8A
Other languages
Chinese (zh)
Other versions
CN111432157A (en
Inventor
彭宇龙
郭少森
安君超
王艳辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Visionvera Information Technology Co Ltd
Original Assignee
Visionvera Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Visionvera Information Technology Co Ltd filed Critical Visionvera Information Technology Co Ltd
Priority to CN202010100395.8A priority Critical patent/CN111432157B/en
Publication of CN111432157A publication Critical patent/CN111432157A/en
Application granted granted Critical
Publication of CN111432157B publication Critical patent/CN111432157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/02Details
    • H04J3/06Synchronising arrangements
    • H04J3/0635Clock or time synchronisation in a network
    • H04J3/0638Clock or time synchronisation among nodes; Internode synchronisation
    • H04J3/0644External master-clock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/155Conference systems involving storage of or access to video conference sessions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2151Time stamp
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the invention provides a conference processing method, a conference processing device, conference processing equipment and a conference processing medium based on a video network, wherein a video network terminal is arranged in the video network, the video network terminal is in communication connection with a voice processing server, the video network terminal is also in communication connection with a GPS module and voice acquisition equipment, and the method is applied to the video network terminal and comprises the following steps: in a video conference, when the starting operation of a voice acquisition device is detected, the operation time of the starting operation is obtained, and the biological characteristic information acquired by the voice acquisition device based on the starting operation is obtained; sending the biological characteristic information to a voice processing server, and calibrating the operation time through a GPS module; acquiring audio data acquired by the started voice acquisition equipment, and adding a timestamp to the audio data based on the calibrated operation time; and sending the audio data to the voice processing server so that the voice processing server generates a conference file based on the audio data, the biological feature information and the timestamp.

Description

Conference processing method, device, equipment and storage medium based on video networking
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a conference processing method, apparatus, device, and storage medium based on a video network.
Background
With the development of the video network, a large-scale high-definition video conference is held in the video network by a plurality of users. In the process of holding a video conference, it is often necessary to record the content in the video conference. For example, the speaking content of the speaker is recorded for archiving. However, in the related art, the accuracy of recording the utterance content of the speaker is not high, so that the recorded conference recording cannot restore the video conference scene.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a method, an apparatus, a device and a storage medium for processing a conference based on a video network, so as to overcome the above problems or at least partially solve the above problems.
In a first aspect of the embodiments of the present invention, a conference processing method based on a video network is disclosed, wherein a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, the video network terminal is also in communication connection with a GPS module and a voice acquisition device, and the method is applied to the video network terminal and includes:
in the process of carrying out a video conference, when the starting operation of the voice acquisition equipment is detected, the operation time of the starting operation and the biological characteristic information acquired by the voice acquisition equipment based on the starting operation are obtained;
sending the biological characteristic information to the voice processing server, and calibrating the operation time through the GPS module;
acquiring audio data acquired by the started voice acquisition equipment, and adding a timestamp to the audio data based on the calibrated operation time;
and sending the audio data to the voice processing server so that the voice processing server generates text information corresponding to the voice acquisition equipment based on the audio data and the biological characteristic information, and storing the text information as a conference file corresponding to the video network terminal based on the timestamp.
Optionally, calibrating the operation time by the GPS module includes:
acquiring GPS time through the GPS module, and calibrating the operation time to be time synchronous with the GPS time;
adding a timestamp to the audio data based on the calibrated operating time, comprising:
and determining the acquisition time when the audio data is acquired from the calibrated operation time, and adding a time stamp corresponding to the acquisition time to the audio data.
Optionally, after sending the biometric information to the voice processing server, the method further includes:
receiving a control instruction sent by the voice processing server, wherein the control instruction is generated by the voice processing server when the fact that the biological characteristic information is not matched with the biological characteristics of a plurality of users in a preset biological characteristic library is determined;
and closing the voice acquisition equipment based on the control instruction.
Optionally, the number of the voice acquisition devices connected to the terminal of the video network is multiple, and the biometric information is information respectively acquired by at least part of the started voice acquisition devices in the multiple voice acquisition devices; the audio data are respectively acquired by at least part of the voice acquisition devices which are turned on.
The second aspect of the embodiment of the invention discloses a conference processing method based on a video network, wherein a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, and the video network terminal is also in communication connection with a GPS module and a voice acquisition device; the method is applied to the voice processing server and comprises the following steps:
receiving biological characteristic information sent by the video networking terminal in the video conference process, wherein the biological characteristic information is acquired by the voice acquisition equipment based on the starting operation performed by a user;
receiving audio data sent by the video networking terminal, wherein the audio data is collected by the started voice collecting equipment and added with a timestamp by the video networking terminal based on the calibrated operation time; the calibrated operation time is the time after the operation time when the video network terminal starts the operation is calibrated through the GPS module;
generating text information based on the biometric information and the audio data;
and storing the text information as a conference file corresponding to the video networking terminal based on the timestamp.
Optionally, the number of the voice collecting devices is multiple; generating text information based on the biometric information and the audio data, including:
generating a plurality of text messages respectively corresponding to the plurality of activated audio acquisition devices based on the received plurality of biometric information and the plurality of audio data; wherein the plurality of audio data and the plurality of biometric information are respectively collected by a plurality of voice collecting devices that are activated;
based on the timestamp, storing the text information as a conference file corresponding to the video networking terminal, including:
and storing the plurality of text messages as conference files corresponding to the video networking terminals according to the sequence of the respective timestamps of the plurality of audio data.
Optionally, generating text information based on the biometric information and the audio data comprises:
comparing the biological characteristic information with a plurality of user biological characteristics in a preset biological characteristic library, and acquiring the identity information of the user corresponding to the user biological characteristics which are compared successfully when the biological characteristic information is determined to be matched with the user biological characteristics in the plurality of user biological characteristics;
identifying the audio data to obtain a text corresponding to the audio data;
adding the identity information to the text to obtain text information;
the method further comprises the following steps:
and when the biological characteristic information is determined not to be matched with the biological characteristics of the plurality of users, generating a control instruction, and sending the control instruction to the video network terminal.
Optionally, after generating text information based on the biometric information and the audio data, the method further comprises:
converting the text information into subtitle information, and sending the subtitle information to a conference terminal participating in the video conference so as to enable the conference terminal to display the subtitle information;
after storing the text information as a conference file corresponding to the video network terminal based on the timestamp, the method further comprises:
when a video conference is finished, acquiring pre-stored conference data corresponding to the video conference;
storing the conference data into the conference file to obtain a conference detailed record;
and sending the meeting detailed record to a web page in communication connection with the voice server so that the web page displays the meeting detailed record.
In a third aspect of the embodiments of the present invention, a conference processing apparatus based on a video network is disclosed, in which a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, the video network terminal is also in communication connection with a GPS module and a voice acquisition device, and the apparatus is applied to the video network terminal, and includes:
the starting information acquisition module is used for acquiring the operation time of the starting operation and the biological characteristic information acquired by the voice acquisition equipment based on the starting operation when the starting operation of the voice acquisition equipment is detected in the video conference process;
the information calibration module is used for sending the biological characteristic information to the voice processing server and calibrating the operation time through the GPS module;
the voice data acquisition module is used for acquiring voice data acquired by the started voice acquisition equipment and adding a timestamp to the voice data based on the calibrated operation time;
and the audio data sending module is used for sending the audio data to the voice processing server so that the voice processing server generates text information corresponding to the voice acquisition equipment based on the audio data and the biological characteristic information, and stores the text information as a conference file corresponding to the video network terminal based on the timestamp.
In a fourth aspect of the embodiments of the present invention, a conference processing apparatus based on a video network is disclosed, wherein a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, and the video network terminal is further in communication connection with a GPS module and a voice acquisition device; the device is applied to the voice processing server and comprises:
the first information receiving module is used for receiving biological characteristic information sent by the video networking terminal in the video conference process, and the biological characteristic information is acquired by the voice acquisition equipment based on the starting operation of a user;
the second information receiving module is used for receiving audio data sent by the video networking terminal, the audio data is collected by the started voice collecting equipment, and a timestamp is added into the audio data by the video networking terminal based on the calibrated operation time; the calibrated operation time is the time after the operation time when the video network terminal starts the operation is calibrated through the GPS module;
the text information generating module is used for generating text information based on the biological characteristic information and the audio data;
and the conference file generation module is used for storing the text information into a conference file corresponding to the video network terminal based on the timestamp.
The embodiment of the invention also discloses an electronic device, which comprises:
one or more processors; and
one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the video networking-based conferencing processing method of the first or second aspects of the invention.
The embodiment of the invention also discloses a computer readable storage medium, which stores a computer program to make a processor execute the conference processing method based on the video network according to the first aspect or the second aspect of the invention.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, when the video network terminal detects that the voice acquisition equipment is started in a video conference, the operation time of the starting operation and the biological characteristic information acquired by the voice acquisition equipment based on the starting operation can be acquired, then the biological characteristic information is sent to the voice processing server, the operation time is calibrated by using the GPS time, when the audio data acquired by the voice acquisition equipment is obtained, a timestamp can be added to the audio data based on the calibrated operation time, after the audio data is sent to the voice processing server, the voice processing server can generate text information according to the audio data and the biological characteristic information, and the text information is stored as a conference file according to the timestamp. By adopting the embodiment of the invention, on one hand, the time stamp is added based on the calibrated operation time, so that the time stamp of the audio data is based on the GPS time, thereby reducing the difference of the text information in time and improving the accuracy of the conference recording. On the other hand, the voice acquisition equipment can also acquire biological characteristic information, so that the identity information of the users participating in the meeting can be recorded in the text information, and further the meeting record is more detailed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the description of the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings may be obtained according to these drawings without inventive labor.
FIG. 1 is a schematic networking diagram of a video network of the present invention;
FIG. 2 is a schematic diagram of a hardware architecture of a node server according to the present invention;
fig. 3 is a schematic diagram of a hardware structure of an access switch of the present invention;
fig. 4 is a schematic diagram of a hardware structure of an ethernet protocol conversion gateway according to the present invention;
FIG. 5 is a diagram of an application scenario for a video conference in an embodiment of the present invention;
FIG. 6 is a flowchart illustrating the steps of a method for video networking based conferencing in accordance with an embodiment of the present invention;
fig. 7 is a diagram illustrating an application scenario of another video conference in an embodiment of the present invention;
FIG. 8 is a flowchart illustrating the steps of a video networking based conference processing method according to another embodiment of the present invention;
FIG. 9 is a schematic flowchart illustrating a video networking-based conference processing method according to an embodiment of the present invention
Fig. 10 is a schematic structural diagram of a conference processing apparatus based on video networking according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a conference processing apparatus based on a video network in another embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
To better understand the embodiments of the present invention, the following description refers to the internet of view:
the video networking is an important milestone for network development, is a real-time network, can realize high-definition video real-time transmission, and pushes a plurality of internet applications to high-definition video, and high-definition faces each other.
The video networking adopts a real-time high-definition video exchange technology, can integrate required services such as dozens of services of video, voice, pictures, characters, communication, data and the like on a system platform on a network platform, such as high-definition video conference, video monitoring, intelligent monitoring analysis, emergency command, digital broadcast television, delayed television, network teaching, live broadcast, VOD on demand, television mail, personal Video Recorder (PVR), intranet (self-office) channels, intelligent video broadcast control, information distribution and the like, and realizes high-definition quality video broadcast through a television or a computer.
Some of the technologies applied in the video networking are as follows:
network technology (network technology)
Network technology innovation in video networking has improved the traditional Ethernet (Ethernet) to face the potentially huge first video traffic on the network. Unlike pure network Packet Switching (Packet Switching) or network Circuit Switching (Circuit Switching), the Packet Switching is adopted by the technology of the video networking to meet the Streaming requirement. The video networking technology has the advantages of flexibility, simplicity and low price of packet switching, and simultaneously has the quality and safety guarantee of circuit switching, thereby realizing the seamless connection of the whole network switching type virtual circuit and the data format.
Switching Technology (Switching Technology)
The video network adopts two advantages of asynchronism and packet switching of the Ethernet, eliminates the defects of the Ethernet on the premise of full compatibility, has end-to-end seamless connection of the whole network, is directly communicated with a user terminal, and directly bears an IP data packet. The user data does not require any format conversion across the entire network. The video networking is a higher-level form of the Ethernet, is a real-time exchange platform, can realize the real-time transmission of the whole-network large-scale high-definition video which cannot be realized by the existing Internet, and pushes a plurality of network video applications to high-definition and unification.
Server technology (Servertechnology)
The server technology on the video network and the unified video platform is different from the traditional server, the streaming media transmission of the video network and the unified video platform is established on the basis of connection orientation, the data processing capability of the video network and the unified video platform is irrelevant to flow and communication time, and a single network layer can contain signaling and data transmission. For voice and video services, the complexity of video networking and unified video platform streaming media processing is much simpler than that of data processing, and the efficiency is greatly improved by over one hundred times compared with that of the traditional server.
Storage Technology (Storage Technology)
The super-high speed memory technology of the unified video platform adopts the most advanced real-time operating system in order to adapt to the media content with super-large capacity and super-large flow, the program information in the server instruction is mapped to the specific hard disk space, the media content is not passed through the server any more, and is instantly and directly sent to the user terminal, and the user waiting time is less than 0.2 second. The optimized sector distribution greatly reduces the mechanical motion of the magnetic head track seeking of the hard disk, the resource consumption only accounts for 20% of that of the IP internet of the same grade, but concurrent flow which is 3 times larger than that of the traditional hard disk array is generated, and the comprehensive efficiency is improved by more than 10 times.
Network Security Technology (Network Security Technology)
The structural design of the video network completely eliminates the network security problem troubling the internet structurally by the modes of independent service permission control each time, complete isolation of equipment and user data and the like, generally does not need antivirus programs and firewalls, avoids the attack of hackers and viruses, and provides a structural carefree security network for users.
Service Innovation Technology (Service Innovation Technology)
The unified video platform integrates services and transmission, and is not only automatically connected once, but also connected with a single user, a private network user or the sum of one network. The user terminal, the set-top box or the PC are directly connected to the unified video platform to obtain various multimedia video services in various forms. The unified video platform adopts a menu type configuration table mode to replace the traditional complex application programming, can realize complex application by using very few codes, and realizes infinite new service innovation.
Networking of the video network is as follows:
the video network is a centralized control network structure, and the network can be a tree network, a star network, a ring network and the like, but on the basis of the centralized control node, the whole network is controlled by the centralized control node in the network.
As shown in fig. 1, the video network is divided into two parts, an access network and a metropolitan network.
The devices of the access network part can be mainly classified into 3 types: node server, access switch, terminal (including various set-top boxes, coding boards, memories, etc.). The node server is connected to an access switch, which may be connected to a plurality of terminals and may be connected to an ethernet network.
The node server is a node which plays a centralized control function in the access network and can control the access switch and the terminal. The node server can be directly connected with the access switch or directly connected with the terminal.
Similarly, devices on the metro network part can be classified into 3 types: a metropolitan area server, a node switch and a node server. The metro server is connected to a node switch, which may be connected to a plurality of node servers.
The node server is a node server of the access network part, namely the node server belongs to both the access network part and the metropolitan area network part.
The metropolitan area server is a node which plays a centralized control function in the metropolitan area network and can control a node switch and a node server. The metropolitan area server can be directly connected with the node switch or directly connected with the node server.
Therefore, the whole video network is a network structure with layered centralized control, and the network controlled by the node server and the metropolitan area server can be in various structures such as tree, star and ring.
The access network part can form a unified video platform (the part in a dotted circle), and a plurality of unified video platforms can form a video network; each unified video platform may be interconnected via metropolitan area and wide area video networking.
Video networking device classification
1.1 devices in the video network of the embodiment of the present invention can be mainly classified into 3 types: server, exchanger (including Ethernet protocol conversion gateway), terminal (including various set-top boxes, code board, memory, etc.). The video network as a whole can be divided into a metropolitan area network (or national network, global network, etc.) and an access network.
1.2 wherein the devices of the access network part can be mainly classified into 3 types: node server, access exchanger (including Ethernet protocol conversion gateway), terminal (including various set-top boxes, coding board, memory, etc.).
The specific hardware structure of each access network device is as follows:
a node server:
as shown in fig. 2, the system mainly includes a network interface module 201, a switching engine module 202, a CPU module 203, and a disk array module 204;
the packets coming from the network interface module 201, the cpu module 203 and the disk array module 204 all enter the switching engine module 202; the switching engine module 202 performs an operation of looking up the address table 205 on the incoming packet, thereby obtaining the direction information of the packet; and stores the packet in a queue of the corresponding packet buffer 206 based on the packet's steering information; if the queue of the packet buffer 206 is nearly full, it is discarded; the switching engine module 202 polls all packet buffer queues for forwarding if the following conditions are met: 1) The port send buffer is not full; 2) The queue packet counter is greater than zero. The disk array module 204 mainly implements control over the hard disk, including initialization, read-write, and other operations on the hard disk; the CPU module 203 is mainly responsible for protocol processing with an access switch and a terminal (not shown in the figure), configuring an address table 205 (including a downlink protocol packet address table, an uplink protocol packet address table, and a data packet address table), and configuring the disk array module 204.
The access switch:
as shown in fig. 3, the network interface module mainly includes a network interface module (a downlink network interface module 301 and an uplink network interface module 302), a switching engine module 303 and a CPU module 304;
wherein, the packet (uplink data) coming from the downlink network interface module 301 enters the packet detection module 305; the packet detection module 305 detects whether the Destination Address (DA), the Source Address (SA), the packet type, and the packet length of the packet meet the requirements, and if so, allocates a corresponding stream identifier (stream-id) and enters the switching engine module 303, otherwise, discards the stream identifier; the packet (downstream data) coming from the upstream network interface module 302 enters the switching engine module 303; the incoming data packet of the CPU module 304 enters the switching engine module 303; the switching engine module 303 performs an operation of looking up the address table 306 on the incoming packet, thereby obtaining the direction information of the packet; if the packet entering the switching engine module 303 is from the downstream network interface to the upstream network interface, the packet is stored in the queue of the corresponding packet buffer 307 in association with the stream-id; if the queue of the packet buffer 307 is nearly full, it is discarded; if the packet entering the switching engine module 303 is not from the downlink network interface to the uplink network interface, the data packet is stored in the queue of the corresponding packet buffer 307 according to the guiding information of the packet; if the queue of the packet buffer 307 is nearly full, it is discarded.
The switching engine module 303 polls all packet buffer queues and may include two cases:
if the queue is from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) The port send buffer is not full; 2) The queued packet counter is greater than zero; 3) Obtaining a token generated by a code rate control module;
if the queue is not from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) The port send buffer is not full; 2) The queued packet counter is greater than zero.
The rate control module 308 is configured by the CPU module 304, and generates tokens for packet buffer queues from all downstream network interfaces to upstream network interfaces at programmable intervals to control the rate of upstream forwarding.
The CPU module 304 is mainly responsible for protocol processing with the node server, configuration of the address table 306, and configuration of the code rate control module 308.
Ethernet protocol conversion gateway
As shown in fig. 4, the apparatus mainly includes a network interface module (a downlink network interface module 401 and an uplink network interface module 402), a switching engine module 403, a CPU module 404, a packet detection module 405, a rate control module 408, an address table 406, a packet buffer 407, a MAC adding module 409, and a MAC deleting module 410.
Wherein, the data packet coming from the downlink network interface module 401 enters the packet detection module 405; the packet detection module 405 detects whether the ethernet MAC DA, the ethernet MAC SA, the ethernet length or frame type, the video network destination address DA, the video network source address SA, the video network packet type, and the packet length of the packet meet the requirements, and if so, allocates a corresponding stream identifier (stream-id); then, the MAC DA, MAC SA, length or frame type (2 byte) is subtracted by the MAC deletion module 410 and enters the corresponding receiving buffer, otherwise it is discarded;
the downlink network interface module 401 detects the sending buffer of the port, if there is a packet, the ethernet MAC DA of the corresponding terminal is known according to the destination address DA of the packet in the video network, the ethernet MAC DA of the terminal, the MAC SA of the ethernet protocol gateway, and the ethernet length or frame type are added, and the packet is sent.
The other modules in the ethernet protocol gateway function similarly to the access switch.
A terminal:
the system mainly comprises a network interface module, a service processing module and a CPU module; for example, the set-top box mainly comprises a network interface module, a video and audio coding and decoding engine module and a CPU module; the coding board mainly comprises a network interface module, a video and audio coding engine module and a CPU module; the memory mainly comprises a network interface module, a CPU module and a disk array module.
1.3 devices of the metropolitan area network part can be mainly classified into 2 types: node server, node exchanger, metropolitan area server. The node switch mainly comprises a network interface module, a switching engine module and a CPU module; the metropolitan area server mainly comprises a network interface module, a switching engine module and a CPU module.
2. Video networking packet definition
2.1 Access network packet definition
The data packet of the access network mainly comprises the following parts: destination Address (DA), source Address (SA), reserved byte, payload (PDU), CRC.
As shown in the following table, the data packet of the access network mainly includes the following parts:
DA SA Reserved Payload CRC
wherein:
the Destination Address (DA) is composed of 8 bytes (byte), the first byte represents the type of the data packet (such as various protocol packets, multicast data packets, unicast data packets, etc.), there are 256 possibilities at most, the second byte to the sixth byte are metropolitan area network addresses, and the seventh byte and the eighth byte are access network addresses;
the Source Address (SA) is also composed of 8 bytes (byte), defined as the same as the Destination Address (DA);
the reserved byte consists of 2 bytes;
the payload part has different lengths according to the types of different datagrams, 64 bytes if various protocol packets, 32+1024=1056 bytes if single-multicast data packets, and certainly not limited to the above 2 types;
the CRC consists of 4 bytes and its calculation method follows the standard ethernet CRC algorithm.
2.2 packet definition for metropolitan area networks
The topology of a metropolitan area network is a graph and there may be 2, or even more than 2, connections between two devices, i.e., there may be more than 2 connections between a node switch and a node server, a node switch and a node switch, and a node switch and a node server. However, the metro network address of the metro network device is unique, and in order to accurately describe the connection relationship between the metro network devices, parameters are introduced in the embodiment of the present invention: a label to uniquely describe a metropolitan area network device.
In this specification, the definition of the Label is similar to that of the Label of MPLS (Multi-Protocol Label Switch), and assuming that there are two connections between the device a and the device B, there are 2 labels for the packet from the device a to the device B, and 2 labels for the packet from the device B to the device a. The label is classified into an incoming label and an outgoing label, and assuming that the label (incoming label) of the packet entering the device a is 0x0000, the label (outgoing label) of the packet leaving the device a may become 0x0001. The network access process of the metro network is a network access process under centralized control, that is, address allocation and label allocation of the metro network are both dominated by the metro server, and the node switch and the node server are all passively executed, which is different from label allocation of MPLS, which is a result of mutual negotiation between the switch and the server.
As shown in the following table, the data packet of the metro network mainly includes the following parts:
DA SA Reserved label (R) Payload CRC
Namely Destination Address (DA), source Address (SA), reserved byte (Reserved), tag, payload (PDU), CRC. The format of the tag may be defined as follows: the tag is 32 bits with the upper 16 bits reserved and only the lower 16 bits used, which is located between the reserved bytes and the payload of the packet.
Based on the characteristics of the video network, one of the core concepts of the embodiment of the invention is provided: in the video conference, the video network terminal calibrates the operation time when starting when detecting that the voice acquisition equipment is started, adds a time stamp for the audio data acquired by the subsequent voice acquisition equipment, and sends the biological characteristic information acquired by the voice acquisition equipment and the acquired audio data to the voice processing server when starting, so that the voice processing server generates a conference record based on the biological characteristic information, the audio data and the time stamp carried by the audio data.
Referring to fig. 5, a view of an application scenario of a video conference in an embodiment of the present invention is shown, as shown in fig. 5, the video conference includes a plurality of video networking terminals participating in the conference, in fig. 5, three video networking terminals 5011, 5012, and 5013 are taken as an example, each video networking terminal is disposed in the video networking, and each video networking terminal may be communicatively connected to a voice processing server, and each video networking terminal may be communicatively connected to a voice acquisition device and a GPS module. In the video conference, the voice processing server 502 may transmit video data and audio data of the video networking terminal 5011 to the video networking terminal 5012 and the video networking terminal 5013, and similarly, may transmit video data and audio data of the video networking terminal 5012 to the video networking terminal 5011 and the video networking terminal 5013, thereby implementing the video conference.
The voice processing server 502 may be located in the internet or in the video network, and when located in the internet, the voice processing server 502 may communicate with each video network terminal through a protocol conversion gateway, and the protocol conversion gateway may be configured to convert information based on the video network protocol sent by the video network terminal into data based on the internet protocol, so that the voice processing server 502 can receive and parse the data based on the internet protocol. When the voice processing server 502 is located in the video network, the voice processing server 502 can communicate with each video network terminal using the video network protocol.
The voice acquisition equipment can be a microphone, and the video network terminal can be externally connected with the voice acquisition equipment or the voice acquisition equipment is arranged in the video network terminal as a module. Similarly, the GPS module may be built in the video networking terminal, or may be located in a cloud server, and the video networking terminal may communicate with the GPS module in the server. Wherein, the GPS module can be used for providing GPS time for the video network terminal.
The video network terminal is a terminal which adopts a video network protocol for communication and is used in the video network, and can be a set-top box, the set-top box mainly comprises a network interface module, a video and audio coding and decoding engine module and a CPU module, and the modules are mutually matched to support the video network terminal to operate in the video network.
The embodiment takes a video network terminal in a video conference as an example, and explains the conference processing method based on the video network.
Based on the above application scenario, referring to fig. 6, a flowchart illustrating steps of a conference processing method based on video networking in an embodiment is shown, where the method may be applied to a video networking terminal, and specifically may include the following steps:
step S601: in the process of carrying out the video conference, when the starting operation of the voice acquisition equipment is detected, the operation time of the starting operation is obtained, and the biological characteristic information acquired by the voice acquisition equipment based on the starting operation is obtained.
In this embodiment, the starting operation of the voice collecting device may refer to an operation that a user starts a switch of the voice collecting device to power on the voice collecting device.
Specifically, when the terminal of the video network performs the opening operation on the voice acquisition device, the operation time corresponding to the opening operation may be obtained, and the operation time may be the time when the voice acquisition device is activated to start working.
In this embodiment, the biometric information collected based on the opening operation may refer to that the voice collecting device obtains biometric information entered by the user when performing the opening operation. That is, the starting of the voice collecting device may be accompanied with the collection of the biometric information, and in practice, the starting operation of the voice collecting device may be completed at the same time when the voice collecting device collects the biometric information of the user. For example, can install the fingerprint collection device on the pronunciation collection equipment, when the user pressed the finger on the fingerprint collection device, just realized simultaneously to the start-up of pronunciation collection equipment and input the fingerprint information of oneself into pronunciation collection equipment.
In practice, the biometric information collected by the voice collecting device may be face information, fingerprint information or iris information. Accordingly, a camera can be installed on the voice acquisition equipment for acquiring face information or iris information.
Step S602: and sending the biological characteristic information to the voice processing server, and calibrating the operation time through the GPS module.
In this embodiment, the terminal of the video network may send the biometric information to the voice processing server, and the voice processing server identifies the biometric information. Meanwhile, the operation time can be calibrated through the GPS module.
Wherein, the identification of the biometric information may refer to whether the user who performs the opening operation is a user who is allowed to participate in the video conference by the identification of the biometric information.
The operation time calibration through the GPS module may be performed by the following embodiments:
in an embodiment a, the operation time may refer to a device time of the voice capturing device when the voice capturing device is started, and the calibrating may refer to calibrating the device time of the voice capturing device through the GPS module, so as to synchronize the device time of the voice capturing device with the GPS time.
In another embodiment B, the operation time may refer to a device time of the video network terminal when the voice collecting device is started, and the calibrating may refer to calibrating the device of the video network terminal through the GPS module, so that the device time of the video network terminal is synchronized with the GPS time.
Step S603: and acquiring audio data acquired by the started voice acquisition equipment, and adding a time stamp to the audio data based on the calibrated operation time.
In this embodiment, the voice capture device that is opened can gather the audio data of the speaker in the video conference, and then, the video network terminal can obtain the audio data that voice capture device gathered.
Based on the above embodiment a, since the device time of the voice capturing device is calibrated, when a time stamp is added to the audio data, the time stamp may be the device time of the audio capturing at the time of capturing the audio data.
Based on the above embodiment B, since the device time of the video network terminal is calibrated, when the time stamp is added to the audio data, the time stamp may be the device time of the video network terminal when the audio data is collected.
In the above embodiment a or B, if the time stamp in the audio data is synchronized with the GPS time, the difference in time of the audio data can be reduced, and the recorded speaking time of the speaker can be made more accurate because the audio data is the speech of the speaker in the video conference.
Step S604: and sending the audio data to the voice processing server.
And enabling the voice processing server to generate text information corresponding to the voice acquisition equipment based on the audio data and the biological characteristic information, and storing the text information as a conference file corresponding to the video network terminal based on the timestamp.
In this embodiment, in the video conference, the voice capture device can be used to capture the biometric information and the audio data, and the video network terminal can capture the video data of the participant through other devices, for example, the camera, and then can send the video data captured by the camera to the voice processing server in the whole course of the video conference, so that the voice processing server sends the video data to other terminals participating in the conference. In this way, the video data and the audio data can be transmitted separately, so that the audio data can be independently processed by the voice processing server, thereby improving the efficiency of generating text information.
In this embodiment, the video network terminal may send the audio data to the audio processing server, the audio processing server may identify the audio data as a text, and add the user's identity information obtained by identifying the biometric information into the text to form text information, where the text information may be text information for the voice capture device, and further may store the text information as a conference file according to a timestamp, where the conference file may include the text information and the timestamp. Furthermore, when the conference file is used or viewed, the speaking time of the audio corresponding to the text information in the video conference can be determined through the timestamp.
When the embodiment is adopted, the video network terminal calibrates the operation time when the video network terminal is started by detecting the voice acquisition equipment through the GPS module, and adds the time stamp to the audio data acquired by the voice acquisition equipment based on the calibrated time, so that the time stamp can be synchronized with the GPS time, and the accuracy of the time stamp of the audio data is improved. The video network terminal can also obtain the biological characteristic information acquired by the voice acquisition equipment, so that the finally generated conference file can also comprise the identity information of the user identified based on the biological characteristic information, and the integrity of the conference record is improved.
In an embodiment, calibrating the operation time by the GPS module may specifically include the following steps:
step S6021: acquiring GPS time by the GPS module, and calibrating the operation time to be time synchronized with the GPS time.
In this embodiment, when it is detected that the voice capturing device is turned on, the GPS module may obtain GPS time and replace the operation time with GPS time, so as to calibrate the operation time, thereby ensuring that the operation time is synchronized with the GPS time, and ensuring that the voice capturing device in the video conference has accurate turn-on time.
Correspondingly, in step S603, adding a timestamp to the audio data based on the calibrated operation time, which may specifically be the following steps:
step S6031: and determining the acquisition time when the audio data is acquired from the calibrated operation time, and adding a time stamp corresponding to the acquisition time to the audio data.
In this embodiment, when adding a time stamp to audio data, it is possible to determine the time when the audio data starts to be collected from the calibrated operation time, and add the time stamp corresponding to the time to the audio data.
Illustratively, the operation time is 11 o ' clock and 59 seconds at 11 o ' clock and 1 o ' clock in 2019, the GPS time obtained by the GPS module while the voice collecting device is detected to be turned on is 12 o ' clock and 11 seconds at 11 o ' clock and 1 o ' clock in 2019, the calibrated operation time is 12 o ' clock and 11 seconds at 11 o ' clock and 1 o ' clock in 2019, the time when the audio data is collected from this time is 22 o ' clock and 32 seconds at 11 o ' clock and 10 o ' clock and 1 o ' clock in 2019, and then the time stamp corresponding to 22 o ' clock and 32 seconds at 11 o ' clock and 11 o ' clock in 10 o 1 o ' clock in 2019 is added to the audio data.
In one embodiment, after sending the biometric information to the voice processing server, the video network terminal may further perform the following steps:
step S605: and receiving a control instruction sent by the voice processing server.
And the control instruction is generated by the voice processing server when the biological characteristic information is determined not to be matched with the biological characteristics of a plurality of users in a preset biological characteristic library.
In this embodiment, the biometric information may be used to identify the identity of the user who turns on the voice collecting device, and specifically, the voice processing server may compare whether the biometric information is compared with a plurality of user biometrics in the biometric database, and determine the identity information of the user according to the comparison result. The plurality of user biometrics stored in the biometrics library may be biometrics of users who need to participate in the video conference, which are entered by the administrator when the video conference is reserved before the video conference starts. For example, 10 users need to participate in a video conference, and the biometric features of each of the 10 users can be entered in the biometric feature library when the video conference is reserved.
In specific implementation, if the biometric information does not match with the biometric information of the plurality of users, it indicates that the user currently turning on the voice acquisition device is not a user expected to be in the video conference, and a control instruction may be generated.
Step S606: and closing the voice acquisition equipment based on the control instruction.
When receiving the control instruction, the video network terminal can control the voice acquisition device to be turned off, and specifically, the power supply of the voice acquisition device can be cut off according to the control instruction, so that the voice acquisition device cannot be powered on to work. Or controlling the voice acquisition equipment not to acquire the audio data according to the control instruction
Referring to fig. 7, another application scenario diagram of a video conference is shown, as shown in fig. 7, 702 represents a voice processing server, and each of the video network terminals may be connected to multiple voice capture devices, so that in a video conference, multiple users may share one video network terminal for participating, and each voice capture device corresponds to one participant.
For example, the video conference is a video conference with three places (a place a, B place B and C place C), a video network terminal 7011, a video network terminal 7012 and a video network terminal 7013 can be respectively allocated to the a place a, the B place and the C place, and the video network terminal in each place can be connected with a plurality of voice acquisition devices. For example, the video network terminal 7011 of the location a is connected to 3 voice capture devices 703, so that 3 participants can participate in the video conference through the video network terminal 7011, where each participant can use one of the voice capture devices 703 to speak through the voice capture device in the conference. The video network terminal 7011 at location a can transmit the audio data of the 3 participants while speaking to the participants at locations B and C.
Based on the application scenario, the video network terminal may perform the above steps S601 to S604 for each voice acquisition device, and specifically, the conference processing method in the application scenario is described as follows, including the following steps:
step S601': in the process of carrying out the video conference, when the starting operation of a plurality of voice acquisition devices is detected, the operation time of the starting operation corresponding to the started partial voice acquisition devices respectively is obtained, and the biological characteristic information acquired by the started partial voice acquisition devices respectively based on the starting operation is obtained.
This step is illustrated by taking example H as an example: the video network terminal is connected with three voice acquisition devices, and the video conference is held in 2019, 10 and 1. The three voice acquisition devices are respectively No. 1, no. 2 and No. 3. When the video network terminal detects that the terminal is started from No. 1, the corresponding operation time 1 is 11 points, 10 minutes and 21 seconds, and when the terminal detects that the terminal starts from No. 2, the corresponding operation time 2 (namely, the starting time) is 11 points, 11 minutes and 01 seconds. And biometric information 1 acquired No. 1 at 11 points, 10 minutes, and 21 seconds, and biometric information 2 acquired No. 2 at 11 points, 11 minutes, and 01 seconds.
Step S602': and sending a plurality of biological characteristic information to the voice processing server, and calibrating each operation time through the GPS module.
Taking example H as an example, the present step is explained: the terminal of the video network can respectively send the biological characteristic information 1 and the biological characteristic information 2 to the voice processing server when respectively obtaining the biological characteristic information 1 and the biological characteristic information 2. And when the operation time 1 is determined, if the GPS time obtained by the GPS module is 11 points, 10 minutes and 24 seconds, calibrating the operation time 1 to be 11 points, 10 minutes and 24 seconds. When the operation time 2 is determined, the GPS module calibrates the operation time 2 to 11 points 11 minutes 05 seconds when the GPS module obtains 11 points 11 minutes 05 seconds.
After the calibration, the time of the plurality of started voice acquisition devices is calibrated to be the time synchronized with the GPS time. Thereby reducing the time difference between multiple voice capture devices.
In step S601', the process of step S602 may be performed on each turned-on voice capture device, where reference may be made to the description of step S602 for relevant points, and details are not described here.
Step S603': and acquiring audio data acquired by the opened part of the voice acquisition equipment, and adding time stamps to the plurality of audio data respectively based on the calibrated operation time.
This step is illustrated by taking the above example H as an example: the terminal of the video network determines that the user using the device No. 1 starts speaking at 11 o ' clock 20 min 05 sec after the self-calibration operation time 1 (11 o ' clock 10 min 24 sec), and adds a timestamp corresponding to 11 o ' clock 20 min 05 sec to the audio data 1 collected by the device No. 1. And starting from the operation time 2 (11 o ' clock 11 min 05 sec) after the self-rotation is determined, so that the user of the device No. 2 speaks at 11 o ' clock 20 min 06 sec, a time stamp corresponding to 11 o ' clock 20 min 06 sec is added to the audio data 2 collected by the device No. 2.
As can be seen from step S602' and the example of this step, after GPS calibration is performed, the audio data collected by each voice collecting device can use GPS time as a reference value, so that the time between the audio data is more accurate, and the time difference caused by inconsistency of the devices is reduced, so that the conference recording is more accurate, and the speaking sequence of the video conference can be restored.
Step S604': and sending the audio data to the voice processing server so that the voice processing server generates text information corresponding to each started voice acquisition device based on the received audio data and the biological characteristic information, and storing the text information into a conference file corresponding to the video network terminal based on the sequence of the time stamps of the audio data.
Taking example H as an example, describing this step, the video network terminal may send audio data 1 collected by device No. 1 and audio data 2 collected by device No. 2 to the voice processing server, so that the voice processing server generates text information 1 corresponding to device No. 1 based on the audio data 1 and the biometric information 1, and generates text information 2 corresponding to device No. 2 based on the audio data 2 and the biometric information 2. And storing the text information 2 behind the text information 1 according to the sequence of the time stamps of the audio data 1 and the time stamps of the audio data 2 to obtain the conference file. In this way, in the conference file, the text information 1 is arranged behind the text information 2, and the speaking sequence of the speakers in the video conference is accurately restored.
Based on the application scenario shown in fig. 5, in a further embodiment, a video network based conference processing method is described in detail from a voice processing server side, and referring to fig. 8, a flowchart illustrating steps of the video network based conference processing method in a further embodiment of the present invention is shown, where the method may be specifically applied to a voice processing server, and may include the following steps:
step S801: and receiving the biological characteristic information sent by the video networking terminal in the video conference process.
Wherein the biometric information is acquired by the voice acquisition device based on an opening operation performed by a user.
The process of step S801 is similar to the process of step S601, and reference may be made to the description of step S601 for relevant points, which is not described herein again.
Step S802: and receiving the audio data sent by the video networking terminal.
Wherein the audio data is collected by the voice collecting device which is started and added with a time stamp by the video network terminal based on the calibrated operation time; and the calibrated operation time is the time after the operation time when the video network terminal starts the operation is calibrated through the GPS module.
The process of step S802 is similar to the process of step S602, and reference may be made to the description of step S601 for relevant points, which is not described herein again.
Step S803: generating text information based on the biometric information and the audio data.
In this embodiment, the audio processing server may recognize the audio data as a text, and add the user identity information obtained by recognizing the biometric information to the text to form text information, where the text information may be text information for the voice acquisition device.
Step S804: and storing the text information as a conference file corresponding to the video network terminal based on the timestamp.
In this embodiment, the voice processing server may store the text information as a conference file according to the timestamp, where the conference file may include the text information and the timestamp, and the conference file is a file of the video networking terminal in the video conference process.
In an implementation manner, taking the application scenario shown in fig. 7 as an example, the video network terminal may be connected to a plurality of voice collecting devices, so that the voice processing server may receive a plurality of audio data and a plurality of biometric information sent by the video network terminal, and in step S803, the following steps may be included:
step S803': generating a plurality of text messages respectively corresponding to the plurality of activated audio acquisition devices based on the received plurality of biometric information and the plurality of audio data; wherein the plurality of audio data and the plurality of biometric information are respectively captured by a plurality of voice capturing devices that are activated.
In this embodiment, a plurality of users can share one video network terminal for participation, and each user can correspond to one voice acquisition device, so that the voice processing server can obtain audio data and biological characteristic information acquired by the opened voice acquisition devices. In this way, when generating text information, it is possible to generate text information corresponding to each voice capture device for the audio data and biometric information captured by that voice capture device. And then the text information corresponding to each voice acquisition device is obtained.
Accordingly, step S804 may be the following steps:
and step S804', storing the text information into a conference file corresponding to the video network terminal according to the sequence of the respective timestamps of the audio data.
In this embodiment, the sequence of the timestamps may mean that the text information with the time before corresponding to the timestamp is stored after the text information with the time after corresponding to the timestamp. In this way, the sequence of the plurality of text messages in the conference file is consistent with the sequence of the timestamps in the plurality of audio data. Therefore, the text information of the user speaking first is arranged in front of the text information of the user speaking later, so that the video conference can be accurately restored.
In one embodiment, the text information may be generated by the following steps, including:
step S8031: and comparing the biological characteristic information with a plurality of user biological characteristics in a preset biological characteristic library, and acquiring the identity information of the user corresponding to the user biological characteristics which are successfully compared when the biological characteristic information is determined to be matched with the user biological characteristics in the plurality of user biological characteristics.
In this embodiment, the plurality of user biometrics stored in the biometrics library may be biometrics of a user who needs to participate in the video conference, which is entered by a manager when the video conference is reserved before the video conference starts. The voice processing server can compare the biological characteristic information with a plurality of user biological characteristics when receiving the biological characteristic information, and can determine the identity information of the user corresponding to the user biological characteristics which are successfully compared as the identity information of the user of the voice acquisition equipment when the biological characteristic information is successfully compared with one of the user biological characteristics.
Step S8032: and identifying the audio data to obtain a text corresponding to the audio data.
In this embodiment, the audio data may be recognized as text by using a Speech Recognition technology, for example, ASR (Automatic Speech Recognition).
Step S8033: and adding the text to the identity information to obtain text information.
In this embodiment, the obtained identity information may be added to the text, and specifically, the obtained identity information may be added before the first character in the text or after the first character in the text, so as to obtain the text information.
In one implementation scenario, the voice processing server may also receive a human face image sent by the terminal of the internet of things. The face image may be a face image of a participant participating in the video conference, which is acquired by the video networking terminal at the start of the video conference.
Background staff of the voice processing server can record the identity information of the corresponding participant according to the face picture and correspondingly store the face picture and the identity information. When the biometric information is recognized to obtain the identity information corresponding to the biometric information, the face picture corresponding to the identity information is also obtained. Therefore, when the obtained identity information is added to the text, the face picture corresponding to the identity information can be added to the text to obtain more detailed text information.
Accordingly, in this embodiment, when the biometric information is compared with the plurality of user biometrics in the preset biometric library, the following steps may be further performed according to the comparison result:
step S805: and when the biological characteristic information is determined not to be matched with the biological characteristics of the plurality of users, generating a control instruction, and sending the control instruction to the video network terminal.
In this embodiment, if the comparison between the biometric information and the biometric features of the multiple users is unsuccessful, it indicates that the user is not a user who has reserved in the video conference, and further, a control instruction can be generated, so that the terminal of the video network controls the voice acquisition device to be turned off.
In an implementation manner, when it is determined that the biological characteristic information does not match with the biological characteristics of the plurality of users, alarm information can be generated and sent to the video network terminal and the chairman terminal in the video conference respectively, so that the video network terminal and the chairman terminal display the alarm information, and the users can confirm on site in time.
In one embodiment, after generating the text message, the speech processing server may further perform the following steps:
step S806: and converting the text information into caption information, and sending the caption information to a conference terminal participating in the video conference so as to enable the conference terminal to display the caption information.
In this embodiment, the voice processing server may further convert the text information into the subtitle information, and specifically, may use the identity information corresponding to the biometric information included in the text information as information located in a first segment of the subtitle information, and use the text corresponding to the audio data in the text information as information arranged after the identity information in the subtitle information. In this way, the caption information may include the identity information of the speaker or the content of the speaker.
In specific implementation, after the caption information is obtained, the voice processing server may push the caption information to a conference terminal participating in the video conference, such as other video networking terminals shown in fig. 5, so that the caption information may be displayed on the conference terminal, so that a participating user may obtain the text content spoken by a speaker in the video conference in real time, and a conference effect of the video conference is improved.
Correspondingly, after obtaining the conference file, the voice processing server may also perform the following steps:
step S807: and when the video conference is finished, acquiring pre-stored conference data corresponding to the video conference.
In this embodiment, when the video conference is finished, the voice processing server may obtain conference data pre-recorded by the manager, where the conference data may include a conference name, the number of participants, a participant list, a conference start time, a conference end time, and the like of the video conference.
Step S808: and storing the conference data into the conference file to obtain a conference detailed record.
In this embodiment, since the number of the video networking terminals participating in one video conference may be multiple, when the video conference is finished, the voice processing server may obtain a conference file corresponding to each video networking terminal participating in the video conference. Furthermore, a plurality of conference files can be combined to obtain a total conference file, so that conference data can be stored in the total conference file to obtain a conference detailed record.
In one embodiment, after obtaining the meeting detail record, the voice processing server may also push the meeting detail record to a web page of the front end, and the web page may issue the meeting detail record, so that the user may download the meeting detail record on the web page.
Based on the application scenario shown in fig. 5, referring to fig. 9, a complete flowchart of the conference processing method based on the video network in one example is shown, and the conference processing method based on the video network is completely described from the video network terminal and the voice processing server side, as shown in fig. 9.
Wherein, the video network terminal is connected with a plurality of voice collecting devices in a communication manner, fig. 9 shows that the voice collecting device is a microphone, wherein, the MCU1 is a microprocessor in the video network terminal, the MCU2 is a microprocessor in the voice processing server, and the biological characteristic information is fingerprint data. The method specifically comprises the following steps:
step S001: when a video network terminal detects the opening operation of a voice acquisition device in the process of carrying out a video conference, the operation time of the opening operation is obtained, and the voice acquisition device acquires biological characteristic information based on the opening operation.
The process of step S001 is similar to the process of step S601, and reference may be made to the description of step S601 for relevant points, which is not described herein again.
Step S002: and the video network terminal sends the biological characteristic information to the voice processing server and calibrates the operation time through the GPS module.
The process of step S002 is similar to the process of step S602, and the relevant points are only described in step S602, and are not described herein again.
Step S003: the voice processing server receives the biological characteristic information and determines identity information of a user using the voice acquisition device based on the biological characteristic information.
The process of step S003 is similar to the process of step S803', and the relevant points are referred to the description of step S803', which are not repeated herein.
Step S004: and the video network terminal obtains the audio data collected by the started voice collecting equipment and adds a time stamp to the audio data based on the calibrated operation time.
The process of step S004 is similar to the process of step S603, and reference may be made to the description of step S603 for relevant parts, which are not described herein again.
Step S005: and the video network terminal sends the audio data to the voice processing server.
The process of step S005 is similar to the process of step S604, and reference may be made to the description of step S604 for relevant points, which is not described herein again.
Step S006: and the voice processing server generates a text corresponding to the audio data, and adds the identity information to the text to obtain text information. Such as text data a and text data B in fig. 9.
The process of step S006 is similar to the process of step S803, and reference may be made to the description of step S803 for relevant points, which are not described herein again.
Step S007: and the voice processing server stores the text information as a conference file corresponding to the video network terminal based on the timestamp. The text data is sorted by time stamp as in fig. 9.
The process of step S007 is similar to the process of step S804, and reference may be made to the description of step S804 for relevant points, which are not described herein again.
And step S008, the voice processing server obtains conference data, inputs the conference data into text information to obtain complete conference records, and displays the complete conference records. As shown in fig. 9, conference information is integrated according to background information (i.e., conference data), so as to obtain a complete conference record.
It should be noted that for simplicity of description, the method embodiments are shown as a series of combinations of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 10, a schematic structural diagram of a conference processing apparatus based on a video network according to an embodiment of the present invention is shown, where a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, the video network terminal is further in communication connection with a GPS module and a voice acquisition device, and the apparatus is applied to the video network terminal, and may specifically include the following modules:
a start information obtaining module 1001 configured to, when a start operation on the voice collecting device is detected in a process of performing a video conference, obtain operation time of the start operation and biometric information collected by the voice collecting device based on the start operation;
the information calibration module 1002 is configured to send the biometric information to the voice processing server, and calibrate the operation time through the GPS module;
an audio data obtaining module 1003, configured to obtain audio data collected by the started voice collecting device, and add a timestamp to the audio data based on the calibrated operation time;
an audio data sending module 1004, configured to send the audio data to the voice processing server, so that the voice processing server generates text information corresponding to the voice collecting device based on the audio data and the biometric information, and stores the text information as a conference file corresponding to the video network terminal based on the timestamp.
Optionally, the information calibration module is specifically configured to acquire GPS time through the GPS module, and calibrate the operation time to a time synchronized with the GPS time;
the audio data obtaining module may include the following units:
and the time stamp adding unit is used for determining the acquisition time when the audio data is acquired from the calibrated operation time, and adding the time stamp corresponding to the acquisition time to the audio data.
Optionally, the apparatus may further include the following modules:
the control instruction receiving module is used for receiving a control instruction sent by the voice processing server, and the control instruction is generated by the voice processing server when the fact that the biological characteristic information is not matched with the biological characteristics of a plurality of users in a preset biological characteristic library is determined;
and the control module is used for closing the voice acquisition equipment based on the control instruction.
Optionally, the number of the voice collecting devices connected to the terminal of the video network is multiple, and the biometric information is information respectively collected by at least part of the opened voice collecting devices in the multiple voice collecting devices; the audio data are respectively acquired by at least part of the voice acquisition devices which are turned on. :
referring to fig. 11, a schematic structural diagram of another conference processing apparatus based on a video network according to an embodiment of the present invention is shown, where a video network terminal is deployed in the video network, the video network terminal is communicatively connected to a voice processing server, and the video network terminal is further communicatively connected to a GPS module and a voice collecting device; the device is applied to the voice processing server and can comprise the following modules:
a first information receiving module 1101, configured to receive biometric information sent by the video networking terminal in a video conference process, where the biometric information is acquired by the voice acquisition device based on an opening operation performed by a user;
a second information receiving module 1102, configured to receive audio data sent by the video networking terminal, where the audio data is collected by the started voice collecting device, and a timestamp is added to the audio data by the video networking terminal based on the calibrated operation time; the calibrated operation time is the time after the operation time when the video network terminal starts the operation is calibrated through the GPS module;
a text information generating module 1103, configured to generate text information based on the biometric information and the audio data;
and a conference file generating module 1104, configured to store the text information as a conference file corresponding to the video network terminal based on the timestamp.
Optionally, the number of the voice collecting devices is multiple; the text information generating module 1103 is specifically configured to generate, based on the received multiple pieces of biometric information and multiple pieces of audio data, multiple pieces of text information respectively corresponding to the multiple pieces of started audio capturing equipment; wherein the plurality of audio data and the plurality of biometric information are respectively collected by a plurality of voice collecting devices that are activated;
the conference file generating module 1104 is specifically configured to store the plurality of text messages as the conference files corresponding to the video networking terminals according to the sequence of the timestamps of the plurality of audio data.
Optionally, the text information generating module 1103 may include the following units:
the identity information determining unit is used for comparing the biological characteristic information with a plurality of user biological characteristics in a preset biological characteristic library and acquiring identity information of a user corresponding to the user biological characteristics which are successfully compared when the biological characteristic information is determined to be matched with the user biological characteristics in the plurality of user biological characteristics;
the audio identification unit is used for identifying the audio data to obtain a text corresponding to the audio data;
the generating unit is used for adding the identity information to the text to obtain text information;
the apparatus may further include the following modules:
and the control instruction generating module is used for generating a control instruction when the biological characteristic information is determined not to be matched with the biological characteristics of the plurality of users, and sending the control instruction to the video network terminal.
Optionally, the apparatus may further include the following modules:
the conversion module is used for converting the text information into subtitle information and sending the subtitle information to a conference terminal participating in the video conference so as to enable the conference terminal to display the subtitle information;
the conference data acquisition module is used for acquiring pre-stored conference data corresponding to the video conference when the video conference is finished;
a conference record obtaining module, configured to store the conference data in the conference file to obtain a conference detailed record;
and the meeting record sending module is used for sending the meeting detailed record to a web page in communication connection with the voice server so as to enable the web page to display the meeting detailed record.
For the embodiment of the conference processing device based on the internet of view, since it is basically similar to the embodiment of the conference processing method based on the internet of view, the description is relatively simple, and for the relevant points, refer to the partial description of the embodiment of the conference processing method based on the internet of view.
An embodiment of the present invention further provides an electronic device, including:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform one or more video networking-based conference processing methods according to embodiments of the invention.
Embodiments of the present invention further provide a computer-readable storage medium, where a stored computer program enables a processor to execute a conference processing method based on video networking according to an embodiment of the present invention.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrases "comprising one of \ 8230; \8230;" does not exclude the presence of additional like elements in a process, method, article, or terminal device that comprises the element.
The present invention provides a conference processing method, device, equipment and storage medium based on video networking, which are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (12)

1. The conference processing method based on the video network is characterized in that a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, the video network terminal is also in communication connection with a GPS module and a voice acquisition device, and the method is applied to the video network terminal and comprises the following steps:
in the process of carrying out a video conference, when the starting operation of the voice acquisition equipment is detected, acquiring the operation time of the starting operation and the biological characteristic information acquired by the voice acquisition equipment based on the starting operation, wherein the operation time is the equipment time of the voice acquisition equipment when the voice acquisition equipment is started, or the operation time is the equipment time of the video network terminal when the voice acquisition equipment is started;
sending the biological characteristic information to the voice processing server, and calibrating the operation time through the GPS module;
acquiring audio data acquired by the started voice acquisition equipment, and adding a timestamp to the audio data based on the calibrated operation time;
and sending the audio data to the voice processing server so that the voice processing server generates text information corresponding to the voice acquisition equipment based on the audio data and the biological characteristic information, and storing the text information as a conference file corresponding to the video network terminal based on the timestamp.
2. The method of claim 1, wherein calibrating the operating time via the GPS module comprises:
acquiring GPS time through the GPS module, and calibrating the operation time to be time synchronous with the GPS time;
adding a timestamp to the audio data based on the calibrated operating time, comprising:
and determining the acquisition time when the audio data is acquired from the calibrated operation time, and adding a time stamp corresponding to the acquisition time to the audio data.
3. The method of claim 1, wherein after sending the biometric information to the voice processing server, the method further comprises:
receiving a control instruction sent by the voice processing server, wherein the control instruction is generated by the voice processing server when the fact that the biological characteristic information is not matched with the biological characteristics of a plurality of users in a preset biological characteristic library is determined;
and closing the voice acquisition equipment based on the control instruction.
4. The method according to claim 1, wherein the number of the voice collecting devices connected to the terminal of the video network is multiple, and the biometric information is information respectively collected by at least some of the voice collecting devices which are turned on; the audio data are respectively acquired by at least part of the voice acquisition devices which are turned on.
5. A conference processing method based on video networking is characterized in that a video networking terminal is deployed in the video networking, the video networking terminal is in communication connection with a voice processing server, and the video networking terminal is also in communication connection with a GPS module and a voice acquisition device; the method is applied to the voice processing server and comprises the following steps:
receiving biological characteristic information sent by the video networking terminal in the video conference process, wherein the biological characteristic information is acquired by the voice acquisition equipment based on the starting operation performed by a user;
receiving audio data sent by the video networking terminal, wherein the audio data is collected by the started voice collecting equipment and added with a timestamp by the video networking terminal based on the calibrated operation time; the calibrated operation time is the time after the operation time of the video network terminal when the starting operation is calibrated through the GPS module, wherein the operation time is the equipment time of the voice acquisition equipment when the voice acquisition equipment is started, or the operation time is the equipment time of the video network terminal when the voice acquisition equipment is started;
generating text information based on the biometric information and the audio data;
and storing the text information as a conference file corresponding to the video network terminal based on the timestamp.
6. The method according to claim 5, wherein the number of the voice collecting devices is plural; generating text information based on the biometric information and the audio data, including:
generating a plurality of text messages respectively corresponding to the plurality of activated audio acquisition devices based on the received plurality of biometric information and the plurality of audio data; wherein the plurality of audio data and the plurality of biometric information are respectively acquired by a plurality of voice acquisition devices that are activated;
based on the timestamp, storing the text information as a conference file corresponding to the video networking terminal, including:
and storing the plurality of text messages as conference files corresponding to the video networking terminals according to the sequence of the respective timestamps of the plurality of audio data.
7. The method of claim 5, wherein generating textual information based on the biometric information and the audio data comprises:
comparing the biological characteristic information with a plurality of user biological characteristics in a preset biological characteristic library, and acquiring identity information of a user corresponding to the user biological characteristics which are successfully compared when the biological characteristic information is determined to be matched with the user biological characteristics in the plurality of user biological characteristics;
identifying the audio data to obtain a text corresponding to the audio data;
adding the identity information to the text to obtain text information;
the method further comprises the following steps:
and when the biological characteristic information is determined not to be matched with the biological characteristics of the plurality of users, generating a control instruction, and sending the control instruction to the video network terminal.
8. The method of claim 5, wherein after generating text information based on the biometric information and the audio data, the method further comprises:
converting the text information into subtitle information, and sending the subtitle information to a conference terminal participating in the video conference so as to enable the conference terminal to display the subtitle information;
after storing the text information as a conference file corresponding to the video network terminal based on the timestamp, the method further comprises:
when a video conference is finished, acquiring pre-stored conference data corresponding to the video conference;
storing the conference data into the conference file to obtain a conference detailed record;
and sending the meeting detailed record to a web page in communication connection with the voice server so that the web page displays the meeting detailed record.
9. The utility model provides a conference processing apparatus based on video networking, its characterized in that, the video networking terminal has been arranged in the video networking, video networking terminal and voice processing server communication connection, video networking terminal still communication connection has GPS module and pronunciation collection equipment, the device is applied to video networking terminal, includes:
the starting information obtaining module is used for obtaining the operation time of the starting operation when the starting operation of the voice acquisition equipment is detected in the process of carrying out the video conference and the biological characteristic information acquired by the voice acquisition equipment based on the starting operation, wherein the operation time is the equipment time of the voice acquisition equipment when the voice acquisition equipment is started or the operation time is the equipment time of the video network terminal when the voice acquisition equipment is started;
the information calibration module is used for sending the biological characteristic information to the voice processing server and calibrating the operation time through the GPS module;
the audio data acquisition module is used for acquiring the audio data acquired by the started voice acquisition equipment and adding a timestamp to the audio data based on the calibrated operation time;
and the audio data sending module is used for sending the audio data to the voice processing server so that the voice processing server generates text information corresponding to the voice acquisition equipment based on the audio data and the biological characteristic information, and stores the text information as a conference file corresponding to the video network terminal based on the timestamp.
10. A conference processing device based on a video network is characterized in that a video network terminal is deployed in the video network, the video network terminal is in communication connection with a voice processing server, and the video network terminal is also in communication connection with a GPS module and a voice acquisition device; the device is applied to the voice processing server and comprises:
the first information receiving module is used for receiving biological characteristic information sent by the video networking terminal in the video conference process, and the biological characteristic information is acquired by the voice acquisition equipment based on the starting operation of a user;
the second information receiving module is used for receiving audio data sent by the video network terminal, the audio data is collected by the started voice collecting equipment, and a timestamp is added into the audio data by the video network terminal based on the calibrated operation time; the calibrated operation time is the time after the operation time of the video network terminal for starting operation is calibrated through the GPS module, wherein the operation time is the equipment time of the voice acquisition equipment when the voice acquisition equipment is started, or the operation time is the equipment time of the video network terminal when the voice acquisition equipment is started;
the text information generating module is used for generating text information based on the biological characteristic information and the audio data;
and the conference file generation module is used for storing the text information into a conference file corresponding to the video network terminal based on the timestamp.
11. An electronic device, comprising:
one or more processors; and
one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the device to perform the video networking based conferencing processing method of any of claims 1-4 or 5-8.
12. A computer-readable storage medium storing a computer program for causing a processor to execute the video network-based conference processing method according to any one of claims 1 to 4 or 5 to 8.
CN202010100395.8A 2020-02-18 2020-02-18 Conference processing method, device, equipment and storage medium based on video networking Active CN111432157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010100395.8A CN111432157B (en) 2020-02-18 2020-02-18 Conference processing method, device, equipment and storage medium based on video networking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010100395.8A CN111432157B (en) 2020-02-18 2020-02-18 Conference processing method, device, equipment and storage medium based on video networking

Publications (2)

Publication Number Publication Date
CN111432157A CN111432157A (en) 2020-07-17
CN111432157B true CN111432157B (en) 2023-04-07

Family

ID=71547815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010100395.8A Active CN111432157B (en) 2020-02-18 2020-02-18 Conference processing method, device, equipment and storage medium based on video networking

Country Status (1)

Country Link
CN (1) CN111432157B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114912463B (en) * 2022-07-13 2022-10-25 南昌航天广信科技有限责任公司 Conference automatic recording method, system, readable storage medium and computer equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2566144A1 (en) * 2011-09-01 2013-03-06 Research In Motion Limited Conferenced voice to text transcription
CN108922538A (en) * 2018-05-29 2018-11-30 平安科技(深圳)有限公司 Conferencing information recording method, device, computer equipment and storage medium
CN108986826A (en) * 2018-08-14 2018-12-11 中国平安人寿保险股份有限公司 Automatically generate method, electronic device and the readable storage medium storing program for executing of minutes
CN109068089A (en) * 2018-09-30 2018-12-21 视联动力信息技术股份有限公司 A kind of conferencing data generation method and device
CN109361825A (en) * 2018-11-12 2019-02-19 平安科技(深圳)有限公司 Meeting summary recording method, terminal and computer storage medium
CN109856675A (en) * 2019-03-06 2019-06-07 合肥国为电子有限公司 Fine motion acquires equipment, wireless remote-measuring system and data quality monitoring method
CN110602432A (en) * 2019-08-23 2019-12-20 苏州米龙信息科技有限公司 Conference system based on biological recognition and conference data transmission method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2566144A1 (en) * 2011-09-01 2013-03-06 Research In Motion Limited Conferenced voice to text transcription
CN108922538A (en) * 2018-05-29 2018-11-30 平安科技(深圳)有限公司 Conferencing information recording method, device, computer equipment and storage medium
CN108986826A (en) * 2018-08-14 2018-12-11 中国平安人寿保险股份有限公司 Automatically generate method, electronic device and the readable storage medium storing program for executing of minutes
CN109068089A (en) * 2018-09-30 2018-12-21 视联动力信息技术股份有限公司 A kind of conferencing data generation method and device
CN109361825A (en) * 2018-11-12 2019-02-19 平安科技(深圳)有限公司 Meeting summary recording method, terminal and computer storage medium
CN109856675A (en) * 2019-03-06 2019-06-07 合肥国为电子有限公司 Fine motion acquires equipment, wireless remote-measuring system and data quality monitoring method
CN110602432A (en) * 2019-08-23 2019-12-20 苏州米龙信息科技有限公司 Conference system based on biological recognition and conference data transmission method

Also Published As

Publication number Publication date
CN111432157A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
CN108574688B (en) Method and device for displaying participant information
CN110166728B (en) Video networking conference opening method and device
CN109803111B (en) Method and device for watching video conference after meeting
CN108965224B (en) Video-on-demand method and device
CN110049271B (en) Video networking conference information display method and device
CN110493554B (en) Method and system for switching speaking terminal
CN110557597A (en) video conference sign-in method, server, electronic equipment and storage medium
CN109547728B (en) Recorded broadcast source conference entering and conference recorded broadcast method and system
CN110572607A (en) Video conference method, system and device and storage medium
CN108616487B (en) Audio mixing method and device based on video networking
CN108965220B (en) Method and system for synchronizing conference control right
CN108881948B (en) Method and system for video inspection network polling monitoring video
CN109743524B (en) Data processing method of video network and video network system
CN110049273B (en) Video networking-based conference recording method and transfer server
CN109040656B (en) Video conference processing method and system
CN109788235B (en) Video networking-based conference recording information processing method and system
CN111541859A (en) Video conference processing method and device, electronic equipment and storage medium
CN110719425A (en) Video data playing method and device
CN111131747B (en) Method and device for determining data channel state, electronic equipment and storage medium
CN109905616B (en) Method and device for switching video pictures
CN110505433B (en) Data processing method and video networking video conference platform
CN115311706A (en) Personnel identification method, device, terminal equipment and storage medium
CN111432157B (en) Conference processing method, device, equipment and storage medium based on video networking
CN110798648A (en) Video conference processing method and system
CN110049275B (en) Information processing method and device in video conference and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant