CN112399022A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112399022A
CN112399022A CN201910697173.6A CN201910697173A CN112399022A CN 112399022 A CN112399022 A CN 112399022A CN 201910697173 A CN201910697173 A CN 201910697173A CN 112399022 A CN112399022 A CN 112399022A
Authority
CN
China
Prior art keywords
conference
voice
notification
voice data
sound box
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910697173.6A
Other languages
Chinese (zh)
Inventor
黄沛雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910697173.6A priority Critical patent/CN112399022A/en
Publication of CN112399022A publication Critical patent/CN112399022A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The embodiment of the application provides a data processing method, a data processing device, data processing equipment and a storage medium, so that a conference can be initiated more quickly. The method comprises the following steps: the first sound box equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; the second sound box equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server; and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first sound box device and the second sound box device. The conference can be conveniently accessed, and the processing efficiency is improved.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
The teleconference is a novel conference mode which uses the telephone as a tool to take a meeting, and the limitation of the traditional centralized conference on space and time is broken.
At present, the teleconference is usually connected with each user by dialing a fixed telephone or a mobile phone, and the fixed telephone or the mobile phone number of each user participating in the teleconference needs to be dialed respectively, so that the operation is very complicated.
Disclosure of Invention
The embodiment of the application provides a data processing method, so that a conference can be initiated more quickly.
Correspondingly, the embodiment of the application also provides a data processing device, an electronic device and a storage medium, which are used for ensuring the implementation and application of the method.
In order to solve the above problem, an embodiment of the present application discloses a data processing method, including: the first sound box equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; the second sound box equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server; and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first sound box device and the second sound box device.
The embodiment of the application also discloses a data processing method, which comprises the following steps: receiving a conference notification, wherein the conference notification is determined according to a conference event identified by first voice data, and the conference notification is sent out when a server determines that sound box equipment corresponding to a participant is on line; outputting corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data and determining a participation notice; and sending the participation notification so that the server side determines a corresponding conference initiating result according to the participation notification.
The embodiment of the application also discloses a data processing method, which comprises the following steps: receiving a conference event of first sound box equipment, wherein the conference event is identified and determined according to first voice data; determining the conference users participating in the conference according to the conference event; under the condition that the second sound box equipment of the participating user is determined to be online, sending a conference notification to the second sound box equipment so that the second sound box equipment outputs corresponding conference information and a conference password according to the conference notification; receiving a participant notification returned by the second sound box device, wherein the participant notification is identified and determined according to second voice data; and determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first sound box device and the second sound box device.
The embodiment of the application also discloses a data processing method, which comprises the following steps: receiving voice data; carrying out voice recognition on the received voice data, and determining a corresponding conference event; sending the conference event so that the server side sends a conference notification to the online sound box equipment of the participating user according to the conference event, and the online sound box equipment of the participating user outputs corresponding conference information and a conference password according to the conference notification; and receiving a conference initiating result, wherein the conference initiating result is determined according to the conference participation notification returned by the online sound box equipment of the conference participating user.
The embodiment of the application also discloses a data processing method, which comprises the following steps: the first sound box equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; the second sound box equipment outputs corresponding conference information according to the conference notification; receiving second voice data, and sending the participation notice to a server side under the condition that the conference password is identified by the second voice data; and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first sound box device and the second sound box device.
The embodiment of the application also discloses a data processing method, which comprises the following steps: the first equipment carries out voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to second equipment of the conference user under the condition that the second equipment is determined to be online; the second equipment outputs corresponding conference information according to the conference notification, receives second voice data, identifies the second voice data, determines a participation notification and sends the participation notification to a server; and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first equipment and the second equipment.
The embodiment of the application also discloses a data processing method, which comprises the following steps: receiving a conference notification, wherein the conference notification is determined according to a conference event identified by first voice data, and the conference notification is sent out when a server determines that sound box equipment corresponding to a participant is on line; outputting corresponding meeting information according to the meeting notice; receiving second voice data; and sending the participation notice to a server side under the condition that the conference password is recognized by the second voice data.
The embodiment of the present application further discloses a data processing system, the system includes: the system comprises a first sound box device, a second sound box device and a server side; the first sound box equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in a conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first loudspeaker box device and the second loudspeaker box device; the second sound box equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
The embodiment of the application also discloses a data processing device, which comprises: the conference notification module is used for receiving a conference notification, the conference notification is determined according to the conference event identified by the first voice data, and the conference notification is sent out when the server determines that the sound box equipment corresponding to the participant is on line; outputting corresponding conference information and conference passwords according to the conference notification; the conference participating processing module is used for receiving second voice data, recognizing the second voice data and determining a conference participating notification; and the conference feedback module is used for receiving the conference initiating result corresponding to the conference notification.
The embodiment of the application also discloses a data processing device, which comprises: the conference determining module is used for receiving a conference event of the first loudspeaker box device, and the conference event is identified and determined according to the first voice data; determining the conference users participating in the conference according to the conference event; the notification module is used for sending a conference notification to the second sound box device under the condition that the second sound box device of the conference participating user is determined to be on line, so that the second sound box device outputs corresponding conference information and a conference password according to the conference notification; the result determining module is used for receiving a conference participation notification returned by the second sound box device, and the conference participation notification is determined according to second voice data identification; and determining a corresponding conference initiating result according to the conference participating notification.
The embodiment of the application also discloses a data processing device, which comprises: the conference initiating module is used for receiving voice data; carrying out voice recognition on the received voice data, and determining a corresponding conference event; the sending module is used for sending the conference event so that the server side sends a conference notice to the online sound box equipment of the participating user according to the conference event, and the online sound box equipment of the participating user outputs corresponding conference information and a conference password according to the conference notice; and the result determining module is used for receiving a conference initiating result, and the conference initiating result is determined according to the conference participation notification returned by the online sound box equipment of the conference participating user.
The embodiment of the present application further discloses a data processing system, the system includes: the system comprises a first sound box device, a second sound box device and a server side; the first sound box equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server; the server determines a conference user participating in a conference according to the conference event, and sends a conference notification to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first loudspeaker box device and the second loudspeaker box device; the second sound box equipment outputs corresponding conference information according to the conference notification; and receiving second voice data, and sending the participation notification to a server side under the condition that the conference password is recognized by the second voice data.
The embodiment of the present application further discloses a data processing system, the system includes: the system comprises a first device, a second device and a server; the first equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server; the server determines the conference users participating in the conference according to the conference event, and sends a conference notice to the second equipment under the condition that the second equipment of the conference users is determined to be online; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first equipment and the second equipment; and the second equipment outputs corresponding conference information according to the conference notice, receives second voice data, identifies the second voice data, determines a participation notice and sends the participation notice to a server.
The embodiment of the application also discloses a data processing device, which comprises: the conference notification is determined according to the conference event identified by the first voice data, and the conference notification is sent by the server under the condition that the loudspeaker box equipment corresponding to the participant user is determined to be on-line; the notification output module is used for outputting corresponding conference information according to the conference notification; the voice participating module is used for receiving second voice data; and sending the participation notice to a server side under the condition that the conference password is recognized by the second voice data.
The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon, which when executed, causes the processor to perform a data processing method as described in one or more of the embodiments of the present application.
One or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a data processing method as described in one or more of the embodiments of the present application are also disclosed.
The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon, which when executed, causes the processor to perform a data processing method as described in one or more of the embodiments of the present application.
One or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a data processing method as described in one or more of the embodiments of the present application are also disclosed.
The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon, which when executed, causes the processor to perform a data processing method as described in one or more of the embodiments of the present application.
One or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a data processing method as described in one or more of the embodiments of the present application are also disclosed.
The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon, which when executed, causes the processor to perform a data processing method as described in one or more of the embodiments of the present application.
One or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a data processing method as described in one or more of the embodiments of the present application are also disclosed.
The embodiment of the application also discloses a data processing method, which comprises the following steps: the first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in the conference according to the conference event, and sends a conference notification to a second voice device of the conference user under the condition that the second voice device is determined to be on line; the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server; and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first voice equipment and the second voice equipment.
The embodiment of the present application further discloses a data processing system, the system includes: first voice equipment, server and second voice equipment, wherein: the first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server; the server side determines a conference user participating in a conference according to the conference event, and sends a conference notification to a second voice device of the conference user under the condition that the second voice device is determined to be on line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first voice equipment and the second voice equipment; the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
The embodiment of the application also discloses a data processing method, which comprises the following steps: the first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; sending the conference event to a server, wherein the conference event comprises a conference reservation instruction; the server side determines a conference user participating in a conference according to the conference event, determines that the conference condition corresponding to the conference reservation instruction is met, and sends a conference notice to a second voice device of the conference user under the condition that the second voice device is determined to be on-line; the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server; and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first voice equipment and the second voice equipment.
The embodiment of the present application further discloses a data processing system, the system includes: first voice equipment, server and second voice equipment, wherein: the first voice equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; sending the conference event to a server, wherein the conference event comprises a conference reservation instruction; the server side determines the conference users participating in the conference according to the conference event, determines that the conference conditions corresponding to the conference reservation instruction are met, and sends a conference notice to the second voice equipment under the condition that the second voice equipment of the conference users is determined to be on-line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first voice equipment and the second voice equipment; the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
Compared with the prior art, the embodiment of the application has the following advantages:
in the embodiment of the application, the first sound box device can initiate a conference and determine a conference event based on voice, the server can determine a conference user based on the conference event, and send a conference notification to the second sound box device when determining that the second sound box device of the conference user is online, and the second sound box device outputs corresponding conference information and a conference password according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a conference participation notice, sending the conference participation notice to a server, determining a corresponding conference initiation result according to the conference participation notice by the server, and sending the conference initiation result to the first sound box device and the second sound box device, so that the second sound box device in an online state can be automatically called based on the conference initiation by voice, the second sound box device can be conveniently accessed to the conference, and the processing efficiency is improved.
Drawings
Fig. 1 is a schematic diagram of a conference system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a conference scenario in an embodiment of the present application;
FIG. 3 is an interaction diagram of a conferencing system according to an embodiment of the present application;
FIG. 4 is a schematic diagram of call routing according to an embodiment of the present application;
FIG. 5 is a flow chart of steps of an embodiment of a data processing method of the present application;
FIG. 6 is a flow chart of steps in another data processing method embodiment of the present application;
FIG. 7 is a flow chart of steps in yet another data processing method embodiment of the present application;
FIG. 8 is a flow chart of steps in an alternative embodiment of a data processing method of the present application;
FIG. 9 is an interaction diagram of another conferencing system according to an embodiment of the present application;
FIG. 10 is an interaction diagram of yet another conferencing system in accordance with an embodiment of the subject application;
FIG. 11 is an interaction diagram of an embodiment of a data processing method based on a speech device according to an embodiment of the present application;
fig. 12 is an interaction diagram of an embodiment of a processing method for conference reservation initiation according to an embodiment of the present application;
FIG. 13 is a block diagram of an embodiment of a data processing apparatus of the present application;
FIG. 14 is a block diagram of another data processing apparatus embodiment of the present application;
FIG. 15 is a block diagram of an embodiment of a data processing apparatus of the present application;
FIG. 16 is a block diagram of a further data processing apparatus embodiment of the present application;
fig. 17 is a schematic structural diagram of an apparatus according to an embodiment of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
The embodiment of the application can be applied to a conference system, and the devices of the users in the conference system can be interconnected based on a network, a telephone and the like and carry out a conference in a voice/video mode. In one example of a conferencing system, as shown in fig. 1, the conferencing system comprises: the server 10, a plurality of user devices 20, each user device 20 including a speaker device 201, and may further include a terminal device 202.
The sound box equipment can also be called an intelligent sound box and is an upgrade product of the sound box, besides common audio output components such as a power amplifier and a loudspeaker in the sound box, the sound box equipment can also comprise audio input components such as a microphone, image acquisition components such as a camera, display components such as a display, a wireless network module and the like, the wireless network module can comprise a network module such as a wifi chip, a mobile network module such as 4G and 5G, a Bluetooth module such as a Bluetooth chip and the like, and the Bluetooth module can also be a module related to other wireless connection technologies. Therefore, the sound box device can be used as a tool for voice internet access to be connected and interacted with a network and other devices besides providing a basic audio output function. The terminal device may include a mobile phone, a tablet computer, a notebook computer, etc. The server can provide various devices, apparatuses or software modules and the like for a single server, a server cluster, a cloud device and/or a virtual machine and the like.
Referring to fig. 2, a conference scenario diagram according to an embodiment of the present application is shown.
In the embodiment of the application, at an initiating end of a conference, a conference initiating user can initiate the conference based on voice, the first loudspeaker box device receives first voice data of the conference initiating user, then the conference event can be determined based on recognition and analysis of the first voice data, and the conference event is sent to the server end so as to initiate the conference. The server side can inquire conference information based on the conference event, and determines the conference users of the conference receiving end and the second loudspeaker box devices of the conference users. The method includes the steps that second loudspeaker box equipment of one or more receiving ends can be determined, and whether the second loudspeaker box equipment is in an online state or not is detected. And under the condition that the second sound box equipment of the participating user is determined to be on line, the server side sends a conference notice to the second sound box equipment. After receiving the conference notification, the second sound box device of each participating user can determine conference information and a conference password based on the conference notification, then the second sound box device can output the conference information and the conference password, for example, voice data corresponding to the conference information and the conference password are output through the audio output unit, correspondingly, the participating user can also enter the conference through voice, the participating user outputs contents such as the conference password, and the like, the sound box device can receive the second voice data of the participating user, then the participating notification is determined based on recognition and analysis of the voice data, for example, the user identity can be authenticated through verification methods such as voiceprint verification, and the participating user is allowed to enter the conference after the authentication is passed, then the participating notification can be sent to the server, the server determines a corresponding conference initiating result according to the conference notification, and feeds the conference initiating result back to the first sound box device of the initiating terminal, A second speaker device at least one receiving end, and the like.
And carrying out processes of conference initiation, user notification, access and the like by combining the server. The method can be realized according to the following embodiments:
referring to fig. 3, an interaction diagram of a conference system according to an embodiment of the present application is shown. The first device refers to a device of a user initiating a conference, the second device refers to a device of a participating user participating in the conference, and the second device may include a plurality of devices in one conference, that is, the second device for each participating user initiates a conference, and the participating users refer to users participating in the conference.
Step 302, the first device performs recognition analysis on the received first voice data to determine a corresponding conference event. For example, the first speaker device performs voice recognition on the received first voice data, and determines a corresponding conference event.
Step 304, the first device sends the conference event to the server.
The first device of the originating end may include a sound box device, a terminal device, and other devices having functions of voice input/output, network connection interaction, and the like, so that a conference may be initiated based on recognition and analysis of voice data, for example, a user says "initiate a conference", "initiate an XX conference", and the like, and a voice input unit such as a microphone of the first device such as the sound box device may receive corresponding voice data, and then perform processing such as recognition and analysis based on the voice data, so as to obtain a corresponding conference event, where the conference event is used to instruct the server end to initiate a conference.
The sound box equipment can detect user voice in real time, then carries out voice recognition processing on received voice data to obtain a corresponding recognition result, then carries out semantic analysis based on the recognition result, can determine that a certain conference needs to be initiated through the semantic analysis, and can include conference related information in the recognition result, so that a corresponding conference event is generated.
In an optional embodiment, the performing voice recognition on the received first voice data and determining a corresponding conference event includes: performing text recognition on the first voice data to obtain first text information; and identifying a set word in the first text information, and determining a conference event according to the set word. The first voice data can be subjected to voice recognition to obtain first text information corresponding to the first voice data, and then whether the first text information contains a setting word or not is analyzed, and for the setting word recognized from the first text information, such as "meeting", and the like, a corresponding meeting event can be generated after the setting word is detected. Wherein, in some examples, the conference event related parameter information may also be determined based on the first text information.
In the embodiment of the present application, each conference may store, at a server, conference parameters of the conference, where the conference parameters refer to data that the conference corresponds to and can be referred to, and the conference parameters include at least one of: a time parameter, a location parameter, a user parameter, and a content parameter. Conference parameters such as a conference name, a conference identification, etc. may also be included. The conference name refers to the name of the conference and can be set when the conference is scheduled; the conference identification refers to a character string which uniquely identifies one conference, and the conference identification of each conference is different; the time parameter may record time information of the conference, such as an initiation time, an expected duration of the conference, etc., and the user parameter may record a user participating in the conference, such as a user initiating the conference, at least one user participating in the conference, etc. The location parameter can record the location information of the conference participating users, such as the geographical location or the network address of the equipment; the content parameters may record content related to the meeting, such as a brief introduction to the meeting, an issue, etc.
When the user speaks to initiate the conference, only the conference to be initiated can be stated, and the initiation of which conference can also be specifically indicated, so that the required conference can be conveniently identified by the equipment, and therefore, the equipment can be conveniently identified by indicating at least one parameter of time, name and participating users. For example, when the method is applied to a sound box device scene, the sound box device can recognize the setting words, the parameter information and the like of the conference from the voice data, so that the parameter information can be added to the conference event, and the server can conveniently recognize the conference to be initiated. In some other embodiments, all or part of the conference parameters of the conference may also be stored in the initiating device, so that the setting words in the first text information may be matched with the conference parameters, the matched conference is determined, the corresponding conference identifier is obtained, and the conference identifier is added to the conference event. The conference event may then be sent to the server to call the participant devices of the other participant users through the server.
In the embodiment of the application, a conference password is also corresponding to a conference during a conference, the conference password can be understood as a secret key for a user to participate in the conference, conference passwords of different conference users under one conference can be the same or different, the conference password can be spoken to facilitate identification of the conference when the conference is initiated, or after equipment identifies a set word of the conference, the user can be prompted to need the conference password, so that voice data including the conference password is received, the conference password is identified from the voice data and added to a conference event, and the conference is determined more accurately.
In some other embodiments, in order to improve the accuracy of initiating the conference, voiceprint recognition can be performed on the first voice data to obtain corresponding voiceprint information; and carrying out voiceprint verification according to the voiceprint information. In the embodiment of the application, voiceprint information of each user participating in the conference can be determined, and whether the conference is allowed to be initiated or not is determined through voiceprint verification. Voiceprint information can also be identified from the first speech data and then voiceprint verification can be performed. In some examples, the conference information is sent to the user equipment bound by each participating user, and each user equipment can store voiceprint information of the bound user, so that when one participating user initiates a conference, voiceprint verification can be performed in the first equipment, the voiceprint information obtained through identification is matched with the voiceprint information of the bound user, and the conference event is allowed to occur to the server side after the voiceprint verification is determined. In some examples, the originating device may add the identified voiceprint information to the conference event, and thereby perform voiceprint verification at the server to determine whether to initiate the conference.
And step 306, the server determines the conference users participating in the conference according to the conference event.
And 308, the server side sends a conference notice to the second sound box device under the condition that the second device of the participating user is determined to be online. The server side can detect the connection state of each second device, and the connection state comprises online and/or offline.
After receiving the conference event, the server can determine the corresponding conference according to the conference parameters in the conference event, and query the conference information of the conference, and can determine the participating users participating in the conference and the second devices of the participating users according to the user parameters, and send a conference notification to the second devices of the participating users, such as sound box devices, and the like, wherein the conference notification is used for notifying the participating users to join the conference. In one example, the first device of the user initiating the conference may automatically access the conference, and in some other examples, the user initiating the conference also needs to access the conference based on the conference notification as a participant user, which may be specifically set according to the requirement.
In an optional embodiment, the determining the participant users who participate in the conference according to the conference event includes: determining conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
In some optional embodiments, the server may also perform conference initiated verification, and then the server may obtain voiceprint information or a conference password from the conference event, determine the user of the originating terminal according to the device information of the first device of the originating terminal corresponding to the conference event, then obtain the voiceprint information of the user of the originating terminal, match the stored voiceprint information with the voiceprint obtained in the conference event, or perform verification operations such as matching the stored conference password of the conference with the conference password in the conference event, and after the verification is passed, may send a conference notification to the first device of the participating user.
The conference notification can carry information such as conference parameters of the conference and conference passwords of the participating users, conference information can be generated based on the parameter information of each conference parameter, and the conference passwords of different participating users can be different. In other embodiments, the conference password may be issued to the participating users when setting up the conference, so that the conference notification may only carry the conference parameters and no longer carry the conference password, and the conference password is issued when the user requests, which is not limited in the embodiments of the present application.
In the embodiment of the present application, the conference notification may be sent in various manners such as sending a call, a request, or a push message, where the call refers to establishment of connection between users in communication, such as a call for a fixed phone/a mobile phone, and a request for voice and/or video communication corresponding to a network phone, and the call may be specifically set according to requirements.
The fixed phone/mobile phone can determine the dialing of the landline phone number/mobile phone number of the participating user, and the network phone can determine the information of the second device of the participating user, such as a sound box device, for example, a conference notification is sent through a Media Access Control (MAC) address of the device, before the conference notification is sent, whether the device is in an online state can be determined based on the device information such as the MAC address, if the device is in the online state, the conference notification is sent, and if the device is in the offline state, a conference initiating result of call failure can be generated, or prompt information is generated and sent to other devices bound by the participating user to prompt the participating, such as sending a short message, pushing a message and the like to a mobile phone.
Therefore, when the server calls the user equipment of each participating user, the server can initiate a conference notification to routers in various places based on the region where the user equipment is located, and then the router distributes the conference notification to second equipment, such as sound box equipment, of each participating user. As shown in fig. 4, in this example, the server determines that the second devices of the participating users are distributed in shenzhen, beijing, and hangzhou, and may send conference notifications to routers in shenzhen, beijing, and hangzhou, respectively, where the conference notifications may notify the MAC addresses of the corresponding devices, so that router a in shenzhen may distribute the conference notifications to speaker devices a1 and a2 … … An, router B in beijing may distribute the conference notifications to speaker devices B1 and B2 … … Bn, router C in hangzhou may distribute the conference notifications to speaker devices C1 and C2 … … Cn, where n is a positive integer. Therefore, even if the participating users are distributed in different regions, the users can be called through the route, and the users can participate in the conference conveniently.
And step 310, the second device outputs corresponding conference information and a conference password according to the conference notification.
In this embodiment of the application, the second device may receive the conference notification, and then may play information corresponding to the conference notification through the audio output unit, that is, prompt the user that there is a conference, whether to participate, and the like. The conference notification may carry conference parameters and conference passwords, such as parameter information of a conference name, time, and the like, generate conference information based on the conference parameters, and output the conference information and the conference passwords through the audio output unit, so that a participating user can know a conference and determine how to join the conference.
The user can speak the information required by the conference participation, such as whether to participate in the conference, the conference password and the like, and the user can directly speak the conference password for participating in the conference aiming at the conference participation condition, so that the conference password is verified to facilitate the user to join in the conference, and if the user participates in the conference but does not speak the conference password, the second equipment can also guide the user to speak the conference password. In an optional embodiment, the outputting the corresponding meeting information and the meeting password according to the meeting notification includes: acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. The conference parameters can be obtained from the conference notification, and then the conference information is determined according to the conference parameters, in some examples, the conference parameters can also include password parameters, and then the conference password can also be determined based on the conference parameters, in other examples, the conference password can be a random character string or a random number generated by the server for the conference, and the like, and can be carried and issued in the conference notification. The conference information and the conference password are usually in a non-voice form, such as text, characters, and the like, so that the conference information and the conference password can be further subjected to voice conversion to obtain corresponding transportation data, and then can be output in a voice output mode based on an audio output unit.
In a further optional embodiment, the obtaining conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters includes: determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. In still other embodiments, the meeting notification may carry a time parameter, as well as meeting parameters such as a location parameter, a user parameter, a content parameter, and the like, to facilitate determination of meeting information for output to the participating users.
In step 312, the second device receives the second voice data, recognizes the second voice data, and determines a meeting participation notification.
And step 314, the second device sends the participation notice to the server.
The second equipment such as the sound box equipment can receive the voice data in real time, recognize the received second voice data, determine a corresponding recognition result, perform voice analysis according to the recognition result, determine whether the user participates in the conference, determine whether a conference password exists according to the conference result, generate a conference participation notification according to the conference password obtained by recognition if the conference password exists, prompt the user to participate in the conference by voice if the conference password does not exist, recognize the conference password from the voice of the user replying the conference password and add the conference password into the conference notification, or determine the conference participation notification after verifying the conference password is accurate, and then send the conference notification.
Wherein the recognizing the second voice data and determining a participant notification comprises: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification. The received second voice data can be subjected to text recognition to obtain corresponding second text information, and text analysis is carried out on the second text information to determine whether the second text information contains the conference password. In the case where the second text information contains a conference password, in order to secure the conference contents, the voice print verification can be carried out on the participating users, so that the voice print recognition can be carried out on the second voice data to obtain the corresponding voice print information, then the voice print verification is carried out based on the voice print information to determine whether the user is allowed to participate in the conference, wherein, the second device can store the voiceprint information of the participating users, or obtain the voiceprint information of the participating users from the server and the like, so as to match the analyzed voiceprint information with the stored voiceprint information, confirm whether the voiceprint information passes the verification or not, and send the voiceprint information to the server, matching the voiceprint information with the voiceprint information of the corresponding participating user at the server side, determining a voiceprint verification result, determining whether the voiceprint of the user is the voiceprint of the participating user, allowing participating users to answer access to the conference while denying the users to participate in the conference for non-participating users.
In other embodiments, sometimes the user may not output the conference password in reply and the conference password may not be recognized from the second voice data. And outputting third voice data under the condition that the second text information does not contain the conference password, wherein the third voice data is used for guiding the conference participating user to input the conference password.
Some users may forget or be inconvenient to find the conference password, when the first device replies to join the conference, the conference password is spoken, and accordingly the conference password is not recognized in the second voice data, the second device may output third voice data for guiding the participating user to input the conference password, such as "please read the conference password", and also such as "please speak the following conference password 123456", where "123456" guides the user to provide the conference password in a manner of, for example, the conference password, so as to join the conference. In other examples, although the user does not speak the conference password, the voiceprint authentication of the user passes, and when the legal identity of the user is determined, the conference password issued in the second device may be directly acquired and added to the participation notification, or the participation notification may be directly generated.
In an optional embodiment of the present application, performing speech recognition on the second speech data, and determining a corresponding recognition result; analyzing the identification result, and judging whether a conference password exists; if the conference password exists, acquiring the conference password; if the conference password does not exist, prompt information is generated to prompt the user to provide the conference password, so that the conference password is identified from subsequently received voice data, or if the conference password does not exist but the voiceprint verification is passed, the conference password of the conference user corresponding to the voiceprint information can be acquired.
And step 316, the server determines a corresponding conference initiating result according to the conference participating notification.
After receiving the participation notification of the second device, determining conference parameters, conference passwords and the like according to the participation notification, and determining that the second device of the participant user is allowed to be accessed into the conference according to the participation notification, thereby generating corresponding conference initiating results, such as conference participation users, the number of the participant users and the like.
And under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user. And other second equipment of the participating user can be inquired and prompt information and the like can be sent, so that the participating user can normally participate in the conference. The offline refers to a situation that the device and the server cannot interact with each other, and the inability of the device to interact with the server may be caused by various reasons, such as that the device is not powered on, the device is powered on but not networked, and the like.
Step 318A, the server sends the conference initiation result to the first device.
Step 318B, the server sends the conference initiation result to the second device.
The conference notification can be sent to one or more online second devices, the conference initiating result is determined through the conference initiating notification of each second device, or the conference initiating result failing to participate is generated aiming at the offline second devices, so that the conference initiating result of the conference can be determined according to the offline and online conditions of the devices and the responses to the conference initiating notification, for example, the accessed participant user and the like are determined to be fed back based on the accessed participant device, and the accuracy is high.
In this application embodiment, first equipment and second equipment all can be audio amplifier equipment, and this audio amplifier equipment still can include display element such as show, image acquisition parts such as camera etc. to based on display element, first audio amplifier equipment and second audio amplifier equipment can support video conference, make user's accessible audio amplifier equipment, server etc. carry out video conference, the transmission of the meeting content of being convenient for. Correspondingly, the display component of the sound box device can also display the corresponding conference information, the conference password and the like, for example, the display of the sound box device can display the conference content, the summary, the subject and the like, and can also display the content of the presentation, the file and the like in the conference process.
In the embodiment of the application, a user can carry out a conference through voice equipment such as a sound box device and the like, the conference can be carried out through voice interaction, the voice equipment at any end can receive voice data of the user and serve as conference voice data, recognized text information is uploaded to a server after voice recognition processing is carried out, text information recognized by other voice equipment ends can be received from the server and is output after being converted into the voice data, and therefore the purpose that the user at each end carries out the conference through the voice equipment and a server is achieved, the server and each voice equipment end interact the text information to achieve the voice interaction, the data volume can be reduced, and interaction efficiency is improved.
In the process, the service end and each voice equipment end interact text information corresponding to the conference voice, and correspondingly, the text information can be further sorted to automatically generate a conference record. The device may generate a conference record according to the voice text information corresponding to the conference voice in the conference process, where the voice text information includes text information obtained by recognizing conference voice data received by the local terminal and/or text information obtained by recognizing conference voice data transmitted by another device terminal transmitted by the server terminal. That is, the text information converted from the voice data received by the local terminal and the text information corresponding to the voice data of the other device terminal received from the server terminal may be arranged into a meeting record, for example, the meeting record is arranged according to a time sequence, the text information corresponding to each piece of voice data of the user is arranged into the meeting record according to a time sequence, and the meeting record may further identify the information to which each piece of text corresponds, such as determining the affiliated user based on voiceprint recognition, or marking the affiliated meeting terminal according to the device terminal, and the like. In other scenarios, each device side may also only sort the received voice data, generate a conference record according to the time information, and then upload the conference record to the server side, and the server side sorts the conference record transmitted by each device side, the time information of each record, the owner information, and the like into one conference record, which may be specifically set according to the requirements.
In the embodiment of the application, besides the audio input/output unit, some voice devices may further include a display unit, such as a television serving as the voice device, and also such as an intelligent sound box with a display screen, so that for the voice device with the display unit, the conference record may be displayed in the display unit of the voice device, such as being displayed in the display unit along with the progress of the conference, and such as being displayed after the conference is ended. And an editing function can be provided for the voice equipment, so that a user can indicate a part with problems through voice in the process of viewing the conference recording, and then mark and modify the part, thereby improving the efficiency of the conference recording. In addition, after the conference is finished, the server or the voice device can also automatically send the conference record to each user in the conference or send the conference record to a designated user.
The embodiment of the application can be applied to off-site conferences, in some examples, some participant end devices such as sound box devices and the like can be commonly used by multiple users, for example, in a three-place conference of beijing, hangzhou and shenzhen, the sound box device B2 of beijing is arranged in a conference room, so that users of a team can participate in the conference through the sound box device B2 in the conference room. Each second device may also recognize the number of users participating in the conference through voice data, recognize the number of users through different voiceprints, and the like, which is not limited in the embodiment of the present application. Therefore, the conference initiating result can also carry the number of the users recorded in each conference participating notification, so as to determine whether the users participating in the conference are matched with the conference information or not.
In the embodiment of the application, each speaker device may further correspond to an application program in the terminal device, the application program may store conference information, the conference information may be written and stored by a user who establishes a conference and synchronized to the server, and the conference information stored by the server may also be issued to the terminal device where each application program is located, so that each participating user's terminal device may have conference information, thereby facilitating determination of the participating user, the subject of the conference, the conference password, and the like.
Therefore, the embodiment of the application can conveniently initiate and answer the conference based on the voice call, and the conference establishing efficiency and accuracy are improved.
Through the wireless performance of audio amplifier equipment, can save the wiring design of meeting equipment, save the cost to be convenient for a plurality of users to participate in the meeting through an audio amplifier equipment, be convenient for team user's access. And the recognition of signal input can be completed based on the far-field pickup function of the sound box equipment, so that the efficiency is improved.
On the basis of the embodiment, the data processing method can be applied to sound box equipment, so that the conference can be initiated and joined quickly.
Referring to FIG. 5, a flow chart of steps of an embodiment of a data processing method of the present application is shown.
Step 502, receiving voice data.
Step 504, performing voice recognition on the received voice data, and determining a corresponding conference event.
In one example, the performing voice recognition on the received voice data and determining the corresponding conference event includes: performing text recognition on the voice data to obtain text information; and identifying a set word in the text information, and determining a conference event according to the set word. In order to distinguish the voice data, the voice data received by the first device may be referred to as first voice data, and the first voice data may be subjected to voice recognition to obtain first text information corresponding to the first voice data, and then whether the first text information includes a setting word is analyzed, and for the setting word recognized from the first text information, such as "meeting", and the like, a corresponding meeting event may be generated after the setting word is detected. Wherein, in some examples, the conference event related parameter information may also be determined based on the first text information. Therefore, the user can indicate to initiate the conference by speaking the setting words, the conference related information and the like, and the conference can be initiated quickly.
In some embodiments, voiceprint information can also be extracted from the voice data, voiceprint check is performed according to the voiceprint information, and the conference event is sent when the voiceprint check is passed. In one example, the voiceprint check may be performed after a setting word is recognized from the first voice data, and the step of determining the conference event according to the setting word may be performed after the voiceprint check is passed. In another example, a voiceprint check may be performed after a set word is recognized from the first voice data and a conference event is generated. Taking the determination of the conference event and then the verification of the voiceprint as an example:
step 506, extracting voiceprint information from the voice data.
And step 508, performing voiceprint verification according to the voiceprint information. And extracting voiceprint information from the first voice data, matching the voiceprint information with stored voiceprint information of the conference participating user, and determining whether the voiceprint information passes voiceprint verification.
And step 510, sending the conference event to a server side under the condition that the voiceprint check is passed.
And the server side sends a conference notice to online equipment of the participating users, such as sound box equipment, according to the conference event, and the online equipment of the participating users outputs corresponding conference information and can also output conference passwords according to the conference notice.
Step 512, receiving a conference initiating result.
After receiving the conference initiating result, various conference information such as conference participating users, the number of participants and the like can be determined, wherein for the sound box equipment with the display component, the conference information, the conference password, the conference initiating result and the like can also be displayed on the sound box equipment, such as displaying the conference name, the summary, the discussion subject, the number of participants and the like. The audio amplifier device has a display unit and a camera, and therefore, a video conference is also performed through the audio amplifier device.
In the optional embodiment of the present application, voiceprint recognition can be further performed on the voice data to obtain corresponding voiceprint information; and carrying out voiceprint verification according to the voiceprint information. And determining the identity of the user through voiceprint verification, thereby determining whether the user has the right to initiate a conference and can initiate a certain conference.
In other embodiments, the user may also initiate the conference in the application program of the terminal device by means of a click, a gesture, a voice, and the like, which is not limited in this application embodiment.
On the basis of the embodiment, the data processing method is provided, and can be applied to the server side, so that the conference, the access of the participating users and the like can be initiated quickly, the accuracy of the conference is improved, and the safety of the conference content is ensured.
Referring to FIG. 6, a flow chart of steps of another data processing method embodiment of the present application is shown.
At step 602, a conference event is received. The server may obtain the conference event from the first device, such as the first speaker device, and the process of determining the conference event is as described in the above embodiment, which is not described again.
And step 604, determining the conference users participating in the conference according to the conference event. Wherein, the determining the participating users who participate in the conference according to the conference event comprises: determining conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
The server can also determine the connection state of the second sound box device based on the media access control address MAC of the second sound box device. And under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
Step 606, under the condition that it is determined that the second sound box device of the participating user is online, sending a conference notification to the second sound box device, so that the second sound box device outputs corresponding conference information and a conference password according to the conference notification.
Step 608, receiving a participant notification returned by the second sound box device, where the participant notification is determined according to the second voice data recognition.
In some examples, the participant notification may carry a conference password, voiceprint information, etc. check information to check for conference access, determine whether the participant notification is acceptable, etc. Wherein, the checking the participation notice based on the checking information comprises at least one of the following steps: comparing the conference password with a stored conference password to obtain a corresponding key verification result; and matching the voiceprint information with the voiceprint information of the corresponding participating user to determine a voiceprint verification result. For the verification of the conference participation notification, the response validity can be verified through a conference password, and the identity of a response user can also be verified through voiceprint information, so that the user is determined to have the right to join the conference or the right to initiate the conference and the like, the equipment of the user with the legal and/or confirmed identity can be allowed to access the conference, and the security of the conference is improved.
And step 610, determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first sound box device and the second sound box device.
The corresponding conference initiating result can be determined based on the participant notification, for example, the participant user who passes the verification of the participant notification can generate the conference initiating results such as successful joining, successful participation, successful conference initiating and the like; and if the second equipment of the participating user processes the offline state, generating a conference initiating result aiming at the call failure of the participating user. And then the conference initiating result can be fed back to the first user and the second user, so that the conference participating users and the initiating users can determine the initiating state of the conference and the like even if the conference is started, and the conference efficiency is improved.
On the basis of the embodiment, the data processing method can be applied to second equipment such as sound box equipment, so that a conference can be initiated and verified quickly through voice access, the accuracy of the conference is improved, and the safety of conference contents is ensured.
Referring to FIG. 7, a flowchart illustrating steps of yet another data processing method embodiment of the present application is shown.
At step 702, a meeting notification is received. The conference notification is determined according to the conference event identified by the first voice data, the conference notification is sent by the server side under the condition that the sound box equipment corresponding to the participant user is determined to be on-line, and the second equipment in the on-line state can receive the conference notification. In the embodiment of the application, the second devices such as the sound box device and the like with audio input and output functions can also play the prompt information corresponding to the conference notification through voice.
Step 704, outputting the corresponding meeting information and the meeting password according to the meeting notification.
Outputting corresponding conference information and conference passwords according to the conference notification, wherein the conference information and the conference passwords comprise: acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. Optionally, the obtaining conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters includes: determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
In other examples, only the meeting information may be output according to the meeting notification, and the meeting password may not be output when the user needs to output the meeting password, for example, when the meeting password is guided, prompted to the meeting user, or when the meeting password is used for checking or the like.
Step 706, receiving the second voice data, recognizing the second voice data and determining the participation notification.
Wherein the recognizing the second voice data and determining a participant notification comprises: performing text recognition on the second voice data to obtain text information; under the condition that the text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification. And under the condition that the text information does not contain the conference password, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
The conference password identified in the voice data can be matched with the conference password carried in the conference notification, so that the accuracy of the password provided by the user is verified. Therefore, the conference password can verify the validity of the conference, and the security of the conference is improved. Voiceprint verification can be performed based on voiceprint information in the voice data. The identity of the user is determined through voiceprint verification, whether the user has the right to join the conference or not can be determined based on the identity of the user, and the safety of the conference is improved. In other embodiments, the voiceprint information, the conference password, and the like can be added to the conference participation notification, so that the server can perform authentication based on the voiceprint information, such as authentication of the user identity, and the like.
Step 708, sending the participation notification.
Step 710, receiving a conference initiating result.
Wherein the second device may comprise a speaker device having a display component; the outputting of the corresponding conference information and the conference password comprises: and displaying the corresponding conference information and the conference password through the display component. The user can know the conference information and the conference password conveniently, so that the user can determine whether to participate in the conference. The audio amplifier device may also have a display component and a camera for performing a video conference through the audio amplifier device.
On the basis of the above embodiment, the second device on the participating end may also respond by:
referring to FIG. 8, a flowchart illustrating steps of yet another alternative embodiment of a data processing method of the present application is shown.
At step 802, a meeting notification is received.
And step 804, determining corresponding conference information and conference password according to the conference notification. Wherein meeting parameters can be determined from the meeting notification, the meeting parameters including a time parameter; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords.
And step 806, outputting the corresponding conference information and the conference password. The conference information and the conference password can be output through the audio output unit, and can also be output through the audio output unit and the display component together.
Step 808, receiving voice data.
Step 810, performing text recognition on the voice data to obtain text information.
Step 812, determining whether the text information includes a conference password. If the conference password is not included, go to step 814, and if the conference password is included, go to step 816.
Step 814, outputting third voice data, where the third voice data is used to guide the participating user to input a conference password.
Step 816, performing voiceprint recognition on the voice data.
Step 818, a voiceprint check is performed based on the identified voiceprint information. If yes, go to step 820, otherwise end the process.
At step 820, a corresponding meeting notification is determined.
Step 822, sending the participation notification.
Step 824, receiving a corresponding conference initiation result.
According to the conference initiating method and device, the conference can be initiated based on the voice, then the conference notification is sent to the second devices of the conference participating users participating in the conference, and the conference participating users can participate in the conference based on the voice response, so that the users can be allowed to access based on the conference participation notification, the conference can be initiated and participated conveniently, and the processing efficiency is improved.
On the basis of the above embodiments, the embodiments of the present application further provide a data processing method, which can perform a conference based on the first sound box device, the second sound box device, and the server.
Referring to fig. 9, an interaction diagram of another conferencing system of an embodiment of the application is shown.
Step 902, the first sound box device performs voice recognition on the received first voice data, and determines a corresponding conference event.
And 904, the first sound box device sends the conference event to a server.
And step 906, the server determines the conference users participating in the conference according to the conference event.
Step 908, the server sends a conference notification to the second sound box device when determining that the second sound box device of the participating user is online.
Step 910, the second sound box device outputs corresponding meeting information according to the meeting notification.
Step 912, receiving the second voice data and performing recognition processing.
Step 914A, outputting third voice data when the conference password is not recognized by the second voice data, where the third voice data is used to guide the conference user to input the conference password.
Step 914B is to send the conference notification to the server if the conference password is recognized by the second voice data. Wherein, the sending the conference participation notification to a server side under the condition that the second voice data identifies the conference password comprises: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
Step 916, the server determines a corresponding conference initiating result according to the conference participating notification.
Step 918A, the server sends the conference initiating result to the first sound box device.
Step 918B, the server sends the conference initiating result to the second sound box device.
In the conference interaction process, the server and each sound box equipment interact text information corresponding to conference voice, and correspondingly, the text information can be further sorted to automatically generate conference records.
Therefore, the conference can be initiated and accessed based on the first sound box device, the second sound box device and the server side, the conference can be accessed through the conference password, and the security of the conference is improved.
On the basis of the foregoing embodiments, an embodiment of the present application further provides a data processing method, which can perform a conference based on the first device, the second device, and the server.
Referring to fig. 10, an interaction diagram of another conference system according to an embodiment of the present application is shown.
Step 1002, the first device performs voice recognition on the received first voice data, and determines a corresponding conference event.
Step 1004, the first device sends the conference event to the server.
Step 1006, the server determines the participating users participating in the conference according to the conference event.
Step 1008, the server sends a meeting notification to the second device of the meeting participating user when determining that the second device is online.
Step 1010, the second device outputs corresponding meeting information according to the meeting notification.
At step 1012, the second device receives second voice data.
In step 1014, the second device recognizes the second voice data and determines a meeting notification.
Step 1016, the second device sends the participation notice to the server;
step 1018, the server determines a corresponding conference initiating result according to the conference participating notification.
Step 1020A, sending the conference initiation result to the first device.
Step 1020B, the server sends the conference initiation result to the second sound box device.
For the above example, the second device may perform the following steps:
receiving a conference notification, wherein the conference notification is determined according to a conference event identified by first voice data, and the conference notification is sent out when a server determines that sound box equipment corresponding to a participant is on line; outputting corresponding meeting information according to the meeting notice; receiving second voice data; and sending the participation notice to a server side under the condition that the conference password is recognized by the second voice data.
Wherein, still include: and under the condition that the conference password is recognized by the second voice data, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
In an optional embodiment, in the case that the conference password is recognized by the second voice data, sending the participation notification to a server includes: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference participation notice.
In another optional embodiment, the outputting the corresponding meeting information according to the meeting notification includes: and acquiring meeting information from the meeting notice, and outputting the meeting information in a voice mode.
In some other optional embodiments, the outputting the corresponding meeting information according to the meeting notification includes: determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. Wherein the conference parameters further comprise at least one of: location parameters, user parameters, content parameters.
In the conference interaction process, the service end and each equipment end interact text information corresponding to conference voice, and correspondingly, the text information can be further arranged to automatically generate conference records.
In an embodiment of the application, the second device may comprise a speaker device. The speaker device has a display component; the outputting the corresponding conference information comprises: and displaying the corresponding meeting information through the display component. The sound box device is provided with a display component and a camera so as to support the second device to carry out video conference.
On the basis of the above embodiments, the present application further provides a data processing method, which can perform conference interaction based on a voice device and a server, where the voice device may include various devices supporting voice input and output functions, such as a sound box device, an intercom, a television, a mobile phone, and a tablet computer.
Referring to fig. 11, an interaction diagram of an embodiment of a data processing method based on a speech device according to an embodiment of the present application is shown.
Step 1102, the first voice device performs voice recognition on the received first voice data, and determines a corresponding conference event.
And step 1104, the first voice device sends the conference event to a server.
And step 1106, the server determines the conference users participating in the conference according to the conference event.
Step 1108, the server sends a conference notification to the second voice device of the participating user under the condition that the second voice device is determined to be online.
Step 1110, the second voice device outputs corresponding conference information according to the conference notification.
And step 1112, the second voice equipment receives the second voice data and carries out recognition processing.
And 1114, the second voice device sends the participation notification to the server.
And under the condition that the conference password is not recognized by the second voice data, the second voice equipment outputs third voice data, and the third voice data is used for guiding the conference user to input the conference password. And under the condition that the conference password is identified by the second voice data, the second voice equipment sends the participation notice to the server. Wherein, the sending the conference participation notification to a server side under the condition that the second voice data identifies the conference password comprises: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
Step 1116, the server determines a corresponding conference initiating result according to the conference participating notification.
Step 1118A, the server sends the conference initiation result to the first speaker device.
Step 1118B, the server sends the conference initiating result to the second speaker device.
Wherein, the outputting the corresponding conference information and the conference password according to the conference notification comprises: acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. The acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters includes: determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The performing voice recognition on the received first voice data and determining a corresponding conference event includes: performing text recognition on the first voice data to obtain first text information; and identifying a set word in the first text information, and determining a conference event according to the set word.
The recognizing the second voice data and determining the participation notice comprises the following steps: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification.
In some optional embodiments, in a case where the second text information does not contain a conference password, third voice data for guiding the participating user to enter the conference password is output.
The determining of the conference users participating in the conference according to the conference event comprises the following steps: determining conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
The server can also determine the connection state of the second sound box device based on the media access control address MAC of the second sound box device.
The second speaker device has a display component; the outputting of the corresponding conference information and the conference password comprises: and displaying the corresponding conference information and the conference password through the display component.
The first sound box device and the second sound box device are both provided with a display component and a camera so that a video conference can be carried out between the first sound box device and the second sound box device.
Further comprising: and under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
The voice equipment can also generate a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other equipment terminals transmitted by a server terminal, and the sound box equipment comprises first sound box equipment and/or second sound box equipment; and uploading the conference record to a server.
The voice device has a display section; the method further comprises the following steps: displaying the meeting record through the display component.
The steps in the embodiments of the present application are similar to the corresponding steps in the embodiments described above, and therefore are not described in detail.
On the basis of the above embodiments, the present application further provides another data processing method, which can perform conference interaction based on a voice device and a server, where the voice device may include various devices supporting voice input and output functions, such as a sound box device, an intercom, a television, a mobile phone, and a tablet computer. And the user can reserve to initiate the appointed conference in the process of initiating the conference, and the initiating process is executed after the corresponding conference condition is reached.
Referring to fig. 12, an interaction diagram of an embodiment of a processing method for conference reservation initiation according to the embodiment of the present application is shown.
Step 1202, the first voice device performs voice recognition on the received first voice data, and determines a corresponding conference event. The conference reservation instruction is used for indicating reservation to initiate a conference, and the conference reservation instruction can correspond to reservation parameters, conference conditions and the like, for example, a conference is initiated at reservation 12, and if the reservation is initiated when the online equipment reaches a threshold value, a conference initiation process can be performed by the server based on the conference reservation instruction.
Step 1204, the first voice device sends the conference event to a server.
And step 1206, the server determines the conference users participating in the conference according to the conference event.
And step 1208, the server determines that meeting conditions corresponding to the meeting appointment instruction.
Step 1210, the server sends a conference notification to the second voice device of the participating user under the condition that the second voice device is determined to be online.
After receiving the conference event, the server side can determine the conference participating user and the second voice equipment based on the conference event, and can also obtain a conference reservation instruction from the conference event, and determine conference conditions, namely conditions for initiating the conference according to the conference reservation instruction, for example, initiating the conference at a specified time, and for example, initiating the conference when the online number of the second voice equipment reaches a threshold value, so that after determining that the conference conditions are met according to the voice instruction, under the condition that the second voice equipment of the conference participating user is online, a conference notification is sent to the second voice equipment, and the reserved conference is initiated.
Step 1212, the second voice device outputs corresponding conference information according to the conference notification.
Step 1214, the second voice device receives the second voice data and performs recognition processing.
In step 1216, the second voice device sends the participation notification to the server.
And under the condition that the conference password is not recognized by the second voice data, the second voice equipment outputs third voice data, and the third voice data is used for guiding the conference user to input the conference password. And under the condition that the conference password is identified by the second voice data, the second voice equipment sends the participation notice to the server. Wherein, the sending the conference participation notification to a server side under the condition that the second voice data identifies the conference password comprises: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
Step 1218, the server determines a corresponding conference initiating result according to the conference participating notification.
Step 1220A, the server sends the conference initiation result to the first speaker device.
Step 1220B, the server sends the conference initiation result to the second sound box device.
Wherein, the outputting the corresponding conference information and the conference password according to the conference notification comprises: acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. The acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters includes: determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The performing voice recognition on the received first voice data and determining a corresponding conference event includes: performing text recognition on the first voice data to obtain first text information; and identifying a set word in the first text information, and determining a conference event according to the set word.
The recognizing the second voice data and determining the participation notice comprises the following steps: performing text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification.
In some optional embodiments, in a case where the second text information does not contain a conference password, third voice data for guiding the participating user to enter the conference password is output.
The determining of the conference users participating in the conference according to the conference event comprises the following steps: determining conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
The server can also determine the connection state of the second sound box device based on the media access control address MAC of the second sound box device.
The second speaker device has a display component; the outputting of the corresponding conference information and the conference password comprises: and displaying the corresponding conference information and the conference password through the display component.
The first sound box device and the second sound box device are both provided with a display component and a camera so that a video conference can be carried out between the first sound box device and the second sound box device.
Further comprising: and under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
The voice equipment can also generate a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other equipment terminals transmitted by a server terminal, and the sound box equipment comprises first sound box equipment and/or second sound box equipment; and uploading the conference record to a server.
The voice device has a display section; the method further comprises the following steps: displaying the meeting record through the display component.
The steps in the embodiments of the present application are similar to the corresponding steps in the embodiments described above, and therefore are not described in detail.
According to the embodiment of the application, the adopted software and hardware equipment support the encryption function, so that the communication data can be encrypted in the whole process of initiating the conference, starting the conference and carrying out the conference until finishing the conference, and the encrypted communication data can be transmitted after encryption processing of conference events, conference notifications, participant notifications, conference initiating results, voice data in the conference process, voice recognition text data and the like, so that the security of the conference can be improved, and the method and the device are suitable for various conference scenes. The embodiment of the present application does not limit the encryption manner of software and hardware.
According to the conference scene provided by the embodiment of the application, the user only needs to speak the related contents of the conference, such as initiating the conference, participating in the conference and the like, and can correspondingly initiate the conference and access the conference, so that the user operation can be simplified, the conference efficiency can be improved, and the user experience with higher quality can be provided. In addition, the conference scene can be expanded to other scenes, such as various scenes that one voice device faces different users, for example, a family scene, and family members in different regions can carry out different-place voice chat through voice devices such as a sound box device. One voice device faces different users, and group chat of the users through hardware can be realized based on voice devices such as sound box devices, and interaction is conveniently carried out. In addition, the user can set a schedule on the mobile phone and other devices, and the user is automatically reminded to initiate group chat scenes such as conferences and the like based on the schedule.
Through the wireless performance of audio amplifier equipment, can save the wiring design of meeting equipment, save the cost to be convenient for a plurality of users to participate in the meeting through an audio amplifier equipment, be convenient for team user's access. And the recognition of signal input can be completed based on the far-field pickup function of the sound box equipment, so that the efficiency is improved.
In this application embodiment, audio amplifier equipment can include the display to can show the meeting information of this meeting through the display, for example, the personnel of attending the meeting, information such as meeting name, still can include information such as the presentation, file of broadcast in the meeting, thereby can be based on the convenient relevant information that acquires the meeting of audio amplifier equipment's display, supplementary meeting goes on, in some other embodiments, audio amplifier equipment also can include display and camera, then can hold a video conference based on audio amplifier equipment, shoot the video through the camera, also can play video etc. through audio amplifier equipment.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
On the basis of the above embodiments, the present embodiment further provides a data processing apparatus, which is applied to electronic devices such as a server.
Referring to fig. 13, a block diagram of a data processing apparatus according to an embodiment of the present application is shown, which may specifically include the following modules:
a conference determining module 1302, configured to receive a conference event of the first speaker device, where the conference event is determined according to the first voice data identification; and determining the conference users participating in the conference according to the conference event.
A notification module 1304, configured to send a conference notification to the second sound box device when it is determined that the second sound box device of the participant is online, so that the second sound box device outputs corresponding conference information and a conference password according to the conference notification.
A result determining module 1306, configured to receive a conference participation notification returned by the second sound box device, where the conference participation notification is determined according to second voice data recognition; and determining a corresponding conference initiating result according to the conference participating notification.
A feedback module 1308, configured to send the conference initiation result to the first sound box device and the second sound box device.
The conference determining module 1302 is configured to determine conference parameters according to the conference event, where the conference parameters include a time parameter; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
The notification module 1304 is further configured to determine a connection state of the second speaker device based on a media access control address MAC of the second speaker device. And generating a conference initiating result aiming at the conference failure of the conference participating user under the condition that the second loudspeaker box equipment of the conference participating user is determined to be offline.
The first speaker device and the second speaker device both have a display component and a camera, and the feedback module 1308 is further configured to forward video data of the video conference between the first speaker device and the second speaker device.
On the basis of the above embodiments, the present embodiment further provides a data processing apparatus, which is applied to a second device such as a sound box device.
Referring to fig. 14, a block diagram of another data processing apparatus according to another embodiment of the present application is shown, which may specifically include the following modules:
a conference notification module 1402, configured to receive a conference notification, where the conference notification is determined according to a conference event identified by the first voice data, and the conference notification is sent by the server when it is determined that the speaker device corresponding to the participant user is online; and outputting corresponding conference information and a conference password according to the conference notification.
A participant processing module 1404 configured to receive the second voice data, recognize the second voice data, and determine a participant notification.
The participation feedback module 1406 is configured to send the participation notification, so that the server determines a corresponding conference initiation result according to the participation notification.
The conference notification module 1402 is configured to obtain conference parameters from the conference notification, and determine conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. Further, the meeting notification module 1402 is configured to determine meeting parameters according to the meeting notification, where the meeting parameters include a time parameter; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The conference participation processing module 1404 is configured to perform text recognition on the second voice data to obtain text information; under the condition that the text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification.
The conference processing module 1404 is further configured to output third voice data when the text information does not include a conference password, where the third voice data is used to guide the conference user to input the conference password.
The speaker device has a display component; the meeting notification module 1402 is configured to display the corresponding meeting information and the meeting password through the display component.
The sound box device is provided with a display component and a camera so as to carry out video conference through the sound box device.
The conference participating feedback module 1406 is configured to generate a conference record according to voice text information corresponding to conference voice in a conference process, where the voice text information includes text information obtained by recognizing conference voice data received by a home terminal and/or text information obtained by recognizing conference voice data transmitted by other devices through a server terminal; and uploading the conference record to a server.
The conference feedback module 1406 is further configured to display the conference record through the display component.
On the basis of the above embodiments, the present embodiment further provides a data processing apparatus, which is applied to a first device such as a sound box device.
Referring to fig. 15, a block diagram of a structure of another data processing apparatus embodiment of the present application is shown, which may specifically include the following modules:
a conference initiating module 1502 for receiving voice data; and carrying out voice recognition on the received voice data, and determining a corresponding conference event.
A sending module 1504, configured to send the conference event, so that the server sends a conference notification to the online sound box device of the participating user according to the conference event, and the online sound box device of the participating user outputs corresponding conference information and a conference password according to the conference notification.
The result determination module 1506 is configured to receive a conference initiation result, where the conference initiation result is determined according to a conference notification returned by the online speaker device of the conference participating user.
The conference initiating module 1502 is configured to perform text recognition on the voice data to obtain text information; and identifying a set word in the text information, and determining a conference event according to the set word.
The conference initiating module 1502 is further configured to extract voiceprint information from the voice data, and perform voiceprint verification according to the voiceprint information; and under the condition that the voiceprint check is passed, executing the step of determining the conference event according to the setting words.
In an embodiment of the application, the first device comprises a speaker device. The speaker device has a display component; the result determination module 1506 is further configured to display the corresponding meeting information and the meeting password via the display component.
The sound box device is provided with a display component and a camera so as to carry out video conference through the sound box device.
The sending module 1504 is further configured to generate a conference record according to voice text information corresponding to conference voice in a conference process, where the voice text information includes text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other devices at a server terminal; and uploading the conference record to a server.
The sending module 1504 is applied to a sound box device with a display component, and is further used for displaying the conference record through the display component.
On the basis of the above embodiments, the embodiment of the present application further provides a data processing system, which may be a conference system and the like, and includes a first device, a second device, and a server. The first device comprises a first sound box device, the second device comprises a second sound box device, and the server can comprise various devices, apparatuses or software modules and the like which can provide services for a single server, a server cluster, a cloud device and/or a virtual machine and the like.
In an optional embodiment, the first device is configured to perform voice recognition on the received first voice data, and determine a corresponding conference event; and sending the conference event to a server.
The server is used for determining a conference user participating in a conference according to the conference event and sending a conference notification to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; and determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first sound box device and the second sound box device.
The second device is used for outputting corresponding conference information and a conference password according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
The second device is used for acquiring conference parameters from the conference notification and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. Further, the second device determines a conference parameter according to the conference notification, wherein the conference parameter includes a time parameter; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The first equipment performs text recognition on the first voice data to obtain first text information; and identifying a set word in the first text information, and determining a conference event according to the set word. The second equipment performs text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification. And under the condition that the text information does not contain the conference password, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
The server side determines conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
And the server side also determines the connection state of the second sound box device based on the media access control address MAC of the second sound box device. And under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
The second speaker device has a display component; the second speaker device may display the corresponding conference information and the conference password through the display part.
The first sound box device and the second sound box device are both provided with a display component and a camera so that a video conference can be carried out between the first sound box device and the second sound box device.
The first sound box equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by a second sound box equipment terminal through a server terminal; and uploading the conference record to a server. The first speaker device has a display component; the first sound box device further displays the conference record through the display component.
The second sound box equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other sound box equipment terminals through a service terminal; and uploading the conference record to a server. The second speaker device has a display component; and the second sound box device also displays the conference record through the display component. Wherein, other audio amplifier equipment end includes first audio amplifier equipment and other second audio amplifier equipment.
In another alternative embodiment:
the first sound box equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server.
The server determines a conference user participating in a conference according to the conference event, and sends a conference notification to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; and determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first sound box device and the second sound box device.
The second sound box equipment outputs corresponding conference information according to the conference notification; and receiving second voice data, and sending the participation notification to a server side under the condition that the conference password is recognized by the second voice data.
The second sound box device performs text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
And the second sound box equipment also outputs third voice data under the condition that the conference password is not recognized by the second voice data, wherein the third voice data is used for guiding the conference user to input the conference password.
The first sound box equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by a second sound box equipment terminal through a server terminal; and uploading the conference record to a server. The first speaker device has a display component; the first sound box device further displays the conference record through the display component.
The second sound box equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other sound box equipment terminals through a service terminal; and uploading the conference record to a server. The second speaker device has a display component; and the second sound box device also displays the conference record through the display component. Wherein, other audio amplifier equipment end includes first audio amplifier equipment and other second audio amplifier equipment.
In other alternative embodiments:
the first equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server.
The server determines the conference users participating in the conference according to the conference event, and sends a conference notice to the second equipment under the condition that the second equipment of the conference users is determined to be online; and determining a corresponding conference initiating result according to the participation notification, and sending the conference initiating result to the first equipment and the second equipment.
And the second equipment outputs corresponding conference information according to the conference notice, receives second voice data, identifies the second voice data, determines a participation notice and sends the participation notice to a server.
In the foregoing embodiments, the functions of the server, the first device, and the second device may also be implemented in a data processing apparatus, and refer to the discussion of the foregoing embodiments specifically.
On the basis of the above embodiment, the embodiment of the present application further provides a data processing apparatus, which is applied to a second device, such as a sound box device, and can provide a convenient conference access step based on a voice output conference notification and a voice join conference.
Referring to fig. 16, a block diagram of a further data processing apparatus according to another embodiment of the present application is shown, which may specifically include the following modules:
a notification obtaining module 1602, configured to receive a conference notification, where the conference notification is determined according to the conference event identified by the first voice data, and the conference notification is sent by the server when it is determined that the speaker device corresponding to the participant user is online.
The notification output module 1604 is configured to output corresponding meeting information according to the meeting notification.
A voice conferencing module 1606 configured to receive the second voice data; and sending the participation notice to a server side under the condition that the conference password is recognized by the second voice data.
The voice conference module 1606 is configured to output third voice data when the second voice data identifies a conference password, where the third voice data is used to guide the conference user to input the conference password.
A voice participating module 1606, configured to perform text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
In an example, the notification output module 1604 is configured to obtain meeting information from the meeting notification, and output the meeting information in a voice manner.
In another example, the notification output module 1604 is configured to determine meeting parameters according to the meeting notification, where the meeting parameters include a time parameter; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The speaker device has a display component; the notification output module 1604 is configured to display the corresponding meeting information through the display component.
The speaker device has a display part and a camera to perform a video conference.
The voice conference participating module 1606 is further configured to generate a conference record according to voice text information corresponding to conference voice in a conference process, where the voice text information includes text information obtained by recognizing conference voice data received by a home terminal and/or text information obtained by recognizing conference voice data transmitted by another device terminal transmitted by a server; and uploading the conference record to a server.
The method is applied to a voice device with a display component; the voice participating module 1606 is further configured to display the conference record through the display component.
On the basis of the above embodiments, the present application embodiment further provides a data processing system, which may be a conference system, a group chat system, or the like, and includes a first voice device, a second voice device, and a server. The voice device comprises various devices supporting voice input and output functions such as a sound box device, an interphone, a television, a mobile phone and a tablet personal computer, and the server can comprise various devices, devices or software modules and the like providing services for a single server, a server cluster, cloud equipment and/or a virtual machine.
The first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server.
The server side determines a conference user participating in a conference according to the conference event, and sends a conference notification to a second voice device of the conference user under the condition that the second voice device is determined to be on line; and determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first voice equipment and the second voice equipment.
The second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
The second voice equipment is used for acquiring conference parameters from the conference notification and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. Further, the second device determines a conference parameter according to the conference notification, wherein the conference parameter includes a time parameter; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The first voice equipment performs text recognition on the first voice data to obtain first text information; and identifying a set word in the first text information, and determining a conference event according to the set word. The second equipment performs text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification. And under the condition that the text information does not contain the conference password, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
The server side determines conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
And the server side also determines the connection state of the second sound box device based on the media access control address MAC of the second sound box device. And under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
The second voice device has a display section; the second voice device may display the corresponding conference information and the conference password through the display part.
The first voice device and the second voice device are both provided with a display component and a camera so as to carry out video conference between the first voice device and the second voice device.
The first voice equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by a second voice equipment terminal; and uploading the conference record to a server. The first voice device has a display section; the first voice device also displays the conference record through the display component.
The second voice equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other voice equipment terminals transmitted by a service terminal; and uploading the conference record to a server. The second voice device has a display section; the second voice device also displays the conference record through the display component. And the other loudspeaker box equipment ends comprise first voice equipment and other second voice equipment.
On the basis of the above embodiments, the present application embodiment further provides a data processing system, which may be a conference system, a group chat system, or the like, and includes a first voice device, a second voice device, and a server. The voice device comprises various devices supporting voice input and output functions such as a sound box device, an interphone, a television, a mobile phone and a tablet personal computer, and the server can comprise various devices, devices or software modules and the like providing services for a single server, a server cluster, cloud equipment and/or a virtual machine. And the user can reserve to initiate the appointed conference in the process of initiating the conference, and the initiating process is executed after the corresponding conference condition is reached.
The first voice equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; sending the conference event to a server, wherein the conference event comprises a conference reservation instruction;
the server side determines the conference users participating in the conference according to the conference event, determines that the conference conditions corresponding to the conference reservation instruction are met, and sends a conference notice to the second voice equipment under the condition that the second voice equipment of the conference users is determined to be on-line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first voice equipment and the second voice equipment;
the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
The second voice equipment is used for acquiring conference parameters from the conference notification and determining conference information and a conference password according to the conference parameters; and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode. Further, the second device determines a conference parameter according to the conference notification, wherein the conference parameter includes a time parameter; and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords. The conference parameters further include at least one of: location parameters, user parameters, content parameters.
The first voice equipment performs text recognition on the first voice data to obtain first text information; and identifying a set word in the first text information, and determining a conference event according to the set word. The second equipment performs text recognition on the second voice data to obtain second text information; under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data; performing voiceprint verification according to the recognized voiceprint information; and after the voiceprint check is passed, determining the corresponding participant notification. And under the condition that the text information does not contain the conference password, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
The server side determines conference parameters according to the conference event, wherein the conference parameters comprise time parameters; and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
And the server side also determines the connection state of the second sound box device based on the media access control address MAC of the second sound box device. And under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
The second voice device has a display section; the second voice device may display the corresponding conference information and the conference password through the display part.
The first voice device and the second voice device are both provided with a display component and a camera so as to carry out video conference between the first voice device and the second voice device.
The conference recording method comprises the steps that a voice device generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other devices through a server terminal, and the voice device comprises a first voice device and/or a second voice device; and uploading the conference record to a server. Wherein the voice device comprises a voice device having a display component; the method further comprises the following steps: displaying the meeting record through the display component.
The first voice equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by a second voice equipment terminal; and uploading the conference record to a server. The first voice device has a display section; the first voice device also displays the conference record through the display component.
The second voice equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other voice equipment terminals transmitted by a service terminal; and uploading the conference record to a server. The second voice device has a display section; the second voice device also displays the conference record through the display component. And the other loudspeaker box equipment ends comprise first voice equipment and other second voice equipment.
According to the embodiment of the application, the adopted software and hardware equipment support the encryption function, so that the communication data can be encrypted in the whole process of initiating the conference, starting the conference and carrying out the conference until finishing the conference, and the encrypted communication data can be transmitted after encryption processing of conference events, conference notifications, participant notifications, conference initiating results, voice data in the conference process, voice recognition text data and the like, so that the security of the conference can be improved, and the method and the device are suitable for various conference scenes. The embodiment of the present application does not limit the encryption manner of software and hardware.
According to the conference scene provided by the embodiment of the application, the user only needs to speak the related contents of the conference, such as initiating the conference, participating in the conference and the like, and can correspondingly initiate the conference and access the conference, so that the user operation can be simplified, the conference efficiency can be improved, and the user experience with higher quality can be provided. In addition, the conference scene can be expanded to other scenes, such as various scenes that one voice device faces different users, for example, a family scene, and family members in different regions can carry out different-place voice chat through voice devices such as a sound box device. One voice device faces different users, and group chat of the users through hardware can be realized based on voice devices such as sound box devices, and interaction is conveniently carried out. In addition, the user can set a schedule on the mobile phone and other devices, and the user is automatically reminded to initiate group chat scenes such as conferences and the like based on the schedule.
According to the conference processing method and device, the conference can be initiated based on the voice, then the equipment of each conference participating user participating in the conference is called, and the user can participate in the conference based on the voice response, so that the user can be allowed to access after the conference password of the conference participating notification is verified, the conference can be initiated and participated conveniently, and the processing efficiency is improved.
Through the wireless performance of audio amplifier equipment, can save the wiring design of meeting equipment, save the cost to be convenient for a plurality of users to participate in the meeting through an audio amplifier equipment, be convenient for team user's access. And the recognition of signal input can be completed based on the far-field pickup function of the sound box equipment, so that the efficiency is improved.
The present application further provides a non-transitory, readable storage medium, where one or more modules (programs) are stored, and when the one or more modules are applied to a device, the device may execute instructions (instructions) of method steps in this application.
Embodiments of the present application provide one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an electronic device to perform the methods as described in one or more of the above embodiments. In the embodiment of the present application, the electronic device includes various types of devices such as a terminal device and a server (cluster).
Embodiments of the present disclosure may be implemented as an apparatus, which may include electronic devices such as a terminal device, a server (cluster), etc., using any suitable hardware, firmware, software, or any combination thereof, to perform a desired configuration. Fig. 17 schematically illustrates an example apparatus 1700 that may be used to implement various embodiments described herein.
For one embodiment, fig. 17 illustrates an example apparatus 1700 having one or more processors 1702, a control module (chipset) 1704 coupled to at least one of the processor(s) 1702, a memory 1706 coupled to the control module 1704, a non-volatile memory (NVM)/storage 1708 coupled to the control module 1704, one or more input/output devices 1710 coupled to the control module 1704, and a network interface 1712 coupled to the control module 1704.
The processor 1702 may include one or more single-core or multi-core processors, and the processor 1702 may include any combination of general-purpose processors or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the apparatus 1700 can be used as a terminal device, a server (cluster), or the like in this embodiment.
In some embodiments, the apparatus 1700 may include one or more computer-readable media (e.g., the memory 1706 or the NVM/storage 1708) having instructions 1714 and one or more processors 1702 in combination with the one or more computer-readable media and configured to execute the instructions 1714 to implement modules to perform the actions described in this disclosure.
For one embodiment, the control module 1704 may include any suitable interface controller to provide any suitable interface to at least one of the processor(s) 1702 and/or any suitable device or component in communication with the control module 1704.
The control module 1704 may include a memory controller module to provide an interface to the memory 1706. The memory controller module may be a hardware module, a software module, and/or a firmware module.
The memory 1706 may be used, for example, to load and store data and/or instructions 1714 for the apparatus 1700. For one embodiment, memory 1706 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the memory 1706 may comprise a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).
For one embodiment, the control module 1704 may include one or more input/output controllers to provide an interface to the NVM/storage device 1708 and input/output device(s) 1710.
For example, NVM/storage 1708 may be used to store data and/or instructions 1714. The NVM/storage 1708 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard disk drive(s) (HDD (s)), one or more Compact Disc (CD) drive(s), and/or one or more Digital Versatile Disc (DVD) drive (s)).
The NVM/storage 1708 may include storage resources that are physically part of the device on which the apparatus 1700 is installed, or it may be accessible by the device and may not necessarily be part of the device. For example, the NVM/storage 1708 may be accessible over a network via input/output device(s) 1710.
Input/output device(s) 1710 may provide an interface for apparatus 1700 to communicate with any other suitable device, and input/output devices 1710 may include communication components, audio components, sensor components, and so forth. The network interface 1712 may provide an interface for the device 1700 to communicate over one or more networks, and the device 1700 may wirelessly communicate with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as access to a communication standard-based wireless network, such as WiFi, 2G, 3G, 4G, 5G, etc., or a combination thereof.
For one embodiment, at least one of the processor(s) 1702 may be packaged together with logic to control one or more controllers (e.g., memory controller modules) of the module 1704. For one embodiment, at least one of the processor(s) 1702 may be packaged together with logic for one or more controllers of the control module 1704 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 1702 may be integrated on the same die with the logic of one or more controllers of the control module 1704. For one embodiment, at least one of the processor(s) 1702 may be integrated on the same die with logic for one or more controllers of control module 1704 to form a system on a chip (SoC).
In various embodiments, apparatus 1700 may be, but is not limited to being: a server, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.), among other terminal devices. In various embodiments, apparatus 1700 may have more or fewer components and/or different architectures. For example, in some embodiments, device 1700 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.
The detection device may adopt a main control chip as a processor or a control module, the sensor data, the position information and the like are stored in a memory or an NVM/storage device, the sensor group may serve as an input/output device, and the communication interface may include a network interface.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The foregoing detailed description has provided a data processing method and apparatus, an electronic device and a storage medium, and the principles and embodiments of the present application are described herein using specific examples, which are merely used to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (72)

1. A method of data processing, the method comprising:
the first sound box equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server;
the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line;
the second sound box equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server;
and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first sound box device and the second sound box device.
2. The method of claim 1, wherein outputting the corresponding meeting information and the meeting password according to the meeting notification comprises:
acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters;
and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode.
3. The method of claim 2, wherein obtaining meeting parameters from the meeting notification, and determining meeting information and a meeting password according to the meeting parameters comprises:
determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters;
and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords.
4. The method of claim 3, wherein the conference parameters further comprise at least one of: location parameters, user parameters, content parameters.
5. The method of claim 1, wherein performing speech recognition on the received first speech data to determine the corresponding conference event comprises:
performing text recognition on the first voice data to obtain first text information;
and identifying a set word in the first text information, and determining a conference event according to the set word.
6. The method of claim 1, wherein the recognizing the second speech data and determining a meeting notification comprises:
performing text recognition on the second voice data to obtain second text information;
under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data;
performing voiceprint verification according to the recognized voiceprint information;
and after the voiceprint check is passed, determining the corresponding participant notification.
7. The method of claim 6, further comprising:
and under the condition that the second text information does not contain the conference password, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
8. The method of claim 1, wherein said determining participant users to attend a conference based on said conference event comprises:
determining conference parameters according to the conference event, wherein the conference parameters comprise time parameters;
and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
9. The method of claim 8, further comprising:
and determining the connection state of the second sound box device based on the media access control address MAC of the second sound box device.
10. The method of claim 1, wherein the second loudspeaker device has a display component; the outputting of the corresponding conference information and the conference password comprises:
and displaying the corresponding conference information and the conference password through the display component.
11. The method of claim 1, wherein the first and second off-box devices each have a display component and a camera to conduct the video conference between the first and second off-box devices.
12. The method of claim 1, further comprising:
and under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
13. The method of claim 1, further comprising:
the sound box equipment generates a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other equipment terminals and transmitted by a server terminal, and the sound box equipment comprises first sound box equipment and/or second sound box equipment;
and uploading the conference record to a server.
14. The method of claim 13, wherein the sound box device has a display component; the method further comprises the following steps:
displaying the meeting record through the display component.
15. A method of data processing, the method comprising:
receiving a conference notification, wherein the conference notification is determined according to a conference event identified by first voice data, and the conference notification is sent out when a server determines that sound box equipment corresponding to a participant is on line;
outputting corresponding conference information and conference passwords according to the conference notification;
receiving second voice data, recognizing the second voice data and determining a participation notice;
and sending the participation notification so that the server side determines a corresponding conference initiating result according to the participation notification.
16. The method of claim 15, wherein outputting the corresponding meeting information and meeting password according to the meeting notification comprises:
acquiring conference parameters from the conference notification, and determining conference information and a conference password according to the conference parameters;
and carrying out voice conversion on the conference information and the conference password, and outputting the conference information and the conference password in a voice output mode.
17. The method of claim 16, wherein obtaining meeting parameters from the meeting notification, and determining meeting information and a meeting password according to the meeting parameters comprises:
determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters;
and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords.
18. The method of claim 17, wherein the conference parameters further comprise at least one of: location parameters, user parameters, content parameters.
19. The method of claim 15, wherein the recognizing the second speech data and determining a participant notification comprises:
performing text recognition on the second voice data to obtain corresponding text information;
under the condition that the text information contains a conference password, performing voiceprint recognition on the voice data;
performing voiceprint verification according to the recognized voiceprint information;
and after the voiceprint check is passed, determining the corresponding participant notification.
20. The method of claim 19, further comprising:
and under the condition that the text information does not contain the conference password, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
21. A method according to any of claims 15-20, applied to a sound box device.
22. The method of claim 21, wherein the sound box device has a display component; the outputting of the corresponding conference information and the conference password comprises:
and displaying the corresponding conference information and the conference password through the display component.
23. The method of claim 21, wherein the audio box device has a display component and a camera to enable video conferencing through the audio box device.
24. The method of claim 15, further comprising:
generating a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other equipment terminals through a server terminal;
and uploading the conference record to a server.
25. The method of claim 24, applied to a sound box having a display component, the method further comprising:
displaying the meeting record through the display component.
26. A method of data processing, the method comprising:
receiving a conference event of first sound box equipment, wherein the conference event is identified and determined according to first voice data;
determining the conference users participating in the conference according to the conference event;
under the condition that the second sound box equipment of the participating user is determined to be online, sending a conference notification to the second sound box equipment so that the second sound box equipment outputs corresponding conference information and a conference password according to the conference notification;
receiving a participant notification returned by the second sound box device, wherein the participant notification is identified and determined according to second voice data;
and determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first sound box device and the second sound box device.
27. The method of claim 26, wherein said determining participant users to attend a conference based on said conference event comprises:
determining conference parameters according to the conference event, wherein the conference parameters comprise time parameters;
and inquiring corresponding conference information according to the conference parameters, and acquiring the conference participating users who refer to the conference and the second loudspeaker box devices of the conference participating users from the conference information.
28. The method of claim 27, further comprising:
and determining the connection state of the second sound box device based on the media access control address MAC of the second sound box device.
29. The method of claim 26, further comprising:
and under the condition that the second loudspeaker box equipment of the participating user is determined to be offline, generating a conference initiating result aiming at the conference failure of the participating user.
30. The method of claim 26, wherein the first and second enclosure devices each have a display component and a camera; the method further comprises the following steps:
and forwarding the video data of the video conference between the first sound box device and the second sound box device.
31. A method of data processing, the method comprising:
receiving voice data;
carrying out voice recognition on the received voice data, and determining a corresponding conference event;
sending the conference event so that the server side sends a conference notification to the online sound box equipment of the participating user according to the conference event, and the online sound box equipment of the participating user outputs corresponding conference information and a conference password according to the conference notification;
and receiving a conference initiating result, wherein the conference initiating result is determined according to the conference participation notification returned by the online sound box equipment of the conference participating user.
32. The method of claim 31, wherein performing speech recognition on the received speech data to determine the corresponding conference event comprises:
performing text recognition on the voice data to obtain text information;
and identifying a set word in the text information, and determining a conference event according to the set word.
33. The method of claim 32, further comprising:
extracting voiceprint information from the voice data, and carrying out voiceprint verification according to the voiceprint information;
and under the condition that the voiceprint check is passed, executing the step of determining the conference event according to the setting words.
34. A method according to any of claims 31-33, applied to a sound box apparatus.
35. The method of claim 34, wherein the sound box device has a display component; the method further comprises the following steps:
and displaying the corresponding conference information and the conference password through the display component.
36. The method of claim 34, wherein the audio enclosure device has a display component and a camera to enable video conferencing through the audio enclosure device.
37. The method of claim 31, further comprising:
generating a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other equipment terminals through a server terminal;
and uploading the conference record to a server.
38. The method of claim 37, applied to a sound box having a display component, the method further comprising:
displaying the meeting record through the display component.
39. A method of data processing, the method comprising:
the first sound box equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server;
the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line;
the second sound box equipment outputs corresponding conference information according to the conference notification; receiving second voice data, and sending the participation notice to a server side under the condition that the conference password is identified by the second voice data;
and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first sound box device and the second sound box device.
40. The method of claim 39, wherein the sending the notification of participation to a server if the conference password is recognized by the second voice data comprises:
performing text recognition on the second voice data to obtain second text information;
under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data;
performing voiceprint verification according to the recognized voiceprint information;
and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
41. The method of claim 39, further comprising:
and under the condition that the conference password is not recognized by the second voice data, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
42. A method of data processing, the method comprising:
the first equipment carries out voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server;
the server side determines a conference user participating in the conference according to the conference event, and sends a conference notice to second equipment of the conference user under the condition that the second equipment is determined to be online;
the second equipment outputs corresponding conference information according to the conference notification, receives second voice data, identifies the second voice data, determines a participation notification and sends the participation notification to a server;
and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first equipment and the second equipment.
43. A method of data processing, the method comprising:
receiving a conference notification, wherein the conference notification is determined according to a conference event identified by first voice data, and the conference notification is sent out when a server determines that sound box equipment corresponding to a participant is on line;
outputting corresponding meeting information according to the meeting notice;
receiving second voice data;
and sending the participation notice to a server side under the condition that the conference password is recognized by the second voice data.
44. The method of claim 43, further comprising:
and under the condition that the conference password is recognized by the second voice data, outputting third voice data, wherein the third voice data is used for guiding the conference participating user to input the conference password.
45. The method of claim 43, wherein the sending the notification of participation to a server if the conference password is recognized by the second voice data comprises:
performing text recognition on the second voice data to obtain second text information;
under the condition that the second text information contains a conference password, performing voiceprint recognition on the voice data;
performing voiceprint verification according to the recognized voiceprint information;
and after the voiceprint check is passed, determining a corresponding conference notice and sending the conference notice.
46. The method of claim 43, wherein outputting the corresponding meeting information according to the meeting notification comprises:
and acquiring meeting information from the meeting notice, and outputting the meeting information in a voice mode.
47. The method of claim 43, wherein outputting the corresponding meeting information according to the meeting notification comprises:
determining conference parameters according to the conference notification, wherein the conference parameters comprise time parameters;
and inquiring according to the conference parameters, and determining corresponding conference information and conference passwords.
48. The method of claim 47, wherein the conference parameters further comprise at least one of: location parameters, user parameters, content parameters.
49. A method according to any of claims 43 to 47, applied to sound box apparatus.
50. The method of claim 49, wherein the sound box device has a display component; the outputting the corresponding conference information comprises:
and displaying the corresponding meeting information through the display component.
51. The method of claim 49, wherein the audio enclosure device has a display component and a camera for conducting a video conference.
52. The method of claim 43, further comprising:
generating a conference record according to voice text information corresponding to conference voice in a conference process, wherein the voice text information comprises text information obtained by recognizing conference voice data received by a local terminal and/or text information obtained by recognizing conference voice data transmitted by other equipment terminals through a server terminal;
and uploading the conference record to a server.
53. The method of claim 52, applied to a voice device having a display component; the method further comprises the following steps:
displaying the meeting record through the display component.
54. A data processing system, characterized in that the system comprises: the system comprises a first sound box device, a second sound box device and a server side;
the first sound box equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server;
the server side determines a conference user participating in a conference according to the conference event, and sends a conference notice to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first loudspeaker box device and the second loudspeaker box device;
the second sound box equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
55. A data processing apparatus, characterized in that the apparatus comprises:
the conference notification module is used for receiving a conference notification, the conference notification is determined according to the conference event identified by the first voice data, and the conference notification is sent out when the server determines that the sound box equipment corresponding to the participant is on line; outputting corresponding conference information and conference passwords according to the conference notification;
the conference participating processing module is used for receiving second voice data, recognizing the second voice data and determining a conference participating notification;
and the conference feedback module is used for receiving the conference initiating result corresponding to the conference notification.
56. A data processing apparatus, characterized in that the apparatus comprises:
the conference determining module is used for receiving a conference event of the first loudspeaker box device, and the conference event is identified and determined according to the first voice data; determining the conference users participating in the conference according to the conference event;
the notification module is used for sending a conference notification to the second sound box device under the condition that the second sound box device of the conference participating user is determined to be on line, so that the second sound box device outputs corresponding conference information and a conference password according to the conference notification;
the result determining module is used for receiving a conference participation notification returned by the second sound box device, and the conference participation notification is determined according to second voice data identification; and determining a corresponding conference initiating result according to the conference participating notification.
57. A data processing apparatus, characterized in that the apparatus comprises:
the conference initiating module is used for receiving voice data; carrying out voice recognition on the received voice data, and determining a corresponding conference event;
the sending module is used for sending the conference event so that the server side sends a conference notice to the online sound box equipment of the participating user according to the conference event, and the online sound box equipment of the participating user outputs corresponding conference information and a conference password according to the conference notice;
and the result determining module is used for receiving a conference initiating result, and the conference initiating result is determined according to the conference participation notification returned by the online sound box equipment of the conference participating user.
58. A data processing system, characterized in that the system comprises: the system comprises a first sound box device, a second sound box device and a server side;
the first sound box equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server;
the server determines a conference user participating in a conference according to the conference event, and sends a conference notification to a second sound box device of the conference user under the condition that the second sound box device is determined to be on line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first loudspeaker box device and the second loudspeaker box device;
the second sound box equipment outputs corresponding conference information according to the conference notification; and receiving second voice data, and sending the participation notification to a server side under the condition that the conference password is recognized by the second voice data.
59. A data processing system, characterized in that the system comprises: the system comprises a first device, a second device and a server;
the first equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; and sending the conference event to a server;
the server determines the conference users participating in the conference according to the conference event, and sends a conference notice to the second equipment under the condition that the second equipment of the conference users is determined to be online; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first equipment and the second equipment;
and the second equipment outputs corresponding conference information according to the conference notice, receives second voice data, identifies the second voice data, determines a participation notice and sends the participation notice to a server.
60. A data processing apparatus, characterized in that the apparatus comprises:
the conference notification is determined according to the conference event identified by the first voice data, and the conference notification is sent by the server under the condition that the loudspeaker box equipment corresponding to the participant user is determined to be on-line;
the notification output module is used for outputting corresponding conference information according to the conference notification;
the voice participating module is used for receiving second voice data; and sending the participation notice to a server side under the condition that the conference password is recognized by the second voice data.
61. An electronic device, comprising: a processor; and
memory having stored thereon executable code which, when executed, causes the processor to perform a data processing method as claimed in one or more of claims 15-25.
62. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform a data processing method as recited in one or more of claims 15-25.
63. An electronic device, comprising: a processor; and
memory having stored thereon executable code which, when executed, causes the processor to perform a data processing method as claimed in one or more of claims 26-30.
64. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform a data processing method as recited in one or more of claims 26-30.
65. An electronic device, comprising: a processor; and
memory having stored thereon executable code which, when executed, causes the processor to perform a data processing method as claimed in one or more of claims 31-38.
66. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform a data processing method as recited in one or more of claims 31-38.
67. An electronic device, comprising: a processor; and
memory having stored thereon executable code which, when executed, causes the processor to perform a data processing method as claimed in one or more of claims 43-53.
68. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform a data processing method as recited in one or more of claims 13-53.
69. A method of data processing, the method comprising:
the first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server;
the server side determines a conference user participating in the conference according to the conference event, and sends a conference notification to a second voice device of the conference user under the condition that the second voice device is determined to be on line;
the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server;
and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first voice equipment and the second voice equipment.
70. A data processing system, characterized in that the system comprises: first voice equipment, server and second voice equipment, wherein:
the first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; and sending the conference event to a server;
the server side determines a conference user participating in a conference according to the conference event, and sends a conference notification to a second voice device of the conference user under the condition that the second voice device is determined to be on line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first voice equipment and the second voice equipment;
the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
71. A method of data processing, the method comprising:
the first voice equipment performs voice recognition on the received first voice data and determines a corresponding conference event; sending the conference event to a server, wherein the conference event comprises a conference reservation instruction;
the server side determines a conference user participating in a conference according to the conference event, determines that the conference condition corresponding to the conference reservation instruction is met, and sends a conference notice to a second voice device of the conference user under the condition that the second voice device is determined to be on-line;
the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server;
and the server determines a corresponding conference initiating result according to the conference participating notification, and sends the conference initiating result to the first voice equipment and the second voice equipment.
72. A data processing system, characterized in that the system comprises: first voice equipment, server and second voice equipment, wherein:
the first voice equipment is used for carrying out voice recognition on the received first voice data and determining a corresponding conference event; sending the conference event to a server, wherein the conference event comprises a conference reservation instruction;
the server side determines the conference users participating in the conference according to the conference event, determines that the conference conditions corresponding to the conference reservation instruction are met, and sends a conference notice to the second voice equipment under the condition that the second voice equipment of the conference users is determined to be on-line; determining a corresponding conference initiating result according to the conference participating notification, and sending the conference initiating result to the first voice equipment and the second voice equipment;
the second voice equipment outputs corresponding conference information and conference passwords according to the conference notification; and receiving second voice data, recognizing the second voice data, determining a participation notice, and sending the participation notice to a server.
CN201910697173.6A 2019-07-30 2019-07-30 Data processing method, device, equipment and storage medium Pending CN112399022A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910697173.6A CN112399022A (en) 2019-07-30 2019-07-30 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910697173.6A CN112399022A (en) 2019-07-30 2019-07-30 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112399022A true CN112399022A (en) 2021-02-23

Family

ID=74601248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910697173.6A Pending CN112399022A (en) 2019-07-30 2019-07-30 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112399022A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086887A (en) * 2022-05-11 2022-09-20 山东工商学院 Instant messaging system and method based on 5G local area network
CN116052666A (en) * 2023-02-21 2023-05-02 之江实验室 Voice message processing method, device, system, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101257395A (en) * 2007-02-27 2008-09-03 中国移动通信集团公司 System and method for supporting multimedia conference booking
CN104144154A (en) * 2013-05-10 2014-11-12 华为技术有限公司 Method, device and system for initiating appointment conference
CN208862988U (en) * 2018-08-24 2019-05-14 深圳市冠旭电子股份有限公司 A kind of intelligent sound box and video conferencing system
CN109951519A (en) * 2019-01-22 2019-06-28 视联动力信息技术股份有限公司 A kind of control method and device of convention business

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101257395A (en) * 2007-02-27 2008-09-03 中国移动通信集团公司 System and method for supporting multimedia conference booking
CN104144154A (en) * 2013-05-10 2014-11-12 华为技术有限公司 Method, device and system for initiating appointment conference
CN208862988U (en) * 2018-08-24 2019-05-14 深圳市冠旭电子股份有限公司 A kind of intelligent sound box and video conferencing system
CN109951519A (en) * 2019-01-22 2019-06-28 视联动力信息技术股份有限公司 A kind of control method and device of convention business

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086887A (en) * 2022-05-11 2022-09-20 山东工商学院 Instant messaging system and method based on 5G local area network
CN115086887B (en) * 2022-05-11 2023-11-24 山东工商学院 Instant messaging system based on 5G local area network
CN116052666A (en) * 2023-02-21 2023-05-02 之江实验室 Voice message processing method, device, system, electronic device and storage medium

Similar Documents

Publication Publication Date Title
US9438993B2 (en) Methods and devices to generate multiple-channel audio recordings
US10574827B1 (en) Method and apparatus of processing user data of a multi-speaker conference call
US8600025B2 (en) System and method for merging voice calls based on topics
US10732924B2 (en) Teleconference recording management system
WO2016127691A1 (en) Method and apparatus for broadcasting dynamic information in multimedia conference
US20160165044A1 (en) System and method for call authentication
WO2019184650A1 (en) Subtitle generation method and terminal
CN109361527B (en) Voice conference recording method and system
CN112350834B (en) AI voice conference system with screen and method
CN112399022A (en) Data processing method, device, equipment and storage medium
NO341316B1 (en) Method and system for associating an external device to a video conferencing session.
US20140205116A1 (en) System, device, and method for establishing a microphone array using computing devices
KR101622661B1 (en) Method and system for making automatically minutes file of remote meeting
EP2775694B1 (en) Methods and devices to generate multiple-channel audio recordings with location-based registration
US9525979B2 (en) Remote control of separate audio streams with audio authentication
CN112689115A (en) Multi-party conference system and control method
US11089164B2 (en) Teleconference recording management system
US10721558B2 (en) Audio recording system and method
CN113542661A (en) Video conference voice recognition method and system
CN107888790A (en) The way of recording and device of videoconference
US20190333517A1 (en) Transcription of communications
WO2017078003A1 (en) Content reproduction system, authentication method, and medium
JP7393000B2 (en) Teleconferencing devices, systems, methods and programs
CN113139392B (en) Conference summary generation method, device and storage medium
US20240129432A1 (en) Systems and methods for enabling a smart search and the sharing of results during a conference

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210223

RJ01 Rejection of invention patent application after publication