CN112968826B - Voice interaction method and device and electronic equipment - Google Patents

Voice interaction method and device and electronic equipment Download PDF

Info

Publication number
CN112968826B
CN112968826B CN202110165972.6A CN202110165972A CN112968826B CN 112968826 B CN112968826 B CN 112968826B CN 202110165972 A CN202110165972 A CN 202110165972A CN 112968826 B CN112968826 B CN 112968826B
Authority
CN
China
Prior art keywords
voice channel
control
user
open
displayed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110165972.6A
Other languages
Chinese (zh)
Other versions
CN112968826A (en
Inventor
程斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Publication of CN112968826A publication Critical patent/CN112968826A/en
Application granted granted Critical
Publication of CN112968826B publication Critical patent/CN112968826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]
    • H04L51/046Interoperability with other network applications or services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/07User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail characterised by the inclusion of specific contents
    • H04L51/10Multimedia information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/07User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail characterised by the inclusion of specific contents
    • H04L51/18Commands or executable codes

Abstract

The embodiment of the invention discloses a voice interaction method, a voice interaction device and electronic equipment. One embodiment of the method comprises the following steps: in response to detecting triggering operation of a preset voice channel opening control in the instant messaging application, sending an opening voice channel establishment request to a server, so that the server executes the following steps: and in response to determining to establish the open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request. Thus, a new voice interaction mode is provided.

Description

Voice interaction method and device and electronic equipment
Citation of related application
The present application claims priority from chinese patent application No. 202010081146.9 entitled "voice interaction method, apparatus and electronic device", filed on month 02 and 05 of 2020, which application is incorporated herein by reference in its entirety.
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a voice interaction method, a voice interaction device and electronic equipment.
Background
The development of internet technology makes the communication of users more and more convenient, and users can mutually transmit information, such as text information, voice information, video information and the like, through the internet at any time.
The multiple users can also realize online collaboration through the Internet, so that a certain task can be remotely and jointly completed.
Disclosure of Invention
This disclosure is provided in part to introduce concepts in a simplified form that are further described below in the detailed description. This disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The embodiment of the disclosure provides a voice interaction method, a voice interaction device and electronic equipment, which realize that preset identifiers are added to information in an information flow page so as to achieve the purpose of facilitating a user to search the information later.
In a first aspect, an embodiment of the present disclosure provides a voice interaction method, applied to a terminal device, where the method includes: in response to detecting triggering operation of a preset voice channel opening control in the instant messaging application, sending an opening voice channel establishment request to a server, so that the server executes the following steps: and in response to determining to establish the open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request.
In a second aspect, an embodiment of the present disclosure provides a voice interaction apparatus, which is applied to a terminal device, including: the sending unit is used for responding to the detection of the triggering operation of the preset voice channel opening control in the instant messaging application and sending an opening voice channel establishment request to the server so that the server executes the following steps: and in response to determining to establish the open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the voice interaction method as described in the first aspect.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, implements the steps of the voice interaction method as described in the first aspect.
According to the voice interaction method, the voice interaction device and the electronic equipment, the first client responds to the detection of the triggering operation of the preset voice channel opening control in the instant messaging application, the open voice channel establishment request server is sent to the server, after the voice channel between the first client and the first client logged in by the first object is established, the voice stream of the first object starts to be received, and after the opposite terminal user (also called as the associated user) is communicated with the server, the voice stream of the first object can be quickly pushed to the associated user, so that the voice interaction efficiency between the first object and the associated user can be improved.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow chart of some embodiments of a voice interaction method according to the present disclosure;
FIG. 2 is a schematic structural diagram of some embodiments of a voice interaction device according to the present disclosure;
FIG. 3 is an exemplary system architecture in which the voice interaction method of some embodiments of the present disclosure may be applied;
fig. 4 is a schematic diagram of a basic structure of an electronic device provided according to an embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "some embodiments" means "at least some embodiments"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The voice channel in the present disclosure, which is sometimes referred to as an open voice channel, may be a voice channel that can be accessed without an authorization operation such as confirmation by a predetermined user such as a voice channel initiator or an administrator, or may be a voice channel that can be accessed after the predetermined user is required to perform the authorization operation.
Referring to fig. 1, a flow of some embodiments of a voice interaction method according to the present disclosure is shown. The voice interaction method is applied to the terminal equipment. The voice interaction method as shown in fig. 1 comprises the following steps:
step 101, in response to detecting a triggering operation of a preset voice channel opening control in the instant messaging application, sending an opening voice channel establishment request to a server.
In this embodiment, the server may perform the following steps: and in response to determining to establish the open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request.
In this embodiment, the instant messaging application may be a messaging application established based on an instant messaging mechanism. The functions of the instant messaging application may include office functions, i.e., may be used for on-line office. In other words, remote office can be performed using an instant messaging application.
In this embodiment, the voice channel opening control may be preset, and the voice channel opening control may be used to trigger sending an open voice channel establishment request.
In this embodiment, after receiving the open voice channel establishment request, the server may determine whether to establish an open voice channel corresponding to the open voice channel establishment request; if the establishment is determined, the open voice channel is established.
In this embodiment, the server may receive, using the established open voice channel, an audio stream of the first object that initiates the open voice channel establishment request.
It should be noted that, after the server establishes the voice channel with the first client logged in by the first object, the server starts to receive the audio stream of the first object, and after the opposite end user (may also be referred to as an associated user) communicates with the server, the server may implement fast pushing of the audio stream of the first object to the associated user, so that the voice interaction efficiency between the first object and the associated user may be improved.
It should be noted that, after the voice channel is established between the server and the first object, the voice channel is in an accessible state (capable of receiving and transmitting the audio stream in real time) and is not affected by whether the opposite terminal is introduced, so that the first object is in a connectable state at any time, and after one or more opposite terminals are accessed, voice interaction can be immediately performed with the first object, so that a link reserved by each other is avoided, and the efficiency of voice interaction can be improved.
It should be noted that, in the related art, a voice interaction channel is established only when a call and a receiving link are required between users of the voice interaction channel. The related art results in a high threshold against facilitating the user's tele-office collaboration. In the method, the voice interaction method is implemented in instant messaging application, so that the face-to-face dialogue scene between users can be simulated on line, and remote office cooperation between users is facilitated.
In some embodiments, the server may forward audio streams sent by users in the user set to other users in the user set that are not senders for maintaining the user set connected to the open audio channel.
In some embodiments, the server may establish the association for a voice channel with a first object in response to receiving an association for an attach request for the open voice channel, and forward an audio stream of one of the associated user and the first object to the other of the associated user and the first object, or send an audio stream of the first object received within a preset period of time to the associated object. If there are multiple persons A-D in the same voice channel, this means that the audio stream of A is forwarded to the channels other than A (i.e., B-D), i.e., the associated user is not limited to a "single" user.
Thus, the speed at which the associated user receives the audio stream of the first object may be increased.
In some embodiments, the server may send a connection entry setup indication to an associated object of the first object in response to determining to setup the open voice channel. Here, the connection portal establishment indicates that a connection portal control is displayed on the association page of the second client for logging in the association object.
In some embodiments, the associated object is all or part of the users who have previously established a friend relationship with the first object.
In some embodiments, the portion of users is determined by the first object selection.
It should be noted that, selecting a part of the users from the first object can relatively reduce the notification range of the first object, reduce the disturbance to the first object, and also reduce the communication pressure of the established open voice channel.
In some embodiments, the method further comprises: displaying an associated user selection control; and according to the triggering operation of the related user selection control, sending the user identification of the selected related user to the server.
Here, the associated user selection control may be presented at any time, for example, before or after the first user triggers the voice channel opening control. As an example, the associated user selection control may be displayed in a generic setting.
In some embodiments, the presenting an associated user selection control includes: and responding to the detection of the triggering operation of the voice channel opening control, and displaying the related user selection control.
It should be noted that, after the triggering operation of the voice channel opening control is detected, the selection control of the associated user is displayed, so that the first object can conveniently select the associated user expected or likely to perform voice interaction according to the interaction requirement of the voice channel opening at this time.
In some embodiments, the association object comprises at least one of: individual users, groups, subgroups of groups, wherein the subgroups of groups are established based on a preset theme. As an example, sub-groups in a group may be temporarily established groups based on topics.
It should be noted that, the individual users, the groups, and the sub-groups in the groups may all be related objects, so that various users may be covered more widely as related users, and the range of related users of the first object may be enlarged.
In some embodiments, the initiator client may send at least the group identification to the server, which determines the group member based on the group identification, and then pushes a connection portal display notification to the client of the group member.
In some embodiments, the group member client displays the connection entry based on the notification at a certain opportunity (e.g., when the user logs in, or when a point opens a dialog with the initiator).
In some embodiments, the displaying may include silently displaying the connection entry control for a preset period of time.
It should be noted that, the connection entry control is displayed in a silent manner, so that the disturbance to the associated user can be reduced. Further, the technical scheme that the disturbance to the associated user is combined with the fact that the first object is in the state of being connected at any time can be reduced, so that the disturbance to most users can be reduced; in other words, the first object being in the connectable state can be understood as a notification for more users, and at this time, if ring or vibration prompts in the related art are widely performed, interference may be generated to most users who do not intend to perform voice interaction with the first object.
In some embodiments, the displaying the connection entry control in the silence period may include: in the process of displaying the connection entry control, the control is displayed in a silent mode in the whole process; or when the non-silent display parameter reaches a preset condition, starting to display the connection entry control in a silent mode.
Here, the non-silent display parameter may be a number of sounds, or a period of time for a dynamic display, from the start display time of the connection entry control. In other words, a few sounds can be sounded, or a period of time can be sounded, or after a period of time of the live view display, the silent display of the connection entry control can begin.
Before silence, the non-silence mode is prompted, so that the associated user can be effectively reminded.
In some embodiments, the server sends a direct connection notification to an associated object of a first object in response to determining to establish the open voice channel, wherein the direct connection notification indicates that the associated object is connected to the open voice channel without acknowledgement via the first object.
It should be noted that the direct connection notification may inform the associated object that the first object is used for enabling the state to be communicated at any time, so that the associated user can conveniently perform voice interaction with the first object through the open voice channel.
In some embodiments, the voice channel opening controls may include personal voice channel opening controls. Here, the above-mentioned personal voice channel opening control corresponds to a personal user identification.
In some embodiments, a voice channel control is displayed on an associated page of a single user, the voice channel control corresponding to a user identification of the single user.
Here, the above-mentioned association page of the single user may refer to a homepage of the single user. And, the step 101 may include: and responding to the triggering operation of the voice channel control, and sending an open voice channel establishment request to a server, wherein the establishment request carries the user identification of the single user.
It should be noted that, the individual user may establish an open voice channel with the individual user as the initiator through the personal voice channel open request. It will be appreciated that the personal voice channel opening control is used to establish an open voice channel with the corresponding personal user as the initiator.
In some embodiments, the voice channel opening controls may include group voice channel opening controls. Here, the group voice channel opening control corresponds to a group identifier.
In some embodiments, a voice channel control may be displayed on an associated page of group users. Here, the above-mentioned association page of group users may include, but is not limited to, at least one of: in a dialog box of a group or a description page of a group, etc. The description page of the group may include various group related information such as group member avatars, group names, group chat log open controls, and the like.
In some embodiments, the step 101 may include: and responding to the triggering operation of the voice channel control, and sending an open voice channel establishment request to a server, wherein the establishment request carries a group identifier or an identifier of a group member in a preset range.
It should be noted that, the members in the group may trigger the voice channel control displayed on the associated page of the group user, so that an open voice channel for the group may be initiated, and the members in the group may connect to the open voice channel of the group through the connection entry control displayed on the associated page of the group.
In some embodiments, the association page includes at least one connection entry control.
In some embodiments, the associating the page with the at least one connection entry control comprises: displaying a voice channel connection entry control of a group user in an associated display area of the group user with the personal voice channel opened in the group chat page; or, displaying a connection entry control of the group voice channel in a preset area of the group chat page; or, connecting the voice channel of the user with the entry control within a preset range in the associated page aggregation display.
By way of example, the associated display area of the group user of the personal audio channel may be in the vicinity of the avatar of the group user. In the group chat page, the voice channel connection entry control of the group user is displayed in the associated display area of the group user with the personal voice channel opened, and the voice channel can be established with the group user with the personal voice channel opened by quickly connecting the entry through the voice channel connection entry when the group member has the interaction requirement with the group user with the personal voice channel opened.
It should be noted that, as an example, in the preset area of the group chat page, any area of the group chat page may be used, including but not limited to top, bottom, middle, etc. It should be noted that, the connection entry control of the group voice channel is displayed in the preset area of the group chat page, so that the group user can conveniently join in the group chat voice channel through the connection entry control.
In some embodiments, the predetermined range includes all friends of the user, or pre-selected individual and/or group users; the aggregate display includes displaying a user identification and a voice connection entry control.
As an example, when all voice channels are displayed in a unified association page in a concentrated manner, there may be a user avatar of the initiator, or a group avatar, topic summary, or the like.
It should be noted that, the relevant pages of the voice channel connection entry control of the user in the preset range are displayed in an aggregation manner, and the relevant pages can be independent pages or partial areas (such as partial areas of the main page) in other pages. The voice channel connection entry control is displayed on the association page set, so that a user can intuitively acquire relevant information of friends with opened voice channels, and can select friends desiring to perform voice interaction from the relevant information to quickly initiate contact.
In some embodiments, the second client logged in by the association object displays the connection entry control at a preset position of the association page.
In some embodiments, the preset locations include at least one of, but are not limited to: the top, bottom, sides and middle of the associated page.
The preset position comprises the top, and the connection entrance control is displayed at the top, so that a user can be reminded more remarkably.
In some embodiments, the association page includes a preset display range corresponding to the association object in a single chat page of the first object and the association object or an information flow list page.
In some embodiments, the method further comprises: and responding to the fact that the information to be displayed at the preset position comprises the connection entrance control, the number of the information to be displayed is at least two, the priority of each information to be displayed is obtained, and each information to be displayed is processed according to the priority and in a preset display mode, wherein the priority of the connection entrance control is the highest priority.
It should be noted that when the information to be displayed at the preset position is greater than two, the information may be prioritized, and then processed (or each information to be displayed is displayed or not) according to the priority and the preset display method.
In some embodiments, the preset display mode includes at least one of the following: sequencing the information to be displayed according to the order of the priority from high to low, and displaying the information to be displayed according to the sequence; displaying the connection entry control with the highest priority and hiding other information to be displayed except the connection entry control; and aiming at other information to be displayed except the connection entry control, displaying a corresponding deletion control, and stopping displaying the information to be displayed corresponding to the deletion control aiming at the triggering operation in response to the detection of the triggering operation aiming at the deletion control.
The information to be displayed is ordered according to the order of the priority from high to low, and the information to be displayed is displayed according to the order of the order, and the information to be displayed can be ordered from high to low or from low to high. And arranging and displaying each piece of information to be displayed, so that the information to be displayed can not be omitted.
The connection entry control with the highest priority is used for hiding other information to be displayed except the connection entry control, so that the interference of the other information to be displayed on the connection entry control can be eliminated, and a user can be reminded of being added into the public voice channel conveniently.
Here, the deletion control may be displayed in association with each piece of the other pieces of information to be displayed, and whether to display the other pieces of information to be displayed may be selected by the user.
In some embodiments, the method further comprises: and displaying the control panel of the open voice channel. The control panel described above may be presented here in any practical manner. The display of the control panel can prompt the user that the open voice channel is opened, and other people can contact the user in real time.
In some embodiments, the control panel that presents the open voice channel comprises: and a control panel for displaying a folded state in response to determining that the number of connections of the open voice channel is greater than a first number threshold. Here, the first number threshold may be any non-negative number determined in advance.
It should be noted that, by displaying the control panel in the folded state, less information can be displayed and a smaller area is occupied when the control panel is displayed, so that the consumption of computing resources can be reduced and the loading speed of the control panel can be increased.
In some embodiments, the control panel is displayed in suspension. The suspension display control panel can ensure that a user opens any page, and the control panel displays the page, so that the user can operate the suspension display control panel conveniently.
In some embodiments, the method comprises: responsive to detecting a trigger operation for an expansion control in the control panel in the collapsed state, the control panel in the expanded state is presented; in response to detecting a trigger operation for a fold control in the control panel in the unfolded state, the control panel in the folded state is presented.
It should be noted that, the control panel can switch between a folded state and an unfolded state, the folded state displays less information, and the unfolded state can display more information; therefore, the user can select the folding state or the unfolding state according to the requirement of the richness of the needed information, so that the contradiction between the computing resource and the area and the richness of the information is balanced.
In some embodiments, the control panel comprises at least one of: the microphone turns off the control and the voice channel shares the control.
In some embodiments, the control panel may include at least one of, but is not limited to: the leave control can be clicked by the user to quickly leave the current channel; the microphone is closed, and the microphone is closed; setting a control, and rapidly switching the audio equipment; returning to the chat group control, namely returning to the corresponding initiation group; looking at the current audio channel on-line people list: a person only displayed on the audio channel will display the microphone disabled state and the current speaker state.
It should be noted that, when the microphone is in the off state, the speaking forbidden can be temporarily canceled by additionally pressing a preset shortcut key (for example, ctrl+shift+v); any user can share his own joining voice channel.
A control panel in a folded state, which may include at least one of the following; the microphone turns off the control, sets the control, displays the current speaker (avatar + name), and the user leaving the channel control can leave the current channel quickly after clicking.
In some embodiments, the method further comprises: in response to detecting triggering operation for the voice channel sharing control, displaying a sharing object selection control, wherein the sharing object selection control is used for selecting a sharing object; and determining a sharing object according to the triggering operation of the sharing object selection control, and sending voice entrance information to the client of the instant messaging application of the selected sharing object, wherein the voice entrance information is used for representing the link address of the voice channel.
Optionally, the voice portal information is expressed in the form of a voice channel invitation card.
In some embodiments, the sharing object sends the voice channel invitation card to a third object to cause the third object to connect to the open voice channel through the voice channel invitation card.
In some embodiments, the control panel includes a list of user identifications indicating users connected to the open voice channel.
In some embodiments, the number of associated users connected to the open voice channel is less than a second number threshold.
It should be noted that, by controlling the number of associated users connected to the open voice channel, it is possible to ensure that the audio stream data in the open voice channel is not excessively large, so that the pressure of the server can be reduced.
In some embodiments, the open voice channel setup request further includes a device identification, the server determining whether the device identification has been received to participate in a voice channel, and in response to determining that the device identification does not participate in a voice channel, establishing an open voice channel for the device identification.
In some embodiments, the server determines not to establish an open voice channel for the device identification in response to determining that the device identification has participated in the voice channel.
It should be noted that, through the verification based on the device identifier, a single user may use a single device to participate in one open voice channel, so as to avoid confusion that may be generated when the user connects to a plurality of open voice channels at the same time.
In some embodiments, the server determines, in response to receiving a connection request sent by an associated user, whether the number of connections of the voice open channel is greater than a third number threshold; and responding to the fact that the connected number of the voice open channels is larger than a third number threshold, and connecting the associated user to be connected, which sends the connection request, into the open voice channels in a microphone off state mode.
It should be noted that, when the number of connected users is greater than the preset third threshold (for example, greater than 10), the associated user connects to the open voice channel in a state including silence of the microphone, so that it is ensured that after the server adjusts the data stream, the voice authority is opened to the user, and therefore the calculation pressure of the server when connecting to the open voice channel can be reduced.
In some embodiments, a user of the instant messaging client initiates a voice channel establishment request (the user may be referred to as an initiator) by triggering a preset control or the like, the client of the initiator sends the voice channel establishment request to the server, the server establishes a voice channel between the client and the server after receiving the voice channel establishment request, and when the voice channel is established, the client of the initiator can push audio stream data collected by the client to the server, so that the server can push an audio stream of the initiator to the client of the receiver in time after connecting the client of the receiver to the voice channel, thereby reducing delay.
In an embodiment, the client of the initiator may send the audio stream to the server after the voice channel is established successfully, or may send the audio stream to the server when the client of the initiator is in a specific mode such as non-mute.
In an embodiment, since the client side of the initiator and the server have previously established a voice channel (abbreviated as a first voice channel), when the receiver needs to perform voice communication with the initiator, the server only needs to establish a voice channel (abbreviated as a second voice channel) between the initiator and the server, and by associating the first voice channel and the second voice channel, voice interaction between the initiator and the receiver can be achieved without the initiator always being in a state of waiting for the receiver to receive voice invitation (in the related art, if the receiver does not receive voice invitation, the initiator cannot establish an effective voice channel, namely cannot send audio stream data to the server, which is equivalent to always being in a state to be connected), so that the voice channel establishment process of the initiator and the server and the voice channel establishment process of the receiver are separated, the initiating operation of the initiator and the joining operation of the receiver can be long-time without occupying excessive server resources, and causing excessive disturbance to the receiver, and the initiator can always be in a state of being capable of performing voice communication (in the related art, the above-mentioned technical scheme is more closely combined with the service, and the communication scenario of the IM (IM) is more practical, and the communication resource is not occupied, and the communication scenario is not required to be avoided.
In one embodiment, when a voice channel control is displayed on an associated page of a single user, the voice channel control corresponds to a user identification of the single user; and responding to the triggering operation of the voice channel control, and sending an open voice channel establishment request to a server, wherein the establishment request carries the user identification or equipment identification information of the single user. Wherein the association page of the single user comprises, but is not limited to, a preset area of a client page of the single user, or a personal information identification page of the single user, or a setting page and the like.
In an embodiment, when a voice channel control is displayed on an associated page of a group user, in response to a triggering operation on the voice channel control, the client may send an open voice channel establishment request to the server, where the establishment request carries a group identifier or an identifier of a group member in a predetermined range. Wherein, the associated page of the group user comprises but is not limited to a group chat page, a group setting page and the like. The group users in the preset range can be users screened by a voice channel initiator through selection or preset and the like, or can be users related to the topic (such as users participating in the topic) determined based on the preset topic selected by the initiator, so that the user range is intelligently reduced, clients of the users in the range can generate connection entry controls corresponding to voice channels, the users in the range can communicate quickly, and the interference to other users is avoided while the communication efficiency and the information confidentiality are improved.
In an embodiment, after the initiator establishes the voice channel, the connection entry control may be displayed in the range of the receiver of the voice channel, for example, the connection entry control is displayed in the associated page of the client of each receiver, so that the receiver can access the voice channel of the initiator at any time, thereby improving the communication efficiency.
In an embodiment, when the receiver is a single user, the association page may include a single chat page of the initiator and the receiver, or a preset display range of information flows of the initiator and the receiver in an information flow list page (for example, chat cards of the initiator and the receiver in an information flow list).
In one embodiment, more than one connection entry control may be displayed on the associated page.
In an embodiment, the voice channel connection entry control of the group user can be displayed in the associated display area (such as the area near the head of the group user in the chat page, the area near the speech message of the group user, or the area near the group user identifier in the group setting page) of the group user with the personal voice channel opened, so that voice communication with the group member can be initiated conveniently and quickly, and simultaneously, the voice communication and the image-text communication can be switched quickly, so that the communication efficiency is improved.
In an embodiment, a connection entry control of a group voice channel can be displayed in a preset area of the group chat page, so that a group member can conveniently open the voice channel of the group, and the group member can quickly communicate with the group member by triggering the connection entry control of the voice channel.
In an embodiment, the voice channel connection entry controls of the users in the preset range can be displayed in an aggregation manner on the associated pages, for example, connection entry controls of all individual users who have opened the individual voice channels are displayed in an aggregation manner, the connection entry controls can be displayed corresponding to the identifications of the user head portraits of the individual users, and for example, connection entry controls of the individual users who have opened the individual voice channels in the group are displayed in a preset page of the group or in a preset area of the page.
In some embodiments, displaying the connection entry control on the associated page may include silently displaying the connection entry control for a preset period of time, thereby reducing the disturbance. For example, in the process of displaying the connection entry control, the control is displayed in a silent manner in the whole process; for another example, when the non-silent display parameter reaches a preset condition, the connection entry control starts to be displayed in a silent manner, so that the user is reduced from being disturbed when the connection entry control plays a role of a light prompt. Wherein, the non-silent display parameters may include, but are not limited to, ringing a number of sounds, or ringing a number of durations, or graphically displaying a number of durations, starting from a preset point in time (e.g., starting from a starting display time of the connection entry control).
Through the connection entry control of the silence display voice channel, the voice channel can be conveniently triggered by the associated object of the initiator, so that the voice channel of the initiator and the voice channel of the associated object are established, and the interference to both sides can be avoided (in the related technology, when the initiator of voice communication initiates voice invitation, if the receiver fails to accept the voice invitation, the client of the receiver is always in a normal reminding state such as ringing, and the like), and the silence display mode does not need a user to additionally set user equipment to be in a silence mode, and the like, and can simplify the user operation.
After the open voice channel is established, the server may feed back feedback information that the open voice channel is successfully established to the first client used by the first object a. The client used by the first object may display the identifier corresponding to the open voice channel. The first client may send an audio stream to the server.
The second object (here, the second object may be one user, or may be more than two users) may actively or passively learn the open voice channel.
In some embodiments, the second object may actively learn a channel of the open voice channel. Specifically, the second object can learn the open voice channel by looking at the information of the first object or establishing one-to-one instant messaging connection with the first object. That is, the second object actively knows the open voice channel corresponding to the first object.
As an example, the second object may learn the above-mentioned open voice channel of the first object by looking at the identity of the first object in its contact list, by browsing to a connection entry control provided at a preset location of the first object identity. The connection entry control may be, for example, a generic identifier of an open voice channel (where the identifier may be any character, graphic, etc.).
As another example, the second object may learn the above-described open voice channel of the first object by opening a personal page of the first object, by browsing to a connection entry control provided at a preset location of the personal page of the first object.
As yet another example, the connection control entry is not displayed at the preset position of the identifier of the first object in the contact list of the second object, and may be displayed at the top of the instant messaging window when the second object initiates one-to-one instant messaging to the first object. In some application scenarios, the width of the display area of the connection control entry is approximately equal to the width of the instant messaging window. The color of the connection control entry may be different from the background color of the instant messaging window.
The second object may join the open voice channel. After the second object joins the open voice channel, the server may implement voice interactive transmission between the first object and the second object. When the second objects are two users, the server can also realize the voice interactive transmission between the second objects.
In some other embodiments, the second object may passively learn an open voice channel corresponding to the first object. The second object actively transmits information sharing the open voice channel to the second object through the first object to acquire the open voice channel.
In some application scenarios, the first object a may initiate, to a client that the first object a uses, an instruction to share the open voice channel with a single target associated object (where the second object is a single object) B, for example, the first object a may select, in the client that uses, a target associated object from multiple associated objects that have an association relationship with the first object, as a sharing target associated user.
According to the instruction of the user, the client used by the first object a may send, through the server, information for sharing the open voice channel to the sharing target associated user.
The information for sharing the open voice channel includes a connection entry control of the open voice channel. The second client used by the sharing target associated user B may display information for sharing the open voice channel.
The displaying and sharing the information of the open voice channel includes displaying a connection entry control in a session window of the first object and the second object, which is displayed in the second client. For example, the connection entry control is displayed in the session window as a single message. The connection entry control may be, for example, an audio channel card. The audio channel card may display an identification of the first object (e.g., a head portrait of the first object) and an identification of the open audio channel.
For another example, a first client used by the first object may send the connection entry control to a second client used by the second object via a server. The connection entry control may further display the connection entry control at a preset position in the current display page of the second client. The preset position may be, for example, the top of the display page currently displayed by the second client. The current display page may be any page.
For another example, a first client used by the first object a may send the connection entry control to a second client used by the second object via a server. The connection entry control may be displayed at a preset location of the user identifier of the first object a displayed by the second client, or may be displayed in a private page where the first object a is displayed by the second client.
The second object B may implement audio communication with the first object a by clicking on the connection entry control. The server pushes the audio stream a of the first object to the second object B and pushes the audio stream of the second object B to the first object a.
(1) An embodiment when the recipient is a single user.
After the open voice channel corresponding to the first object a is established, the server may feed back feedback information that the open voice channel is successfully established to the first client used by the first object a. The client used by the first object a may display the identifier corresponding to the open voice channel. The first client may send an audio stream to the server.
The first object a may initiate an instruction to the client that it uses to share the open voice channel with the single target associated object (here, it is assumed that the second object is a single object) B, for example, the first object a may select, from multiple associated objects that have an association relationship with the first object, as a sharing target associated user in the client that it uses.
According to the instruction of the user, the client used by the first object a may send, through the server, information for sharing the open voice channel to the sharing target associated user.
The information for sharing the open voice channel includes connection entry control information of the open voice channel. The second client used by the sharing target associated user B may display information for sharing the open voice channel.
In some application scenarios, displaying information sharing the open voice channel includes displaying a connection entry control in a session window of the first object and the second object displayed in the second client. For example, the connection entry control is displayed in the session window as a single message. The connection entry control may be, for example, an audio channel card. The audio channel card may display an identification of the first object (e.g., a head portrait of the first object) and an identification of the open audio channel.
In some other application scenarios, the first client used by the first object may send the connection entry control to the second client used by the second object through a server. The connection entry control may further display the connection entry control at a preset position in the current display page of the second client. The preset position may be, for example, the top of the current display page of the second client. The current display page may be any page.
In still other application scenarios, a first client used by the first object a may send the connection entry control to a second client used by the second object via a server. The connection entry control may be displayed at a preset location of the user identifier of the first object a displayed by the second client, or may be displayed in a private page where the first object a is displayed by the second client.
The second object B may trigger the connection entry control by means of click touch, gesture touch, voice control, etc. to implement audio communication with the first object a. The server pushes the audio stream a of the first object to the second object B and pushes the audio stream of the second object B to the first object a.
After the server receives the connection request sent by the client of the second object B based on the trigger, the server may push the audio stream sent by the client used by the user of the first object a connected to the open voice channel to the client of the second object B, so as to implement interaction between the users connected to the open voice channel.
(2) An embodiment when the recipient is a plurality of users.
The first object may initiate, to a first client that uses the first object, an instruction to share the open voice channel with multiple target associated objects, for example, the first object may select, in the client that uses the first object, at least two target associated objects from multiple associated objects that have an association relationship with the first object as sharing target associated users (where the second object is multiple objects).
According to the instruction of the user, the client used by the first object a may send, to the at least two sharing target associated users, for example, the user C and the user D, information for sharing the open voice channel through the server.
The information for sharing the open voice channel includes connection entry control information of the open voice channel. Information for sharing the open voice channel can be shown in the used second client of each sharing target associated user (second object).
In some application scenarios, the first client used by the first object may send, through the server, the connection entry control to a session window of the first object and the upper sharing target associated user in the second client used by each of the at least two sharing target associated users. The connection entry control may include an audio channel card. In these application scenarios, the connection entry control may be displayed in the session window in the second client in the form of a single message. The audio channel card may display an identification of the first object (e.g., a head portrait of the first object) and an identification of the open audio channel.
In other application scenarios, the first client used by the first object a may send, through the server, the connection entry control information to the second client used by each sharing target associated user. Optionally, the connection entry control may also be displayed at a preset position in the current display page of each of the second clients. The preset position may be, for example, the top of the display page currently displayed by the second client.
And each sharing target associated user can realize audio communication with the first object or realize audio interactive communication with the first object and other sharing target associated users by clicking the connection entry control.
In the related art, when an initiating party initiates an audio invitation to a receiving party, the receiving party can realize voice communication with the initiating party by clicking a receiving selection item displayed on the audio invitation, wherein the current page of a client side of the receiving party is full-screen or a floating window above the current page is displayed for displaying the audio invitation. The above-described audio invitation will affect the work now being performed by the recipient, causing interference to the recipient. In addition, the recipient must either click on the receive option to effect voice communication with the initiator or click on the reject option to end the voice invitation. If the receiver operates by mistake, the phenomenon which is contrary to the true will of the receiver can occur. In the application, the initiator sends the connection entry control information to the receiver through the server, and the connection entry control can be displayed in a session window of the initiator and the receiver in a single message mode or at a preset position of a current window of the client, so that influence on work performed by the receiver is reduced. In addition, the receiving party can trigger the entry control by clicking and the like to realize voice communication with the initiating party, and the phenomenon of misconnection or miscompare caused by misoperation can be reduced.
(3) Embodiments when the recipient is a group
The first object a may initiate, to the first client that uses the first object a command to share the open voice channel with the group in which the first object is located, for example, the first object may select at least one group from a plurality of groups to which the first object displayed in the client that uses the first object joins as the sharing target object. The sharing target object may be all users included in the selected group, that is, a second object (where the second object includes at least one object). In some application scenarios, the first object may also select a part of group users as the sharing target object, that is, the second object, in the selected group.
According to the instruction of the user, the client used by the first object can send information for sharing the open voice channel to the sharing target object through the server. In practice, the server may send the information for sharing the open voice channel to the second client used by each user located in the sharing target object.
The information for sharing the open voice channel includes connection entry control information of the open voice channel. The information for sharing the open voice channel can be displayed in the second client used by each user in the sharing target object.
In some application scenarios, the first client used by the first object may send the connection entry control through a server to the session window of the group in the second client used by each user located in the sharing target object. The connection entry controls may include audio channel cards, or button controls, etc. In these application scenarios, the connection entry control may be displayed in the form of a single message in a group session window displayed by the client sharing the target object.
In some other application scenarios, the first client used by the first object may send the connection entry control to the second client used by each user located in the sharing target object through a server. Optionally, the connection entry control may be displayed at a preset position in the current display page of each of the second clients. The preset position may be, for example, the top of the current display page of the second client. The current display page may be, for example, the group session page.
Each user in the group can trigger the connection entry control in a second client used by the user by clicking or the like, so that audio communication with other users in the sharing target object is realized.
(4) Embodiments when the recipient is a topic-related user in the group
In some application scenarios, a group may include at least one topic group. In these application scenarios, the first object may initiate, to the client that it uses, an instruction to share the open voice channel with at least one topic group in the group, for example, the first object a may select, as the sharing target object, each user in the at least one topic group (e.g., G1) in the group G1, G2, G3 that the first object displayed in the client that it uses joins. The sharing target object is the second object.
According to the instruction of the user, the first client used by the first object can send information for sharing the open voice channel to at least one other user corresponding to the topic group through the server. In practice, the server may send the information sharing the open voice channel to a second client used by each user located in the at least one topic group.
The information for sharing the open voice channel includes connection entry control information of the open voice channel. Information sharing the open voice channel may be presented in a second client used by each user in the topic group.
In some application scenarios, the first client used by the first object may send the connection entry control information through a server to the topic grouping session window in the second client used by each user located in the topic grouping. The connection entry control may include an audio channel card. In these application scenarios, the connection entry control described above may be displayed in the topic grouping session window in the form of a single message.
In some other application scenarios, the first client used by the first object may send the connection entry control to the second client used by each user located in the topic group through a server. The connection entry control may be displayed at a preset position in the current display page of each of the second clients. The preset position may be, for example, the top of the current display page of the second client.
Each user located in the topic group can communicate with other users in the group through an established audio channel by clicking on the connection entry control in the second client that he uses.
With further reference to fig. 2, as an implementation of the apparatus shown in the foregoing figures, the present disclosure provides some embodiments of a voice interaction apparatus, where the apparatus embodiments correspond to the apparatus embodiments shown in fig. 1, and the apparatus is particularly applicable to various electronic devices.
As shown in fig. 2, the voice interaction device of the present embodiment includes: a sending unit 201, where the sending unit is configured to send, in response to detecting a trigger operation for a preset voice channel open control in an instant messaging application, an open voice channel establishment request to a server, so that the server performs the following steps: and in response to determining to establish the open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request.
In this embodiment, the specific processing of the sending unit 201 of the voice interaction device and the technical effects thereof may refer to the description of step 101 in the corresponding embodiment of fig. 1, and are not repeated here.
In some embodiments, the server establishes a voice channel between the associated user and the first object to forward an audio stream of one of the associated user and the first object to the other of the associated user and the first object in response to receiving an attach request for the open voice channel by the associated user.
And the server responds to the determination of establishing the open voice channel and sends a connection entry establishment instruction to the association object of the first object, wherein the connection entry establishment instruction is used for displaying a connection entry control on an association page of a second client which is logged in by the association object.
In some embodiments, the device is further configured to, at the second client, silence and display the connection entry control for a preset period of time.
The silent display of the connection entry control within a preset time period includes any one of the following:
in the process of displaying the connection entry control, the control is displayed in a silent mode in the whole process;
and starting to display the connection entry control in a silent mode in response to the preset parameter value of the non-silent display parameter meeting the preset condition.
The second client stops displaying the connection entry control in response to determining that the first object closes the open voice channel.
The server sends a direct connection notification to an associated object of a first object in response to determining to establish the open voice channel, wherein the direct connection notification is to indicate that the associated object is to connect to the open voice channel without confirmation by the first object.
In some embodiments, the voice channel opening control comprises a personal voice channel opening control, wherein the personal voice channel opening control corresponds to a personal user identification.
In some embodiments, the sending, in response to detecting a triggering operation of a preset voice channel opening control in the instant messaging application, an opening voice channel establishment request to a server includes: and responding to the triggering operation of the personal voice channel control, and sending an open voice channel establishment request to a server, wherein the establishment request carries the user identification of a single user sending the open voice channel establishment request.
In some embodiments, the voice channel opening controls include a group voice channel opening control.
In some embodiments, a voice channel control is displayed on an associated page of a group user; and sending an open voice channel establishment request to a server in response to detecting a triggering operation of a preset voice channel open control in the instant messaging application, wherein the method comprises the following steps: and responding to the triggering operation of the voice channel control, and sending an open voice channel establishment request to a server, wherein the establishment request carries a group identifier or an identifier of a group member in a preset range in a group represented by the group identifier, and the group is a group which is displayed in association with the open voice channel control.
In some embodiments, the associated object is all or part of the users who have previously established a friend relationship with the first object.
In some embodiments, the portion of users is determined by the first object selection.
In some embodiments, the apparatus further comprises: displaying an associated user selection control; and according to the triggering operation of the related user selection control, sending the user identification of the selected related user to the server.
In some embodiments, the presenting an associated user selection control includes: and responding to the detection of the triggering operation of the voice channel opening control, and displaying the related user selection control.
In some embodiments, the association object comprises at least one of: individual users, groups, subgroups of groups, wherein the subgroups of groups are established based on a preset theme.
The association page includes at least one connection entry control.
The at least one connection entry control included with the associated page includes at least one of: displaying a voice channel connection entry control of a group user in an associated display area of the group user with the personal voice channel opened in the group chat page; displaying a connection entry control of the group voice channel in a preset area of the group chat page; the user's voice channel is connected to the entry control within a predetermined range of the associated page aggregate display.
In some embodiments, the user's voice channel connection entry control is within a predetermined range of the associated page aggregate display; wherein: the predetermined range includes all friends of the user, or a pre-selected individual user and/or group user; the aggregate display includes displaying user identifications and corresponding voice connection entry controls within a preset range.
In some embodiments, the second client logged in by the association object displays the connection entry control at a preset position of the association page.
In some embodiments, the preset location includes a top of the associated page.
In some embodiments, the association page includes a preset display range corresponding to the first object and the associated object in a single chat page of the associated object and/or an information flow list page of the instant messaging application.
In some embodiments, the apparatus further comprises:
and responding to the fact that the information to be displayed at the preset position comprises the connection entry control, the number of the information to be displayed is at least two, acquiring the priority of each piece of information to be displayed, and processing each piece of information to be displayed according to a preset display mode according to the priority.
In some embodiments, the preset display mode includes at least one of the following: sequencing the information to be displayed according to the order of the priority from high to low, and displaying the information to be displayed according to the sequence; displaying the connection entry control with the highest priority and hiding other information to be displayed except the connection entry control; displaying each piece of information to be displayed, displaying corresponding deletion controls for other information to be displayed except the connection entry controls, and stopping displaying the information to be displayed corresponding to the deletion controls triggered by the triggering operation in response to detecting the triggering operation for the deletion controls.
In some embodiments, the connection entry control has a higher priority than other information to be displayed.
In some embodiments, the apparatus further comprises: and displaying the control panel of the open voice channel at the instant messaging client of the object joining the voice channel.
In some embodiments, the control panel that presents the open voice channel comprises:
and a control panel for displaying a folded state in response to determining that the number of connections of the open voice channel is greater than a first number threshold.
In some embodiments, the control panel is displayed in suspension.
In some embodiments, the apparatus comprises: responsive to detecting a trigger operation for an expansion control in the control panel in the collapsed state, the control panel in the expanded state is presented; in response to detecting a trigger operation for a fold control in the control panel in the unfolded state, the control panel in the folded state is presented.
In some embodiments, the control panel comprises at least one of: the microphone turns off the control and the voice channel shares the control.
In some embodiments, the control panel includes a voice channel sharing control; the apparatus further comprises: in response to detecting triggering operation for the voice channel sharing control, displaying a sharing object selection control, wherein the sharing object selection control is used for selecting a sharing object; and determining a sharing object according to the triggering operation of the sharing object selection control, and sending voice entrance information to the client of the instant messaging application of the selected sharing object, wherein the voice entrance information is used for representing the link address of the voice channel.
Wherein the voice portal information is presented in the form of a voice channel invitation card.
In some embodiments, the sharing object sends the voice channel invitation card to a third object to cause the third object to connect to the open voice channel through the voice channel invitation card.
In some embodiments, the control panel includes a list of user identifications indicating users connected to the open voice channel.
In some embodiments, the number of associated users connected to the open voice channel is less than a second number threshold.
In some embodiments, the open voice channel setup request further includes a device identification, the server determining whether the received device identification has participated in a voice channel, and in response to determining that the device identification does not participate in a voice channel, establishing an open voice channel for the device identification.
In some embodiments, the server determines not to establish an open voice channel for the device identification in response to determining that the device identification has participated in the voice channel.
In some embodiments, the server determines, in response to receiving a connection request sent by an associated user, whether the number of connections of the voice open channel is greater than a third number threshold; and responding to the fact that the connected number of the voice open channels is larger than a third number threshold, and connecting the associated user to be connected, which sends the connection request, into the open voice channels in a microphone off state mode.
With further reference to fig. 3, fig. 3 illustrates an exemplary system architecture to which the new interaction method or message interaction device of some embodiments of the present disclosure may be applied.
As shown in fig. 3, the system architecture may include first terminal devices 301, 302, second terminal devices 303, 304, and a server 305. The first terminal devices 301 and 302 and the server 305 are connected to each other, and the remaining second terminal devices 303 and 304 of the server 305 are connected to each other by a network. The network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 305 over a network using the first user terminal device 301, 302 to receive or send messages or the like. The first terminal device 301, 302 may have various communication client applications installed thereon, such as an instant messaging application.
The first terminal device 301, 302 and the second terminal device 303, 304 may be hardware or software. When the first terminal device 301, 302 and the second terminal device 303, 304 are hardware, they may be various electronic devices having a display screen and supporting information presentation, including but not limited to smart phones, tablet computers, electronic book readers, laptop portable computers, desktop computers, and the like. When the first terminal device 301, 302 and the second terminal device 303, 304 are software, they can be installed in the above-listed electronic devices. It may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein.
The server 305 may be a server providing various services, such as a background message processing server providing support for messages presented on the first terminal device 301, 302 and the second terminal device 303, 304. The background message processing server can forward the session message sent by the user.
It should be noted that, the voice interaction method provided by the embodiment of the present disclosure may be executed by the first terminal device and the second terminal device. Accordingly, the voice interaction device can be arranged in the first terminal equipment and the second terminal equipment.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. Referring now to fig. 4, a schematic diagram of an electronic device (e.g., the terminal device of fig. 3) suitable for use in implementing some embodiments of the present disclosure is shown. Terminal devices in some embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, as well as stationary terminals such as digital TVs, desktop computers, and the like. The terminal device/server illustrated in fig. 4 is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 4, the electronic device may include a processing means (e.g., a central processor, a graphics processor, etc.) 401, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic device 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
In general, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, magnetic tape, hard disk, memory card, etc.; and a communication device 409. The communication means 409 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 shows an electronic device having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 4 may represent one device or a plurality of devices as needed.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 409, or from storage 408, or from ROM 402. The above-described functions defined in the methods of some embodiments of the present disclosure are performed when the computer program is executed by the processing device 401.
It should be noted that the computer readable medium according to some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: in response to detecting triggering operation of a preset voice channel opening control in the instant messaging application, sending an opening voice channel establishment request to a server, so that the server executes the following steps: and in response to determining to establish the open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request.
Computer program code for carrying out operations for some embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Where the names of the units do not constitute a limitation on the units themselves in some cases, for example, an aggregate unit may also be described as "a unit that aggregates session messages in an aggregate queue".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (38)

1. A method of voice interaction, comprising:
in response to detecting triggering operation of a preset voice channel opening control in the instant messaging application, sending an opening voice channel establishment request to a server, so that the server executes the following steps:
responsive to determining to establish an open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request; the open voice channel is used for receiving an audio stream of a first object initiating the open voice channel establishment request in real time;
The server responds to the determination of establishing the open voice channel and sends a connection entry establishment instruction to an associated object of a first object, wherein the connection entry establishment instruction is used for displaying a connection entry control on an associated page of a second client which is logged in by the associated object;
the connection entry control is displayed in a silent mode in a preset time period at the second client;
the silent display of the connection entry control within a preset time period includes any one of the following:
in the process of displaying the connection entry control, the control is displayed in a silent mode in the whole process;
and starting to display the connection entry control in a silent mode in response to the preset parameter value of the non-silent display parameter meeting the preset condition.
2. The method of claim 1, wherein a server establishes a voice channel between an associated user and the first object to forward an audio stream of one of the associated user and the first object to the other of the associated user and the first object in response to receiving an attach request for the open voice channel by the associated user.
3. The method of claim 1, wherein the second client ceases to display the connection entry control in response to determining that the first object closes the open voice channel.
4. The method of claim 1, wherein the server sends a direct connection notification to an associated object of a first object in response to determining to establish the open voice channel, wherein the direct connection notification is to indicate that the associated object is to connect to the open voice channel without confirmation via the first object.
5. The method of claim 1, wherein the voice channel opening control comprises a personal voice channel opening control, wherein the personal voice channel opening control corresponds to a personal user identification.
6. The method of claim 5, wherein the step of determining the position of the probe is performed,
the response to detecting the triggering operation of the preset voice channel opening control in the instant messaging application, the sending of the opening voice channel establishment request to the server includes:
and responding to the triggering operation of the personal voice channel opening control, and sending an opening voice channel establishment request to a server, wherein the establishment request carries the user identification of a single user sending the opening voice channel establishment request.
7. The method of claim 1, wherein the voice channel opening control comprises a group voice channel opening control.
8. The method of claim 7, wherein the voice channel controls are displayed on an associated page of the group user; and
the response to detecting the triggering operation of the preset voice channel opening control in the instant messaging application, the sending of the opening voice channel establishment request to the server includes:
and responding to the triggering operation of the voice channel control, and sending an open voice channel establishment request to a server, wherein the establishment request carries a group identifier or an identifier of a group member in a preset range in a group represented by the group identifier, and the group is a group which is displayed in association with the open voice channel control.
9. The method of claim 1, wherein the associated object is all or part of a user who previously established a friend relationship with the first object.
10. The method of claim 9, wherein the portion of users is determined by the first object selection.
11. The method according to claim 10, wherein the method further comprises:
displaying an associated user selection control;
and according to the triggering operation of the related user selection control, sending the user identification of the selected related user to the server.
12. The method of claim 11, wherein the presenting the associated user selection control comprises:
and responding to the detection of the triggering operation of the voice channel opening control, and displaying the related user selection control.
13. The method of claim 1, wherein the association object comprises at least one of: individual users, groups, subgroups of groups, wherein the subgroups of groups are established based on a preset theme.
14. The method of claim 1, wherein the association page includes at least one connection entry control.
15. The method of claim 14, wherein the at least one connection entry control included with the association page comprises at least one of:
displaying a voice channel connection entry control of a group user in an associated display area of the group user with the personal voice channel opened in the group chat page;
displaying a connection entry control of the group voice channel in a preset area of the group chat page;
the user's voice channel is connected to the entry control within a predetermined range of the associated page aggregate display.
16. The method of claim 15, wherein the user's voice channel connects to the portal control within a predetermined range of the associated page aggregate display;
Wherein: the predetermined range includes all friends of the user, or a pre-selected individual user and/or group user; the aggregate display includes displaying user identifications and corresponding voice connection entry controls within a preset range.
17. The method according to claim 1 or 2, wherein the connection entry control is displayed at a preset position of the associated page at a second client on which the associated object is logged.
18. The method of claim 17, wherein the preset location comprises a top of the associated page.
19. The method of claim 17, wherein the association page includes a preset display range corresponding to the first object and the associated object in a single chat page of the associated object and/or an information flow list page of the instant messaging application.
20. The method of claim 17, wherein the method further comprises:
and responding to the fact that the information to be displayed at the preset position comprises the connection entry control, the number of the information to be displayed is at least two, acquiring the priority of each piece of information to be displayed, and processing each piece of information to be displayed according to a preset display mode according to the priority.
21. The method of claim 20, wherein the preset display mode includes at least one of:
sequencing the information to be displayed according to the order of the priority from high to low, and displaying the information to be displayed according to the sequence;
displaying the connection entry control with the highest priority and hiding other information to be displayed except the connection entry control;
displaying each piece of information to be displayed, displaying corresponding deletion controls for other information to be displayed except the connection entry controls, and stopping displaying the information to be displayed corresponding to the deletion controls triggered by the triggering operation in response to detecting the triggering operation for the deletion controls.
22. The method of claim 20, wherein the connection entry control has a higher priority than other information to be displayed.
23. The method according to claim 1, wherein the method further comprises:
and displaying the control panel of the open voice channel at the instant messaging client of the object joining the voice channel.
24. The method of claim 23, wherein the control panel displaying the open voice channel comprises:
And a control panel for displaying a folded state in response to determining that the number of connections of the open voice channel is greater than a first number threshold.
25. The method of claim 23, wherein the control panel is displayed in suspension.
26. The method according to claim 23, characterized in that the method comprises:
responsive to detecting a trigger operation for an expansion control in the control panel in the collapsed state, the control panel in the expanded state is presented;
in response to detecting a trigger operation for a fold control in the control panel in the unfolded state, the control panel in the folded state is presented.
27. The method of claim 23, wherein the control panel comprises at least one of: the microphone turns off the control and the voice channel shares the control.
28. The method of claim 27, wherein the control panel includes a voice channel sharing control;
the method further comprises the steps of:
in response to detecting triggering operation for the voice channel sharing control, displaying a sharing object selection control, wherein the sharing object selection control is used for selecting a sharing object;
and determining a sharing object according to the triggering operation of the sharing object selection control, and sending voice entrance information to the client of the instant messaging application of the selected sharing object, wherein the voice entrance information is used for representing the link address of the voice channel.
29. The method of claim 28, wherein the voice portal information is in the form of a voice channel invitation card.
30. The method of claim 29, wherein the sharing object sends the voice channel invitation card to a third object to cause the third object to connect to the open voice channel via the voice channel invitation card.
31. The method of claim 23, wherein the control panel includes a list of user identifications, the user identifications in the list indicating users who are connected to the open voice channel.
32. The method of claim 1, wherein the number of associated users connected to the open voice channel is less than a second number threshold.
33. The method of claim 1, wherein the open voice channel setup request further comprises a device identification, wherein the server determines whether the received device identification has participated in a voice channel, and wherein the open voice channel is established for the device identification in response to determining that the device identification does not participate in a voice channel.
34. The method of claim 33, wherein the server determines that an open voice channel is not established for the device identification in response to determining that the device identification has participated in the voice channel.
35. The method of claim 1, wherein the server determines whether the number of connections of the open voice channel is greater than a third number threshold in response to receiving a connection request sent by an associated user; and responding to the fact that the connected number of the voice open channels is larger than a third number threshold, and connecting the associated user to be connected, which sends the connection request, into the open voice channels in a microphone off state mode.
36. A voice interaction device, comprising:
the sending unit is used for responding to the detection of the triggering operation of the preset voice channel opening control in the instant messaging application and sending an opening voice channel establishment request to the server so that the server executes the following steps:
responsive to determining to establish an open voice channel corresponding to the open voice channel establishment request, establishing the open voice channel to receive an audio stream of a first object initiating the open voice channel establishment request; the open voice channel is used for receiving an audio stream of a first object initiating the open voice channel establishment request in real time;
the server responds to the determination of establishing the open voice channel and sends a connection entry establishment instruction to an associated object of a first object, wherein the connection entry establishment instruction is used for displaying a connection entry control on an associated page of a second client which is logged in by the associated object;
The connection entry control is displayed in a silent mode in a preset time period at the second client;
the silent display of the connection entry control within a preset time period includes any one of the following:
in the process of displaying the connection entry control, the control is displayed in a silent mode in the whole process;
and starting to display the connection entry control in a silent mode in response to the preset parameter value of the non-silent display parameter meeting the preset condition.
37. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-35.
38. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-35.
CN202110165972.6A 2020-02-05 2021-02-04 Voice interaction method and device and electronic equipment Active CN112968826B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2020100811469 2020-02-05
CN202010081146 2020-02-05

Publications (2)

Publication Number Publication Date
CN112968826A CN112968826A (en) 2021-06-15
CN112968826B true CN112968826B (en) 2023-08-08

Family

ID=76274911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110165972.6A Active CN112968826B (en) 2020-02-05 2021-02-04 Voice interaction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112968826B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115706714A (en) * 2021-08-06 2023-02-17 北京字跳网络技术有限公司 Interaction method, electronic device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780653A (en) * 2012-08-09 2012-11-14 上海量明科技发展有限公司 Method, client and system for fast communication in instant messaging
CN103347003A (en) * 2013-06-19 2013-10-09 腾讯科技(深圳)有限公司 Voice interconnection method, device and system
CN105610788A (en) * 2015-12-17 2016-05-25 小米科技有限责任公司 Method and device for establishing call
CN107124661A (en) * 2017-04-07 2017-09-01 广州市百果园网络科技有限公司 Communication means, apparatus and system in direct broadcast band
CN109120947A (en) * 2018-09-05 2019-01-01 北京优酷科技有限公司 A kind of the voice private chat method and client of direct broadcasting room
WO2019020061A1 (en) * 2017-07-26 2019-01-31 腾讯科技(深圳)有限公司 Video dialogue processing method, video client, video server, and computer readable storage medium
CN110534108A (en) * 2019-09-25 2019-12-03 北京猎户星空科技有限公司 A kind of voice interactive method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10789041B2 (en) * 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780653A (en) * 2012-08-09 2012-11-14 上海量明科技发展有限公司 Method, client and system for fast communication in instant messaging
CN103347003A (en) * 2013-06-19 2013-10-09 腾讯科技(深圳)有限公司 Voice interconnection method, device and system
CN105610788A (en) * 2015-12-17 2016-05-25 小米科技有限责任公司 Method and device for establishing call
CN107124661A (en) * 2017-04-07 2017-09-01 广州市百果园网络科技有限公司 Communication means, apparatus and system in direct broadcast band
WO2019020061A1 (en) * 2017-07-26 2019-01-31 腾讯科技(深圳)有限公司 Video dialogue processing method, video client, video server, and computer readable storage medium
CN109120947A (en) * 2018-09-05 2019-01-01 北京优酷科技有限公司 A kind of the voice private chat method and client of direct broadcasting room
CN110534108A (en) * 2019-09-25 2019-12-03 北京猎户星空科技有限公司 A kind of voice interactive method and device

Also Published As

Publication number Publication date
CN112968826A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
KR101077739B1 (en) User initiated invite for automatic conference participation by invitee
EP3114832B1 (en) Displaying video call data
US9569752B2 (en) Providing parameterized actionable communication messages via an electronic communication
US9420038B2 (en) Method and apparatus providing synchronization and control for server-based multi-screen videoconferencing
US8477176B1 (en) System and method for automatically suggesting or inviting a party to join a multimedia communications session
WO2021231108A1 (en) Simulating real-life social dynamics in a large group video chat
JP2018508051A (en) Group icon composition method and apparatus for messenger service
US20170289070A1 (en) Making a Dialogue Available To an Autonomous Software Agent
US20100153858A1 (en) Uniform virtual environments
US20170289069A1 (en) Selecting an Autonomous Software Agent
US20150032809A1 (en) Conference Session Handoff Between Devices
WO2017172652A1 (en) Supplying context data to a servicing entity
WO2023040791A1 (en) Interaction method and apparatus, electronic device and storage medium
JP2023539103A (en) Screen sharing methods, devices and electronic devices
US11757948B2 (en) Communication method and apparatus, and electronic device
WO2023124767A1 (en) Prompt method and apparatus based on document sharing, device, and medium
US9013539B1 (en) Video conferencing system and method
CN110505072B (en) Method, terminal device and computer readable medium for backing up chat records
CN114650264A (en) Unread message display method and device, electronic equipment and storage medium
US10372324B2 (en) Synchronous communication system and method
CN112968826B (en) Voice interaction method and device and electronic equipment
CN112818303B (en) Interaction method and device and electronic equipment
US9137029B1 (en) State and availability monitoring for customer support services for multimedia conferences
KR20160085302A (en) Synchronous communication system and method
CN112306595A (en) Interaction method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant