CN113037924B

CN113037924B - Voice transmission method, device, electronic equipment and readable storage medium

Info

Publication number: CN113037924B
Application number: CN202110112724.5A
Authority: CN
Inventors: 张孝东
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2021-01-27
Filing date: 2021-01-27
Publication date: 2022-11-25
Anticipated expiration: 2041-01-27
Also published as: CN113037924A

Abstract

The application discloses a voice sending method, a voice sending device and electronic equipment, belongs to the technical field of communication, and can solve the problem that in the process of sending messages to group chat staff or other staff, users are required to sequentially mark or copy and then paste the messages to send the messages, so that the user sending message is increased, and further the efficiency of the users for using the electronic equipment is reduced. The method comprises the following steps: the method comprises the steps that under the condition that a conversation interface of a group conversation is displayed, the electronic equipment receives first input of a user, wherein the first input is used for inputting first voice; in response to a first input, the electronic device displays object identifications for the X objects; the electronic equipment receives a second input of the target object identifier in the X object identifiers from the user; in response to the second input, the electronic device transmits the target information.

Description

Voice transmission method, device, electronic equipment and readable storage medium

Technical Field

The application belongs to the technical field of communication, and particularly relates to a voice sending method, a voice sending device and electronic equipment.

Background

With the development of communication technology, it has become a communication method commonly used by users to send chat information at any time by using electronic devices (e.g., mobile phones, tablet computers).

In order to facilitate communication, a user can perform group chat through the electronic device, so that the information can be simultaneously received by the electronic devices of all people in the group chat through one-time transmission. When a user needs to remind part of people in the group chat or other people outside the group chat to remind of viewing a message, part of people identification in the group chat can be sequentially marked or the message can be copied and sent to other people through the electronic equipment when the message is sent, and when the marked part of people electronic equipment or the electronic equipment of other people receives the message, the marked people can be reminded to pay attention to the message in a specific mode.

However, in the process of labeling the person or sending the message to other persons, the user needs to label or copy the message in sequence and then paste the message, so that the steps of sending the message by the user are increased, the time for sending the message by the user is increased, and the efficiency of using the electronic equipment by the user is reduced.

Disclosure of Invention

The embodiment of the application aims to provide a voice sending method, a voice sending device and electronic equipment, and can solve the problems that in the process of labeling personnel or sending messages to other personnel, because a user needs to label or copy in sequence and then paste for sending, the steps of sending messages by the user are increased, the time length of sending messages by the user is increased, and the efficiency of using the electronic equipment by the user is reduced.

In order to solve the technical problem, the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides a method for sending a voice, where the method includes: the method comprises the steps that under the condition that a conversation interface of a group conversation is displayed, electronic equipment receives first input of a user, wherein the first input is used for inputting first voice; responding to the first input, and displaying object identifications of X objects by the electronic equipment; the electronic equipment receives a second input of the target object identifier in the X object identifiers from the user; responding to the second input, and sending target information by the electronic equipment; the target information includes first information and second information corresponding to the first voice, the second information is used for indicating a target object corresponding to the target object identifier, the object identifiers of the X objects are object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects other than the group corresponding to the group session, and X, N, and M are positive integers.

In a second aspect, an embodiment of the present application provides a voice sending apparatus, where the apparatus includes a receiving module, a display module, and a sending module; the receiving module is configured to receive a first input of a user under a condition that a session interface of a group session is displayed, where the first input is used to input a first voice; the display module is configured to display object identifiers of the X objects in response to the first input received by the receiving module; the receiving module is further configured to receive a second input of the target object identifier from the X object identifiers from the user; the sending module is used for responding to the second input received by the receiving module and sending target information; the target information includes the first voice and first information, the first information is used to indicate a target object corresponding to the target object identifier, the object identifiers of the X objects are object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects other than the group corresponding to the group session, and X, N, and M are positive integers.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored in the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

In this embodiment of the application, the electronic device may receive a first input of a first voice input by a user, display object identifiers of N objects in a group corresponding to a group session and/or object identifiers of M objects (i.e., X objects) other than the group corresponding to the group session, and then receive a second input of a target object identifier in the X object identifiers by the user, where the second information may be used to indicate a target object of the target object identifier, and the electronic device may send a target message including first information and second information corresponding to the first voice in the group session. Therefore, the electronic equipment can send the target information capable of indicating the target object in the electronic equipment to view the information only by receiving single input, so that the user does not need to label the target object information of the target object again, the step of labeling the target object information by the user is simplified, and the efficiency of prompting the target object to view the information in the electronic equipment by the user is improved.

Drawings

Fig. 1 is a flowchart of a speech sending method according to an embodiment of the present application;

fig. 2 is a schematic view of an interface applied to a voice sending method according to an embodiment of the present application;

fig. 3 is a second schematic diagram of an interface applied to a voice transmission method according to an embodiment of the present application;

fig. 4 is a third schematic diagram of an interface applied to a voice sending method according to an embodiment of the present application;

fig. 5 is a fourth schematic view of an interface applied to a voice sending method according to an embodiment of the present application;

fig. 6 is a fourth schematic view of an interface applied to a voice sending method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a voice sending apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 9 is a second schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The following describes the speech transmission method provided by the embodiment of the present application in detail through a specific embodiment and an application scenario thereof with reference to the accompanying drawings.

The voice sending method provided by the embodiment of the application can be applied to a scene of sending the message in the group session and viewing the message to the specific object.

For a scenario where a message is sent in a group session and a specific object is prompted to view the message, assuming that user a sends message 1 in group 1, which contains 4 users (user a, user B, user C and user D, respectively) in the chat application, user B and user C need to be specifically prompted to view the message 1. After the user inputs the message 1, the user inputs "@" to enter a display interface for selecting a specific object, if the user B is selected, the user identifier of the user B is added behind the message 1, then the user inputs "@" again to enter the display interface for selecting the specific object, if the user C is selected, the user identifier of the user C is added behind the user identifier of the user B. At this time, after receiving the click input of the user to the sending control, the electronic device sends a message 1 indicating that the user identifier of the user B and the user identifier of the user C are added to the session interface of the group 1, where the user identifiers of the user B and the user C may be used to send special reminding instructions to the electronic device of the user B and the electronic device of the user C, so as to remind the user B and the user C to view the message 1. It can be seen from the above process that when the user a needs to particularly remind some users in the group chat to view the message in the group chat, the user identifier needs to be manually added, and only the user identifier of one user in the group chat users can be added each time, so that the step of the user a sending the message is increased, the time length of the user a sending the message is further increased, and the efficiency of the user a using the electronic device is reduced.

According to the voice sending method provided by the embodiment of the application, under the condition that the electronic device displays the conversation interface of the group 1, the electronic device can receive the first input of the user A for inputting the message 1, further display the user identifications of other 3 users in the group 1, and then receive the selection input of the user A for the user identification of the user B and the user identification of the user C in the 3 user identifications, so that the electronic device can send the target information containing the message 1, the user identification of the user B and the user identification of the user C in the group 1, wherein the user identification of the user B and the user identification of the user C can be used for indicating the electronic device of the user B and the electronic device of the user C. Therefore, the electronic equipment can send the target information which can indicate the user B and the user C in the group 1 to check the information only through single input, so that the user does not need to label the user identification information of the user B and the user identification information of the user C again, the step of labeling the user identification information by the user is simplified, and the efficiency of prompting the user B and the user C to check the information in the group session by the user is improved.

The present embodiment provides a voice transmission method, as shown in fig. 1, the voice transmission method includes the following steps 301 to 304:

step 301: the voice transmission device receives a first input of a user while displaying a conversation interface of the group conversation.

In an embodiment of the present application, the first input is used to input a first voice.

In an embodiment of the present application, the first input may be a touch input, for example, a click input; the input may also be a voice input, or may also be an input of a specific gesture, which is not limited in this embodiment of the present application.

In this embodiment, the group session may be a group session in any application having a group chat function in the electronic device. For example, a group session in a chat application, a group session in a shopping application.

In the embodiment of the present application, the group session includes more than 2 objects capable of sending information, for example, one group session may include 3 chat accounts.

In this embodiment, the session interface is an interface for displaying information to be sent in a group session.

Step 302: in response to the first input, the voice transmission apparatus displays object identifications of the X objects.

In this embodiment, the object identifiers of the X objects are object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects other than the group corresponding to the group session, and X, N, and M are positive integers.

In this embodiment of the application, the N objects may be accounts of the N electronic devices included in the group session, and the M objects may be accounts of the other electronic devices included in the application corresponding to the group session except the N accounts in the group session in the application.

It can be understood that, after the electronic device receives the first input, only the object identifiers of the N objects may be displayed, only the object identifiers of the M objects may be displayed, and the object identifiers of the N objects and the object identifiers of the M objects may be simultaneously displayed. Further, in order to facilitate the user to distinguish the belonging situation of the object identifier, in the case of simultaneously displaying the object identifiers of the N objects and the object identifiers of the M objects, the electronic device may additionally display, on the object identifiers of the N objects, identification information that belongs to the group session, where the identification information may be picture information, text information, or other types of information, which is not limited in this embodiment of the application.

In this embodiment, the object may be related information of the object, and generally, the object may include information of the object, image information, and text information of the object, for example, when the object is any user account in any group session, the object identifier may be avatar information of the user account, name information of the user account, and account information of the user account, where the avatar information is image information, and the name information and the account information are text information.

In this embodiment of the present application, the object identifier may be a user account identifier corresponding to any user account of the electronic device in the group session.

In the embodiment of the present application, the objects and the object identifiers are in a one-to-one correspondence relationship.

In this embodiment, the object identifier may be: and the mark is formed by any information in the related information of the object. For example, when the object is a user account, the object identifier may be avatar information of the user account, and the avatar information may be used to indicate the user account.

Step 303: the voice transmission apparatus receives a second input of a target object identification among the X object identifications from the user.

In this embodiment of the application, the second input may be used to select a target object identifier, and select a target object by selecting the target object identifier, and may also be used to send information.

In an embodiment of the present application, the second input may be a touch input, for example, a click input or a slide input; the input may also be a voice input, or may also be an input of a specific gesture, which is not limited in the embodiment of the present application.

In this embodiment of the present application, the target object identifier may be any one or more of N object identifiers, which is not limited in this embodiment of the present application.

Step 304: in response to the second input, the voice transmission apparatus transmits the target information.

In an embodiment of the present application, the target information includes first information and second information corresponding to the first voice, and the second information is used to indicate a target object corresponding to the target object identifier.

In the embodiment of the present application, the first information may be text information, voice information, or multimedia information, for example, information including a music file or information including an image file.

It can be understood that, after the electronic device receives the first input of the user, the first voice of the first input may be recognized, and the first voice is recognized as text information and displayed on the display screen of the electronic device, or directly after the first voice is recognized, if the first voice includes multimedia information (e.g., a song name), the first information including a multimedia information link (e.g., a song link) may be generated for subsequent use by the user.

In one example, when the first information is voice information, the first input may be an input to a voice control on a session interface in the user electronic device.

In the embodiment of the present application, the second information may be information added to the first information, or may be information transmitted separately from the first information.

In this embodiment of the application, the second information may be information that can prompt a user of the target object to view the first information. For example, the second information may trigger the electronic device of the target object to perform a strong alert, reminding the user of the target object to view the first information.

In this embodiment, the second information may include target object information of the target object.

In one example, when the first information is text information, the second information may be information that is added to the first information and then transmitted together with the first information.

In one example, when the first information is voice information, the second information may be information separately transmitted from the first information.

It is understood that the second information may be directly displayed in the electronic device (that is, viewable by the user), for example, if the first information is information sent to the group session, the second information may be a prompt information for instructing the target object to view the information; for example, if the first information is directly sent to the session interface between the account of the target object and the account of the electronic device, the second information for reminding the target object of viewing the information is simultaneously attached to the session interface between the account of the target object and the account of the electronic device.

In this embodiment, the electronic device may send the target message in the group session, or send the target message outside the group session.

In one example, the electronic device may send the target information in the group session and display the target information on a session interface of the group session when receiving an object identifier of a user in the group session selected by the user.

In another example, when receiving the object identifier of the user other than the group session selected by the user, the electronic device may send the target information in a session between the electronic device and the electronic device of the other user, and display the target information on a session interface with the electronic device of the other user.

It is understood that after the electronic device sends the target information to the session of the electronic device with the electronic devices of other users, the electronic device may pop up query information for querying whether the user needs to jump to the session interface with the electronic devices of other users. The query information may be displayed in a floating window of the electronic device, and the floating window may further include a selection control, and the user may input the selection control to determine whether to jump to a session interface with the electronic device of another user.

Example 1: take the first information as text information and the group session as a group chat session containing 4 user accounts in the chat application as an example. As shown in fig. 2 (a), in a case where the display interface of the electronic device displays a session interface 31 of a group chat session, the electronic device receives a first input that a user inputs voice information (i.e., the first voice) to the voice control 32, and upon receiving the first input, the electronic device recognizes the voice information as text information 33 (i.e., the first information), and then, as shown in fig. 2 (b), the electronic device may display 3 avatar identifications (i.e., object identifications of the N objects) corresponding to 3 user accounts on the session interface 31. At this time, the electronic device receives a sliding selection input (i.e., the second input) of the user on two identifiers (i.e., the target object identifier) of the 3 avatar identifiers, and then, as shown in (C) of fig. 2, the electronic device adds two pieces of selected user account information, i.e., the user account information of the user B and the user account information of the user C, behind the text information, and after the user releases the sliding selection input on the display screen, the electronic device sends the text information and the user account information behind the text information, and then the electronic device of the user B and the electronic device of the user C are triggered to perform strong reminding, so as to remind the user B and the user C of viewing the information.

Example 2: take the first information as voice information, the group session is a group chat session including 4 user accounts in the chat application, and the chat application further includes accounts of 3 other users as an example. As shown in fig. 3 (a), when the display interface of the electronic device displays the session interface 31 of the group chat session, the electronic device inputs a first input of voice information (i.e., the first voice) to the voice control 32, and upon receiving the first input, the electronic device may display 3 avatar identifications corresponding to 3 user accounts in the group chat session (i.e., object identifications of the N objects) and 3 avatar identifications corresponding to 3 user accounts other than the group chat session on the session interface 31 as shown in fig. 3 (b). At this time, when the electronic device receives a user's press selection input (i.e., the second input) for the identifier of the user E (i.e., the target object identifier), as shown in fig. 3 (c), after the electronic device has transmitted the voice information to the electronic device of the user E, the electronic device pops up the inquiry window 34 on the display interface of the current group chat session to inquire whether the user needs to jump to the chat session interface for chatting with the user E.

In the voice sending apparatus provided in the embodiment of the present application, in a case that a session interface of a group session is displayed, the voice sending apparatus may receive a first input of a first voice input by a user, display object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects (i.e., X objects) other than the group corresponding to the group session, and then receive a second input of a target object identifier in the X object identifiers by the user, so that the voice sending apparatus sends a target message including first information and second information corresponding to the first voice in the group session, where the second information may be used to indicate a target object of the target object identifier. Therefore, the voice sending device can send the target information which can indicate the target object in the electronic equipment to view the information only by receiving the single input, so that the user does not need to label the target object information of the target object again, the step of labeling the target object information by the user is simplified, and the efficiency of prompting the target object to view the information in the electronic equipment by the user is improved.

Optionally, in this embodiment of the application, the first input is a voice input, and the first information is a target voice. On this basis, in the step 302, the voice sending method provided in the embodiment of the present application may further include the following step 302a:

step 302a: the voice transmission apparatus displays object identifications of the X objects in a process in which the user inputs the target voice.

Optionally, in this embodiment of the application, the voice input is a pressing input of the voice control by a user.

For example, when the voice sending apparatus receives the press input, a microphone of the electronic device may be used to receive voice, and when the voice sending apparatus receives a release of the voice control from the user on the display screen, the voice sending apparatus stops receiving voice.

For example, the target voice may be voice information received by the voice transmission apparatus until the user releases the press input after receiving the press input.

In one example, the voice transmitting apparatus may switch the current conversation interface to the voice input interface upon receiving a voice input of the user.

For example, the above X objects and the object identifiers of the X objects may refer to the foregoing description, and are not described herein again.

In one example, the object identifications of the above X objects may be displayed in a voice input interface.

For example, the object identifiers of the X objects may be displayed in various ways.

In an example, the object identifiers of the X objects may be displayed in the form of an overall session identifier including the X object identifiers. In example 1, as shown in fig. 4, after the voice transmission apparatus receives the voice input, the overall conversation id 42 including all the object ids in the voice chat application is displayed on the voice input interface 41.

It should be noted that, the electronic device may display the smaller-sized overall session identifier when the selection input of the user on the session identifier of the group session is not received, and enlarge the size of the session identifier of the group session when the selection input of the user on the target object identifier in the overall session identifier is received, so that when the user does not need to select the target object, the area of the voice input interface occupied by the session identifier of the group session is saved, and the smoothness of the voice input interface is improved.

In an example, the object identifiers of the X objects may be directly displayed in the form of X object identifiers. In example 2, as shown in fig. 5, after the voice transmission apparatus receives the voice input, the

object identifiers

43, 44, and 45 of the 3 objects are displayed on the voice input interface 41.

In an embodiment of the present application, the second input includes: and in the process of pressing the voice control by the user, the user slides to slide input of the target object identifier.

In one example, when the object identifiers of the above X objects can be displayed in the form of a session identifier of a group session, the voice transmission apparatus may receive a selection input of a target object identifier on the session identifier by a user, and further select the target object. With reference to example 1, as shown in fig. 4, after the voice input interface 41 displays the conversation identifier 42 of the group conversation, the voice sending apparatus may receive a sliding input of the user on the target object identifier in the conversation identifier, where the sliding trajectory is shown as 46, and further select the target object.

In one example, when the object identifiers of the N objects are directly displayed in the form of the N object identifiers, the voice transmission apparatus may receive a selection input of the target object identifier from the user, and then select the target object. With reference to example 2, as shown in fig. 5, after the voice input interface 41 displays the

object identifiers

43, 44, and 45 of the 3 objects, the voice sending apparatus may receive the selection input of the user on the target object identifier 43 in the session identifier, and further select the target object.

Example 3: take the first information as voice information and the group session as a group chat session containing 4 user accounts in the chat application as an example. As shown in fig. 6 (a), when the display interface of the electronic device displays the session interface 31 of the group chat session, and after the electronic device receives a press input to the voice control 51 from the user, as shown in fig. 6 (b), the electronic device may switch the session interface to the voice input interface 52, and display 3 avatar identifications (i.e., object identifications of the N objects) corresponding to 3 user accounts on the voice input interface 52 in an arc arrangement on the session interface 31. At this time, the electronic device receives a sliding selection input (i.e., the second input) of the user on two identifiers (i.e., the target object identifier) that slide to the 3 avatar identifiers, and then after the electronic device inputs the target voice information, the electronic device separately adds information including the selected two pieces of user account information, i.e., information including the user account information of the user B and the user account information of the user C, and after the user releases the sliding selection input on the display screen, as shown in (C) of fig. 6, the electronic device sends the target voice information and the information including the user account information of the user B and the user account information of the user C (i.e., the second information), and then the electronic device of the user B and the electronic device of the user C are triggered to strongly remind the user B and the user C to view the information.

Therefore, when the user inputs the voice message in the group session, if the user needs to prompt part of the users in the group session to check the information, the user prompt information can be added behind the voice message through single input, so that the steps of the user after inputting the voice message can be simplified, and the efficiency of inputting the voice by the user is improved.

Optionally, in this embodiment, after step 301, the voice sending method provided in this embodiment may further include step 305:

step 305: and the voice sending device cancels the display of the object identifications of the N objects when the voice input is finished.

For example, after the voice sending apparatus finishes receiving the voice input, the voice sending apparatus switches the voice input interface back to the conversation interface, and cancels the display of the object identifiers of the N objects.

Optionally, in this embodiment of the application, the first input is used to input first voice information; the first speech information includes second speech information indicating the X objects specified based on the second speech information. On this basis, the step 304 may include the following steps 306 and 307:

step 306: and responding to the first input, the voice sending device removes the second voice information in the first voice information to obtain target voice information.

For example, the X objects may refer to the foregoing description, and details are not repeated here.

For example, the first input may be a voice input to the electronic device by the user during a press input to the voice control.

For example, the target voice information may be voice information that the user substantially needs to send to an object other than an object in the group session or in an application corresponding to the group session, and that the target object is prompted to receive.

For example, the electronic device may recognize the second voice information in the first voice information.

In an example, the electronic device may be configured to identify a voice identifier, and after identifying the voice identifier, the electronic device may trigger to identify the same voice content as the target object identifier in the voice information by using a voice recognition technology, and after querying, may identify the second voice information in the first voice information. For example, the electronic device uses "@" as the voice identifier, and then the user inputs the voice information "i get there tomorrow, @ B, @ C" to the electronic device, and then the electronic device will trigger the voice recognition technology of the electronic device after receiving the voice information and when receiving "@", recognize the same content as the target object identifier in the information corresponding to the subsequent voice information from "@", that is, query the same user account identifiers as "B" and "C" in all the user account identifiers, and after querying, may recognize "@ B, @ C" as the information for indicating the target object in the plurality of objects in the group session.

In the embodiment of the present application, after the second voice information in the first voice information is recognized, the second voice information may be removed from the first voice information, so as to obtain the target voice information.

Step 307: the voice transmission device transmits the destination information.

Illustratively, the target information includes the target voice information and second information indicating the target object.

In the embodiment of the present application, the second information is the same as the content described in the foregoing, and is not described again here.

Further, after the electronic device recognizes the second voice message, the second message converts the corresponding content in the second voice message into the second message. The second information may be text content.

Example 4: take the first information as voice information and the group session as a group chat session containing 4 user accounts in the chat application as an example. In the case that a display interface of the electronic device displays a session interface of a group chat session, after the electronic device receives a press input of a user to a voice control, the user inputs voice information 'I tomorrow to get there, @ B, @ C' (namely the first voice information), the electronic device removes the 'I tomorrow to get there, @ B, @ C' (namely the second voice information) in the 'B, @ C' (namely the second voice information) after receiving the voice information, and obtains 'I tomorrow to get you there' as target voice information, and then after releasing the press input of the voice control, the electronic device sends the voice information 'I tomorrow to get you there' and '@ B, @ C' in the group session.

Thus, the voice sending device can directly identify the target object indicated in the voice information when the user inputs voice through the voice identification technology, further remove the voice information containing the target object from the input voice information, extract the target voice information, finally send the target voice information, send the target object information for prompting the target object to check the target voice information, and prompt the target object information to check the target voice information. Therefore, the voice sending device can send the target information which can indicate the target object in the group session to view the information only by receiving single input, so that the user does not need to label the target object information of the target object again, the step of labeling the target object information by the user is simplified, and the efficiency of prompting the target object to view the information in the group session by the user is improved.

In the voice transmission method provided in the embodiment of the present application, the execution subject may be a voice transmission apparatus, or a control module in the voice transmission apparatus for executing the voice transmission method. The voice transmitting apparatus provided in the embodiment of the present application will be described with reference to an example in which a voice transmitting apparatus executes a voice transmitting method.

Fig. 7 is a schematic diagram of a possible structure of a voice transmitting apparatus according to an embodiment of the present application. As shown in fig. 7, the voice transmitting apparatus 600 includes a receiving module 601, a display module 602, and a transmitting module 603; the receiving module 601 is configured to receive a first input of a user under a condition that a session interface of a group session is displayed, where the first input is used to input a first voice; the display module 602, configured to display object identifiers of X objects in response to the first input received by the receiving module 601; the receiving module 601 is further configured to receive a second input of a target object identifier in the X object identifiers from the user; the sending module 603 is configured to send target information in response to the second input received by the receiving module 601; the target information includes first information and second information corresponding to the first voice, the second information is used for indicating a target object corresponding to the target object identifier, the object identifiers of the X objects are object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects other than the group corresponding to the group session, and X, N, and M are positive integers.

In the voice sending apparatus provided in the embodiment of the present application, under the condition that a session interface of a group session is displayed, the voice sending apparatus may receive a first input of a first voice input by a user, display object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects (i.e., X objects) other than the group corresponding to the group session, and then receive a second input of a target object identifier in the X object identifiers from the user, so that the voice sending apparatus sends a target message including first information and second information corresponding to the first voice in the group session, where the second information may be used to indicate a target object of the target object identifier. Therefore, the voice sending device can send the target information which can indicate the target object in the electronic equipment to view the information only by receiving the single input, so that the user does not need to label the target object information of the target object again, the step of labeling the target object information by the user is simplified, and the efficiency of prompting the target object to view the information in the electronic equipment by the user is improved.

Optionally, in this embodiment of the application, the first input is a voice input, and the first information is a target voice; the display module 602 is specifically configured to display object identifiers of X objects in a process of inputting a target voice by a user.

Optionally, in this embodiment of the application, the voice input is a pressing input of the voice control by a user; wherein the second input comprises: and in the process that the user presses the voice control, the user slides to the sliding input of the target object identifier.

Optionally, in this embodiment of the application, the apparatus 600 further includes an executing module 604; the executing module 604 is configured to cancel displaying the object identifiers of the X objects when the voice input is ended.

Optionally, in this embodiment of the application, the first input is used to input first voice information; the first voice message includes second voice message indicating X objects in the group session, where the X objects are determined according to the second voice message, and the apparatus 600 further includes: an execution module 604; the executing module 604 is configured to remove the second voice message from the first voice message in response to the first input received by the receiving module 601, so as to obtain a target voice message; the sending module 603 is specifically configured to send target information; the target information includes the target voice information and second information indicating the target object.

The voice transmission device in the embodiment of the present application may be a device, and may also be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The voice transmission apparatus in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.

The voice sending apparatus provided in the embodiment of the present application can implement each process implemented by the method embodiments of fig. 1 to fig. 6, and is not described here again to avoid repetition.

It should be noted that, as shown in fig. 7, modules that are necessarily included in the voice transmission apparatus 600 are illustrated by solid line boxes, such as a receiving module 601; modules that may or may not be included in the electronic device 600 are illustrated with dashed boxes as execution block 604.

Optionally, as shown in fig. 8, an electronic device 800 is further provided in the embodiment of the present application, and includes a processor 801, a memory 802, and a program or an instruction that is stored in the memory 802 and is executable on the processor 801, where the program or the instruction is executed by the processor 801 to implement each process of the foregoing embodiment of the voice sending method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 9 is a schematic diagram of a hardware structure of an electronic device implementing the embodiment of the present application.

The electronic device 100 includes, but is not limited to: a radio frequency unit 101, a network module 102, an audio output unit 103, an input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, and a processor 110. Wherein the user input unit 107 includes: touch panel 1071 and other input devices 1072, display unit 106 including display panel 1061, input unit 104 including image processor 1041 and microphone 1042, memory 109 may be used to store software programs (e.g., an operating system, application programs needed for at least one function), and various data.

Those skilled in the art will appreciate that the electronic device 100 may further comprise a power supply (e.g., a battery) for supplying power to various components, and the power supply may be logically connected to the processor 110 via a power management system, so as to implement functions of managing charging, discharging, and power consumption via the power management system. The electronic device structure shown in fig. 9 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is not repeated here.

The user input unit 107 is configured to receive a first input of a user when a conversation interface of a group conversation is displayed, where the first input is used to input a first voice; a display unit 106 for displaying object identifications of X objects in response to the first input by the user input unit 107; a user input unit 107, further configured to receive a second input of a target object identifier from the X object identifiers by the user; a radio frequency unit 101 for transmitting target information in response to the second input inputted by the user input unit 107; the target information includes first information and second information corresponding to the first voice, the second information is used for indicating a target object corresponding to the target object identifier, the object identifiers of the X objects are object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects other than the group corresponding to the group session, and X, N, and M are positive integers.

In this embodiment of the present application, in a case that a session interface of a group session is displayed, the electronic device may receive a first input of a first voice input by a user, display object identifiers of N objects in a group corresponding to the group session and/or object identifiers of M objects (i.e., X objects) other than the group corresponding to the group session, and then receive a second input of a target object identifier in the X object identifiers by the user, and then the electronic device may send a target message including first information and second information corresponding to the first voice in the group session, where the second information may be used to indicate a target object of the target object identifier. Therefore, the electronic equipment can send the target information which can indicate the target object in the electronic equipment to view the information only by receiving single input, so that the user does not need to label the target object information of the target object again, the step of labeling the target object information by the user is simplified, and the efficiency of prompting the target object to view the information in the electronic equipment by the user is improved.

Optionally, the first input is a voice input, and the first information is a target voice; the display unit 106 is specifically configured to display the object identifiers of the X objects in a process of inputting the target voice by the user.

Optionally, the processor 110 is configured to cancel displaying the object identifiers of the X objects when the voice input is finished.

Optionally, the first input is used to input first voice information; the first voice message includes second voice message, the second voice message is used for indicating the X objects, and the X objects are determined according to the second voice message; a processor 110, configured to remove the second voice information from the first voice information in response to the first input by the user input unit 107 to obtain target voice information; a radio frequency unit 101 configured to transmit target information; the target information includes the target voice information and second information indicating the target object.

It should be understood that, in the embodiment of the present application, the input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, and the Graphics Processing Unit 1041 processes image data of a still picture or a video obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes a touch panel 1071 and other input devices 1072. The touch panel 1071 is also referred to as a touch screen. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. The memory 109 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 110 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements the processes of the foregoing embodiment of the voice transmission method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the foregoing voice transmission method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of another like element in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for transmitting speech, the method comprising:

under the condition that a conversation interface of a group conversation is displayed, receiving a first input of a user, wherein the first input is used for inputting first voice information, the first voice information comprises second voice information, the second voice information is used for indicating X objects, and the X objects are determined according to the second voice information;

in response to the first input, displaying object identifications for the X objects;

receiving a second input of a target object identification in the object identifications of the X objects from a user;

sending target information in response to the second input;

the object identifiers of the X objects are the object identifiers of N objects in a group corresponding to the group session and/or the object identifiers of M objects outside the group corresponding to the group session, and X, N and M are positive integers;

the sending target information in response to the second input comprises:

responding to the second input, removing the second voice information in the first voice information to obtain target voice information;

when the object identifiers of the X objects are the object identifiers of the N objects in the group corresponding to the group session, sending target information in the group session; when the object identifiers of the X objects are the object identifiers of M objects except the group corresponding to the group session, sending target information in the session corresponding to the target object identifier; the target information comprises the target voice information and second information, the second information is character information, the second information is used for indicating a target object corresponding to the target object identification, and the second information is used for reminding a user of the target object to view the target voice information.

2. The method of claim 1, wherein the first input is a speech input;

the object identification for displaying X objects comprises:

and displaying the object identifications of the X objects in the process of inputting the first voice information by the user.

3. The method of claim 2, wherein the voice input is a user press input to the voice control;

wherein the second input comprises: and in the process that the user presses the voice control, the user slides to the sliding input of the target object identification.

4. The method of claim 2, wherein after receiving the first input from the user, the method further comprises:

and canceling the display of the object identifications of the X objects when the voice input is finished.

5. The voice sending device is characterized by comprising a receiving module, a display module, a sending module and an execution module;

the receiving module is configured to receive a first input of a user under a condition that a session interface of a group session is displayed, where the first input is used to input first voice information, the first voice information includes second voice information, the second voice information is used to indicate X objects, and the X objects are determined according to the second voice information;

the display module is used for responding to the first input received by the receiving module and displaying the object identifications of the X objects;

the receiving module is further configured to receive a second input of a target object identifier in the object identifiers of the X objects by the user;

the sending module is used for responding to the second input received by the receiving module and sending target information;

the object identifiers of the X objects are the object identifiers of N objects in a group corresponding to the group session and/or the object identifiers of M objects except the group corresponding to the group session, and X, N and M are positive integers;

the execution module is configured to remove the second voice information from the first voice information in response to the second input received by the receiving module, so as to obtain target voice information;

the sending module is used for sending target information in the group session when the object identifiers of the X objects are the object identifiers of the N objects in the group corresponding to the group session; when the object identifiers of the X objects are the object identifiers of M objects except the group corresponding to the group session, the method is used for sending target information in the session corresponding to the target object identifier; the target information comprises the target voice information and second information, the second information is character information, the second information is used for indicating a target object corresponding to the target object identification, and the second information is used for reminding a user of the target object to view the target voice information.

6. The apparatus of claim 5, wherein the first input is a speech input;

the display module is specifically configured to display object identifiers of the X objects in a process of inputting the first voice information by the user.

7. The apparatus of claim 6, wherein the voice input is a user press input to the voice control;

8. The apparatus of claim 6, further comprising an execution module;

and the execution module is used for canceling the display of the object identifications of the X objects when the voice input is finished.

9. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions when executed by the processor implementing the steps of the method of transmitting speech according to any of claims 1-4.

10. A readable storage medium, characterized in that a program or instructions are stored thereon, which program or instructions, when executed by a processor, carry out the steps of the method for voice transmission according to any one of claims 1-4.