CN115242747A

CN115242747A - Voice message processing method and device, electronic equipment and readable storage medium

Info

Publication number: CN115242747A
Application number: CN202210860396.1A
Authority: CN
Inventors: 徐珑嘉
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2022-07-21
Filing date: 2022-07-21
Publication date: 2022-10-25

Abstract

The application discloses a voice message processing method, a device, an electronic device and a readable storage medium, which belong to the technical field of communication, wherein the voice message processing method comprises the following steps: initiating voice recording requests to at least two contacts of a voice recording initiator; under the condition that the target voice message is obtained, the target voice message is shared to at least two contacts; wherein the target voice message is determined based on voice messages recorded by at least two contacts in response to the voice recording request.

Description

Voice message processing method and device, electronic equipment and readable storage medium

Technical Field

The application belongs to the technical field of communication, and particularly relates to a voice message processing method and device, electronic equipment and a readable storage medium.

Background

Voice conversation messages are fast and easy to send, so that more and more people like sending voice conversation messages to chat when talking and chatting. However, when a plurality of people send voice session messages to chat, the voice session messages sent by the plurality of people are easily scattered by other session messages in the session interface, which is not beneficial to centralized information acquisition, and a user needs to turn up the session record step by step to read each voice session message, thereby reducing the rate of reading messages by the user and reducing the efficiency of acquiring information by the user.

Disclosure of Invention

An embodiment of the present application aims to provide a voice message processing method, an apparatus, an electronic device, and a readable storage medium, which can solve the problem that the efficiency of acquiring information from a voice message by a user is low.

In a first aspect, an embodiment of the present application provides a method for processing a voice message, where the method includes: initiating voice recording requests to at least two contacts of a voice recording initiator; under the condition that the target voice message is obtained, the target voice message is shared to at least two contacts; wherein the target voice message is determined based on voice messages recorded by at least two contacts in response to the voice recording request.

In a second aspect, an embodiment of the present application provides a voice message processing apparatus, where the apparatus includes: the sending unit is used for sending voice recording requests to at least two contacts of a voice recording initiator; the sending unit is further used for sharing the target voice message to at least two contacts under the condition that the target voice message is obtained; wherein the target voice message is determined based on voice messages recorded by at least two contacts in response to the voice recording request.

In a third aspect, embodiments of the present application provide an electronic device, which includes a processor and a memory, where the memory stores a program or instructions executable on the processor, and the program or instructions, when executed by the processor, implement the steps of the voice message processing method according to the first aspect.

In a fourth aspect, the present application provides a readable storage medium, on which a program or instructions are stored, and when executed by a processor, the program or instructions implement the steps of the voice message processing method according to the first aspect.

In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the steps of the voice message processing method according to the first aspect.

In a sixth aspect, the present application provides a computer program product, which is stored in a storage medium and executed by at least one processor to implement the steps of the voice message processing method according to the first aspect.

According to the voice message processing method, the voice recording initiating direction sends the voice recording requests to at least two contacts, the target voice message is determined according to the fact that the at least two contacts record corresponding voice messages in response to the received voice recording requests, and the target voice message is shared with each contact. Therefore, each contact person participating in the conversation does not need to gradually turn up the conversation record to read the voice messages sent by other contact persons, the voice messages can be read in a centralized manner only through the target voice message, the speed of reading the voice messages by the user is improved, and the efficiency of acquiring information from the voice messages by the user is improved.

Drawings

Fig. 1 is a schematic flowchart of a voice message processing method according to an embodiment of the present application;

fig. 2 is an operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 3 is a second operation interface diagram of the voice message processing method according to the embodiment of the present application;

fig. 4 is a third operation interface diagram of a voice message processing method according to the embodiment of the present application;

fig. 5 is a fourth operation interface diagram of the voice message processing method according to the embodiment of the present application;

fig. 6 is a fifth operation interface diagram of the voice message processing method according to the embodiment of the present application;

fig. 7 is a sixth operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 8 is a seventh operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 9 is an eighth operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 10 is a ninth view of an operation interface of a voice message processing method according to an embodiment of the present application;

fig. 11 is a tenth of an operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 12 is an eleventh operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 13 is a twelfth operation interface diagram of a voice message processing method according to an embodiment of the present application;

fig. 14 is a block diagram illustrating a structure of a voice message processing apparatus according to an embodiment of the present application;

fig. 15 is a block diagram of an electronic device according to an embodiment of the present application;

fig. 16 is a schematic hardware structure diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below clearly with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application are capable of operation in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like generally refer to a class of objects and do not limit the number of objects, for example, a first object may be one or at least two. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

An embodiment of the first aspect of the present application provides a voice message processing method, and an execution subject of the technical solution of the voice message processing method provided in the embodiment of the present application may be a voice message processing apparatus, and may specifically be determined according to an actual use requirement, and the embodiment of the present application is not limited. In order to more clearly describe the voice message processing method provided in the embodiment of the present application, in the following method embodiment, an execution subject of the voice message processing method is exemplarily illustrated as a voice message processing apparatus.

The following describes the voice message processing method provided by the embodiment of the present application in detail through a specific embodiment and an application scenario thereof with reference to the accompanying drawings.

As shown in fig. 1, an embodiment of the present application provides a voice message processing method, which may include the following steps S102 and S104:

s102: and initiating voice recording requests to at least two contacts of the voice recording initiator.

The voice message processing method provided by the embodiment of the application is suitable for the electronic equipment of the voice recording initiator. The at least two contacts are at least two contacts determined by the electronic device to participate in the recording of the voice session message in response to the selection input of the user (i.e., the voice recording initiator).

Specifically, in an actual application process, the electronic device receives and responds to a first input of a user to display a contact list, and further receives and responds to a selection input of the user for at least two contacts in the contact list to determine that the at least two contacts are session members participating in recording of the voice session message.

As shown in fig. 2, before the user selects the at least two contacts, the user may record an initial voice message 202, where the initial voice message 202 is not sent immediately after being recorded, but is displayed in a session interface 204 of the voice recording initiator, and on this basis, the first input may be specifically a touch input of the user to the initial voice message.

Further, the voice recording initiator may not record the initial voice message, and at this time, the first input may be a touch input of the user directly to the session interface or a touch button therein.

That is, the first input may be a touch input of the user to a touch button in the initial voice message, the conversation interface, or the meeting interface, and the touch input may be a single-click input, a double-click input, a long-press input, a sliding input along a preset direction, or the like. For the specific form of the first input, the user may select the first input according to actual conditions, and is not limited specifically herein.

Further, when the contact list is displayed, information such as nicknames and head portraits of at least two contacts can be displayed in the contact list, and the selection input can be a sliding input of dragging the head portraits of the contacts into the session interface by the user. In response to the selection input, the electronic device displays the avatars 206 of the at least two contacts in sequence within the session interface 204 in the order in which the user selected the at least two contacts, as shown in FIG. 3.

In an actual application process, at least two operation controls may be further displayed in the contact list, and the at least two operation controls correspond to the at least two contacts one to one. The operation control can be an adding control (such as a touch button in a form of "+"), and the electronic device responds to click input of a user on the adding control in front of the contact person and sequentially displays the head portraits of the contact person in the session interface according to a selection sequence.

Further, the operation control may also be a selection control (e.g., a touch button in the form of "o"), and the electronic device may sequentially display the avatars of the contacts in the session interface according to the selection order in response to a click input of the user on the selection control before the contacts, and may display the order in which the user selects the corresponding contacts in the selection control in the form of numbers.

In addition, when the user selects the at least two contacts, the voice recording time length of each contact can be limited, for example, the voice recording time length of the first contact is limited to 5 seconds, and the voice recording time length of the second contact is limited to 10 seconds.

Further, the voice recording request is used for instructing the contact person to record the voice message. In an actual application process, the voice recording request can be used as a voice recording switch, and the contact starts a voice recording function of the electronic device to record a corresponding voice message according to touch input (such as long press input) of the received voice recording request.

And under the condition that the user records the initial voice message, the voice recording request comprises the initial voice message, so that the contact records the corresponding voice message according to the specific content of the initial voice message. And under the condition that the user does not record the initial voice message, the voice recording request is sent to the contact person in a blank message form.

In addition, in an actual application process, as shown in fig. 4, after the contact receives the voice recording request 208, the voice recording request 208 may further include a text prompt message 210 (for example, a text of "please supplement the recorded voice") to remind the contact to feed back a corresponding voice message according to the voice recording request 208 in time.

S104: and sharing the target voice message to at least two contacts under the condition of acquiring the target voice message.

Wherein the target voice message is determined based on voice messages recorded by at least two contacts in response to the voice recording request.

Further, the target voice message includes a voice message recorded by at least two contacts of the voice recording initiator.

Specifically, the target voice message may be a complete voice message generated in real time in a process that at least two contacts of the voice recording initiator record voice messages in sequence and record the voice messages by the at least two contacts. The target voice message may also be a complete voice message obtained by the voice recording initiator summarizing the voice messages fed back by the at least two contacts of the voice recording initiator and feeding back the respective recorded voice messages to the voice recording initiator. On the basis, under the condition that the terminal equipment of the voice recording initiator acquires the target voice message, the target voice message is sent to each contact person participating in the voice conversation, so that each contact person participating in the voice conversation can check the target voice message, and corresponding information is acquired.

Specifically, in an actual application process, when the user selects the at least two contacts, a voice recording mode can be selected, and the generation mode of the target voice message is related to the voice recording mode selected by the user. The voice recording mode comprises a multi-person chain recording mode and a multi-person simultaneous recording mode. Where "chained" means that the nodes behind follow the nodes in front, a sequential structure.

And under the condition that the user selects a multi-person simultaneous recording mode, the electronic equipment of the voice recording initiator simultaneously sends a voice recording request to the terminal equipment of at least two contacts. On the basis, at least two contacts record corresponding voice messages according to the received voice recording requests, the recorded voice messages are sent to the electronic equipment of the voice recording initiator, and the electronic equipment of the voice recording initiator determines the target voice messages according to the voice messages recorded by the at least two contacts.

It should be noted that, as shown in fig. 10, when a certain contact is required to record multiple voice messages, the electronic device of the voice recording initiator may send at least two voice recording requests 208 to the terminal device of the contact at the same time. That is, the electronic device of the voice recording initiator sends the voice recording requests 208 of corresponding number to the terminal devices of at least two contacts respectively according to the number of the voice messages that each contact needs to record.

Further, under the condition that the user selects the multi-person chain recording mode, the electronic equipment of the voice recording initiator sends a voice recording request to a first terminal corresponding to a first contact person in at least two contact persons.

And the first contact can play the initial voice message under the condition that the voice recording request comprises the initial voice message recorded by the voice recording initiator, further start the voice recording function of the first terminal through touch input to the voice recording request, and record the voice message according to the specific content of the initial voice message. On the basis, the first terminal generates a new voice recording request according to the voice message recorded by the first contact and the initial voice message, and sends the new voice recording request to a second terminal corresponding to a second contact in at least two contacts.

Further, the second contact records a new voice message according to the received voice recording request, the second terminal generates a new voice recording request according to the voice message recorded by the second contact, the voice message recorded by the first contact and the initial voice message, and sends the new voice recording request to a third terminal corresponding to a third contact of the at least two contacts. And repeating the steps until all the at least two contacts are recorded, and finally generating the target voice message by the terminal equipment of the contact recording the voice message according to the voice message recorded by each contact and sending the target voice message to the terminal equipment of the voice recording initiator.

As shown in fig. 5 and fig. 6, at least two contact identifiers 212 may be displayed in the voice recording request 208, where the at least two contact identifiers 212 correspond to the plurality of voice messages in the voice recording request 208 in a one-to-one manner. When the contact plays the voice message in the voice recording request 208, the contact can play the corresponding voice message through touch input to the contact identifier 212, or perform text conversion processing on the corresponding voice message. The contact identifier may be specifically an avatar of the contact or other symbolic identifiers associated with the contact, and is not limited herein.

Further, when the contact plays the voice message in the voice recording request, as shown in fig. 5, a voice progress bar 214 may be displayed in the session interface 204, and the contact adjusts the progress of the voice playing through the voice progress bar.

Further, as shown in fig. 7, when the contact records a new voice message 216 according to the received voice recording request 208, an edit control 218 is displayed around the new voice message 216, and the edit control 218 may specifically include a determination control 220 and a cancellation control 222. The contact re-records the voice message through touch input to the cancel control 222 or determines that the voice message is recorded through touch input to the determination control 220.

In addition, the waiting time for recording the voice message by each contact (i.e. the time from the first time point when the terminal device corresponding to the contact receives the voice recording request to the second time point when the contact determines to complete the voice recording) cannot exceed the preset time threshold. When the waiting time for recording the voice message by the contact is longer than the time threshold, the text prompt information (such as a word of 'please supplement the recorded voice') in the voice recording request received by the contact automatically disappears, that is, the contact cannot start the voice recording function of the terminal device by touch input of the voice recording request. And the voice recording request of the contact person can be automatically sent to the terminal equipment of the next contact person, so that the efficiency of chain voice recording is improved, and the time cost is saved.

By the voice message processing method provided by the embodiment of the application, when a plurality of people carry out conversation by sending voice messages, for the electronic equipment of the voice recording initiator, the voice recording request is sent to at least two contacts of the voice recording initiator, so that the at least two contacts record corresponding voice messages according to the received voice recording requests. On the basis, the target voice message is determined according to the voice messages recorded by the at least two contacts of the voice recording initiator according to the voice recording requests received by the at least two contacts, and then the target voice message is shared with the at least two contacts under the condition that the target voice message is obtained. That is to say, when a plurality of people carry out a conversation by sending voice messages, the voice messages recorded by at least two contacts of the voice recording initiator can be spliced and integrated, so that a target voice message containing the voice message recorded by each contact is obtained, and the target voice message is shared with each contact. Therefore, each contact person participating in the conversation does not need to gradually turn up the conversation record to read the voice messages sent by other contact persons, the voice messages can be read in a centralized manner only through the target voice message, the speed of reading the voice messages by the user is improved, and the efficiency of obtaining information by the user is improved.

In this embodiment, before the step S102, the voice message processing method may further include the following steps S100 and S101, and on this basis, the step S102 may specifically include the following step S102a:

s100: the voice recording sequence of at least two contacts is determined.

The voice recording sequence can be the sequence of at least two contacts selected by the user.

In an actual application process, as shown in fig. 3, the avatars 206 of the at least two contacts may be sequentially displayed in the session interface 204 according to an order in which the user selects the at least two contacts, so as to represent the voice recording order of the at least two contacts.

S101: and determining the contact in the first order according to the voice recording sequence of at least two contacts.

Specifically, when the chain recording is performed, the order of recording the voice of the at least two contacts can be determined by selecting the sequence of the at least two contacts by the user, and the contact located in the first order of the voice recording sequence in the at least two contacts is determined.

S102a: and initiating a voice recording request to the first-order contact person.

And under the condition that the recording of the contacts in the first order is finished, transmitting the voice message recorded by the contacts in the first order and the voice recording request to the contacts in the next order, so that the contacts in the next order respond to the voice recording request, record the voice message according to the voice message recorded by the contacts in the first order, and under the condition that the recording of the contacts in each order is finished, acquiring the target voice message.

Further, for the at least two contacts, the voice recording request received by each contact includes the voice message recorded by the contact which recorded the voice message before the voice message. For example, for the K-th contact (K is a positive integer greater than or equal to 2), the received voice recording request includes the voice message recorded by the first K-1 contacts.

Specifically, when the user selects the at least two contacts, the user can select a voice recording mode. Under the condition that a user selects a multi-person chain type recording mode, the electronic equipment of the voice recording initiator sends a voice recording request to the terminal equipment of the first contact selected by the user first according to the sequence of at least two contacts selected by the user, and the voice recording request is recorded as a first terminal.

And the first contact can play the initial voice message under the condition that the voice recording request comprises the initial voice message recorded by the voice recording initiator, further start the voice recording function of the first terminal through touch input to the voice recording request, and record the voice message according to the specific content of the initial voice message. On the basis, the first terminal generates a new voice recording request according to the voice message recorded by the first contact and the initial voice message, and sends the new voice recording request to the terminal equipment of a second contact selected by the user for recording as a second terminal.

Further, the second contact records a new voice message according to the received voice recording request, the second terminal generates a new voice recording request according to the voice message recorded by the second contact, the voice message recorded by the first contact and the initial voice message, and sends the new voice recording request to the terminal equipment of a third contact selected by the user for the third time, and the new voice recording request is recorded as a third terminal. And repeating the steps until all the at least two contacts are recorded, and finally generating the target voice message by the terminal equipment of the contact recording the voice message according to the voice message recorded by each contact and sending the target voice message to the terminal equipment of the voice recording initiator.

According to the embodiment provided by the application, the electronic equipment of the voice recording initiator can send the voice recording request to the first-order contact person in the voice recording sequence according to the preset voice recording sequence, wherein the voice recording request is used for controlling at least two contact persons to send the voice recording request to the next-order contact person in a chained mode according to the preset voice recording sequence until all the at least two contact persons are recorded completely. It should be noted that, starting from the second contact, the voice recording request received by each contact includes voice messages recorded by all previous contacts. Namely, the voice recording request received by the kth contact (K is a positive integer greater than or equal to 2) contains the voice message fed back by the first K-1 contacts. That is, after the last contact completes the recording, a complete voice message including all the voice messages recorded by the at least two contacts, that is, the target voice message, can be obtained. Therefore, when a plurality of people carry out conversation by sending the voice message, conversation members do not need to turn up the conversation record step by step to read a plurality of voice messages sent by at least two contacts, and the messages can be read in a centralized manner only through the target voice message, so that the message reading rate of a user is improved, and the information obtaining efficiency of the user is improved.

In addition, it should be noted that, in an actual application process, after the user selects at least two contacts, the order in which the at least two contacts record the voices may not be limited. At this time, the terminal equipment of the voice recording initiator randomly sends an initial voice recording request to one of the at least two contacts, and the voice recording request and the recorded voice message are randomly and unrepeatedly transmitted between the at least two contacts until the at least two contacts are completely recorded, the terminal equipment of the last recorded contact generates the target voice message according to the voice message recorded by the at least two contacts, and sends the target voice message to the voice recording initiator.

In this embodiment of the present application, the voice message processing method may further include the following step S106:

s106: and displaying the voice recording progress under the condition that at least two contacts record the voice message.

Specifically, in the process of recording the voice message by at least two contacts, the progress states of recording the voice message by the at least two contacts are dynamically displayed on a session interface of the electronic device of the voice recording initiator, so that the voice recording initiator can clearly master the progress states of recording the voice message by the at least two contacts.

Illustratively, as shown in fig. 8, a recording progress bar 224 is displayed on the session interface 204 of the electronic device of the voice recording initiator, and the progress of recording voice of at least two contacts is expressed by the recording progress bar 224. Moreover, as shown in fig. 8, according to the progress statuses of the at least two contacts recording the voice message, a contact identifier 212 (such as an avatar of a contact or other symbol identifiers associated with the contact) corresponding to the contact which has finished recording and is recording the voice message may be synchronously displayed on the session interface 204 of the electronic device of the voice recording initiator, so that the voice recording initiator can clearly grasp the progress statuses of the at least two contacts recording the voice message.

According to the embodiment provided by the application, in the process of recording the voice message by the at least two contacts, the progress states of the at least two contacts for recording the voice message are dynamically displayed on the session interface of the electronic equipment of the voice recording initiator, so that the voice recording initiator can clearly master the progress states of the at least two contacts for recording the voice message.

In this embodiment, the step of generating the target voice message may specifically include the following steps S108 and S110:

s108: and receiving the voice messages recorded by at least two contacts respectively.

Specifically, when the user selects the at least two contacts, the user can select a voice recording mode. Under the condition that the user selects the multi-person simultaneous recording mode, the electronic equipment of the voice recording initiator simultaneously sends the voice recording request to the terminal equipment of at least two contacts, so that each contact can simultaneously record voice messages according to the received voice recording request, and the recording time of the voice messages is saved.

It should be noted that, when a certain contact is required to record multiple voice messages, the electronic device of the voice recording initiator may send at least two voice recording requests to the terminal device of the contact at the same time. That is, the electronic device of the voice recording initiator sends the voice recording requests of corresponding quantity to the terminal devices of at least two contacts according to the quantity of the voice messages to be recorded by each contact.

On the basis, the terminal equipment of the voice recording initiator can respectively receive the voice message fed back by each contact person in the at least two contact persons according to the received voice recording request, so that the voice message fed back by each contact person is processed subsequently to obtain the target voice message.

S110: and generating a target voice message according to the voice messages recorded by the at least two contacts respectively.

Specifically, in the mode of recording multiple persons simultaneously, at least two contacts record corresponding voice messages according to respective received voice recording requests, and send the recorded voice messages to the electronic equipment of the voice recording initiator, and the electronic equipment of the voice recording initiator receives the voice messages respectively fed back by the at least two contacts and collects the received voice messages, so that the target voice message is obtained.

According to the embodiment provided by the application, the electronic equipment of the voice recording initiator can simultaneously send the voice recording requests to at least two contacts, so that each contact can simultaneously record voice messages according to the received voice recording requests. On the basis, the electronic equipment of the voice recording initiator receives the voice messages fed back by each contact according to the received voice recording requests, and summarizes the received voice messages, so that the target voice message is obtained. Therefore, when a plurality of people carry out conversation by sending the voice message, the time for the plurality of people to record the voice message is saved, and the voice message recorded by the plurality of people is integrated, so that conversation members do not need to turn up the conversation record step by step to read a plurality of voice messages sent by at least two contacts, the messages can be read in a centralized manner only through the target voice message, the message reading rate of a user is improved, and the information obtaining efficiency of the user is improved.

In the embodiment of the present application, the foregoing S110 may further include the following S109, and on this basis, the foregoing S110 may specifically include the following S110a:

s109: a speech synthesis order for at least two contacts is determined.

The voice synthesis sequence may be a sequence in which the user selects at least two contacts, and the voice synthesis sequence may also be a sequence in which the voice recording initiator receives voice messages sent by the at least two contacts. For the specific content of the above speech synthesis sequence, the user may set according to the actual situation, and is not limited specifically here.

S110a: and according to the voice synthesis sequence of the at least two contacts, performing voice synthesis processing on the voice messages recorded by the at least two contacts respectively to obtain the target voice message.

Specifically, in the mode of simultaneous recording by multiple persons, the electronic device of the voice recording initiator performs automatic splicing and integration on the received voice messages recorded by at least two contacts according to a preset voice synthesis sequence, so as to obtain the target voice message.

The voice synthesis sequence may also be a sequence in which the electronic device of the voice recording initiator receives voice messages recorded by at least two contacts, and the sequence is not limited specifically herein.

According to the embodiment provided by the application, the electronic equipment of the voice recording initiator automatically splices and integrates the voice messages recorded by at least two contacts according to the preset voice synthesis sequence, so as to obtain the target voice message. Therefore, the determining efficiency of the target voice message is improved, and the efficiency of recording voices by multiple persons is improved.

In addition, in an actual application process, after receiving the voice messages sent by the at least two contacts, as shown in fig. 9, the electronic device of the voice recording initiator may display the voice messages 226 sent by the at least two contacts on the session interface 204, and the voice recording initiator may perform manual concatenation on the voice messages recorded by the at least two contacts through a dragging operation on the voice messages 226 displayed in the session interface 204, so as to obtain the target voice message 228.

In the embodiment of the present application, the S109 may specifically include the following S109a, and on this basis, the S110a may specifically include the following S110a1:

s109a: for each contact, a speech synthesis order of the contact on at least one of the at least two audio tracks is determined.

Specifically, as shown in fig. 11, in the multi-person simultaneous recording mode, when the voice recording initiator selects at least two contacts, at least two tracks 230 may be set according to actual application scenarios (such as song harmony, symphony music performance, band performance, and the like), and the contact identifier 212 of the corresponding contact may be selected and added to a different track 230.

On this basis, the electronic device of the voice recording initiator determines the voice synthesis sequence of each contact on each audio track according to the arrangement sequence of the contact identifiers 212 in each audio track 230 or according to the sequence of the contact identifiers 212 in each audio track 230 selected by the user.

S110a1: and carrying out voice synthesis processing on the voice messages recorded by the at least two contacts respectively according to the voice synthesis sequence of the at least two contacts on each audio track to obtain the target voice message.

Specifically, according to the voice synthesis sequence of each contact on each audio track, the electronic device of the voice recording initiator firstly integrates at least one voice message located at the same order in different audio tracks to obtain at least two voice segments.

It should be noted that, when at least one voice message located at the same position in the different audio tracks is integrated, the playing time of the voice message located at the same position in the different audio tracks is set to be the same time period, instead of sequentially splicing the voice messages. That is, when playing voice, the same-order pieces of voice messages in different audio tracks are played simultaneously.

Further, after at least one voice message located at the same level in different audio tracks is integrated to obtain at least two voice segments, the at least two voice segments are spliced and integrated according to a preset voice synthesis sequence, so as to obtain the target voice message. When the target voice message is played, the voice messages in the same voice segment are played simultaneously.

In the embodiment provided by the application, when the electronic device of the voice recording initiator splices and integrates the voice messages recorded by at least two contacts according to a preset voice synthesis sequence to obtain the target voice message, the voice synthesis sequence of each contact on at least one audio track of at least two audio tracks is specifically determined, and then the voice synthesis processing is performed on the voice messages recorded by the at least two contacts respectively according to the voice synthesis sequence of the at least two contacts on each audio track to obtain the target voice message. Therefore, the voice messages of different audio tracks are spliced according to the corresponding sequence, so that the target voice message of a plurality of voice messages containing different audio tracks is obtained, the speed of reading the message by a user is improved, the efficiency of obtaining information by the user is improved, and meanwhile, the application scenes (such as song harmony, symphony music performance, band performance and the like) of audio recording are increased.

In this embodiment of the present application, the voice message processing method may further include the following S112 and S114:

s112: and displaying the contact person identifications of at least two contact persons under the condition of acquiring the target voice message.

The contact person identification corresponds to the voice message recorded by the corresponding contact person, and each contact person identification is used for indicating the voice message recorded by the corresponding contact person.

Specifically, after the electronic device of the voice recording initiator acquires the target voice message, at least two contact identifiers may be displayed on the session interface. The contact identifier may be specifically an icon of the contact or other symbolic identifier associated with the contact, and is not specifically limited herein.

S114: and responding to the input of the target contact person identification in the at least two contact person identifications, and adjusting the voice message recorded by the target contact person corresponding to the target contact person identification.

The input of the target contact person identifier is a touch input of the user to the target contact person identifier, and the touch input may be a single-click input, a double-click input, a long-press input, a sliding input along a preset direction, and the like. The specific form of the input is selected by the user according to the actual situation, and is not limited in particular.

Specifically, under the condition that M contact identifiers are displayed on the session interface, the electronic device of the voice recording initiator responds to an input operation of the voice recording initiator for adjusting the arrangement sequence of the target contact identifier in at least two contact identifiers, and adjusts the playing sequence of the voice message recorded by the target contact in the target voice message. In addition, the electronic device of the voice recording initiator can also delete the voice message recorded by the target contact in the target voice message in response to the input operation of deleting the target contact identification by the voice recording initiator.

After the voice recording initiator deletes the voice message recorded by the target contact person in the target voice message by deleting the target contact person identifier, the electronic equipment of the voice recording initiator automatically reissues a voice recording request to the target contact person, and adds the voice message fed back again by the target contact person to the target position of the target voice message. The target location may be an original location of the deleted voice message in the target voice message, or may also be a last location in the target voice message, which is not limited herein.

Illustratively, as shown in fig. 12 (a), 4 contact identifications 212 are displayed in the target voice message 228 in the session interface 204, and the electronic device of the voice recording initiator deletes the target contact identification 232 displayed in the target voice message 228 and deletes one voice message corresponding to the target contact identification 232 in the target voice message 228 in response to a slide input by the voice recording initiator to move the target contact identification 232 out of the target voice message 228, as shown in fig. 12 (b).

Illustratively, as shown in fig. 13 (a), 4 contact identifications 212 are displayed in a target voice message 228 in the session interface 204, and the electronic device of the voice recording initiator responds to a swipe input by the voice recording initiator to move the first target contact identification 234 to a second order of the 4 contact identifications 212 and to move the second target contact identification 236 to a fourth order of the 4 contact identifications 212. As shown in fig. 13 (b), the first target contact id 234 is displayed by being shifted from the fourth order to the second order among the 4 contact ids 212, and the second target contact id 236 is displayed by being shifted from the second order to the fourth order among the 4 contact ids 212.

In addition, in an actual application process, the voice recording initiator can play the corresponding voice message or perform word conversion processing on the corresponding voice message through touch input on the contact person identifier.

According to the embodiment provided by the application, after the target voice message is determined, the contact person identifiers of at least two contact persons can be displayed on the session interface, and on the basis, a user (namely, a voice recording initiator) can adjust the voice message recorded by the target contact person corresponding to the target contact person identifier through touch input of the target contact person identifier in the at least two contact person identifiers. Therefore, the voice messages recorded by different contacts in the target voice message are associated with the corresponding contact identifications, the user can adjust the voice content of the obtained target voice message by simply inputting the contact identifications, and the convenience of adjusting the target voice message is improved.

In the voice message processing method provided by the embodiment of the first aspect of the present application, the execution subject may be a voice message processing apparatus. In the embodiment of the present application, a voice message processing apparatus executes the voice message processing method as an example, and the voice message processing apparatus provided in the second aspect of the present application is described.

As shown in fig. 14, an embodiment of the present application provides a voice message processing apparatus 1400, which may include a sending unit 1402 described below.

A sending unit 1402, configured to initiate a voice recording request to at least two contacts of a voice recording initiator;

the sending unit 1402 is further configured to share the target voice message with at least two contacts when the target voice message is obtained;

With the voice message processing apparatus 1400 provided in this embodiment of the present application, when multiple persons perform a conversation by sending voice messages, for an electronic device of a voice recording initiator, a voice recording request is sent to at least two contacts of the voice recording initiator, so that the at least two contacts record corresponding voice messages according to the respective received voice recording requests. On the basis, the target voice message is determined according to the voice messages recorded by the at least two contacts of the voice recording initiator according to the received voice recording requests, and then the target voice message is shared with the at least two contacts under the condition that the target voice message is obtained. That is to say, when a plurality of persons carry out a conversation by sending a voice message, the voice messages recorded by at least two contacts of the voice recording initiator can be spliced and integrated, so that a target voice message containing the voice message recorded by each contact is obtained, and the target voice message is shared with each contact. Therefore, each contact person participating in the conversation does not need to gradually turn up the conversation record to read the voice messages sent by other contact persons, the voice messages can be read in a centralized manner only through the target voice message, the speed of reading the voice messages by the user is improved, and the efficiency of obtaining information by the user is improved.

In this embodiment, the voice message processing apparatus 1400 further includes a processing unit 1404, and the processing unit 1404 is configured to: determining the voice recording sequence of at least two contacts; determining a first-order contact person according to the voice recording sequence of at least two contact persons; the sending unit 1402 is specifically configured to: initiating a voice recording request to the contact persons in the first order; under the condition that the recording of the first-order contact person is finished, transmitting the voice message recorded by the first-order contact person and the voice recording request to the next-order contact person so that the next-order contact person responds to the voice recording request and records the voice message according to the voice message recorded by the first-order contact person; and acquiring the target voice message under the condition that the recording of each cis-position contact person is completed.

In this embodiment, the voice message processing apparatus 1400 further includes: the display unit 1406 is configured to display a voice recording progress when at least two contacts record a voice message.

In this embodiment, the voice message processing apparatus 1400 further includes a receiving unit 1408, where the receiving unit 1408 is configured to: receiving voice messages recorded by at least two contacts respectively; the processing unit 1404 is specifically configured to: and generating a target voice message according to the voice messages recorded by the at least two contacts respectively.

According to the embodiment provided by the application, the electronic equipment of the voice recording initiator can simultaneously send the voice recording requests to at least two contacts, so that each contact can simultaneously record the voice message according to the received voice recording requests. On the basis, the electronic equipment of the voice recording initiator receives the voice messages fed back by each contact according to the received voice recording requests, and summarizes the received voice messages, so that the target voice message is obtained. Therefore, when a plurality of people carry out conversation by sending the voice message, the time for the plurality of people to record the voice message is saved, and the voice message recorded by the plurality of people is integrated, so that conversation members do not need to turn up the conversation record step by step to read a plurality of voice messages sent by at least two contacts, the messages can be read in a centralized manner only through the target voice message, the message reading rate of a user is improved, and the information obtaining efficiency of the user is improved.

In the embodiment of the present application, the processing unit 1404 is specifically configured to: determining a speech synthesis order of at least two contacts; and according to the voice synthesis sequence of the at least two contacts, performing voice synthesis processing on the voice messages recorded by the at least two contacts respectively to obtain the target voice message.

In addition, in an actual application process, after receiving the voice messages sent by the at least two contacts, the electronic device of the voice recording initiator may display the voice messages sent by the at least two contacts on the session interface, and the voice recording initiator may manually splice the voice messages recorded by the at least two contacts through a dragging operation of the voice messages displayed in the session interface, so as to obtain the target voice message.

In the embodiment of the present application, the processing unit 1404 is specifically configured to: for each contact, determining a speech synthesis order of the contact on at least one of the at least two audio tracks; and carrying out voice synthesis processing on the voice messages recorded by the at least two contacts respectively according to the voice synthesis sequence of the at least two contacts on each audio track to obtain the target voice message.

In the embodiment provided by the application, when the electronic device of the voice recording initiator splices and integrates voice messages recorded by at least two contacts according to a preset voice synthesis sequence to obtain the target voice message, a voice synthesis sequence of each contact on at least one audio track of at least two audio tracks is specifically determined, and then voice synthesis processing is performed on the voice messages respectively recorded by the at least two contacts according to the voice synthesis sequence of the at least two contacts on each audio track to obtain the target voice message. Therefore, the voice messages of different audio tracks are spliced according to the corresponding sequence, so that the target voice message of a plurality of voice messages containing different audio tracks is obtained, the speed of reading the message by a user is improved, the efficiency of obtaining information by the user is improved, and meanwhile, the application scenes (such as song harmony, symphony music performance, band performance and the like) of audio recording are increased.

In this embodiment, the display unit 1406 is specifically configured to: displaying contact person identifications of at least two contact persons under the condition of acquiring the target voice message; the processing unit 1404 is specifically configured to: and responding to the input of the target contact person identification in the at least two contact person identifications, and adjusting the voice message recorded by the target contact person corresponding to the target contact person identification.

In the above embodiment provided by the application, after the target voice message is determined, the contact person identifiers of at least two contact persons may be displayed on the session interface, and on this basis, a user (i.e., a voice recording initiator) may adjust the voice message recorded by the target contact person corresponding to the target contact person identifier by touch input of the target contact person identifier of the at least two contact person identifiers. Therefore, the voice messages recorded by different contacts in the target voice message are associated with the corresponding contact identifications, the user can adjust the voice content of the obtained target voice message by simply inputting the contact identifications, and the convenience of adjusting the target voice message is improved.

The voice message processing apparatus 1400 in the embodiment of the present application may be an electronic device, or may be a component in an electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be a device other than a terminal. The electronic Device may be, for example, a Mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic Device, a Mobile Internet Device (MID), an Augmented Reality (AR)/Virtual Reality (VR) Device, a robot, a wearable Device, an ultra-Mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and may also be a server, a Network Attached Storage (Network Attached Storage, NAS), a personal computer (NAS), a Television (TV), a teller machine, a self-service machine, and the like, and the embodiments of the present application are not limited in particular.

The voice message processing apparatus 1400 in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android operating system (Android), an iOS operating system, or other possible operating systems, which is not specifically limited in the embodiments of the present application.

The voice message processing apparatus 1400 provided in the embodiment of the second aspect of the present application can implement each process implemented in the embodiment of the method in fig. 1, and is not described herein again to avoid repetition.

Optionally, as shown in fig. 15, an electronic device 1500 according to an embodiment of the present application is further provided, and includes a processor 1502 and a memory 1504, where the memory 1504 stores a program or an instruction that can be executed on the processor 1502, and when the program or the instruction is executed by the processor 1502, the steps of the voice message processing method embodiment according to the first aspect described above are implemented, and the same technical effects can be achieved, and are not described again to avoid repetition.

It should be noted that the electronic devices in the embodiments of the present application include the mobile electronic device and the non-mobile electronic device described above.

Fig. 16 is a schematic hardware structure diagram of an electronic device implementing an embodiment of the present application.

The electronic device 1600 includes, but is not limited to: radio frequency unit 1601, network module 1602, audio output unit 1603, input unit 1604, sensor 1605, display unit 1606, user input unit 1607, interface unit 1608, memory 1609, and processor 1610.

Those skilled in the art will appreciate that the electronic device 1600 may further comprise a power supply (e.g., a battery) for supplying power to the various components, and the power supply may be logically connected to the processor 1610 through a power management system, so as to manage charging, discharging, and power consumption management functions through the power management system. The electronic device structure shown in fig. 16 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description thereof is omitted.

The electronic device 1600 of this embodiment of the present application may be configured to implement the steps of the foregoing voice message processing method embodiment of the first aspect.

Wherein, the processor 1610 is configured to initiate a voice recording request to at least two contacts of a voice recording initiator.

The processor 1610 is further configured to share the target voice message with at least two contacts if the target voice message is acquired.

In the embodiment of the application, when a plurality of people carry out a conversation by sending voice messages, and when the plurality of people carry out the conversation by sending voice messages, for the electronic equipment of the voice recording initiator, the voice recording request is sent to at least two contacts of the voice recording initiator, so that the at least two contacts record corresponding voice messages according to the received voice recording requests. On the basis, the target voice message is determined according to the voice messages recorded by the at least two contacts of the voice recording initiator according to the received voice recording requests, and then the target voice message is shared with the at least two contacts under the condition that the target voice message is obtained. That is to say, when a plurality of persons carry out a conversation by sending a voice message, the voice messages recorded by at least two contacts of the voice recording initiator can be spliced and integrated, so that a target voice message containing the voice message recorded by each contact is obtained, and the target voice message is shared with each contact. Therefore, each contact person participating in the conversation does not need to gradually turn up the conversation record to read the voice messages sent by other contact persons, the voice messages can be read in a centralized manner only through the target voice message, the speed of reading the voice messages by the user is improved, and the efficiency of obtaining information by the user is improved.

Optionally, the processor 1610 is specifically configured to: determining a voice recording sequence of at least two contacts; determining a first-order contact person according to the voice recording sequence of at least two contact persons; initiating a voice recording request to the contact persons in the first order; under the condition that the recording of the first-order contact person is finished, transmitting the voice message recorded by the first-order contact person and the voice recording request to the next-order contact person so that the next-order contact person responds to the voice recording request and records the voice message according to the voice message recorded by the first-order contact person; and under the condition that the recording of each cis-position contact person is finished, obtaining the target voice message.

According to the embodiment provided by the application, the electronic equipment of the voice recording initiator can send the voice recording request to the first-order contact person in the voice recording sequence according to the preset voice recording sequence, wherein the voice recording request is used for controlling at least two contact persons to send the voice recording request to the next-order contact person in a chained mode according to the preset voice recording sequence until the at least two contact persons are completely recorded. It should be noted that, starting from the second contact, the voice recording request received by each contact includes the voice messages recorded by all previous contacts. Namely, the voice recording request received by the kth contact (K is a positive integer greater than or equal to 2) contains the voice message fed back by the first K-1 contacts. That is, after the last contact completes the recording, a complete voice message including all the voice messages recorded by the at least two contacts, that is, the target voice message, can be obtained. Therefore, when a plurality of persons carry out conversation by sending the voice message, conversation members do not need to gradually turn up conversation records to read a plurality of voice messages sent by at least two contacts, and the messages can be read in a centralized manner only through the target voice message, so that the message reading rate of the user is improved, and the information obtaining efficiency of the user is improved.

Optionally, the display unit 1606 is configured to: and displaying the voice recording progress under the condition that at least two contacts record the voice message.

According to the embodiment provided by the application, in the process of recording the voice messages by the at least two contacts, the progress states of the at least two contacts for recording the voice messages are dynamically displayed on the session interface of the electronic equipment of the voice recording initiator, so that the voice recording initiator can clearly master the progress states of the at least two contacts for recording the voice messages.

Optionally, the processor 1610 is specifically configured to: receiving voice messages recorded by at least two contacts respectively; and generating a target voice message according to the voice messages recorded by the at least two contacts respectively.

According to the embodiment provided by the application, the electronic equipment of the voice recording initiator can simultaneously send the voice recording requests to at least two contacts, so that each contact can simultaneously record the voice message according to the received voice recording requests. On the basis, the electronic equipment of the voice recording initiator receives the voice messages fed back by each contact according to the received voice recording requests, and summarizes the received voice messages, so that the target voice messages are obtained. Therefore, when a plurality of people carry out conversation by sending the voice message, the time for the plurality of people to record the voice message is saved, and the voice message recorded by the plurality of people is integrated, so that conversation members do not need to turn up the conversation record step by step to read a plurality of voice messages sent by at least two contacts, the messages can be read in a centralized manner only through the target voice message, the message reading rate of a user is improved, and the information obtaining efficiency of the user is improved.

Optionally, processor 1610 is specifically configured to: determining a speech synthesis order of at least two contacts; and according to the voice synthesis sequence of the at least two contacts, performing voice synthesis processing on the voice messages recorded by the at least two contacts respectively to obtain the target voice message.

Optionally, the processor 1610 is specifically configured to: for each contact, determining a speech synthesis order of the contact on at least one of the at least two audio tracks; and carrying out voice synthesis processing on the voice messages recorded by the at least two contacts respectively according to the voice synthesis sequence of the at least two contacts on each audio track to obtain the target voice message.

In the embodiment provided by the application, when the electronic device of the voice recording initiator splices and integrates voice messages recorded by at least two contacts according to a preset voice synthesis sequence to obtain the target voice message, a voice synthesis sequence of each contact on at least one audio track of at least two audio tracks is specifically determined, and then voice synthesis processing is performed on the voice messages respectively recorded by the at least two contacts according to the voice synthesis sequence of the at least two contacts on each audio track to obtain the target voice message. Therefore, the voice messages of different audio tracks are spliced according to the corresponding sequence, so that the target voice message of a plurality of voice messages containing different audio tracks is obtained, the speed of reading the message by a user is improved, the efficiency of obtaining information by the user is improved, and simultaneously, the application scenes of audio recording (such as scenes of song harmony, symphony music performance, band performance and the like) are increased.

Optionally, the display unit 1606 is further configured to: displaying contact person identifications of at least two contact persons under the condition of acquiring the target voice message; processor 1610 is further configured to: and responding to the input of the target contact person identification in the at least two contact person identifications, and adjusting the voice message recorded by the target contact person corresponding to the target contact person identification.

According to the embodiment provided by the application, after the target voice message is determined, the contact person identifiers of at least two contact persons can be displayed on the session interface, and on the basis, a user (namely, a voice recording initiator) can adjust the voice message recorded by the target contact person corresponding to the target contact person identifier through touch input of the target contact person identifier in the at least two contact person identifiers. Therefore, the voice messages recorded by different contacts in the target voice message are associated with the corresponding contact identifications, the user can adjust the voice content of the obtained target voice message by simply inputting the contact identifications, and convenience in adjusting the target voice message is improved.

It should be understood that in the embodiment of the present application, the input Unit 1604 may include a Graphics Processing Unit (GPU) 16041 and a microphone 16042, and the Graphics processor 16041 processes image data of still pictures or videos obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 1606 may include a display panel 16061, and the display panel 16061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1607 includes a touch panel 16071 and at least one of other input devices 16072. The touch panel 16071 is also called a touch screen. The touch panel 16071 may include two parts of a touch detection device and a touch controller. Other input devices 16072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.

The memory 1609 may be used to store software programs as well as various data. The memory 1609 may mainly include a first storage area storing programs or instructions and a second storage area storing data, wherein the first storage area may store an operating system, application programs or instructions required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. Further, memory 1609 may include volatile memory or nonvolatile memory, or memory 1609 may include both volatile and nonvolatile memory.

The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. The volatile Memory may be a Random Access Memory (RAM), a Static Random Access Memory (Static RAM, SRAM), a Dynamic Random Access Memory (Dynamic RAM, DRAM), a Synchronous Dynamic Random Access Memory (Synchronous DRAM, SDRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (Double Data Rate SDRAM, ddr SDRAM), an Enhanced Synchronous SDRAM (ESDRAM), a Synchronous Link DRAM (SLDRAM), and a Direct Memory bus RAM (DRRAM). The memory 1609 in the embodiments of the subject application includes, but is not limited to, these and any other suitable types of memory.

Processor 1610 may include one or at least two processing units; optionally, processor 1610 integrates an application processor, which primarily handles operations involving the operating system, user interface, and applications, and a modem processor, which primarily handles wireless communication signals, such as a baseband processor. It is to be appreciated that the modem processor described above may not be integrated into processor 1610.

An embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the foregoing embodiment of the method for processing a voice message in the first aspect, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device in the above embodiment. Readable storage media include computer readable storage media such as computer read only memory ROM, random access memory RAM, magnetic or optical disks, and the like.

An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement each process of the foregoing first aspect voice message processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

Embodiments of the present application provide a computer program product, where the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the processes of the foregoing first aspect voice message processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, details are not repeated here.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of another like element in a process, method, article, or apparatus that comprises the element.

Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, or a network device) to execute the method of the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for processing a voice message, the method comprising:

initiating voice recording requests to at least two contacts of a voice recording initiator;

under the condition that the target voice message is obtained, the target voice message is shared to the at least two contacts;

wherein the target voice message is determined based on voice messages recorded by the at least two contacts in response to the voice recording request.

2. The method for processing the voice message according to claim 1, wherein the step of initiating the voice recording request to at least two contacts of the voice recording initiator further comprises;

determining a voice recording sequence of the at least two contacts;

determining a first-order contact person according to the voice recording sequence of the at least two contact persons;

the initiating of the voice recording request to at least two contacts of the voice recording initiator comprises:

initiating a voice recording request to the contact persons in the first order;

when the recording of the first-order contact person is finished, transmitting the voice message recorded by the first-order contact person and the voice recording request to a next-order contact person, so that the next-order contact person responds to the voice recording request and records the voice message according to the voice message recorded by the first-order contact person; and acquiring the target voice message under the condition that the recording of each cis-position contact person is completed.

3. The method of claim 2, further comprising:

and displaying the voice recording progress under the condition that the at least two contacts record the voice messages.

4. The voice message processing method of claim 1, wherein the generating of the target voice message comprises:

receiving the voice messages recorded by the at least two contacts respectively;

and generating the target voice message according to the voice messages recorded by the at least two contacts respectively.

5. The method of claim 4, wherein the step of generating the target voice message according to the respective recorded voice messages of the at least two contacts is preceded by the step of:

determining a speech synthesis order of the at least two contacts;

generating the target voice message according to the voice messages recorded by the at least two contacts respectively comprises the following steps:

and according to the voice synthesis sequence of the at least two contacts, performing voice synthesis processing on the voice messages recorded by the at least two contacts respectively to obtain the target voice message.

6. The method of claim 5, wherein the determining the speech synthesis order of the at least two contacts comprises:

for each contact, determining a speech synthesis order of the contact on at least one of at least two audio tracks;

the performing voice synthesis processing on the voice messages recorded by the at least two contacts respectively according to the voice synthesis sequence of the at least two contacts to obtain the target voice message includes:

and according to the voice synthesis sequence of the at least two contacts on each audio track, performing voice synthesis processing on the voice messages recorded by the at least two contacts respectively to obtain the target voice message.

7. A voice message processing method according to any of claims 1 to 6, characterized in that the method further comprises:

under the condition that the target voice message is obtained, displaying contact person identifications of the at least two contact persons;

and responding to the input of a target contact person identifier in at least two contact person identifiers, and adjusting the voice message recorded by the target contact person corresponding to the target contact person identifier.

8. A voice message processing apparatus, characterized in that the apparatus comprises:

the sending unit is used for sending voice recording requests to at least two contacts of a voice recording initiator;

the sending unit is further configured to share the target voice message to the at least two contacts when the target voice message is acquired;

9. An electronic device comprising a processor and a memory, the memory storing a program or instructions executable on the processor, the program or instructions when executed by the processor implementing the steps of the voice message processing method of any one of claims 1 to 7.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the voice message processing method according to any one of claims 1 to 7.