CN111399796B

CN111399796B - Voice message aggregation method and device, electronic equipment and storage medium

Info

Publication number: CN111399796B
Application number: CN202010153199.7A
Authority: CN
Inventors: 刘硕
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-03-06
Filing date: 2020-03-06
Publication date: 2022-08-05
Anticipated expiration: 2040-03-06
Also published as: CN111399796A

Abstract

The disclosure relates to a voice message aggregation method, a voice message aggregation device, electronic equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: acquiring a plurality of selected target voice messages from a displayed interactive interface, and aggregating the plurality of target voice messages when the aggregation operation of the plurality of target voice messages is detected; and saving the aggregated voice message obtained by aggregation. A mode of aggregating a plurality of voice messages is expanded, after a user selects a plurality of target voice messages, the selected plurality of voice messages can be aggregated into one aggregated voice message, and the aggregated voice message is stored, so that the functions are enriched, the processing mode of the voice message is expanded, and the functions are diversified.

Description

Voice message aggregation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for aggregating voice messages, an electronic device, and a storage medium.

Background

With the rapid development of computer technology and the wide application of instant messaging functions, the interaction demand of users is gradually increased. The sending process of the voice message is simple and convenient, and the voice message can also contain the emotion of the user, so that the voice message becomes a common interactive form.

In the related art, a plurality of received voice messages are displayed in an interactive interface, a user can listen to any one voice message and also can store each voice message, and a subsequent user can also continue to play each stored voice message. However, for a plurality of voice messages in the interactive interface, each voice message can only be stored independently, and the interactive interface is single in function and has limitation.

Disclosure of Invention

The present disclosure provides a voice message aggregation method, apparatus, electronic device and storage medium, which can overcome the problems of single function and limitation in the related art.

According to a first aspect of the embodiments of the present disclosure, there is provided a voice message aggregation method, the method including:

acquiring a plurality of selected target voice messages from the displayed interactive interface;

when the aggregation operation of the target voice messages is detected, aggregating the target voice messages;

and saving the aggregated voice message obtained by aggregation.

In one possible implementation manner, the obtaining, from the displayed interactive interface, the selected multiple target voice messages includes:

when a first trigger operation on a first target voice message in the plurality of target voice messages is detected, displaying a favorite option in the interactive interface;

and when the selection operation of the favorite is detected, acquiring a plurality of selected target voice messages.

In another possible implementation manner, the saving an aggregated voice message obtained by aggregation includes:

and storing the aggregated voice message in a favorite list corresponding to the logged-in user identifier.

In another possible implementation, the method further includes:

displaying the favorite list, wherein the favorite list comprises the aggregated voice message;

and when the playing operation of the aggregated voice message is detected, playing the aggregated voice message.

In another possible implementation, the method further includes:

displaying a playing time axis of the aggregated voice messages, wherein the playing time axis comprises an initial time point of each target voice message in the plurality of target voice messages;

and when the trigger operation of any starting time point is detected, starting to play the aggregated voice message from the any starting time point.

In another possible implementation, the method further includes:

displaying a playing time axis of the aggregated voice message, wherein the playing time axis comprises a playing progress bar;

when the dragging operation of the playing progress bar is detected, acquiring the termination position of the dragging operation;

and playing the aggregated voice message from the time point corresponding to the termination position.

In another possible implementation manner, the obtaining, from the displayed interactive interface, the selected multiple target voice messages includes:

when a first trigger operation on a first target voice message in the plurality of target voice messages is detected, displaying a selection option of each voice message in the interactive interface;

and acquiring a plurality of target voice messages corresponding to the selected selection options.

In another possible implementation manner, when the first trigger operation on the first target voice message in the plurality of target voice messages is detected, displaying a selection option of each voice message in the interactive interface, where the displaying includes:

displaying a multi-selection option in the interactive interface when a first trigger operation on the first target voice message is detected;

and when the selection operation of the multi-selection option is detected, displaying the selection option of each voice message in the interactive interface.

In another possible implementation manner, the interactive interface includes an aggregation option, and when an aggregation operation on the target voice messages is detected, aggregating the target voice messages includes:

when the selection operation of the aggregation option is detected, the target voice messages are aggregated.

In another possible implementation manner, the aggregating the plurality of target voice messages includes:

and aggregating the plurality of target voice messages according to the sequence of the plurality of target voice messages.

In another possible implementation manner, the aggregating the plurality of target voice messages according to the order of the plurality of target voice messages includes:

aggregating the plurality of target voice messages according to the sequence of the message identifications of the plurality of target voice messages from first to last; or the like, or, alternatively,

aggregating the plurality of target voice messages according to the sequence of the selected time of the plurality of target voice messages from early to late; or the like, or, alternatively,

and aggregating the target voice messages according to the sequence of the sending time of the target voice messages from early to late.

In another possible implementation, the method further comprises at least one of:

aggregating the durations of the plurality of target voice messages to obtain the duration of the aggregated voice message;

and aggregating the capacities of the plurality of target voice messages to obtain the capacity of the aggregated voice message.

adding a preset prompt message between any two adjacent target voice messages in the plurality of target voice messages;

and aggregating the target voice messages added with the preset prompt message.

According to a second aspect of the embodiments of the present disclosure, there is provided a voice message aggregation apparatus, the apparatus including:

the acquisition unit is configured to acquire the selected multiple pieces of target voice messages from the displayed interactive interface;

an aggregation unit configured to aggregate the plurality of target voice messages when an aggregation operation on the plurality of target voice messages is detected;

and the storage unit is configured to store the aggregated voice message.

In one possible implementation manner, the obtaining unit includes:

the display subunit is configured to display a favorite option in the interactive interface when a first trigger operation on a first target voice message in the plurality of target voice messages is detected;

an acquisition subunit configured to acquire the selected plurality of target voice messages when a selection operation of the favorite option is detected.

In another possible implementation manner, the saving unit is configured to save the aggregated voice message in a favorite list corresponding to the logged-in user identifier.

In another possible implementation manner, the apparatus further includes:

a display unit configured to display the favorite list including the aggregated voice message;

a playing unit configured to play the aggregated voice message when a playing operation of the aggregated voice message is detected.

In another possible implementation manner, the apparatus further includes:

the display unit is configured to display a play time axis of the aggregated voice message, wherein the play time axis includes a start time point of each target voice message in the plurality of target voice messages;

the playing unit is configured to play the aggregated voice message from any starting time point when a trigger operation on the any starting time point is detected.

In another possible implementation manner, the apparatus further includes:

the display unit is configured to display a playing time axis of the aggregated voice message, and the playing time axis comprises a playing progress bar;

the acquisition unit is configured to acquire an end position of a drag operation when the drag operation on the play progress bar is detected;

the playing unit is configured to play the aggregated voice message from a time point corresponding to the termination position.

In another possible implementation manner, the obtaining unit includes:

the display subunit is configured to display a selection option of each voice message in the interactive interface when a first trigger operation on a first target voice message in the plurality of target voice messages is detected;

and the acquisition subunit is configured to acquire a plurality of target voice messages corresponding to the selected selection options.

In another possible implementation manner, the display subunit is configured to display a multiple-choice option in the interactive interface when a first trigger operation on the first target voice message is detected;

the display subunit is configured to display the selection option of each voice message in the interactive interface when the selection operation of the multi-selection option is detected.

In another possible implementation manner, the interactive interface includes an aggregation option, and the aggregating unit is configured to aggregate the plurality of target voice messages when a selection operation of the aggregation option is detected.

In another possible implementation manner, the aggregating unit is configured to aggregate the plurality of target voice messages according to an order of the plurality of target voice messages.

In another possible implementation manner, the aggregating unit is configured to aggregate the plurality of target voice messages according to a sequence of message identifications of the plurality of target voice messages from first to last; or the like, or, alternatively,

the aggregation unit is configured to aggregate the plurality of target voice messages according to the sequence of the selected time of the plurality of target voice messages from early to late; or the like, or, alternatively,

the aggregation unit is configured to aggregate the target voice messages according to the sequence of the sending time of the target voice messages from early to late.

In another possible implementation, the aggregation unit is configured to perform at least one of:

In another possible implementation manner, the polymerization unit includes:

the adding subunit is configured to add a preset prompting message between any two adjacent target voice messages in the plurality of target voice messages;

and the aggregation subunit is configured to aggregate the plurality of target voice messages added with the preset prompt messages.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

one or more processors;

volatile or non-volatile memory for storing the one or more processor-executable commands;

wherein the one or more processors are configured to perform the voice message aggregation method as described in the first aspect.

According to a fourth aspect provided by embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, wherein instructions of the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the voice message aggregation method according to the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein the instructions of the computer program product, when executed by a processor of an electronic device, enable the electronic device to perform the voice message aggregation method according to the first aspect.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the method, the device, the electronic equipment and the storage medium provided by the embodiment of the disclosure expand a mode of aggregating a plurality of voice messages, acquire a plurality of selected target voice messages from a displayed interactive interface, aggregate the plurality of target voice messages when the aggregation operation of the plurality of target voice messages is detected, store an aggregated voice message obtained by aggregation, aggregate the plurality of selected voice messages into an aggregated voice message after a user selects the plurality of target voice messages, store the aggregated voice message, enrich functions and expand a processing mode of the voice message.

In addition, according to the method provided by the embodiment of the application, when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, a collection option is displayed in an interactive interface, when the trigger operation on the collection option is detected, the selected plurality of target voice messages are obtained, when the aggregation operation on the plurality of target voice messages is detected, the plurality of target voice messages are aggregated, and one aggregated voice message obtained by aggregation is stored in a collection list corresponding to a logged-in user identifier. A new mode for collecting the voice messages is expanded, a plurality of target voice messages can be aggregated into one aggregated voice message and stored in a collection list, the efficiency of collecting the plurality of target voice messages is improved, and the functions are diversified.

In addition, the method provided by the embodiment of the application displays the playing time axis of the aggregated voice message, and jumps to play the aggregated voice message from the starting time point when the triggering operation of the starting time point of any entry slogan message in the playing time axis is detected, so that a user can quickly position the starting time point of any target voice message, the positioning accuracy is improved, and the efficiency of playing the voice message is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flow chart illustrating a method of voice message aggregation in accordance with an example embodiment.

Fig. 2 is a flow chart illustrating a method of voice message aggregation in accordance with an example embodiment.

FIG. 3 is a schematic diagram illustrating the structure of an interactive interface in accordance with an exemplary embodiment.

FIG. 4 is a schematic diagram illustrating the structure of another interactive interface in accordance with an exemplary embodiment.

FIG. 5 is a schematic diagram illustrating the structure of another interactive interface in accordance with an exemplary embodiment.

FIG. 6 is a schematic diagram illustrating the structure of another interactive interface in accordance with an exemplary embodiment.

Fig. 7 is a flow chart illustrating a method of voice message aggregation in accordance with an example embodiment.

Fig. 8 is a schematic structural diagram illustrating a voice message aggregation apparatus according to an exemplary embodiment.

Fig. 9 is a schematic structural diagram illustrating another voice message aggregation apparatus according to an exemplary embodiment.

Fig. 10 is a block diagram illustrating a terminal according to an example embodiment.

Fig. 11 is a schematic diagram illustrating a configuration of a server according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The embodiment of the disclosure provides a voice message aggregation method, which can aggregate a plurality of target voice messages into one aggregated voice message, and store the obtained aggregated voice message, and can be applied to various scenes:

for example, the method provided by the embodiment of the present disclosure is applied to an instant messaging scenario, and when a user interacts with other users through voice messages, by using the method provided by the embodiment of the present disclosure, a plurality of target voice messages can be aggregated into one aggregated voice message, and the aggregated voice message is stored.

Or, the method provided by the embodiment of the present disclosure is applied to a voice recording scene, and after a user records a plurality of voice messages, the method provided by the embodiment of the present disclosure is adopted, so that a plurality of target voice messages can be aggregated into one aggregated voice message, and the aggregated voice message is stored.

The voice message aggregation method provided by the embodiment of the disclosure is applied to electronic equipment, and the electronic equipment may include a terminal and may further include a server.

When the electronic equipment comprises a terminal, the terminal is used for acquiring a plurality of selected target voice messages from a displayed interactive interface, and when the aggregation operation of the plurality of target voice messages is detected, the plurality of target voice messages are aggregated, and the aggregated voice message obtained by aggregation is stored.

Or when the electronic equipment comprises a terminal and a server, the terminal is used for acquiring a plurality of selected target voice messages from a displayed interactive interface, when the aggregation operation of the plurality of target voice messages is detected, the plurality of target voice messages are sent to the server, the server aggregates the plurality of target voice messages, and an aggregated voice message obtained by aggregation is stored.

The terminal can be various terminals such as a mobile phone, a tablet computer, a computer and the like, and the server can be a server, a server cluster consisting of a plurality of servers, or a cloud computing service center.

Fig. 1 is a flowchart illustrating a voice message aggregation method according to an exemplary embodiment, applied to an electronic device, and referring to fig. 1, the method includes:

in step 101, the selected target voice messages are obtained from the displayed interactive interface.

In step 102, when an aggregation operation on a plurality of target voice messages is detected, the plurality of target voice messages are aggregated.

In step 103, an aggregated voice message obtained by aggregation is saved.

The method provided by the embodiment of the disclosure expands a mode of aggregating a plurality of voice messages, acquires a plurality of selected target voice messages from a displayed interactive interface, aggregates the plurality of target voice messages when the aggregation operation of the plurality of target voice messages is detected, stores an aggregated voice message obtained by aggregation, can aggregate the plurality of selected voice messages into an aggregated voice message after a user selects the plurality of target voice messages, stores the aggregated voice message, enriches functions, expands a processing mode of the voice message, and has diversified functions.

In one possible implementation manner, the obtaining of the selected multiple target voice messages from the displayed interactive interface includes:

when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, displaying a collection option in an interactive interface;

In another possible implementation manner, saving an aggregated voice message obtained by aggregation includes:

In another possible implementation, the method further includes:

displaying a favorite list, wherein the favorite list comprises an aggregation voice message;

In another possible implementation, the method further includes:

displaying a playing time axis of the aggregated voice messages, wherein the playing time axis comprises an initial time point of each target voice message in a plurality of target voice messages;

when the trigger operation of any starting time point is detected, the aggregated voice message is played from any starting time point.

In another possible implementation, the method further includes:

and starting from the time point corresponding to the termination position, playing the aggregated voice message.

In another possible implementation manner, the obtaining of the selected multiple target voice messages from the displayed interactive interface includes:

when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, displaying a selection option of each voice message in an interactive interface;

In another possible implementation manner, when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, a selection option of each voice message is displayed in an interactive interface, and the selection option comprises:

when a first trigger operation on a first target voice message is detected, displaying a multi-selection option in an interactive interface;

In another possible implementation manner, the interactive interface includes an aggregation option, and when an aggregation operation on a plurality of target voice messages is detected, aggregating the plurality of target voice messages includes:

and when the selection operation of the aggregation option is detected, aggregating the target voice messages.

In another possible implementation, aggregating a plurality of targeted voice messages includes:

In another possible implementation manner, aggregating a plurality of target voice messages according to an order of the plurality of target voice messages includes:

aggregating the time lengths of the plurality of target voice messages to obtain the time length of the aggregated voice message;

and aggregating the target voice messages added with the preset prompt message.

Fig. 2 is a flowchart illustrating a voice message aggregation method according to an exemplary embodiment, applied to an electronic device, and referring to fig. 2, the method includes:

in step 201, when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, a favorite option is displayed in an interactive interface.

In the embodiment of the present disclosure, the electronic device may log in based on a user identifier, where the user identifier is used to determine a unique user, and may be a nickname, an account, a mobile phone number, or another identifier that can determine the unique user. Each user identifier may establish an association relationship with other user identifiers, and of any two user identifiers establishing an association relationship, one of the user identifiers may be referred to as an associated user identifier of the other user identifier.

In addition, each user identifier may also establish an association relationship with a group identifier, indicating that the user identifier is added to the group corresponding to the group identifier, and the group identifier may also be referred to as an associated user identifier of the user identifier. The group identifier is used to determine a unique group, and may be a group name, a group account, or another identifier that can determine a unique group.

The electronic equipment can display an interactive interface corresponding to any associated user identifier, send a message to the associated user identifier through the interactive interface or display the message sent by the associated user identifier, and realize interaction with the associated user identifier through the sending and receiving of the message.

For example, the electronic device installs an instant messaging application, logs in the instant messaging application based on the user identifier, establishes an association relationship with other user identifiers through the instant messaging application, and sends a message to the associated user identifier or receives a message sent by the associated user identifier, thereby realizing interaction with the associated user identifier.

The message may be any one of a text message, a voice message, a picture message and a video message, and may also be other forms of messages.

In a possible implementation manner, after the electronic device logs in based on the user identifier, an information display interface is displayed, where the information display interface includes a plurality of interaction entries, and by triggering an interaction entry, the electronic device may display an interaction interface of an associated user identifier corresponding to the interaction entry, where the interaction interface includes a voice message sent by the associated user identifier.

The trigger operation may be a click operation, a long-time press operation, a leftward sliding operation, or a rightward sliding operation, or may also be other forms of trigger operations, which is not specifically limited in this disclosure.

The user may perform a play operation, a delete operation, a save operation, a collect operation, or other operations based on any of the voice messages currently displayed.

In the embodiment of the disclosure, a plurality of target voice messages are displayed in an interactive interface, and when a first trigger operation on a first target voice message in the plurality of target voice messages is detected, a favorite is displayed in the interactive interface.

The first target voice message is any one of a plurality of target voice messages. In addition, the first trigger operation may be a long press operation, a double click operation, or other operations.

For example, as shown in fig. 3, a voice message 1, a voice message 2, a voice message 3, and a voice message 4 are displayed in the interactive interface, and when a long-press operation on the voice message 1 is detected, a favorite option is displayed in the interactive interface.

In another embodiment, in the interactive interface, if a second trigger operation on a first target voice message of the plurality of target voice messages is detected, the first target voice message is played. Wherein, the second trigger operation is a single-click operation or other operations.

In step 202, when a selection operation of the favorite option is detected, a plurality of selected target voice messages are acquired.

The interactive interface displays a plurality of voice messages, when a user collects a plurality of target voice messages, the selection operation of the collection option is triggered, and when the selection operation of the collection option is detected, the selected plurality of target voice messages are obtained.

And the selected target voice messages are the target voice messages selected by the user in the interactive interface.

Optionally, after the user triggers a selection operation on the collection option, at least one voice message is displayed in the interactive interface, the user can trigger a sliding operation in the interactive interface so that the interactive interface displays other voice messages, and by triggering the selection operation on a plurality of target voice messages, the terminal can detect the selection operation on the plurality of target voice messages and acquire the selected plurality of target voice messages.

In a possible implementation manner, when a selection operation on a favorite is detected, a selection option of each voice message is displayed in an interactive interface, and a target voice message corresponding to a plurality of selected selection options is acquired.

For example, as shown in fig. 4, a selection option is displayed on the left side of each target voice message, and when the first selection option is selected, a selection mark indicating that voice message 1 is the selected target voice message is displayed in the first selection option.

It should be noted that, the embodiment of the present application is described only by taking an example of displaying a favorite option when a first trigger operation on a first target voice message is detected. In another embodiment, when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, a selection option of each voice message is displayed in an interactive interface, and a plurality of target voice messages corresponding to a plurality of selected selection options are obtained.

Optionally, when a first trigger operation on the first target voice message is detected, a multiple selection option is displayed in the interactive interface, and when a selection operation on the multiple selection option is detected, a selection option of each voice message is displayed in the interactive interface.

The first trigger operation may be a long press operation, a double-click operation, or other operations.

For example, as shown in fig. 5, a voice message 1, a voice message 2, a voice message 3, and a voice message 4 are displayed in the interactive interface, and when a long-press operation on the voice message 1 is detected, a multi-selection option is displayed in the interactive interface. And when the selection operation of the multiple selection options is detected, displaying the selection options as shown in fig. 4, and further acquiring a plurality of target voice messages corresponding to the selected multiple selection options.

In step 203, when the aggregation operation of the plurality of target voice messages is detected, the plurality of target voice messages are aggregated.

After the selected multiple pieces of target voice messages are acquired, when the aggregation operation of the multiple pieces of target voice messages is detected, the multiple pieces of target voice messages are aggregated.

And the aggregation operation is used for indicating that the selected target voice messages are aggregated so as to aggregate the target voice messages into one aggregated voice message.

In one possible implementation manner, an aggregation option is displayed in the interactive interface, the aggregation option is used for indicating that a plurality of target voice messages are aggregated, and when a selection operation of the aggregation option is detected, the plurality of target voice messages are aggregated.

Wherein the aggregation option may be "merge", "aggregate", or other options.

For example, as shown in fig. 6, in the interactive interface, a plurality of selected target voice messages are displayed, and a "merge" option is displayed in the interactive interface, when a selection operation on the "merge" option is detected, it is determined that an aggregation operation on the plurality of target voice messages is detected, and the plurality of target voice messages are aggregated.

Optionally, when aggregating a plurality of target voice messages, the plurality of target voice messages are aggregated in the order of the plurality of target voice messages.

In one possible implementation, the plurality of target voice messages are aggregated in order of their message identifications going from first to last.

When each target voice message is received, adding a message identifier for each received target voice message, wherein the message identifiers added have a sequence when the target voice messages are received, and the added message identifiers are larger the later the target voice messages are received, the message identifiers of the target voice messages can represent the sequence of the target voice messages, the target voice messages can be sequenced according to the message identifiers of the target voice messages, and the target voice messages are aggregated according to the sequence of the message identifiers of the target voice messages from first to last.

Alternatively, the message identifications of the plurality of target voice messages may be seqids (a kind of sequence identifications), or other types of identifications.

For example, the target voice messages include target voice messages with message identifications 10001, 10002, and 10003, and when an aggregation operation for the target voice messages is detected, the target voice messages are aggregated according to the order of the message identifications 10001, 10002, and 10003.

According to the method and the device, the target voice messages are aggregated according to the sequence of the message identifications of the target voice messages from first to last, the sequence of the aggregated voice messages can be ensured to be the same as the sequence of the original target voice messages, and the accuracy of aggregating the target voice messages is improved.

In another possible implementation, the plurality of target voice messages are aggregated in an order from early to late of the selected time of the plurality of target voice messages.

When the user selects the target voice message, the multiple target voice messages need to be selected in sequence, so that in the process of selecting the target voice message by the user, the selected time for selecting the multiple target voice messages is obtained, and the multiple target voice messages can be aggregated according to the sequence of the selected time of the multiple target voice messages from early to late.

Optionally, when a plurality of target voice messages are aggregated, the aggregation is performed by using the sequence of the selected time of the plurality of target voice messages, so that before the plurality of target voice messages are selected, a prompt message is displayed in the interactive interface, and the prompt message is used for instructing the user to select according to the aggregation sequence of the target voice messages.

For example, before selecting a plurality of target voice messages, "please select voice messages in sequence from first to last" is displayed in the interactive interface, and the subsequent users can aggregate the voice messages according to the aggregation sequence of the voice messages.

According to the method and the device, the prompt information is displayed in the interactive interface, so that the situation that when a plurality of target voice messages are subsequently aggregated, the obtained aggregated voice messages are out of order and the aggregated voice messages are different from the original target voice messages in order, and the users are influenced to play the aggregated voice messages can be avoided.

In another possible implementation, the multiple target voice messages are aggregated in an order from early to late in their transmission time.

Each target voice message in the multiple target voice messages has sending time, the sending time of the multiple target voice messages represents the sequence of contents carried by the voice messages, and then the multiple target voice messages are aggregated according to the sequence of the sending time of the multiple target voice messages from early to late, so that the sequence of the aggregated voice message obtained by aggregation can be ensured to be the same as that of the original multiple target voice messages.

It should be noted that each target voice message in the plurality of target voice messages has description information, and the description information includes duration, capacity, encoding mode, and the like of the target voice message.

In another possible implementation manner, in the process of aggregating a plurality of target voice messages, not only the target voice messages are aggregated, but also description information of the plurality of target voice messages is aggregated.

Optionally, the durations of the multiple target voice messages are aggregated to obtain the duration of the aggregated voice message, and the capacities of the multiple target voice messages are aggregated to obtain the capacity of the aggregated voice message. The duration and the capacity of the aggregated voice message can be used as the description information of the aggregated voice message.

In one possible implementation manner, a plurality of target voice messages are aggregated into an aggregated voice message set, and formats of the plurality of target voice messages in the aggregated voice message set are not changed. And when the aggregated voice message is played subsequently, sequentially playing the plurality of target voice messages in the aggregated voice message set.

For example, the format of each target voice message is as follows:

Message AudioMeta{

1, AudioFormat format is 1; // formats such as mp3/aac format

String codec is 2; // coding, e.g. H64

Double duration ═ 3; // Play duration, unit s

Uint64size _ in _ bytes ═ 4; // File size

String uri is 5; // download Address }

The format of the aggregated voice message obtained by aggregating a plurality of target voice messages is as follows:

Message AudioMetaComb{

1 in Audio Format; // formats such as mp3/aac format

String codec is 2; // coding, e.g. H64

Double duration ═ 3; // Play duration, unit s

Uint64size in bytes ═ 4; // File size

String uri is 5; // download address;

the read AudioMeta _ meta _ list is 6; // original information set of target voice message }

In another possible implementation manner, a preset prompting message is added between any two adjacent target voice messages in the plurality of target voice messages, and the plurality of target voice messages added with the preset prompting message are aggregated.

When the target voice messages are aggregated, a preset prompting message is added between any two adjacent target voice messages in the target voice messages, and the target voice messages added with the preset prompting message are aggregated. When the obtained aggregated voice message is played subsequently, the user can know that the next original target voice message is to be played every time when the preset prompt message is played.

The preset prompting message is selected by the user, or the preset prompting message is set by the terminal, or other modes are adopted for setting.

In a possible implementation manner, a setting interface is displayed, the setting interface includes a prompt information setting option, when a selection operation on the prompt information setting option is detected, a prompt information setting list is displayed, the prompt information setting list includes at least one prompt message, and when a selection operation on any prompt message is detected, a prompt message corresponding to the selection operation is determined as a preset prompt message.

The method provided by the embodiment of the application adds the preset prompt message among the plurality of target voice messages, ensures that when the aggregated voice message is played subsequently, the user can be prompted when the next target voice message is played, and is convenient for the user to know that the next target voice message is played.

In step 204, an aggregated voice message obtained by aggregation is saved in the favorite list corresponding to the logged-in user identifier.

After the target voice messages are aggregated, an aggregated voice message can be obtained, and then the aggregated voice message is stored in a collection list corresponding to the logged-in user identification. And then the voice message can be found in the favorite list, and the voice message can be played.

In one possible implementation manner, the obtained aggregated voice message is sent to a server, and the server stores the aggregated voice message into a favorite list corresponding to the logged-in user identifier.

It should be noted that, in the embodiment of the present application, only the aggregated voice message is stored in the favorite list corresponding to the logged-in user identifier as an example for description. In another embodiment, step 204 is an optional step, and an aggregated voice message obtained by aggregation may also be directly saved without saving the aggregated voice message in a favorite list corresponding to the logged-in user identifier.

In one possible implementation, the obtained aggregated voice message is saved in a saving folder corresponding to the user identifier. Or sending the aggregated voice message to a server, and storing the aggregated voice message into a storage folder corresponding to the user identifier by the server.

Optionally, in the interactive interface, the display of the plurality of target voice messages is cancelled, and the obtained aggregated voice message is directly displayed.

Optionally, in the interactive interface, the obtained aggregated voice message is displayed below the last target voice message in the plurality of target voice messages.

In step 205, a favorites list is displayed.

Wherein the favorite list comprises aggregated voice messages. In addition, other voice messages may be included in the favorites list, and the other voice messages may be aggregated voice messages, original voice messages, text messages, and the like.

When a user needs to play a certain aggregated voice message, the operation of displaying the favorite list is triggered, and then the aggregated voice message can be selected from the favorite list for playing.

Optionally, when each voice message is displayed in the favorite list, the type of the voice message is marked in each voice message to indicate whether the voice message belongs to the original voice message or the aggregated voice message.

For example, the favorite list includes voice message 1, voice message 2, and voice message 3, with the original voice message displayed behind voice message 1, the aggregate voice message displayed behind voice message 2, and the aggregate voice message displayed behind voice message 3.

In one possible implementation, a main interface is displayed, the main interface includes a favorite list option, and when a selection operation on the favorite list option is detected, the favorite list is displayed.

In another possible implementation, a favorites list option is displayed in the interactive interface, and the favorites list is displayed when a selection operation of the favorites list option is detected.

In step 206, when a play operation for the aggregated voice message is detected, the aggregated voice message is played.

Wherein the aggregated voice message is any aggregated voice message in the favorite list. The play operation may be a single click operation, a double click operation, a long press operation, or the like.

Optionally, the voice message aggregation is stored locally in the terminal, and when a play operation of the voice message aggregation is detected, the voice message aggregation is directly obtained and played.

Optionally, the aggregated voice message is stored in the server, when a playing operation of the aggregated voice message is detected, an acquisition request of the aggregated voice message is sent to the server, and when the aggregated voice message sent by the server is received, the aggregated voice message is played.

In one possible implementation manner, when a playing operation on the aggregated voice message is detected, a voice playing interface is displayed, and the aggregated voice message is played in the voice playing interface.

In another possible implementation manner, when the aggregated voice message is played, a playing time axis of the aggregated voice message is displayed, the playing time axis includes a starting time point of each target voice message in the plurality of target voice messages, and when a trigger operation for any starting time point is detected, the aggregated voice message is played from any starting time point.

Wherein the playing time axis includes the total duration of the voice message. The trigger operation may be a single click operation, a sliding operation, or other operations.

In another possible implementation manner, a playing time axis of the aggregated voice message is displayed, the playing time axis includes a playing progress bar, when a dragging operation on the playing progress bar is detected, a termination position of the dragging operation is acquired, and the aggregated voice message is played from a time point corresponding to the termination position.

The playing progress bar is used for indicating the progress of playing the aggregated voice message, and the playing progress bar can also be dragged, so that the aggregated voice message is played from the end position of the dragging.

It should be noted that, after the step 201 and the step 204 are executed, the embodiment of the present application aggregates the multiple target voice messages, and stores the obtained aggregated voice message in the favorite list. Subsequently,

step

205 and 206 may be performed one or more times to play the aggregated voice message in the favorite list.

The method provided by the embodiment of the application expands a mode of aggregating a plurality of voice messages, acquires a plurality of selected target voice messages from a displayed interactive interface, aggregates the plurality of target voice messages when the aggregation operation of the plurality of target voice messages is detected, stores an aggregated voice message obtained by aggregation, can aggregate the plurality of selected voice messages into an aggregated voice message after a user selects the plurality of target voice messages, stores the aggregated voice message, enriches functions, and expands a processing mode of the voice message.

In addition, the method provided by the embodiment of the application displays the playing time axis of the aggregated voice message, and skips to playing the aggregated voice message from the starting time point when the triggering operation of the starting time point of any entry slogan message in the playing time axis is detected, so that a user can quickly position the starting time point of any target voice message, the positioning accuracy is improved, and the efficiency of playing the voice message is improved.

Fig. 7 is a flow chart illustrating a method of voice message aggregation in accordance with an example embodiment. Applied to an electronic device, referring to fig. 7, the method includes:

in step 701, an interactive interface is displayed.

The interactive interface comprises a plurality of target voice messages.

In step 702, when a first trigger operation on a first target voice message in a plurality of target voice messages is detected, a multi-selection option is displayed in an interactive interface.

In step 703, when a selection operation on the multiple selection options is detected, the selection options of each voice message are displayed in the interactive interface, and multiple target voice messages corresponding to the multiple selected selection options are obtained.

In step 704, when an aggregation operation for a plurality of target voice messages is detected, the plurality of target voice messages are aggregated.

In step 705, an aggregated voice message obtained by aggregation is saved.

In step 706, an interactive interface is displayed, including the aggregated voice message.

In step 707, when a play operation for the aggregated voice message is detected, the aggregated voice message is played.

Steps 701-707 in the present application are similar to those in the above embodiments, and are not described herein again.

Fig. 8 is a schematic structural diagram illustrating a voice message aggregation apparatus according to an exemplary embodiment. Referring to fig. 8, the apparatus includes:

an obtaining unit 801 configured to obtain a plurality of selected target voice messages from the displayed interactive interface;

an aggregation unit 802 configured to aggregate a plurality of target voice messages when an aggregation operation on the plurality of target voice messages is detected;

a saving unit 803 configured to save the aggregated one voice message.

The device provided by the embodiment of the disclosure expands a mode of aggregating a plurality of voice messages, acquires a plurality of selected target voice messages from a displayed interactive interface, aggregates the plurality of target voice messages when the aggregation operation of the plurality of target voice messages is detected, stores an aggregated voice message obtained by aggregation, can aggregate the plurality of selected voice messages into an aggregated voice message after a user selects the plurality of target voice messages, stores the aggregated voice message, enriches functions, expands a processing mode of the voice message, and has diversified functions.

In one possible implementation, referring to fig. 9, the obtaining unit 801 includes:

the display subunit 8011 is configured to, when a first trigger operation on a first target voice message of the plurality of target voice messages is detected, display favorite options in the interactive interface;

a retrieving sub-unit 8012 configured to retrieve the selected plurality of target voice messages when a selecting operation of the favorite option is detected.

In another possible implementation manner, the saving unit 803 is configured to save the aggregated voice message in a favorite list corresponding to the logged-in user identifier.

In another possible implementation, referring to fig. 9, the apparatus further includes:

a display unit 804 configured to display a favorite list including an aggregated voice message;

a playing unit 805 configured to play the aggregated voice message when a playing operation for the aggregated voice message is detected.

In another possible implementation manner, the apparatus further includes:

a display unit 804 configured to display a play time axis of the aggregated voice message, the play time axis including a start time point of each of the plurality of target voice messages;

a playing unit 805 configured to play the aggregated voice message from any starting time point when a trigger operation for any starting time point is detected.

In another possible implementation manner, the apparatus further includes:

a display unit 804 configured to display a play time axis of the aggregated voice message, the play time axis including a play progress bar;

an acquisition unit 801 configured to acquire, when a drag operation on the play progress bar is detected, an end position of the drag operation;

and a playing unit 805 configured to play the aggregated voice message from a time point corresponding to the termination position.

In another possible implementation, referring to fig. 9, the obtaining unit 801 includes:

the display sub-unit 8011 configured to display a selection option of each voice message in the interactive interface when a first trigger operation on a first target voice message of the plurality of target voice messages is detected;

the acquiring sub-unit 8012 is configured to acquire a plurality of target voice messages corresponding to the selected selection options.

In another possible implementation, the display subunit 8011 is configured to display a multi-selection option in the interactive interface when a first trigger operation on a first target voice message is detected;

the display sub-unit 8011 is configured to display a selection option of each voice message in the interactive interface when a selection operation of a multi-selection option is detected.

In another possible implementation manner, the interactive interface includes an aggregation option, and the aggregation unit 802 is configured to aggregate the plurality of target voice messages when a selection operation on the aggregation option is detected.

In another possible implementation manner, the aggregating unit 802 is configured to aggregate the multiple target voice messages according to an order of the multiple target voice messages.

In another possible implementation manner, the aggregating unit 802 is configured to aggregate the plurality of target voice messages according to a sequence of message identifications of the plurality of target voice messages from first to last; or the like, or, alternatively,

an aggregation unit 802 configured to aggregate the plurality of target voice messages in an order from early to late of the selected time of the plurality of target voice messages; or the like, or, alternatively,

an aggregation unit 802 configured to aggregate the plurality of target voice messages in an order from early to late of transmission time of the plurality of target voice messages.

In another possible implementation, the aggregation unit 802 is configured to perform at least one of the following:

In another possible implementation, referring to fig. 9, the aggregation unit 802 includes:

an adding subunit 8021, configured to add a preset prompting message between any two adjacent target voice messages in the plurality of target voice messages;

an aggregation subunit 8022 configured to aggregate the plurality of target voice messages to which the preset alert message has been added.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 10 is a block diagram illustrating a terminal according to an example embodiment. The terminal 1000 can be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 800 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.

In general, terminal 1000 can include: one or more processors 1001 and one or more memories 1002.

Processor 1001 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so forth. The processor 1001 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1001 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also referred to as a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1001 may be integrated with a GPU (Graphics Processing Unit, data recommender), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 1001 may further include an AI (Artificial Intelligence) processor for processing a computing operation related to machine learning.

Memory 1002 may include one or more computer-readable storage media, which may be non-transitory. The memory 1002 may also include volatile memory or non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1002 is used to store at least one instruction for being possessed by processor 1001 to implement the voice message aggregation methods provided by method embodiments herein.

In some embodiments, terminal 1000 can also optionally include: a peripheral interface 1003 and at least one peripheral. The processor 1001, memory 1002 and peripheral interface 1003 may be connected by a bus or signal line. Various peripheral devices may be connected to peripheral interface 1003 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1004, touch screen display 1005, camera 1006, audio circuitry 1007, positioning components 1008, and power supply 1009.

The peripheral interface 1003 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 1001 and the memory 1002. In some embodiments, processor 1001, memory 1002, and peripheral interface 1003 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1001, the memory 1002, and the peripheral interface 1003 may be implemented on separate chips or circuit boards, which are not limited by this embodiment.

The Radio Frequency circuit 1004 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 1004 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1004 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1004 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1004 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 1004 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1005 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1005 is a touch display screen, the display screen 1005 also has the ability to capture touch signals on or over the surface of the display screen 1005. The touch signal may be input to the processor 1001 as a control signal for processing. At this point, the display screen 1005 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, display screen 1005 can be one, providing a front panel of terminal 1000; in other embodiments, display 1005 can be at least two, respectively disposed on different surfaces of terminal 1000 or in a folded design; in still other embodiments, display 1005 can be a flexible display disposed on a curved surface or on a folded surface of terminal 1000. Even more, the display screen 1005 may be arranged in a non-rectangular irregular figure, i.e., a shaped screen. The Display screen 1005 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 1006 is used to capture images or video. Optionally, the camera assembly 1006 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1006 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuit 1007 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 1001 for processing or inputting the electric signals to the radio frequency circuit 1004 for realizing voice communication. For stereo sound collection or noise reduction purposes, multiple microphones can be provided, each at a different location of terminal 1000. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1001 or the radio frequency circuit 1004 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuit 1007 may also include a headphone jack.

A Location component 1008 is employed to locate a current geographic Location of terminal 1000 for purposes of navigation or LBS (Location Based Service). The Positioning component 1008 may be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

Power supply 1009 is used to supply power to various components in terminal 1000. The power source 1009 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 1009 includes a rechargeable battery, the rechargeable battery may support wired charging or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 1000 can also include one or more sensors 1010. The one or more sensors 1010 include, but are not limited to: acceleration sensor 1011, gyro sensor 1012, pressure sensor 1013, fingerprint sensor 1014, optical sensor 1015, and proximity sensor 1016.

Acceleration sensor 1011 can detect acceleration magnitudes on three coordinate axes of a coordinate system established with terminal 1000. For example, the acceleration sensor 1011 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 1001 may control the touch display screen 1005 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1011. The acceleration sensor 1011 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 1012 may detect a body direction and a rotation angle of the terminal 1000, and the gyro sensor 1012 and the acceleration sensor 1011 may cooperate to acquire a 3D motion of the user on the terminal 1000. From the data collected by the gyro sensor 1012, the processor 1001 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensor 1013 may be disposed on a side frame of terminal 1000 and/or on a lower layer of touch display 1005. When pressure sensor 1013 is disposed on a side frame of terminal 1000, a user's grip signal on terminal 1000 can be detected, and processor 1001 performs left-right hand recognition or shortcut operation according to the grip signal collected by pressure sensor 1013. When the pressure sensor 1013 is disposed at a lower layer of the touch display screen 1005, the processor 1001 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 1005. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 1014 is used to collect a fingerprint of the user, and the processor 1001 identifies the user according to the fingerprint collected by the fingerprint sensor 1014, or the fingerprint sensor 1014 identifies the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 1001 authorizes the user to have relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. Fingerprint sensor 1014 can be disposed on the front, back, or side of terminal 1000. When a physical key or vendor Logo is provided on terminal 1000, fingerprint sensor 1014 can be integrated with the physical key or vendor Logo.

The optical sensor 1015 is used to collect the ambient light intensity. In one embodiment, the processor 1001 may control the display brightness of the touch display screen 1005 according to the intensity of the ambient light collected by the optical sensor 1015. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 1005 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 1005 is turned down. In another embodiment, the processor 1001 may also dynamically adjust the shooting parameters of the camera assembly 1006 according to the intensity of the ambient light collected by the optical sensor 1015.

Proximity sensor 1016, also known as a distance sensor, is typically disposed on a front panel of terminal 1000. Proximity sensor 1016 is used to gather the distance between the user and the front face of terminal 1000. In one embodiment, when proximity sensor 1016 detects that the distance between the user and the front surface of terminal 1000 gradually decreases, processor 1001 controls touch display 1005 to switch from a bright screen state to a dark screen state; when proximity sensor 1016 detects that the distance between the user and the front of terminal 1000 is gradually increased, touch display screen 1005 is controlled by processor 1001 to switch from a breath-screen state to a bright-screen state.

Those skilled in the art will appreciate that the configuration shown in FIG. 10 is not intended to be limiting and that terminal 1000 can include more or fewer components than shown, or some components can be combined, or a different arrangement of components can be employed.

Fig. 11 is a schematic structural diagram of a server according to an exemplary embodiment, where the server 1100 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 1101 and one or more memories 1102, where the memory 1102 stores at least one instruction, and the at least one instruction is loaded and executed by the processors 1101 to implement the methods provided by the above method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The server 1100 may be used to perform the steps performed by the server in the voice message aggregation method described above.

In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium, wherein instructions of the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the steps performed by a terminal or a server in the voice message aggregation method.

In an exemplary embodiment, there is also provided a computer program product, wherein instructions of the computer program product, when executed by a processor of an electronic device, enable the electronic device to perform the steps performed by the terminal or the server in the voice message aggregation method.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for voice message aggregation, the method comprising:

storing the aggregated voice message in a favorite list corresponding to the logged-in user identifier;

displaying the favorite list, when the playing operation of any one of the voice messages in the favorite list is detected, playing the voice message, and displaying the playing time axis of the voice message, wherein the playing time axis comprises the starting time point of each target voice message in the target voice messages;

when the trigger operation of any starting time point is detected, the aggregated voice message is played from any starting time point;

the step of obtaining the selected multiple target voice messages from the displayed interactive interface comprises the following steps:

2. The method of claim 1, wherein the play timeline includes a play progress bar, the method further comprising:

3. The method according to claim 1, wherein the obtaining the selected target voice messages from the displayed interactive interface comprises:

4. The method of claim 3, wherein displaying a selection option for each voice message in the interactive interface upon detecting a first trigger operation on a first target voice message of the plurality of target voice messages comprises:

5. The method of claim 1, wherein the interactive interface comprises an aggregation option, and wherein aggregating the plurality of target voice messages when an aggregation operation for the plurality of target voice messages is detected comprises:

6. The method of claim 1, wherein said aggregating the plurality of targeted voice messages comprises:

7. The method of claim 6, wherein said aggregating the plurality of target voice messages in the order of the plurality of target voice messages comprises:

according to the sequence of the selected time of the target voice messages from early to late, the target voice messages are aggregated; or the like, or, alternatively,

8. The method of claim 1, further comprising at least one of:

9. The method of claim 1, wherein said aggregating the plurality of targeted voice messages comprises:

and aggregating the target voice messages added with the preset prompt message.

10. An apparatus for voice message aggregation, the apparatus comprising:

the storage unit is configured to store the aggregated voice message in a favorite list corresponding to the logged-in user identifier;

the device further comprises:

a display unit configured to display the favorite list;

a playing unit configured to play any aggregated voice message in the favorite list when a playing operation for the aggregated voice message is detected;

the playing unit is configured to play the aggregated voice message from any starting time point when a trigger operation on the any starting time point is detected;

the acquisition unit includes:

11. The apparatus of claim 10, wherein the playback timeline includes a playback progress bar;

12. The apparatus of claim 10,

the acquisition subunit is configured to acquire a plurality of target voice messages corresponding to the selected selection options.

13. The apparatus according to claim 12, wherein the display subunit is configured to display a multiple-choice option in the interactive interface when a first trigger operation on the first target voice message is detected;

14. The apparatus of claim 10, wherein the interactive interface comprises an aggregation option, and wherein the aggregation unit is configured to aggregate the plurality of targeted voice messages when a selection operation of the aggregation option is detected.

15. The apparatus of claim 10, wherein the aggregating unit is configured to aggregate the plurality of target voice messages in an order of the plurality of target voice messages.

16. The apparatus of claim 15,

the aggregation unit is configured to aggregate the plurality of target voice messages according to the sequence of the message identifications of the plurality of target voice messages from first to last; or the like, or a combination thereof,

the aggregation unit is configured to aggregate the target voice messages according to the sequence of the selected time of the target voice messages from early to late; or the like, or, alternatively,

17. The apparatus of claim 10, wherein the aggregation unit is configured to perform at least one of:

18. The apparatus of claim 10, wherein the aggregation unit comprises:

and the aggregation subunit is configured to aggregate the target voice messages added with the preset prompt messages.

19. An electronic device, characterized in that the electronic device comprises:

one or more processors;

wherein the one or more processors are configured to perform the voice message aggregation method of any one of claims 1-9.

20. A non-transitory computer readable storage medium having instructions stored thereon that, when executed by a processor of an electronic device, enable the electronic device to perform the voice message aggregation method of any one of claims 1-9.