WO2019205119A1

WO2019205119A1 - Voice playback method and device, and client

Info

Publication number: WO2019205119A1
Application number: PCT/CN2018/085027
Authority: WO
Inventors: 陈飞; 杨磊; 廖彬彬
Original assignee: 海能达通信股份有限公司
Priority date: 2018-04-28
Filing date: 2018-04-28
Publication date: 2019-10-31

Abstract

The present application provides a voice playback method and device, and a client. The voice playback method comprises: obtaining a voice packet from a server; generating a voice playback progress bar according to the voice of each voice object, a voice object identifier, the voice start time, and the voice end time in the voice packet, and marking voice playback segments of each voice object on the voice playback progress bar; and selecting at least one voice playback segment from the voice playback segments of each voice object marked on the voice playback progress bar for voice playback. In the present application, by means of the manners above, the voice playback speed of the specified object is improved, and thus the voice playback efficiency is improved.

Description

Voice playing method, device and client

Technical field

The present application relates to the field of voice processing technologies, and in particular, to a voice playing method, apparatus, and client.

Background technique

With the development of electronic computers and computer networks, voice playback functions (which can be understood as: pre-storing voices, and then playing back stored voices when needed) are realized, and are widely used in communication, education, scientific research, and the like. field.

However, when the voice playback function needs to determine the voice of a specified object in a certain segment of speech, the method of listening to the tail from the beginning is required, so that the voice of the specified object cannot be quickly located, and the voice playback efficiency is low.

Summary of the invention

To solve the above technical problem, the embodiment of the present application provides a voice playing method, device, and client, so as to improve the playback speed of a specified object, thereby improving the efficiency of voice playback. The technical solution is as follows:

A voice playing method includes:

Get a voice packet from the server;

Generating a voice playback progress bar according to the voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet, and marking the voice play of each of the voice objects on the voice playback progress bar. segment;

And selecting at least one voice play segment from the voice play segments of each of the voice objects marked on the voice playback progress bar to perform voice play.

Preferably, the voice play segment of each of the voice objects is marked on the voice playback progress bar, and includes:

And displaying a voice play segment of each of the voice objects in different colors on the voice playback progress bar, and marking a voice object identifier on a voice play segment of each of the voice objects;

Or, the voice play segments of each of the voice objects are marked on the voice playback progress bar in different shading patterns, and the voice object identifiers are marked on the voice play segments of the respective voice objects.

Preferably, the at least one voice play segment is selected from the voice play segments of the voice objects marked on the voice playback progress bar to perform voice play, including:

The voice play segments of the respective voice objects marked on the voice playback progress bar are sequentially played in a voice play.

Preferably, the at least one voice play segment is selected from the voice play segments of the voice objects marked on the voice playback progress bar, and before the voice play is performed, the method further includes:

In the voice play segments of the respective voice objects marked on the voice playback progress bar, the unselected voice play segments are displayed in a gray color.

Preferably, the obtaining a voice packet of each voice object from the server includes:

Sending a voice packet request to the server;

Receiving a voice packet returned by the server in response to the voice packet request.

Preferably, before the obtaining the voice packet from the server, the method further includes:

Obtaining voices of each of the voice objects, and recording start and end times of voices of the respective voice objects;

The voice, the voice object identity, the start time and the end time of each voice object are encapsulated into voice packets, and sent to the server for storage.

A voice playback device includes:

a first obtaining module, configured to obtain a voice packet from a server;

a generating module, configured to generate a voice playback progress bar according to voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet;

a marking module, configured to mark a voice playing segment of each of the voice objects on the voice playback progress bar;

The playing module is configured to select at least one voice playing segment from the voice playing segments of each of the voice objects marked on the voice playback progress bar to perform voice playback.

Preferably, the marking module comprises:

a first marking unit, configured to mark a voice playing segment of each of the voice objects in different colors on the voice playback progress bar, and mark a voice object identifier on a voice playing segment of each of the voice objects;

Or a second marking unit, configured to mark a voice playing segment of each of the voice objects in different shading patterns on the voice playback progress bar, and mark a voice object on a voice playing segment of each of the voice objects Logo.

Preferably, the playing module includes:

The first playing unit is configured to perform voice playback on the voice playing segments of each of the voice objects marked on the voice playback progress bar in sequence.

Preferably, the marking module further comprises:

And a display unit, configured to perform a gray color display on the unselected voice play segment of the voice play segments of each of the voice objects marked on the voice playback progress bar.

Preferably, the first acquiring module includes:

a sending unit, configured to send a voice packet request to the server;

And a receiving unit, configured to receive a voice packet returned by the server in response to the voice packet request.

Preferably, the device further comprises:

a second acquiring module, configured to acquire voices of each of the voice objects, and record start time and end time of voices of each of the voice objects;

The sending module is configured to encapsulate the voice, the voice object identity, the start time and the end time of each voice object into a voice packet, and send the message to the server for storage.

A client comprising: a processor, a memory, and a data bus, wherein the processor and the memory communicate via the data bus;

The memory is configured to store a program;

The processor, configured to execute the program;

The program, when executed by the processor, implements the following method steps:

Get a voice packet from the server;

Compared with the prior art, the beneficial effects of the present application are:

In the present application, the voice play segments of the respective voice objects may be marked on the voice playback progress bar, and at least one voice play is selected from the voice play segments of the respective voice objects marked on the voice playback progress bar. In the segment, the selected voice playing end can be a voice playing segment of the specified object, and then the voice of the specified object can be directly played, and the voice of the specified object is not required to be heard from the beginning of the voice, and the voice of the specified object is improved. The playback speed, which in turn improves the efficiency of voice playback.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present application. Other drawings may also be obtained from those of ordinary skill in the art in view of the drawings.

1 is a flow chart of a voice playing method provided by the present application;

2 is a schematic structural diagram of a voice playback progress bar provided by the present application;

3 is another flow chart of a voice playing method provided by the present application;

4 is still another flowchart of the voice playing method provided by the present application;

FIG. 5 is a schematic diagram of a logical structure of a voice playback apparatus provided by the present application.

detailed description

The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

The embodiment of the present application discloses a voice playing method, which generates a voice packet of each voice object from a server, and generates a voice object identity identifier, a voice start time, and a voice end time according to voice packets of each voice object. a speech playback progress bar, and marking a voice play segment of each of the voice objects on the voice playback progress bar, and playing a voice corresponding to a voice playback progress bar of the voice play segment marked with each of the voice objects To achieve the playback of voice.

The voice playback method disclosed in the embodiment of the present application is introduced. Referring to FIG. 1, the method may include:

Step S11: Obtain a voice packet from the server.

In this embodiment, the server is configured to store voice packets, which can reduce the occupation of the client memory and reduce the running load of the client.

The voice packet may include at least one voice object voice, a voice object identity, a voice start time, and a voice end time. The voice object identity can be used for identifying the voice object identity.

Step S12: Generate a voice playback progress bar according to the voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet, and mark each voice object on the voice playback progress bar. Voice playback segment.

Generating a voice playback progress bar according to the voice object identity, the voice start time, and the voice end time in the voice packets of the voice object, which can be understood as: according to the voice object identity in the voice packet of each voice object, The voice start time and the voice end time generate a voice playback progress bar in the form of a time axis.

After the voice playback progress bar is generated, the voice play segments of the voice objects are marked on the voice playback progress bar, and the voice time segments of the voice objects can be displayed intuitively, so as to quickly locate the voice of the voice object of interest. Position, improve the efficiency of voice playback.

Step S13: Select at least one voice play segment from the voice play segments of each of the voice objects marked on the voice playback progress bar to perform voice play.

In another embodiment of the present application, the voice play segment of each of the voice objects is marked on the voice playback progress bar, and specifically includes:

A voice play segment of each of the voice objects is marked in a different color on the voice playback progress bar, and a voice object identifier is marked on a voice play segment of each of the voice objects.

The voice play segment marks of different voice objects have different colors, and are used to distinguish voice play segments of different voice objects, so that the voice time segments of each voice object are displayed very intuitively.

Referring to FIG. 2, the voice playback segment of the different voice objects in the voice playback progress bar has different colors, and the voice segment of each voice object is marked with a voice object identifier. For example, the voice segment of the voice object A is marked with a person A. The voice play segment of the voice object B is marked with a person B, and the voice play segment of the voice object C is marked with a person C.

In another embodiment of the present application, another embodiment of the voice play segment of each of the voice objects is marked on the voice playback progress bar, and specifically includes:

The voice play segments of each of the voice objects are marked on the voice playback progress bar in different shading patterns, and the voice object identifiers are marked on the voice play segments of the respective voice objects.

The voice play segment marks of different voice objects have different shading patterns, and are used to distinguish voice play segments of different voice objects, so as to display the voice time segments of each voice object very intuitively.

It should be noted that, the manner in which the voice play segments of each of the voice objects are marked on the voice playback progress bar is not limited to the above-mentioned manners marked with different colors and marked with different shading patterns, and any difference may be distinguished. The marking manner of the voice playing segment of the voice object needs to be protected by the present invention.

Certainly, in this embodiment, the voice play segments of each of the voice objects may be marked in a combination of a color and a shading pattern. Specifically, the voice play segments of different voice objects are marked with different colors and the same shading pattern. Or, the voice segments of different voice objects are marked with the same color and different shading patterns; or, the voice segments of different voice objects are marked with different colors and different shading patterns.

In another embodiment of the present application, at least one voice play segment is selected from the voice play segments of the voice objects marked on the voice playback progress bar, and the voice play is performed for introduction.

The voice play segment of each of the voice objects marked on the voice playback progress bar is played in sequence, and can be understood as: a full voice play mode, that is, a voice of all voice objects is played.

In another embodiment of the present application, another implementation manner of performing voice playback is performed on at least one voice play segment of the voice play segments of each of the voice objects marked on the voice playback progress bar. Introduction, specifically can include:

From the voice play segments of each of the voice objects marked on the voice playback progress bar, a voice play segment of the specified voice object is selected for voice play.

Selecting, from the voice play segment of each of the voice objects marked on the voice playback progress bar, a voice play segment of the specified voice object to perform voice play, which can be understood as: playing only the voice play segment of the specified voice object Voice, which automatically skips the voice playback segment of a non-specified voice object.

Only the voice corresponding to the voice segment of the specified voice object is played, and the voice segment of the non-designated voice object is automatically skipped, which can save time and further improve the voice playback efficiency.

Specifically, the user can drag the scroll bar on the voice playback progress bar to the voice play segment of the specified voice object, and the client correspondingly specifies the voice object on the voice playback progress bar marked with the voice play segment of each of the voice objects. The voice corresponding to the voice playback segment is played.

In another embodiment of the present application, another voice playing method is provided. Referring to FIG. 3, the method may include:

Step S21: Obtain a voice packet from the server.

Step S22: Generate a voice playback progress bar according to the voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet, and mark each voice object on the voice playback progress bar. Voice playback segment.

The steps S21-S22 are the same as the steps S11-S12 in the foregoing embodiment, and the detailed process of the steps S21-S22 can be referred to the related description of the steps S11-S12, and details are not described herein again.

In step S23, in the voice play segment of each of the voice objects marked on the voice playback progress bar, the unselected voice play segment is displayed in a gray color.

In the voice play segment of each of the voice objects marked on the voice playback progress bar, the unselected voice play segment is displayed in a gray color, and the time period of the voice to be played can be displayed more intuitively.

Step S24: Select at least one voice play segment from the voice play segments of each of the voice objects marked on the voice playback progress bar to perform voice play.

For the detailed process of step S24, refer to the voice play segment of each of the voice objects marked on the voice playback progress bar in the foregoing embodiment, and select at least one voice play segment to perform related introduction of voice play. Let me repeat.

In another embodiment of the present application, the voice packet of each voice object is obtained from the server, and specifically includes:

A1. Send a voice packet request to the server.

Specifically, a voice packet request may be sent to the server according to time and/or voice object to request a time and/or a voice packet corresponding to the voice object.

A2. Receive a voice packet returned by the server in response to the voice packet request.

Based on the content of the foregoing various embodiments, in another embodiment of the present application, another voice playing method is provided. Referring to FIG. 4, the method may include:

Step S31: Acquire the voices of the respective voice objects, and record the start time and the end time of the voices of the respective voice objects.

When the voice object initiates the voice, the client acquires the voice of each voice object, and records the start time and the end time of the voice of each voice object.

In step S32, the voice, the voice object identity, the start time and the end time of each voice object are encapsulated into voice packets, and sent to the server for storage.

Step S33: Obtain a voice packet from the server.

Step S34: Generate a voice playback progress bar according to the voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet, and mark each voice object on the voice playback progress bar. Voice playback segment.

Step S35: Select at least one voice play segment from the voice play segments of each of the voice objects marked on the voice playback progress bar to perform voice play.

The steps S33-S35 are the same as the steps S11-S13 in the foregoing embodiment, and the detailed process of the steps S33-S35 can be referred to the related description of the steps S11-S13, and details are not described herein again.

Next, the voice playback device provided by the present application will be described. The voice playback device described below and the voice playback method described above can be referred to each other.

Referring to FIG. 5, a schematic diagram of a logical structure of a voice playback apparatus provided by the present application is shown. The voice playback apparatus includes: a first acquisition module 11, a generation module 12, a marking module 13, and a playback module 14.

The first obtaining module 11 is configured to obtain a voice packet from a server.

The generating module 12 is configured to generate a voice playback progress bar according to the voice, the voice object identity, the voice start time, and the voice end time of each voice object in the voice packet.

The marking module 13 is configured to mark a voice playing segment of each of the voice objects on the voice playback progress bar.

The playing module 14 is configured to select at least one voice playing segment from the voice playing segments of the voice objects marked on the voice playback progress bar to perform voice playing.

In this embodiment, the marking module 13 may include: a first marking unit or a second marking unit.

The first marking unit is configured to mark the voice playing segments of each of the voice objects in different colors on the voice playback progress bar, and mark the voice object identifiers on the voice playing segments of the voice objects.

And a second marking unit, configured to mark the voice playing segments of each of the voice objects in different shading patterns on the voice playback progress bar, and mark the voice object identifiers on the voice playing segments of the voice objects.

In this embodiment, the playing module 14 may include: a first playing unit or a second playing unit.

And a second playing unit, configured to select a voice playing segment of the specified voice object from the voice playing segment of each of the voice objects marked on the voice playback progress bar, and perform voice playing. .

In this embodiment, the marking module 13 may further include:

In this embodiment, the first obtaining module 11 may include: a sending unit and a receiving unit.

And a sending unit, configured to send a voice packet request to the server.

In this embodiment, the voice playback device may further include: a second acquiring module and a sending module.

And a second acquiring module, configured to acquire voices of each of the voice objects, and record a start time and an end time of voices of the voice objects.

In another embodiment of the present application, a client is provided that includes a processor, a memory, and a data bus, the processor and the memory being in communication over the data bus.

The memory is used to store a program.

The processor is configured to execute the program.

Get a voice packet from the server;

It should be noted that each embodiment in the specification is described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the embodiments are referred to each other. can. For the device type embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.

Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the term "comprises" or "comprises" or "comprises" or any other variations thereof is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device that comprises a plurality of elements includes not only those elements but also Other elements, or elements that are inherent to such a process, method, item, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.

For the convenience of description, the above devices are described separately by function into various units. Of course, the functions of each unit may be implemented in the same software or software and/or hardware when implementing the present application.

It will be apparent to those skilled in the art from the above description of the embodiments that the present application can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product in essence or in the form of a software product, which may be stored in a storage medium such as a ROM/RAM or a disk. , an optical disk, etc., includes instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present application or portions of the embodiments.

The voice playback method, device, and client provided by the present application are described in detail. The principles and implementation manners of the application are described in the specific examples. The description of the above embodiments is only used to help understand the present application. The method of application and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in the specific implementation manner and application scope. In summary, the content of this specification should not be understood. To limit the application.

Claims

A voice playing method, comprising:

Get a voice packet from the server;

Generating a voice playback progress bar according to the voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet, and marking the voice play of each of the voice objects on the voice playback progress bar. segment;

And selecting at least one voice play segment from the voice play segments of each of the voice objects marked on the voice playback progress bar to perform voice play.
The method according to claim 1, wherein the marking a voice play segment of each of the voice objects on the voice playback progress bar comprises:

Marking a voice play segment of each of the voice objects in different colors on the voice playback progress bar, and marking a voice object identifier on a voice play segment of each of the voice objects;

Or, the voice play segments of each of the voice objects are marked on the voice playback progress bar in different shading patterns, and the voice object identifiers are marked on the voice play segments of the respective voice objects.
The method according to claim 1, wherein the at least one voice play segment is selected from the voice play segments of the voice objects marked on the voice playback progress bar to perform voice play, including:

The voice play segments of the respective voice objects marked on the voice playback progress bar are sequentially played in a voice play.
The method according to claim 3, wherein said at least one voice play segment is selected from the voice play segments of each of the voice objects marked on the voice playback progress bar, and before the voice play is performed, include:

In the voice play segments of the respective voice objects marked on the voice playback progress bar, the unselected voice play segments are displayed in a gray color.
The method according to claim 1, wherein the obtaining a voice packet of each voice object from the server comprises:

Sending a voice packet request to the server;

Receiving a voice packet returned by the server in response to the voice packet request.
The method according to any one of claims 1-5, wherein before the obtaining the voice packet from the server, the method further comprises:

Obtaining voices of each of the voice objects, and recording start and end times of voices of the respective voice objects;

The voice, the voice object identity, the start time and the end time of each voice object are encapsulated into voice packets, and sent to the server for storage.
A voice playback device, comprising:

a first obtaining module, configured to obtain a voice packet from a server;

a generating module, configured to generate a voice playback progress bar according to voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet;

a marking module, configured to mark a voice playing segment of each of the voice objects on the voice playback progress bar;

The playing module is configured to select at least one voice playing segment from the voice playing segments of each of the voice objects marked on the voice playback progress bar to perform voice playback.
The apparatus according to claim 7, wherein said marking module comprises:

a first marking unit, configured to mark a voice playing segment of each of the voice objects in different colors on the voice playback progress bar, and mark a voice object identifier on a voice playing segment of each of the voice objects;

Or a second marking unit, configured to mark a voice playing segment of each of the voice objects in different shading patterns on the voice playback progress bar, and mark a voice object on a voice playing segment of each of the voice objects Logo.
The device according to claim 7, wherein the playing module comprises:

The first playing unit is configured to perform voice playback on the voice playing segments of each of the voice objects marked on the voice playback progress bar in sequence.
The device according to claim 9, wherein the marking module further comprises:

And a display unit, configured to perform a gray color display on the unselected voice play segment of the voice play segments of each of the voice objects marked on the voice playback progress bar.
The device according to claim 7, wherein the first obtaining module comprises:

a sending unit, configured to send a voice packet request to the server;

And a receiving unit, configured to receive a voice packet returned by the server in response to the voice packet request.
The device of any of claims 7-11, wherein the device further comprises:

a second acquiring module, configured to acquire voices of each of the voice objects, and record start time and end time of voices of each of the voice objects;

The sending module is configured to encapsulate the voice, the voice object identity, the start time and the end time of each voice object into a voice packet, and send the message to the server for storage.
A client, comprising: a processor, a memory, and a data bus, wherein the processor and the memory communicate via the data bus;

The memory is configured to store a program;

The processor, configured to execute the program;

The program, when executed by the processor, implements the following method steps:

Get a voice packet from the server;

Generating a voice playback progress bar according to the voice, voice object identity, voice start time, and voice end time of each voice object in the voice packet, and marking the voice play of each of the voice objects on the voice playback progress bar. segment;

And selecting at least one voice play segment from the voice play segments of each of the voice objects marked on the voice playback progress bar to perform voice play.