CN113329235A

CN113329235A - Audio processing method and device and cloud server

Info

Publication number: CN113329235A
Application number: CN202110600633.6A
Authority: CN
Inventors: 杨建国; 张宇
Original assignee: Taicang Taoxin Information Technology Co ltd
Current assignee: Taicang Taoxin Information Technology Co ltd
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2021-08-31

Abstract

The embodiment of the invention provides an audio processing method, an audio processing device and a cloud server, belonging to the field of audio playing, wherein the method comprises the following steps: displaying an audio receiving interface, an audio presentation region of the audio receiving interface comprising a first audio track of a number of audio tracks played by a monophonic track; receiving a first operation corresponding to the first audio track; altering playback of the first audio track in the audio presentation area to the number of audio tracks played as a multi-track playback in response to the first operation. The application realizes the real-time switching between single track playing and multi-track playing.

Description

Audio processing method and device and cloud server

Technical Field

The invention relates to the technical field of audio playing, in particular to an audio processing method and device and a cloud server.

Background

In the prior art, when a plurality of audio tracks exist in the audio playing process, the plurality of audio tracks generally play different associated contents. For example, when a song is played, a first track plays the song and a second track plays the original song. Or when playing the sports game, the first audio track plays the original sound of the playing field, and the second audio track plays the commentary sound.

Usually, the tracks are placed in the left channel or the right channel respectively, and the user only needs to switch the left channel and the right channel to achieve the above effect, but the objects processed by the above method are not real-time live content, and the above method cannot achieve fast track switching in a multi-track scene (for example, an online meeting or a multi-person microphone).

Disclosure of Invention

In order to overcome at least the above disadvantages in the prior art, the present invention provides an audio processing method, an audio processing apparatus, and a cloud server.

In a first aspect, the present invention provides an audio processing method, comprising:

displaying an audio receiving interface, an audio presentation region of the audio receiving interface comprising a first audio track of a number of audio tracks played by a monophonic track;

receiving a first operation corresponding to the first audio track;

altering playback of the first audio track in the audio presentation area to the number of audio tracks played as a multi-track playback in response to the first operation.

In a second aspect, the present invention provides an audio processing apparatus, the apparatus comprising:

the audio playing module is used for displaying an audio receiving interface, and an audio presenting area of the audio receiving interface comprises a first audio track in a plurality of audio tracks played by a single audio track;

an operation module to receive a first operation corresponding to the first audio track;

the audio playing module is further configured to change the playing of the first audio track in the audio presentation area to the plurality of audio tracks played in multiple audio tracks in response to the first operation.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where instructions are stored, and when executed, cause a computer to perform the audio processing method in the first aspect or any one of the possible designs of the first aspect.

In a fourth aspect, an embodiment of the present invention further provides a cloud server, where the cloud server includes a processor, a machine-readable storage medium, and a network interface, where the machine-readable storage medium, the network interface, and the processor are connected through a bus system, the network interface is used for being communicatively connected with at least one broadcast terminal, the machine-readable storage medium is used for storing programs, instructions, or codes, and the processor is used for executing the programs, the instructions, or the codes in the machine-readable storage medium to perform the audio processing method in the first aspect or any one of the possible designs in the first aspect.

Based on any one of the above aspects, in the embodiment of the present application, a first audio track of a plurality of audio tracks played in a single audio track is displayed in an audio presentation area of an audio receiving interface, and in response to a first operation, the first audio track in the audio presentation area is changed to be played as a plurality of audio tracks played in a plurality of audio tracks, so that a fast switch between a single-audio-track playing mode and a multi-audio-track playing mode is achieved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic view of an application scenario of an audio processing system according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating an audio processing method according to an embodiment of the present invention;

FIG. 3 is a functional block diagram of an audio processing apparatus according to an embodiment of the present invention;

fig. 4 is a schematic block diagram of a structure of a cloud server for implementing the audio processing method according to an embodiment of the present invention.

Detailed Description

The present invention is described in detail below with reference to the drawings, and the specific operation methods in the method embodiments can also be applied to the apparatus embodiments or the system embodiments.

Fig. 1 is an interactive schematic diagram of an audio processing system 10 according to an embodiment of the present invention. The audio processing system 10 may include a cloud server 100 and a play terminal 200 communicatively connected to the cloud server 100. The audio processing system 10 shown in fig. 1 is only one possible example, and in other possible embodiments, the audio processing system 10 may include only a portion of the components shown in fig. 1 or may include other components.

In this embodiment, the cast terminal 200 may include a mobile device, a tablet computer, a laptop computer, etc., or any combination thereof. In some embodiments, the mobile device may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home devices may include control devices of smart electrical devices, smart monitoring devices, smart televisions, smart cameras, and the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, a smart lace, smart glass, a smart helmet, a smart watch, a smart garment, a smart backpack, a smart accessory, or the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a personal digital assistant, a gaming device, and the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, virtual reality glass, a virtual reality patch, an augmented reality helmet, augmented reality glass, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or augmented reality device may include various virtual reality products and the like.

In this embodiment, the cloud server 100 and the cast terminal 200 in the audio processing system 10 may cooperatively perform the audio processing method described in the following method embodiment, and for a specific step part of the cloud server 100 and the cast terminal 200, reference may be made to the detailed description of the following method embodiment.

To solve the technical problem in the foregoing background, fig. 2 is a schematic flowchart of an audio processing method according to an embodiment of the present invention, and the audio processing method according to the embodiment may be executed by the cloud server 100 shown in fig. 1, and the audio processing method is described in detail below.

Step S110, displaying an audio receiving interface, wherein an audio presenting area of the audio receiving interface comprises a first audio track in a plurality of audio tracks played by a single audio track;

step S120 of receiving a first operation corresponding to the first track;

step S130, in response to the first operation, changing the first audio track in the audio presentation area to the plurality of audio tracks played in multiple audio tracks.

In one possible embodiment, the mono track playback is a playback mode in which only a single track is played at the same time and the track change is performed using a side-to-side sliding mode; the multi-track playing is a playing mode in which at least two tracks are played at the same time and a mixed-sound playing mode is used for playing.

In one possible embodiment, step S130 includes:

step S131, determining a multi-track playing mode in the playing mode according to the total amount of the plurality of tracks;

step S132, pausing the playing of the first audio track in the audio presentation area, and updating the plurality of audio tracks played using the multi-audio track playing mode in the audio presentation area.

In one possible embodiment, step S131 includes:

step S1311, determining at least two candidate multi-track playback modes corresponding to the total number of the plurality of tracks;

step S1312, selecting the multi-track playing mode used this time according to a default strategy in the at least two candidate multi-track playing modes;

step S1313, wherein the default policy includes at least one or a set of: a random selection strategy, a priority selection strategy and a circular selection strategy.

In a possible embodiment, step S130 is followed by:

step S140 of receiving a second operation corresponding to a second track of the plurality of tracks;

step S150, in response to the second operation, changing the plurality of audio tracks played by the multi-audio track in the audio presentation area to the second audio track played by the single audio track.

In a possible embodiment, when the audio presentation area displays only part of the audio content of the number of audio tracks played with a plurality of audio tracks, the reference point in time of the second audio track played with the single audio track is automatically scrolled into alignment with the reference point in time of the audio presentation area after the switch.

In one possible embodiment, the first operation comprises: click, double click, long press preset time, hover preset time, the second operation includes: at least one of a touch action, a tap action, an eye gaze action, and a blink action on the wearable device.

Fig. 3 is a schematic diagram of functional modules of an audio processing apparatus 300 according to an embodiment of the present invention, and in this embodiment, the audio processing apparatus 300 may be divided into the functional modules according to a method embodiment executed by the cloud server 100, that is, the following functional modules corresponding to the audio processing apparatus 300 may be used to execute the method embodiments executed by the cloud server 100. The audio processing apparatus 300 may include an audio playing module 310 and an operation module 320, and the functions of the functional modules of the audio processing apparatus 300 are described in detail below.

The audio playing module 310 may be configured to perform the step S110, the step S130, namely, to display an audio receiving interface, where an audio presenting area of the audio receiving interface includes a first audio track of a plurality of audio tracks played by a single audio track. Further for altering playback of the first audio track in the audio presentation area to the number of audio tracks played for multi-track playback in response to the first operation

The operation module 320 may be configured to perform the above step S120, namely, to receive a first operation corresponding to the first track.

It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the audio playing module 310 may be a separate processing element, or may be integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program codes, and a processing element of the apparatus calls and executes the functions of the audio playing module 310. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call the program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).

Fig. 4 is a schematic diagram illustrating a hardware structure of the cloud server 100 for implementing the control device, according to an embodiment of the present invention, as shown in fig. 4, the cloud server 100 may include a processor 110, a machine-readable storage medium 120, a bus 130, and a transceiver 140.

In a specific implementation process, at least one processor 110 executes computer-executable instructions stored in the machine-readable storage medium 120 (for example, included in the audio processing apparatus 300 shown in fig. 3), so that the processor 110 may execute the audio processing method according to the above method embodiment, where the processor 110, the machine-readable storage medium 120, and the transceiver 140 are connected through the bus 130, and the processor 110 may be configured to control the transceiver action of the transceiver 140, so as to perform data transceiving with the foregoing cast terminal 200.

For a specific implementation process of the processor 110, reference may be made to the above-mentioned method embodiments executed by the cloud server 100, and implementation principles and technical effects are similar, which are not described herein again.

In the embodiment shown in fig. 4, it should be understood that the processor may be a Central Processing Unit (CPU), other general-purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.

The machine-readable storage medium 120 may comprise high-speed RAM memory and may also include non-volatile storage NVM, such as at least one disk memory.

The bus 130 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus 130 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.

In addition, the embodiment of the present invention further provides a readable storage medium, in which a computer executing instruction is stored, and when a processor executes the computer executing instruction, the audio processing method is implemented.

The readable storage medium described above may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of audio processing, the method comprising:

receiving a first operation corresponding to the first audio track;

2. The method of claim 1,

the single-track playing is a playing mode which only plays a single track at the same time and changes the track by using a left-right sliding mode;

the multi-track playing is a playing mode in which at least two tracks are played at the same time and a mixed-sound playing mode is used for playing.

3. The method of claim 2, wherein said altering the playback of the first audio track in the audio presentation area to the number of audio tracks played as multiple audio tracks comprises:

determining a multi-track play mode in the play mode according to the total amount of the plurality of tracks;

pausing playback of the first audio track in the audio presentation region, and updating the number of audio tracks played using the multi-track playback mode in the audio presentation region.

4. The method according to claim 3, wherein said determining a multi-track playback mode in said playback mode based on a total number of said plurality of tracks comprises:

determining at least two candidate multi-track playback modes corresponding to the total number of the number of audio tracks;

selecting a multi-track playing mode used at this time according to a default strategy from the at least two candidate multi-track playing modes;

wherein the default policy comprises at least one or a set of: a random selection strategy, a priority selection strategy and a circular selection strategy.

5. The method of any of claims 1 to 4, wherein after updating the first audio track in the audio presentation area to the plurality of audio tracks played as a multi-track playback, further comprising:

receiving a second operation corresponding to a second audio track of the number of audio tracks;

in response to the second operation, altering playback of the number of audio tracks in the audio presentation region played with the multi-audio track to the second audio track played with the single audio track.

6. The method of claim 5, further comprising:

when the audio presentation area displays only a part of the audio content of the plurality of tracks played with the plurality of tracks, the reference time point of the second track played with the single track is automatically scrolled to be aligned with the reference time point of the audio presentation area after switching.

7. The method according to any of claims 1 to 6, wherein the first operation comprises: click, double click, long press preset time, hover preset time, the second operation includes: at least one of a touch action, a tap action, an eye gaze action, and a blink action on the wearable device.

8. An audio processing apparatus, characterized in that the apparatus comprises:

9. A computer readable storage medium storing instructions/executable code which, when executed by a processor of an electronic device, causes the electronic device to implement the method of any one of claims 1-7.

10. A cloud server, characterized in that the cloud server comprises a processor, a machine-readable storage medium, and a network interface, the machine-readable storage medium, the network interface, and the processor are connected through a bus system, the network interface is used for being connected with at least one play terminal in a communication manner, the machine-readable storage medium is used for storing programs, instructions, or codes, and the processor is used for executing the programs, instructions, or codes in the machine-readable storage medium to execute the audio processing method according to any one of claims 1 to 7.