CN112416229A

CN112416229A - Audio content adjusting method and device and electronic equipment

Info

Publication number: CN112416229A
Application number: CN202011346383.XA
Authority: CN
Inventors: 高桦
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2021-02-26

Abstract

The application discloses an audio content adjusting method, an audio content adjusting device and electronic equipment, belongs to the technical field of communication, and can solve the problems that in the existing electronic equipment, the adjusting mode for directly adjusting any audio parameter in a multimedia file is single, and the audio adjusting mode is not flexible enough. The method comprises the following steps: the electronic equipment receives a first input of a target audio track in the audio content; responding to the first input, and executing target operation on target audio content corresponding to the target audio track by the electronic equipment; wherein N is a positive integer; the target operation is used for editing the audio content of the target audio track.

Description

Audio content adjusting method and device and electronic equipment

Technical Field

The application belongs to the technical field of communication, and particularly relates to an audio content adjusting method, an audio content adjusting device and electronic equipment.

Background

With the development of electronic device technology, playing multimedia files (e.g., audio, video) through an electronic device (e.g., a mobile phone or a tablet computer) has become a very popular form of playing. When the electronic equipment plays the multimedia file, the volume of the whole multimedia file can be adjusted according to the operation of a user, and different tone qualities of the multimedia file can be switched. For example, after receiving the input of the user for adjusting the volume of the multimedia file, the electronic device may directly adjust the overall volume of the multimedia file; alternatively, the electronic device may adjust the audio quality of the multimedia file as a whole after receiving the user input to adjust the audio quality of the multimedia file.

In the related art, the audio content in the existing multimedia file is rich in composition, for example, the audio content may include a character sound, a background sound, and the like. However, in the above example, the electronic device can only directly adjust any audio parameter ensemble tone in the multimedia file.

Therefore, in the existing electronic equipment, any audio parameter in the multimedia file can only be directly adjusted integrally, and the mode for adjusting the audio is single and inflexible.

Disclosure of Invention

The embodiment of the application aims to provide an audio content adjusting method, an audio content adjusting device and electronic equipment, and the method, the device and the electronic equipment can be used for solving the problems that in the existing electronic equipment, only any audio parameter in a multimedia file can be adjusted integrally, and the audio adjusting mode is single and not flexible enough.

In order to solve the technical problem, the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides an audio content adjusting method, where the method includes: the electronic equipment receives a first input of a target audio track in the audio content; responding to the first input, and executing target operation on target audio content corresponding to the target audio track by the electronic equipment; wherein N is a positive integer; the target operation is used for editing the audio content of the target audio track.

In a second aspect, an embodiment of the present application provides an audio content adjusting apparatus, where the apparatus includes a receiving module and a display module; the receiving module is used for receiving a first input of a target audio track in the audio content; the display module is configured to execute a target operation on a target audio content corresponding to the target audio track in response to the first input received by the receiving module; wherein N is a positive integer; the target operation is used for editing the audio content of the target audio track.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

In the embodiment of the application, after receiving a first input of a target audio track in audio content, the electronic device may perform a target operation on the target audio content corresponding to the target audio track, where the target operation may be used to edit the track content of the target audio track. Therefore, when a user needs to edit a file containing audio in the electronic equipment, the electronic equipment can directly process the audio according to the input of the user, extract data corresponding to a target audio track required by the user according to the editing requirement of the user, and edit the target audio track, so that the steps of operating the electronic equipment by the user are saved, and the use efficiency of the electronic equipment is improved.

Drawings

Fig. 1 is a schematic flowchart of an audio content adjusting method according to an embodiment of the present application;

fig. 2 is a schematic diagram of an interface applied by an audio content adjusting method according to an embodiment of the present disclosure;

fig. 3 is a second schematic diagram of an interface applied by an audio content adjusting method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an audio content adjusting apparatus according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

fig. 6 is a second schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

Audio track

The tracks are parallel "tracks" of one that you see in sequencer software. Each track defines attributes of the track, such as the timbre, the timbre base, the number of channels, the input/output ports, and the volume of the track.

The audio content adjusting method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

The audio content adjusting method provided by the embodiment of the application can be applied to adjusting the scenes of the audio content.

For adjusting the scene of the audio content, assuming that the user plays the video using the mobile phone, since the audio portion of the video includes the audio component corresponding to the human voice of the 3 persons (person 1, person 2, and person 3, respectively) and the background music, the user only needs to pay attention to the human voice of person 1, and therefore wants to remove person 2, person 3, and background music. In the related art, a user imports the video file into an audio processing application by using a computer, the audio processing application can extract an audio part in the video, further split the audio part into 3 tasks and a background music part according to a received splitting operation, then extract audio content of a character 1 according to a received editing operation, and store the audio content of the character 1 according to a storing operation; and finally, sending the audio content of the character 1 to a mobile phone for the user to use. According to the technical scheme, if a user wants to extract the audio content of the character 1, the whole audio needs to be led into a professional audio processing application of other equipment, and the processed audio is stored separately, so that the audio needed by the user is obtained. Therefore, the steps of the user in the process of carrying out personalized processing on the audio content are complicated, the time is wasted, and the working efficiency is reduced.

In the audio content processing method provided by the embodiment of the application, after receiving a first input of a target audio track identifier corresponding to a character 1 in audio content in a video from a user, a mobile phone may perform extracted editing operation on the target audio content corresponding to the target audio track identifier. Therefore, when a user needs to edit a file containing audio in the mobile phone, the mobile phone can process the audio according to the input of the user, extract data corresponding to a target audio track required by the user according to the editing requirement of the user, and edit the target audio track, so that steps of operating the mobile phone by the user are saved, and the use efficiency of the mobile phone is improved.

The present embodiment provides an audio content adjusting method, as shown in fig. 1, which is applied to an electronic device, and includes the following steps 301 and 302:

step 301: the audio content adjustment device receives a first input for a target audio track in the audio content.

In the embodiment of the present application, the audio content may be audio content in any audio in the electronic device.

For example, the audio content may be audio content of audio in an audio application in the electronic device, for example, audio content of any audio in a music application; or, the audio content corresponding to the audio portion of the video-type application, for example, the audio content corresponding to the audio portion of the video in the video application.

In the embodiment of the present application, the target audio track may be any audio track in the audio content.

In an embodiment of the present application, the first input may be used to select the target track.

In this embodiment of the application, the first input may be a touch input, for example, a click input, a voice input, or an input of a specific gesture, which is not limited in this embodiment of the application.

Step 302: and responding to the first input, and executing target operation on the target audio content corresponding to the target audio track by the audio content adjusting device.

In the embodiment of the present application, the target operation is used to edit the audio content of the target track.

For example, the target operation may be to adjust a playing speed, a playing volume, a tone color, and an audio quality of the corresponding target audio content of the target audio track; and the method can also be used for editing and synthesizing the target audio content corresponding to the target audio track.

It should be noted that, the above audio quality, i.e. tone quality, for example, in a music application, for the same audio, smooth tone quality and high-quality tone quality can be provided according to the difference of data content contained in the audio, where the data content of the audio corresponding to the high-quality tone quality is richer than the data content of the audio corresponding to the smooth tone quality, and is directly embodied that the memory occupied by the audio corresponding to the high-quality tone quality is larger than the memory occupied by the audio corresponding to the smooth tone quality.

In this embodiment, the target audio content is audio data corresponding to a target audio track.

Optionally, in an embodiment of the present application, the audio content includes N audio tracks, the N audio tracks include the target audio track, and the N audio tracks include at least one of: the dialogue tracks of X persons in the target audio content and the background tracks in the target audio content; wherein N and X are positive integers, and X is less than or equal to N.

It will be appreciated that the audio content described above may be synthesized from N tracks.

Example 1: when the audio content is the audio content corresponding to the audio part in the video application, the target audio content may be the audio content composed of the audio track corresponding to the background sound, the dialogue audio track corresponding to each of the 3 persons in the video, and the audio track corresponding to the background sound, for a total of 5 audio tracks.

It can be understood that, after the audio content adjusting apparatus splits the target audio content into N audio tracks, the electronic device may extract audio data on the N audio tracks, upload the audio data to a network server corresponding to the audio content, and store in advance in the network server an audio track data sample library, where the audio track data sample library includes audio track data samples corresponding to multiple different types of audio tracks (for example, a person's dialogue track sample, a background sound sample, and a background music sample), and after the network server receives the audio data on the N audio tracks, the network server may compare data parameters (for example, a frequency parameter, a timbre parameter, and the like) of the audio data on the N audio tracks with the data parameters of the audio track data samples, thereby determining the types of the audio tracks. For example, when a data parameter of audio data of a certain track matches a dialogue track sample of a person, the type of the track may be determined as the dialogue track of the person.

Illustratively, the dialog tracks of the X persons are: the audio track corresponding to the character sound of the character in the target audio content. For example, in combination with the above example 1, the tracks corresponding to the 3 persons are dialog tracks.

In one example, the dialogue tracks of the X persons may be 1 track, X tracks, or Y tracks, where Y is a positive integer and is smaller than X.

Illustratively, the background tracks are: and the audio track corresponding to the background sound in the target audio content. It is understood that the background sound may be other tracks in the target audio than the dialogue tracks of the X persons and the background music in the target audio. For example, when there is a person's dialogue track, a track corresponding to background music, and a track corresponding to rain fall in the target audio content, the track corresponding to rain fall is the background track.

Therefore, the audio content adjusting device can adjust various types of target audio contents, so that the requirement of a user for individually editing the audio is fully met.

In the embodiment of the application, after receiving a first input of a target audio track in audio content, the audio content adjusting apparatus may perform a target operation on the target audio content corresponding to the target audio track, where the target operation may be used to edit the audio track content of the target audio track. Therefore, when a user needs to edit a file containing audio in the electronic equipment, the audio content adjusting device can directly process the audio according to the input of the user, extract data corresponding to the target audio track required by the user according to the editing requirement of the user, and edit the target audio track, so that the steps of the user for operating the electronic equipment are saved, and the use efficiency of the electronic equipment is improved.

Optionally, in this embodiment of the present application, before the step 301, the audio content adjusting method provided in this embodiment of the present application may include the following step a1 and step a 2:

step A1: and the audio content adjusting device receives a second input of the user under the condition of playing the target audio corresponding to the target audio content.

Step A2: and responding to the second input, and displaying the playing interface of the target audio by the audio content adjusting device.

On the basis of the above step a1 and step a2, the audio content adjusting method provided by the embodiment of the present application may further include the following steps A3;

step A3: the audio content adjusting device receives a first input of a target audio track identification in the N audio track identifications from a user.

Illustratively, the playback interface displays track identifiers of N tracks of the target audio, where the target track is a track corresponding to the target track identifier.

For example, the second input may be a touch input, for example, a click input; the input may be voice input or touch input, which is not limited in the embodiments of the present application

Illustratively, the second input is used for triggering the electronic device to start a processing function of the audio content.

Illustratively, the processing function may comprise at least one of: and splitting the audio into different audio tracks and extracting a specific audio track in the audio.

In one example, the second input may be a first input to the audio content processing control by a user, and the electronic device will start a processing function on the audio content after receiving the second input. The audio content processing control can be displayed in a display interface containing audio content. For example, when the audio content is audio in a music application, the audio content processing control may be displayed on a playing interface for playing the audio.

For example, the track id may be an id indicating different tracks, where one track corresponds to one track id.

For example, the audio track identifier may be an image identifier, a text identifier, or a digital identifier, which is not limited in this embodiment of the application.

In the embodiment of the present application, the track identifier may be displayed on any interface.

For example, the audio track identifier may be displayed on an interface where the target audio is located; the target audio may also be displayed on a floating interface, where the floating interface may be displayed on an interface where the target audio is located, or may also be displayed on any interface of the electronic device, which is not limited in this embodiment of the application.

Example 2: with reference to example 1, when the target audio is the audio content corresponding to the audio part in the video application, the audio identifier corresponding to the target audio content may include the audio identifier 1 of the audio track corresponding to the background music, and may further include the audio track identifiers of the dialog audio tracks corresponding to the 3 persons, which are the audio track identifier 2, the audio track identifier 3, and the audio track identifier 4, respectively, and the audio track identifier 5 of the audio track corresponding to the background music.

Further, after determining the type of the audio data, the electronic device may further distinguish audio tracks according to different parameters of the audio track data, and add audio track identifiers to the audio tracks, for example, when the network server determines that 3 audio tracks in the same audio are dialog audio tracks of a person, it may further distinguish different audio tracks according to different voiceprint information of the 3 audio tracks, and add audio track identifiers to the 3 audio tracks: person 1, person 2, person 3.

It should be noted that, a specific audio track data sample and an audio track identifier corresponding to the sample may be pre-stored in the network server. For example, a documentary dialogue track sample can be pre-stored in the network server, and a text type track identifier "documentary" can be added to the documentary dialogue track sample.

Illustratively, the target track identifies a corresponding track for the target track.

Illustratively, the target track may be any of N tracks.

Illustratively, the above-described track identification may also be used to select different tracks in the audio.

In one example, the track identification can be a track control that is displayed in the form of a track identification.

Example 3: with reference to the above examples 1 and 2, the audio identifier 1 may be used to select a track corresponding to background music, the track identifier 2, the track identifier 3, and the track identifier 4 may be used to select 3 dialog tracks corresponding to personal objects, and the track identifier 5 may be used to select a track corresponding to background music.

In an embodiment of the application, the second input may be used to select the target track.

In this embodiment of the application, the second input may be a touch input, for example, a click input, a voice input, or an input of a specific gesture, which is not limited in this embodiment of the application.

The following two cases are exemplified for N tracks in the target audio:

in the first case: the N tracks include dialog tracks for X people in the target audio.

Example 1: assuming that the electronic device is playing video 1 in a video application, an audio portion (i.e. the target audio) in the video 1 includes dialogs of 3 people, and people 1, 2, and 3 respectively have a conversation in the video, and a user needs to adjust the volume of the audio content corresponding to the person 1, as shown in (a) in fig. 2, the electronic device receives a click input (i.e. the second input) from the user to the audio processing control 32 on the video interface 31, and after receiving the click input, as shown in (b) in fig. 2, the electronic device displays a floating window 33 on the video interface 31, and displays track identifiers of 3 tracks of the audio portion, namely, an identifier 34 of the person 1, an identifier 35 of the person 2, and an identifier 36 of the person 3, in the floating window. Then, the electronic device receives a click input (i.e. the first input) from the user to the identifier 34 of the person 1 among the 3 audio track identifiers, and after receiving the click input, as shown in (c) of fig. 2, the electronic device displays an adjustment identifier corresponding to the audio track of the person 1 in the floating window, including: the sound speed adjusting control, the volume adjusting control 37, and the track switch identifier, the user can control the volume of the track of the person 1 in the video by adjusting the volume of the volume adjusting control 37 (i.e., the above target operation), so as to adjust the audio parameters of part of the audio in the video in a personalized manner.

In the second case: the N tracks include a background track in the target audio.

Example 2: assuming that the electronic device is playing video 1 in a video application, an audio portion (i.e. the target audio) in the video 1 includes dialogs of 3 persons, i.e. person 1, person 2, and person 3, respectively, have a conversation in the video, and also background music and background sound, and the user needs to turn off the background sound, as shown in (a) in fig. 2, the electronic device receives a click input (i.e. the second input) from the user to the audio processing control 32 on the video interface 31, and after receiving the click input, as shown in (a) in fig. 3, the electronic device displays a floating window 33 on the video interface 31, and displays track identifications of 5 tracks of the audio portion, i.e. an identification 34 of person 1, an identification 35 of person 2, an identification 36 of person 3, an identification 41 of background music, and a background sound 42, in the floating window. Then, the electronic device receives a click input (i.e. the first input) from the user to the identifier 42 of the background sound in the 5 audio track identifiers, and after receiving the click input, the electronic device displays an adjustment identifier corresponding to the audio track of the background sound in the floating window, as shown in fig. 3 (b), including: the sound speed adjusting control, the volume adjusting control and the track switch identifier 43, and then the user clicks the track switch identifier 43 to input, and the electronic device will turn off the background sound of the audio portion. Thus, when the electronic device plays the video, the user can not hear the background sound any more.

For example, when the target audio is an audio file corresponding to the target audio in the electronic device, the playing of the target audio corresponding to the target audio content is the playing of the audio file, for example, when the target audio is an audio in a music file, the playing of the target audio is the playing of the audio file; when the target audio is an audio part in a file of another type in the electronic device, the playing of the target audio corresponding to the target audio content is the playing of the other file, for example, when the target audio is an audio part in a video file, the playing of the target audio is the playing of the video file.

For example, the playing interface may be a playing interface of a file corresponding to the target audio, for example, when the file corresponding to the target audio is an audio file, the playing interface is a playing interface of the audio file, and when the file corresponding to the target audio is a video file, the playing interface is a playing interface corresponding to the video file.

Example 3: with reference to the foregoing example 1, after receiving the click input (i.e., the foregoing first input) from the user to the audio processing control 32 on the video interface 31, the electronic device displays, in the video interface 31, the track identifications of the 3 tracks of the audio portion, namely, the identification 34 of the person 1, the identification 35 of the person 2, and the identification 36 of the person 3, respectively, after receiving the click input.

Therefore, after the data of the target audio required to be processed by the user is received, the audio content adjusting device can directly display the audio track identifier on the playing interface for the user to select and view, so that the audio parameters of the target audio are adjusted in a personalized manner by the user, and the efficiency of the user in using the electronic equipment is improved.

Optionally, in an embodiment of the present application, the N tracks include: and the dialogue tracks of X persons in the target audio. Based on this, in the above step 302, the audio content adjusting method provided by the embodiment of the present application may include the following step B:

and B: and the audio content adjusting device displays the audio track identifications of the dialogue audio tracks of the X persons according to the real-time audio playing content of the target audio in the process of playing the target audio.

For example, the audio content adjusting means may sequentially display the track identifications corresponding to the real-time sequence according to the real-time sequence of sounds made by the X persons.

Example 4: with reference to the above example 1, after the electronic device receives the click input (i.e., the first input) from the user to the audio processing control 32 on the video interface 31, the electronic device displays the identifier 34 of the person 1 in the video interface 31 when the person 1 speaks, displays the identifier 35 of the person 2 in the video interface 31 when the person 2 speaks, and displays the identifier 36 of the person 3 in the video interface 31 when the person 3 speaks, along with the playing of the video 1 after receiving the click input.

Therefore, the audio content adjusting device can display the audio track identification corresponding to each character in real time according to the time of the X person in the target audio for making a sound, so that a user can conveniently and accurately position the sound made by the required character, and accurately mark the audio track corresponding to the character.

Optionally, in this embodiment of the present application, in step 301, the audio content adjusting method provided in this embodiment of the present application may include step C:

and C: the audio content adjustment device receives a first input to a target area in a video frame.

Illustratively, the video content includes the audio content, different regions in the video frame correspond to different tracks, and the target region corresponds to the target track.

Illustratively, the target area may be any area in a video picture.

For example, the audio content adjusting device may divide the video frame into one or more different regions according to different display contents in the video frame.

In an example, the region where the person is located in the video picture is a person region, and the region where the background is located is a background region.

For example, the first input may be used to select a track corresponding to an arbitrary region by inputting the region.

Therefore, when the electronic equipment displays the video picture, the audio content adjusting device can select the audio track corresponding to the area by receiving the input of the user to the area displaying different contents in the video picture, further execute the target operation required by the user and carry out editing operation on the audio track, thereby further simplifying the operation steps of editing the target audio track in the video picture by the user and improving the efficiency of using the electronic equipment by the user.

Optionally, in this embodiment of the application, the video image includes Y persons and a background area, and when the target area is an area where a target person is located among the Y persons, the target track is a track corresponding to the target person; when the target area is the background area, the target track is a background track.

For example, after the audio content adjusting apparatus receives a first input of the area where the target person is located from the user, an audio track corresponding to the target person may be selected, and then a target operation is performed on the audio track; when the audio content adjusting device receives a first input of the user to the background area, the audio track corresponding to the background area can be selected, and then the target operation is executed on the audio track.

Therefore, when the electronic equipment displays the video picture, the user can select the audio track corresponding to the area by directly performing first input on the corresponding area of the video picture, and then the target operation required by the user is executed, so that the flexibility and richness of the user for controlling the video audio track are enhanced, and the experience richness of the user is also improved.

Optionally, in this embodiment of the present application, in the step 302, the audio content adjusting method provided in this embodiment of the present application may include the following step D1:

step D1: and the audio content adjusting device executes target operation on the target audio content of the target audio track in the target audio to obtain the edited target audio content.

Based on the step D2, after the step 302, the audio content adjusting method provided by the embodiment of the present application may include the following steps D2:

step D2: the audio content adjusting device stores the edited target audio content.

For example, the above target operation may refer to the foregoing description, and is not described herein again.

In one example, the target operation may be a clipping operation, for example, cutting a portion of audio content from the complete audio content of a certain track.

Example 5: assuming that the electronic device is playing a video 1 in a video application, an audio portion (i.e. the target audio) in the video 1 includes dialogs of 3 persons, and the person 1, the person 2 and the person 3 respectively perform a dialog in the video, and the user only needs the electronic device to play a part of the audio content corresponding to the person 1, after the electronic device receives a click input (i.e. the second input) from the user on an identifier 34 of the person 1 in the 3 track identifiers, an adjustment identifier corresponding to the track of the person 1 is displayed in the floating window 31, which includes: the user can start the clipping mode of the audio content corresponding to the character 1 by clicking the audio clip identifier, and the user can clip the audio content (target audio content) required by the user by performing the clipping operation (i.e., the target operation) in the clipping mode. At this time, the electronic device receives the click input of the user to the saving control, and then the electronic device stores the clipped audio content.

Therefore, when a user needs to clip a certain part of content in the audio, the audio which the user needs to edit can be directly personalized edited in the application to which the audio belongs, and the requirement of the user for personalized audio processing is fully met.

It should be noted that, in the audio content adjusting method provided in the embodiment of the present application, the execution main body may be an audio content adjusting apparatus, or a control module of the audio content adjusting apparatus for executing the audio content adjusting method. The embodiment of the present application takes an audio content adjusting apparatus as an example to execute an audio content adjusting method, and describes an audio content adjusting apparatus provided in the embodiment of the present application.

Fig. 4 is a schematic structural diagram of a possible audio content adjusting apparatus for implementing the embodiment of the present application. As shown in fig. 4, the apparatus 600 includes a receiving module 601 and a display module 602; the receiving module 601 is configured to receive a first input of a target audio track in audio content; the display module 602 is configured to execute a target operation on a target audio content corresponding to the target audio track in response to the first input received by the receiving module 601; wherein N is a positive integer; the target operation is used for editing the audio content of the target audio track.

According to the audio content adjusting apparatus provided by the embodiment of the application, after receiving the first input of the target audio track in the audio content, the audio content adjusting apparatus may perform the target operation on the target audio content corresponding to the target audio track, where the target operation may be used to edit the audio track content of the target audio track. Therefore, when a user needs to edit a file containing audio in the electronic equipment, the audio content adjusting device can directly process the audio according to the input of the user, extract data corresponding to the target audio track required by the user according to the editing requirement of the user, and edit the target audio track, so that the steps of the user for operating the electronic equipment are saved, and the use efficiency of the electronic equipment is improved.

Optionally, in an embodiment of the present application, the audio content includes N audio tracks, the N audio tracks include the target audio track, and the N audio tracks include at least one of: the dialogue tracks of X persons in the target audio and the background tracks in the target audio; wherein X is a positive integer and X is less than or equal to N.

Optionally, in this embodiment of the application, the receiving module 601 is further configured to receive a second input of the user when the target audio corresponding to the target audio content is played; the display module 602 is further configured to display a playing interface of the target audio in response to the second input received by the receiving module 601, where the playing interface displays track identifiers of N tracks of the target audio, and the target track is a track corresponding to the target track identifier

Optionally, in an embodiment of the present application, the N tracks include: the dialogue tracks of X persons in the target audio; the display module 602 is specifically configured to display the track identifiers of the dialog tracks of the X persons according to the real-time audio playing content of the target audio in the process of playing the target audio.

Optionally, in this embodiment of the application, the receiving module 601 is specifically configured to receive a first input to a target area in a video picture; wherein the video content includes the audio content, different regions in the video frame correspond to different audio tracks, and the target region corresponds to the target audio track.

The audio content adjusting device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The audio content adjusting apparatus in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.

The audio content adjusting apparatus provided in the embodiment of the present application can implement each process implemented by the method embodiments in fig. 1 to fig. 3, and is not described herein again to avoid repetition.

It should be noted that, as shown in fig. 4, the modules that are necessarily included in the audio content adjusting apparatus 600 are illustrated by solid line boxes, such as the receiving module 601.

Optionally, as shown in fig. 5, an electronic device 800 is further provided in this embodiment of the present application, and includes a processor 801, a memory 802, and a program or an instruction stored in the memory 802 and executable on the processor 801, where the program or the instruction is executed by the processor 801 to implement each process of the foregoing audio content adjusting method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 100 includes, but is not limited to: a radio frequency unit 101, a network module 102, an audio output unit 103, an input unit 104, a sensor 105, a display unit 106, a user input unit 107, an interface unit 108, a memory 109, and a processor 110. Wherein the user input unit 107 includes: touch panel 1071 and other input devices 1072, display unit 106 including display panel 1061, input unit 104 including image processor 1041 and microphone 1042, memory 109 may be used to store software programs (e.g., an operating system, application programs needed for at least one function), and various data.

Those skilled in the art will appreciate that the electronic device 100 may further comprise a power source (e.g., a battery) for supplying power to various components, and the power source may be logically connected to the processor 110 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 6 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is omitted here.

Wherein, the user input unit 107 receives a first input of a target audio track in the audio content; a display unit 106, configured to perform a target operation on a target audio content corresponding to the target audio track in response to the first input received by the user input unit; wherein the target operation is used for editing the audio content of the target audio track.

According to the electronic equipment, after receiving a first input of a target audio track in audio content, the electronic equipment can execute a target operation on the target audio content corresponding to the target audio track, and the target operation can be used for editing the audio track content of the target audio track. Therefore, when a user needs to edit a file containing audio in the electronic equipment, the electronic equipment can directly process the audio according to the input of the user, extract data corresponding to a target audio track required by the user according to the editing requirement of the user, and edit the target audio track, so that the steps of operating the electronic equipment by the user are saved, and the use efficiency of the electronic equipment is improved.

Optionally, the user input unit 107 is configured to receive a second input from a user when a target audio corresponding to the target audio content is played; a display unit 106, configured to display a playing interface of the target audio in response to the second input received by the user input unit 107, where track identifiers of N tracks of the target audio are displayed in the playing interface, and the target track is a track corresponding to the target track identifier; the user input unit 107 is specifically configured to receive a first input of a target track identifier from the N track identifiers from a user.

Optionally, the N tracks include: the dialogue tracks of X persons in the target audio; the display unit 106 is specifically configured to display the track identifiers of the dialog tracks of the X persons according to the real-time audio playing content of the target audio in the process of playing the target audio.

Optionally, the user input unit 107 is specifically configured to receive a first input to a target area in a video frame; wherein the video content comprises the audio content, different regions in the video image correspond to different audio tracks, and the target region corresponds to the target audio track

It should be understood that, in the embodiment of the present application, the input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, and the Graphics Processing Unit 1041 processes image data of a still picture or a video obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes a touch panel 1071 and other input devices 1072. The touch panel 1071 is also referred to as a touch screen. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. The memory 109 may be used to store software programs as well as various data including, but not limited to, application programs and an operating system. The processor 110 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the above-mentioned audio content adjusting method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the above-mentioned audio content adjusting method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for audio content adaptation, the method comprising,

receiving a first input to a target audio track in audio content;

responding to the first input, and executing target operation on target audio content corresponding to the target audio track;

wherein the target operation is used to edit the audio content of the target audio track.

2. The method of claim 1, wherein the audio content comprises N audio tracks, wherein the N audio tracks comprise the target audio track, and wherein the N audio tracks comprise at least one of: the dialogue tracks of X characters in the target audio content and the background tracks in the target audio content;

wherein N and X are positive integers, and X is less than or equal to N.

3. The method of claim 1, wherein prior to receiving the first input for the target audio track in the audio content, the method further comprises:

receiving a second input of the user under the condition of playing a target audio corresponding to the target audio content;

responding to the second input, displaying a playing interface of the target audio, wherein audio track identifications of N audio tracks of the target audio are displayed in the playing interface, and the target audio track is an audio track corresponding to the target audio track identification;

the receiving a first input to a target audio track in audio content comprises:

and receiving a first input of a target audio track identifier in the N audio track identifiers from a user.

4. The method of claim 3, wherein the N tracks comprise: a dialog track for X people in the target audio;

the audio track identification of the N audio tracks of the display target audio comprises:

and in the process of playing the target audio, displaying the audio track identifications of the dialogue audio tracks of the X persons according to the real-time audio playing content of the target audio.

5. The method of claim 1, wherein receiving a first input for a target audio track in audio content comprises:

receiving a first input of a target area in a video picture; wherein the video content comprises the audio content, different regions in the video picture correspond to different audio tracks, and the target region corresponds to the target audio track.

6. The method according to claim 5, wherein the video frame comprises Y persons and a background area, and when the target area is an area where a target person is located in the Y persons, the target audio track is an audio track corresponding to the target person; in a case where the target region is the background region, the target track is a background track.

7. An audio content adjusting device is characterized by comprising a receiving module and a display module;

the receiving module is used for receiving a first input of a target audio track in the audio content;

the display module is used for responding to the first input received by the receiving module and executing target operation on target audio content corresponding to the target audio track;

wherein N is a positive integer; the target operation is used to edit the audio content of the target audio track.

8. The apparatus of claim 7, wherein the audio content comprises N audio tracks, wherein the N audio tracks comprise the target audio track, and wherein the N audio tracks comprise at least one of: the dialogue tracks of X people in the target audio and the background track in the target audio;

wherein X is a positive integer and X is less than or equal to N.

9. The apparatus of claim 7,

the receiving module is further configured to receive a second input of the user under the condition that the target audio corresponding to the target audio content is played;

the display module is further configured to display a playing interface of the target audio in response to the second input received by the receiving module, where track identifiers of N tracks of the target audio are displayed in the playing interface, and the target track is a track corresponding to the target track identifier.

10. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions when executed by the processor implementing the steps of the audio content adaptation method of any of claims 1-6.