WO2020147522A1 - Method and device for processing audio - Google Patents

Method and device for processing audio Download PDF

Info

Publication number
WO2020147522A1
WO2020147522A1 PCT/CN2019/127603 CN2019127603W WO2020147522A1 WO 2020147522 A1 WO2020147522 A1 WO 2020147522A1 CN 2019127603 W CN2019127603 W CN 2019127603W WO 2020147522 A1 WO2020147522 A1 WO 2020147522A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
dubbing
track
recorded
volume
Prior art date
Application number
PCT/CN2019/127603
Other languages
French (fr)
Chinese (zh)
Inventor
思磊
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020147522A1 publication Critical patent/WO2020147522A1/en

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data
    • G11B2020/10555Audio or video recording specifically adapted for audio data wherein the frequency, the amplitude, or other characteristics of the audio signal is taken into account
    • G11B2020/10574Audio or video recording specifically adapted for audio data wherein the frequency, the amplitude, or other characteristics of the audio signal is taken into account volume or amplitude

Definitions

  • the embodiments of the present disclosure relate to the field of computer technology, in particular to methods and devices for processing audio.
  • the embodiments of the present disclosure propose methods and apparatuses for processing audio.
  • an embodiment of the present disclosure provides a method for processing audio.
  • the method includes: in response to detecting a user-triggered dubbing start signal, adjusting the volume of the dubbing track on the audio to be dubbed to the first Target volume, which adjusts the volume of the audio tracks to be dubbed, other than the dubbed audio track, to the second target volume, where the dubbed audio track is an audio track with a preset volume added to the audio to be dubbed in advance ; Obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing track; in response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, The second target volume is used to save the to-be-recorded audio signals included in the to-be-dubbed audio in the dubbing time period and on other audio tracks except the dubbing track.
  • the method further includes: transferring the dubbing audio track Adjust the volume of the audio to the preset volume, and adjust the volume of the audio tracks to be dubbed, other than the dubbed audio track, to the initial volume.
  • the method further includes: responding to detecting User-triggered modified dubbing signal, showing the interface for modifying the to-be-recorded audio signal recorded on the dubbing track; in response to detecting the end of the user-triggered modification of the dubbing signal, save the modified dubbing signal on the dubbing track Audio signal to be recorded.
  • the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
  • the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
  • an embodiment of the present disclosure provides an apparatus for processing audio.
  • the apparatus includes: an adjustment unit configured to, in response to detecting a user-triggered dubbing start signal, adjust the dubbing track on the audio to be dubbed Adjust the volume of the to-be-dubbed audio to the first target volume, and adjust the volume of the audio tracks to be dubbed, except for the dubbing track, to the second target volume, where the dubbing track is added to the to-be-dubbed audio in advance, A sound track with a preset volume; a recording unit configured to obtain the audio signal to be recorded and record the audio signal to be recorded on the dubbing sound track; the saving unit is configured to respond to the detection of the end dubbing signal to the first target The volume saves the to-be-recorded audio signals recorded on the dubbing track during the dubbing time period, and saves the to-be-recorded audio signals on the audio tracks other than the dubbing track included in the dubbing audio during the dubbing time period at the second target volume .
  • the saving unit is further configured to: adjust the volume of the dubbed audio track to a preset volume, and adjust the volume of other audio tracks included in the audio to be dubbed except for the dubbed audio track to the initial volume.
  • the saving unit includes: a display module configured to display an interface for modifying the audio signal to be recorded recorded on the dubbing audio track in response to detecting a modified dubbing signal triggered by a user; a saving module , Is configured to modify the dubbing signal in response to detecting the end triggered by the user, and save the modified audio signal to be recorded on the dubbing track.
  • the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
  • the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
  • the embodiments of the present disclosure provide a terminal device, the terminal device includes: one or more processors; a storage device on which one or more programs are stored; when one or more programs are Multiple processors execute, so that one or more processors implement the method described in any implementation manner of the first aspect.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, which when executed by a processor implements the method described in any one of the implementation manners of the first aspect.
  • the method and apparatus for audio processing adjust the volume of the dubbing track on the audio to be dubbed to the first target volume in response to detecting the start dubbing signal triggered by the user, and adjust the audio to be dubbed
  • the volume of the included audio tracks other than the dubbing track is adjusted to the second target volume, the audio signal to be recorded is acquired, and the audio signal to be recorded is recorded on the dubbing track, in response to detecting the end dubbing signal, Save the to-be-recorded audio signal recorded on the dubbing track during the dubbing period at the first target volume, and save the audio signal to be dubbed during the dubbing period at the second target volume on other audio tracks except the dubbing track
  • the audio signal to be recorded can be used to add audio tracks to the audio to be dubbed, so that dubbing can be performed without modifying the original audio to be dubbed. Setting the first target volume and the second target volume can help To better integrate the recorded audio signal to be recorded with the original audio to be dubbed, it helps to flexibly
  • Fig. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure can be applied;
  • Fig. 2 is a flowchart of one embodiment of a method for processing audio according to an embodiment of the present disclosure
  • Fig. 3 is a schematic diagram of an application scenario of the method for processing audio according to an embodiment of the present disclosure
  • Fig. 4 is a flowchart of another embodiment of a method for processing audio according to an embodiment of the present disclosure
  • Fig. 5 is a schematic structural diagram of an embodiment of an apparatus for processing audio according to an embodiment of the present disclosure
  • Fig. 6 is a schematic structural diagram of a terminal device suitable for implementing embodiments of the present disclosure.
  • FIG. 1 shows an exemplary system architecture 100 of a method for processing audio or an apparatus for processing audio to which an embodiment of the present disclosure can be applied.
  • the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105.
  • the network 104 is a medium used to provide a communication link between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages, and so on.
  • Various communication client applications such as audio player applications, video player applications, web browser applications, and social platform software, may be installed on the terminal devices 101, 102, and 103.
  • the terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices. When the terminal devices 101, 102, and 103 are software, they can be installed in the aforementioned electronic devices. It can be implemented as multiple software or software modules (for example, software or software modules used to provide distributed services), or as a single software or software module. There is no specific limit here.
  • the server 105 may be a server that provides various services, such as a background audio and video resource server that provides support for audio and video played on the terminal devices 101, 102, and 103.
  • the background audio and video resource server can send audio and video to the terminal device, and can also receive the audio and video sent by the terminal device.
  • the method for processing audio provided by the embodiments of the present disclosure is generally executed by the terminal devices 101, 102, 103, and correspondingly, the device for processing audio is generally set in the terminal devices 101, 102, 103 .
  • the server can be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server is software, it can be implemented as multiple software or software modules (for example, software or software modules for providing distributed services), or as a single software or software module. There is no specific limit here.
  • terminal devices, networks, and servers in FIG. 1 are only schematic. According to the implementation needs, there can be any number of terminal devices, networks and servers.
  • the above system architecture may not include the network and the server, but only the terminal device.
  • the method for processing audio includes the following steps:
  • Step 201 In response to detecting a user-triggered dubbing start signal, adjust the volume of the dubbing track on the audio to be dubbed to the first target volume, and adjust the audio to be dubbed to include audio tracks other than the dubbing track. The volume is adjusted to the second target volume.
  • the executor of the method for processing audio may adjust the volume of the dubbing track on the audio to be dubbed to be in response to detecting the start dubbing signal triggered by the user.
  • the first target volume is to adjust the volume of audio tracks other than the dubbed audio track included in the audio to be dubbed to the second target volume.
  • the dubbing audio track is an audio track with a preset volume added to the audio to be dubbed in advance.
  • the audio to be dubbed may be the audio selected by the user from a preset audio set (for example, the audio set stored locally by the above-mentioned executive body).
  • the preset volume can be set to 0.
  • the audio to be dubbed may be the audio obtained by the above-mentioned executive body from a remote location through a wired connection or a wireless connection in advance, or an audio obtained locally. It should be understood that the audio to be dubbed may be a separate audio file or an audio component included in the video file.
  • the above-mentioned dubbing start signal may be a signal triggered by the user and used to indicate the start of dubbing the dubbed audio.
  • a start dubbing signal is generated.
  • the above-mentioned first target volume and second target volume may be the volume when the dubbed audio to be dubbed is played after the dubbing ends.
  • the first target volume and the second target volume may be preset volume respectively. It can also be the volume adjusted by the user.
  • the above-mentioned execution subject may show the user an interface for adjusting the volume, and the user may adjust the first target volume and the second target volume used during dubbing on the interface.
  • the first target volume and the second target volume may be a set fixed volume, or may be a volume determined according to a set percentage. For example, assuming that the original volume of the audio to be dubbed is 100%, the first target volume may be set to 80% of the foregoing original volume, and the second target volume may be set to 20% of the foregoing original volume. By setting the first target volume and the second target volume, the user can more flexibly blend the dubbed audio track with the audio to be dubbed, thereby helping to improve the effect of dubbing.
  • Step 202 Obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track.
  • the above-mentioned execution subject can obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track.
  • the audio signal to be recorded may be an audio signal for recording on a dubbing audio track.
  • the audio signal to be recorded may be a pre-stored audio signal obtained remotely or locally by the above-mentioned executive body.
  • the audio signal to be recorded may be an audio signal collected in real time by the target sound collecting device.
  • the target sound collection device may be a device (such as a microphone) included in the execution subject, or a device communicatively connected with the execution subject.
  • step 201 the method of adding a dubbing track to the audio to be dubbed described in step 201 and the method of recording the to-be-recorded audio signal on the dubbing track described in step 202 are well-known technologies that are currently widely studied and applied. I won't repeat them here.
  • Step 203 In response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the to-be-recorded audio signal during the dubbing time period at the second target volume. To-be-recorded audio signals on audio tracks other than the dubbed audio track.
  • the above-mentioned execution subject may save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the dubbing time period at the second target volume
  • the audio to be dubbed includes audio signals to be recorded on other audio tracks except the dubbed audio track.
  • the dubbing time period is the time period during which the to-be-recorded audio signal recorded on the dubbing audio track is played.
  • the dubbing time period may be a time period from the time when the dubbing signal is detected to the time when the dubbing signal is detected to end.
  • the dubbing time period may be a time period starting from the detection of the start of the dubbing signal, and the duration is a time period of the playing time of the audio signal to be recorded.
  • the above-mentioned dubbing end signal may be a signal triggered by a user and used to instruct the end of dubbing operation, or it may be a signal automatically generated by the execution subject and used to instruct the end of dubbing operation.
  • the audio signal to be recorded is an audio signal collected in real time by the target sound collecting device, when the user clicks the end dubbing button displayed on the screen of the execution subject, the end dubbing signal is generated.
  • a start dubbing signal is generated, and when it is detected that the user's finger leaves the screen, an end dubbing signal is generated.
  • the above-mentioned audio signal to be recorded is a pre-stored audio signal
  • the above-mentioned executive body detects that the audio signal to be recorded is completely recorded on the dubbing track, it generates an end dubbing signal.
  • the above-mentioned executive body may also adjust the volume of the dubbing audio track to a preset volume, and adjust the audio to be dubbed to include other than the dubbing audio track.
  • the volume of the audio track is adjusted to the initial volume.
  • the initial volume is the volume of the audio to be dubbed before the audio is recorded on the dubbing track. Therefore, after dubbing is finished, when the aforementioned audio to be dubbed is played, the volume of the dubbing track does not affect the playback of the audio.
  • FIG. 3 is a schematic diagram of an application scenario of the method for processing audio according to the present embodiment.
  • the terminal device 301 is playing the audio 302 to be dubbed, and the user wants to dub the audio 302 to be dubbed.
  • the terminal device 301 adds a dubbing track 3021 with a volume of zero to the audio 302 to be dubbed in advance.
  • the user presses the start dubbing button 303 on the screen of the terminal device 301, and the terminal device 301 generates a start dubbing signal.
  • the terminal device 301 detects the start of dubbing signal, according to the first target volume (for example, 80% of the original volume of the audio to be dubbed) and the second target volume (for example, 20% of the original volume of the audio to be dubbed) preset by the user,
  • the volume of the dubbing audio track is adjusted to the first target volume
  • the volume of the audio tracks other than the dubbing audio track included in the audio to be dubbed is adjusted to the second target volume.
  • the microphone on the terminal device 301 collects the user's voice, generates an audio signal 304 to be recorded, and records the audio signal 304 to be recorded on the dubbing audio track 3021.
  • the terminal device 301 When the user’s finger is lifted from the dubbing start button 303, the terminal device 301 generates an end dubbing signal, and the terminal device 301 saves the audio signal to be recorded recorded on the dubbing track during the dubbing period at the first target volume, and the second The target volume stores the to-be-dubbed audio signals included in the to-be-dubbed audio in the dubbing time period, and the to-be-recorded audio signals on other audio tracks except the dubbed audio track, so as to obtain the dubbed audio 305 after dubbing the to-be dubbed audio.
  • the method provided by the above-mentioned embodiment of the present disclosure adjusts the volume of the dubbing track on the audio to be dubbed to the first target volume in response to the detection of a user-triggered dubbing start signal, and removes the dubbing audio included in the audio to be dubbed.
  • the volume of other audio tracks other than the audio track is adjusted to the second target volume, and then the audio signal to be recorded is acquired, and the audio signal to be recorded is recorded on the dubbing track, and in response to detecting the end dubbing signal, it is saved at the first target volume
  • the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period, and the to-be-recorded audio signal on the other audio tracks except the dubbing track included in the dubbing audio during the dubbing time period is saved at the second target volume, thereby It is possible to add audio tracks to the audio to be dubbed, so that the original audio to be dubbed can be dubbed without modifying the original audio to be dubbed.
  • By setting the first target volume and the second target volume it can help to record the audio to be recorded.
  • the signal is better integrated with the original audio to be dubbed, which helps to flexibly dub and modify the dubbed audio.
  • FIG. 4 shows a flow 400 of still another embodiment of a method for processing audio.
  • the process 400 of the method for processing audio includes the following steps:
  • Step 401 In response to detecting a user-triggered dubbing start signal, adjust the volume of the dubbing track on the audio to be dubbed to the first target volume, and adjust the audio to be dubbed to include audio tracks other than the dubbing track. The volume is adjusted to the second target volume.
  • step 401 is basically the same as step 201 in the embodiment corresponding to FIG. 2, and will not be repeated here.
  • Step 402 Obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track.
  • step 402 is basically the same as step 202 in the embodiment corresponding to FIG. 2, and will not be repeated here.
  • Step 403 In response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing period at the first target volume, and save the to-be-recorded audio signal during the dubbing period at the second target volume. To-be-recorded audio signals on audio tracks other than the dubbed audio track.
  • step 403 is basically the same as step 203 in the embodiment corresponding to FIG. 2, and will not be repeated here.
  • Step 404 In response to detecting the modified dubbing signal triggered by the user, an interface for modifying the to-be-recorded audio signal recorded on the dubbing track is displayed.
  • the executor of the method for processing audio may respond to detecting the modified dubbing signal triggered by the user, and display the dubbing signal to be recorded on the dubbing track.
  • the interface for modifying the audio signal may be a signal that modifies the audio signal.
  • the above-mentioned modified dubbing signal may be a signal triggered by the user and used to indicate that the user wants to modify the dubbing signal on the saved dubbing track.
  • a modified dubbing signal is generated.
  • an interface for modifying the to-be-recorded audio signal recorded on the dubbing track is displayed on the screen, and the user can use this interface to control the above-mentioned execution subject to modify the to-be-recorded audio signal on the dubbing track.
  • the modification operation may include at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
  • the delete operation can be used to delete the to-be-recorded audio signal on the dubbing audio track.
  • the cropping operation can be used to delete part of the audio signal to be recorded on the dubbing audio track.
  • the re-recording operation can be used to replace the to-be-recorded audio signal on the dubbing track with the re-recorded-to-be-recorded audio signal.
  • Step 405 In response to detecting the end of the modified dubbing signal triggered by the user, save the modified audio signal to be recorded on the dubbing track.
  • the above-mentioned execution subject may modify the dubbing signal in response to detecting the end triggered by the user, and save the modified audio signal to be recorded on the dubbing track.
  • the above-mentioned end modification dubbing signal may be a signal triggered by the user and used to indicate that the user has completed the modification of the dubbing signal on the saved dubbing audio track.
  • the end modification dubbing button displayed on the screen of the execution subject
  • the end modification dubbing signal is generated. Then, the above-mentioned execution subject saves the dubbing audio track after the modification operation has been performed.
  • the process 400 of the method for processing audio in this embodiment highlights the step of modifying the dubbing audio track. Therefore, the solution described in this embodiment can flexibly modify the dubbed audio to be dubbed without affecting the original audio to be dubbed, thereby further improving the flexibility of dubbing.
  • the present disclosure provides an embodiment of a device for processing audio.
  • the device embodiment corresponds to the method embodiment shown in FIG.
  • the device can be applied to various electronic devices.
  • the apparatus 500 for processing audio in this embodiment includes: an adjustment unit 501 configured to adjust the volume of the dubbing track on the audio to be dubbed to in response to detecting the start dubbing signal triggered by the user
  • the first target volume which adjusts the volume of the audio tracks to be dubbed except for the dubbing audio track to the second target volume, where the dubbing audio track is pre-added to the audio to be dubbed and has a preset volume Audio track
  • recording unit 502 configured to obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track
  • saving unit 503 configured to respond to the detection of the end dubbing signal, save at the first target volume
  • the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period is stored at the second target volume and the to-be-recorded audio signal on the other audio tracks except the dubbing track included in the dubbing audio during the dubbing time period is saved.
  • the adjustment unit 501 may adjust the volume of the dubbing track on the audio to be dubbed to the first target volume in response to detecting the start dubbing signal triggered by the user, and adjust the audio track to be dubbed except for the dubbing audio track. Adjust the volume of other audio tracks to the second target volume.
  • the dubbing audio track is an audio track with a preset volume added to the audio to be dubbed in advance. Usually, the preset volume can be set to 0.
  • the audio to be dubbed may be the audio obtained by the foregoing device 500 from a remote location through a wired connection or a wireless connection in advance, or an audio obtained locally.
  • the audio to be dubbed may be a separate audio file or an audio component included in the video file.
  • the above-mentioned dubbing start signal may be a signal triggered by the user and used to indicate the start of dubbing the dubbed audio.
  • a start dubbing signal is generated.
  • the above-mentioned first target volume and second target volume may be the volume when the dubbed audio to be dubbed is played after the dubbing ends.
  • the recording unit 502 can obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing soundtrack.
  • the audio signal to be recorded may be an audio signal used for recording on a dubbing audio track.
  • the audio signal to be recorded may be a pre-stored audio signal obtained remotely or locally by the aforementioned apparatus 500.
  • the audio signal to be recorded may be an audio signal collected in real time by the target sound collecting device.
  • the target sound collection device may be a device (such as a microphone) included in the foregoing device 500, or may be a device communicatively connected to the foregoing device 500.
  • the saving unit 503 may, in response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the dubbing time period at the second target volume
  • the audio to be dubbed includes audio signals to be recorded on other audio tracks except the dubbed audio track.
  • the dubbing time period is the time period during which the to-be-recorded audio signal recorded on the dubbing audio track is played.
  • the dubbing time period may be a time period from the time when the dubbing signal is detected to the time when the dubbing signal is detected to end.
  • the dubbing time period may be a time period starting from the detection of the start of the dubbing signal, and the duration is a time period of the playing time of the audio signal to be recorded.
  • the above-mentioned dubbing end signal may be a signal triggered by a user and used to instruct the end of dubbing operation, or it may be a signal automatically generated by the above-mentioned apparatus 500 and used to instruct the end of dubbing operation.
  • the audio signal to be recorded is an audio signal collected in real time by the target sound collection device, when the user clicks the end dubbing button displayed on the screen of the device 500, an end dubbing signal is generated.
  • a start dubbing signal is generated, and when it is detected that the user's finger leaves the screen, an end dubbing signal is generated.
  • the audio signal to be recorded is a pre-stored audio signal, when the device 500 detects that the audio signal to be recorded is completely recorded on the dubbing track, it generates an end dubbing signal.
  • the saving unit 503 may be further configured to: adjust the volume of the dubbing audio track to a preset volume, and adjust the audio to be dubbed to include other audio except the dubbing audio track.
  • the volume of the track is adjusted to the initial volume.
  • the saving unit 503 may include: a display module (not shown in the figure), configured to display a display module (not shown) in response to detecting a modified dubbing signal triggered by a user to display the The interface for modifying the audio signal to be recorded on the audio track; the save module (not shown in the figure) is configured to modify the dubbing signal in response to detecting the end triggered by the user, and save the modified pending audio signal on the audio track. Record audio signal.
  • the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
  • the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
  • the device provided in the above-mentioned embodiment of the present disclosure adjusts the volume of the dubbing track on the audio to be dubbed to the first target volume by responding to the detection of a user-triggered dubbing start signal, so that the audio to be dubbed includes, except for the dubbing audio,
  • the volume of other audio tracks other than the audio track is adjusted to the second target volume, and then the audio signal to be recorded is acquired, and the audio signal to be recorded is recorded on the dubbing track, and in response to detecting the end dubbing signal, it is saved at the first target volume
  • the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period, and the to-be-recorded audio signal on the other audio tracks except the dubbing track included in the dubbing audio during the dubbing time period is saved at the second target volume, thereby
  • the method of adding a sound track to the audio to be dubbed enables dubbing without modifying the original audio to be dubbed, which helps to flexibly dub and modify the
  • FIG. 6 shows a schematic structural diagram of a terminal device 600 suitable for implementing embodiments of the present disclosure.
  • the terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals ( For example, mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, and the like.
  • the terminal device shown in FIG. 6 is only an example, and should not bring any limitation to the functions and use scope of the embodiments of the present disclosure.
  • the terminal device 600 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608.
  • the program in the memory (RAM) 603 executes various appropriate actions and processing.
  • various programs and data required for the operation of the terminal device 600 are also stored.
  • the processing device 601, ROM 602, and RAM 603 are connected to each other via a bus 604.
  • An input/output (I/O) interface 605 is also connected to the bus 604.
  • the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speaker, vibration
  • input devices 606 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.
  • An output device 607 such as a storage device; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609.
  • the communication device 609 may allow the terminal device 600 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 6 shows a terminal device 600 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may be implemented alternatively or provided with more or fewer devices.
  • embodiments of the present disclosure include a computer program product that includes a computer program carried on a computer-readable medium, the computer program containing program code for performing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 609, or from the storage device 608, or from the ROM 602.
  • the processing device 601 the above-described functions defined in the method of the embodiments of the present disclosure are executed.
  • the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal that is propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable signal medium may send, propagate, or transmit a program for use by or in combination with an instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: electric wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the above-mentioned computer-readable medium may be included in the above-mentioned terminal device; or it may exist alone without being assembled into the terminal device.
  • the above-mentioned computer-readable medium carries one or more programs.
  • the terminal device In response to detecting the start dubbing signal triggered by the user, dubbing on the audio to be dubbed The volume of the audio track is adjusted to the first target volume, and the volume of the audio tracks other than the dubbed audio track included in the audio to be dubbed is adjusted to the second target volume, wherein the dubbing audio track is added to the audio to be dubbed in advance Audio track with preset volume; Obtain the audio signal to be recorded and record the audio signal to be recorded on the dubbing track; In response to detecting the end dubbing signal, save the dubbing at the first target volume during the dubbing period The to-be-recorded audio signal on the track is saved at the second target volume, and the to-be-recorde
  • the computer program code used to perform the operations of the present disclosure can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (for example, through an Internet service provider Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider Internet connection for example, AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of code that contains one or more logic functions Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented with dedicated hardware-based systems that perform specified functions or operations Or, it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure can be implemented in software or hardware. Among them, the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • the recording unit can also be described as a "unit of the recording unit".

Abstract

Disclosed are a method and device for processing audio. A specific embodiment of the method comprises: in response to a dubbing start signal triggered by a user being detected, adjusting the volume of a dubbing track on audio to be dubbed to a first target volume, and adjusting the volume of audio tracks, other than the dubbing track, comprised in the audio to be dubbed to a second target volume; acquiring an audio signal to be recorded, and recording the audio signal to be recorded on the dubbing track; and in response to a dubbing end signal being detected, storing, at the first target volume, the audio signal recorded on the dubbing track within a dubbing time period, and storing, at the second target volume, audio signals to be recorded on the audio tracks, other than the dubbing track, comprised in the audio to be dubbed within the dubbing time period. According to the embodiment, dubbing can be performed without modifying original audio to be dubbed, thereby facilitating the flexible dubbing of audio to be dubbed and modifying dubbing.

Description

用于处理音频的方法和装置Method and device for processing audio
相关申请的交叉引用Cross-reference of related applications
本申请基于申请号为201910037108.0、申请日为2019年01月15日、名称为“用于处理音频的方法和装置”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is filed based on the Chinese patent application with the application number 201910037108.0, the filing date of January 15, 2019, and the title "Method and Apparatus for Audio Processing", and it claims the priority of the Chinese patent application. The Chinese patent application The entire content of is hereby incorporated into this application as a reference.
技术领域Technical field
本公开的实施例涉及计算机技术领域,具体涉及用于处理音频的方法和装置。The embodiments of the present disclosure relate to the field of computer technology, in particular to methods and devices for processing audio.
背景技术Background technique
随着互联网技术的发展,人们可以使用手机、平板电脑等电子设备从互联网资源中获得视频、音频等内容,还可以录制视频、音频,以及对视频、音频进行配音。当人们为原始音频进行配音的时候,现有技术通常采用将用户的声音直接与原始音频混合,或者将用户的声音替换掉原始音频的某个片段的方法。With the development of Internet technology, people can use electronic devices such as mobile phones and tablet computers to obtain video, audio and other content from Internet resources, as well as record video and audio, and dubbing video and audio. When people dub the original audio, the prior art usually uses a method of directly mixing the user's voice with the original audio, or replacing the user's voice with a certain segment of the original audio.
发明内容Summary of the invention
本公开的实施例提出了用于处理音频的方法和装置。The embodiments of the present disclosure propose methods and apparatuses for processing audio.
第一方面,本公开的实施例提供了一种用于处理音频的方法,该方法包括:响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,其中,配音音轨是预先在待配音音频上增加的、预设音量的音轨;获取待录制音频信号,以及将待录制音频信号录制在配音音轨上;响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录 制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。In a first aspect, an embodiment of the present disclosure provides a method for processing audio. The method includes: in response to detecting a user-triggered dubbing start signal, adjusting the volume of the dubbing track on the audio to be dubbed to the first Target volume, which adjusts the volume of the audio tracks to be dubbed, other than the dubbed audio track, to the second target volume, where the dubbed audio track is an audio track with a preset volume added to the audio to be dubbed in advance ; Obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing track; in response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, The second target volume is used to save the to-be-recorded audio signals included in the to-be-dubbed audio in the dubbing time period and on other audio tracks except the dubbing track.
在一些实施例中,在以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号之后,该方法还包括:将配音音轨的音量调整到预设音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到初始音量。In some embodiments, after saving the to-be-recorded audio signal on other audio tracks other than the dubbing audio track included in the dubbing audio during the dubbing time period at the second target volume, the method further includes: transferring the dubbing audio track Adjust the volume of the audio to the preset volume, and adjust the volume of the audio tracks to be dubbed, other than the dubbed audio track, to the initial volume.
在一些实施例中,在以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号之后,该方法还包括:响应于检测到用户触发的修改配音信号,展示用于对录制在配音音轨上的待录制音频信号进行修改操作的界面;响应于检测到用户触发的结束修改配音信号,保存配音音轨上的、修改后的待录制音频信号。In some embodiments, after saving the to-be-recorded audio signal included in the to-be-dubbed audio during the dubbing time period at the second target volume and on a track other than the dubbing track, the method further includes: responding to detecting User-triggered modified dubbing signal, showing the interface for modifying the to-be-recorded audio signal recorded on the dubbing track; in response to detecting the end of the user-triggered modification of the dubbing signal, save the modified dubbing signal on the dubbing track Audio signal to be recorded.
在一些实施例中,修改操作包括以下至少一种:删除操作、裁剪操作、重新录制操作。In some embodiments, the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
在一些实施例中,第一目标音量和第二目标音量分别是预设的音量,或者分别是由用户调整的音量。In some embodiments, the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
第二方面,本公开的实施例提供了一种用于处理音频的装置,该装置包括:调整单元,被配置成响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,其中,配音音轨是预先在待配音音频上增加的、预设音量的音轨;录制单元,被配置成获取待录制音频信号,以及将待录制音频信号录制在配音音轨上;保存单元,被配置成响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。In a second aspect, an embodiment of the present disclosure provides an apparatus for processing audio. The apparatus includes: an adjustment unit configured to, in response to detecting a user-triggered dubbing start signal, adjust the dubbing track on the audio to be dubbed Adjust the volume of the to-be-dubbed audio to the first target volume, and adjust the volume of the audio tracks to be dubbed, except for the dubbing track, to the second target volume, where the dubbing track is added to the to-be-dubbed audio in advance, A sound track with a preset volume; a recording unit configured to obtain the audio signal to be recorded and record the audio signal to be recorded on the dubbing sound track; the saving unit is configured to respond to the detection of the end dubbing signal to the first target The volume saves the to-be-recorded audio signals recorded on the dubbing track during the dubbing time period, and saves the to-be-recorded audio signals on the audio tracks other than the dubbing track included in the dubbing audio during the dubbing time period at the second target volume .
在一些实施例中,保存单元进一步被配置成:将配音音轨的音量调整到预设音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到初始音量。In some embodiments, the saving unit is further configured to: adjust the volume of the dubbed audio track to a preset volume, and adjust the volume of other audio tracks included in the audio to be dubbed except for the dubbed audio track to the initial volume.
在一些实施例中,保存单元包括:展示模块,被配置成响应于检 测到用户触发的修改配音信号,展示用于对录制在配音音轨上的待录制音频信号进行修改操作的界面;保存模块,被配置成响应于检测到用户触发的结束修改配音信号,保存配音音轨上的、修改后的待录制音频信号。In some embodiments, the saving unit includes: a display module configured to display an interface for modifying the audio signal to be recorded recorded on the dubbing audio track in response to detecting a modified dubbing signal triggered by a user; a saving module , Is configured to modify the dubbing signal in response to detecting the end triggered by the user, and save the modified audio signal to be recorded on the dubbing track.
在一些实施例中,修改操作包括以下至少一种:删除操作、裁剪操作、重新录制操作。In some embodiments, the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
在一些实施例中,第一目标音量和第二目标音量分别是预设的音量,或者分别是由用户调整的音量。In some embodiments, the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
第三方面,本公开的实施例提供了一种终端设备,该终端设备包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法。In the third aspect, the embodiments of the present disclosure provide a terminal device, the terminal device includes: one or more processors; a storage device on which one or more programs are stored; when one or more programs are Multiple processors execute, so that one or more processors implement the method described in any implementation manner of the first aspect.
第四方面,本公开的实施例提供了一种计算机可读介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面中任一实现方式描述的方法。According to a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, which when executed by a processor implements the method described in any one of the implementation manners of the first aspect.
本公开的实施例提供的用于处理音频的方法和装置,通过响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,再获取待录制音频信号,以及将待录制音频信号录制在配音音轨上,响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号,从而可以通过采用为待配音音频增加音轨的手段,使得可以在不对原始的待配音音频进行修改的情况下进行配音,通过设置第一目标音量和第二目标音量可以有助于将录制的待录制音频信号与原始的待配音音频更好地融合,有助于灵活地对待配音音频进行配音及修改配音。The method and apparatus for audio processing provided by the embodiments of the present disclosure adjust the volume of the dubbing track on the audio to be dubbed to the first target volume in response to detecting the start dubbing signal triggered by the user, and adjust the audio to be dubbed The volume of the included audio tracks other than the dubbing track is adjusted to the second target volume, the audio signal to be recorded is acquired, and the audio signal to be recorded is recorded on the dubbing track, in response to detecting the end dubbing signal, Save the to-be-recorded audio signal recorded on the dubbing track during the dubbing period at the first target volume, and save the audio signal to be dubbed during the dubbing period at the second target volume on other audio tracks except the dubbing track The audio signal to be recorded can be used to add audio tracks to the audio to be dubbed, so that dubbing can be performed without modifying the original audio to be dubbed. Setting the first target volume and the second target volume can help To better integrate the recorded audio signal to be recorded with the original audio to be dubbed, it helps to flexibly dub and modify the dubbed audio.
附图说明BRIEF DESCRIPTION
通过阅读参照以下附图所作的对非限制性实施例所作的详细描 述,本公开的其它特征、目的和优点将会变得更明显:By reading the detailed description of the non-limiting embodiments with reference to the following drawings, other features, purposes, and advantages of the present disclosure will become more apparent:
图1是本公开的一个实施例可以应用于其中的示例性系统架构图;Fig. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure can be applied;
图2是根据本公开的实施例的用于处理音频的方法的一个实施例的流程图;Fig. 2 is a flowchart of one embodiment of a method for processing audio according to an embodiment of the present disclosure;
图3是根据本公开的实施例的用于处理音频的方法的一个应用场景的示意图;Fig. 3 is a schematic diagram of an application scenario of the method for processing audio according to an embodiment of the present disclosure;
图4是根据本公开的实施例的用于处理音频的方法的又一个实施例的流程图;Fig. 4 is a flowchart of another embodiment of a method for processing audio according to an embodiment of the present disclosure;
图5是根据本公开的实施例的用于处理音频的装置的一个实施例的结构示意图;Fig. 5 is a schematic structural diagram of an embodiment of an apparatus for processing audio according to an embodiment of the present disclosure;
图6是适于用来实现本公开的实施例的终端设备的结构示意图。Fig. 6 is a schematic structural diagram of a terminal device suitable for implementing embodiments of the present disclosure.
具体实施方式detailed description
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关公开,而非对该公开的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关公开相关的部分。The disclosure will be further described in detail below with reference to the drawings and embodiments. It can be understood that the specific embodiments described herein are only used to explain the relevant disclosure, rather than limiting the disclosure. It should also be noted that, for ease of description, only the parts related to the relevant disclosure are shown in the drawings.
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。It should be noted that the embodiments in the present disclosure and the features in the embodiments can be combined with each other without conflict. The disclosure will be described in detail below with reference to the drawings and in conjunction with the embodiments.
图1示出了可以应用本公开的实施例的用于处理音频的方法或用于处理音频的装置的示例性系统架构100。FIG. 1 shows an exemplary system architecture 100 of a method for processing audio or an apparatus for processing audio to which an embodiment of the present disclosure can be applied.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如音频播放器应用、视频播放器应用、网页 浏览器应用、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages, and so on. Various communication client applications, such as audio player applications, video player applications, web browser applications, and social platform software, may be installed on the terminal devices 101, 102, and 103.
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是各种电子设备。当终端设备101、102、103为软件时,可以安装在上述电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices. When the terminal devices 101, 102, and 103 are software, they can be installed in the aforementioned electronic devices. It can be implemented as multiple software or software modules (for example, software or software modules used to provide distributed services), or as a single software or software module. There is no specific limit here.
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上播放的音频、视频提供支持的后台音视频资源服务器。后台音视频资源服务器可以向终端设备发送音频、视频,也可以接收终端设备发送的音频、视频。The server 105 may be a server that provides various services, such as a background audio and video resource server that provides support for audio and video played on the terminal devices 101, 102, and 103. The background audio and video resource server can send audio and video to the terminal device, and can also receive the audio and video sent by the terminal device.
需要说明的是,本公开的实施例所提供的用于处理音频的方法一般由终端设备101、102、103执行,相应地,用于处理音频的装置一般设置于终端设备101、102、103中。It should be noted that the method for processing audio provided by the embodiments of the present disclosure is generally executed by the terminal devices 101, 102, 103, and correspondingly, the device for processing audio is generally set in the terminal devices 101, 102, 103 .
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。It should be noted that the server can be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, software or software modules for providing distributed services), or as a single software or software module. There is no specific limit here.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。在待配音音频不需要从远程获取的情况下,上述系统架构可以不包括网络和服务器,而只包括终端设备。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are only schematic. According to the implementation needs, there can be any number of terminal devices, networks and servers. In the case that the audio to be dubbed does not need to be obtained remotely, the above system architecture may not include the network and the server, but only the terminal device.
继续参考图2,示出了根据本公开的用于处理音频的方法的一个实施例的流程200。该用于处理音频的方法,包括以下步骤:With continued reference to FIG. 2, there is shown a flow 200 of an embodiment of the method for processing audio according to the present disclosure. The method for processing audio includes the following steps:
步骤201,响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量。Step 201: In response to detecting a user-triggered dubbing start signal, adjust the volume of the dubbing track on the audio to be dubbed to the first target volume, and adjust the audio to be dubbed to include audio tracks other than the dubbing track. The volume is adjusted to the second target volume.
在本实施例中,用于处理音频的方法的执行主体(例如图1所示的终端设备)可以响应于检测到用户触发的开始配音信号,将待配音 音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量。其中,配音音轨是预先在待配音音频上增加的、预设音量的音轨。作为示例,待配音音频可以是用户从预设的音频集合(例如上述执行主体本地存储的音频集合)中挑选出的音频,当上述执行主体检测到用于欲对该待配音音频进行配音时(例如用户选中该待配音音频并点击“配音”按钮),向待配音音频中增加一条新的音轨作为配音音轨。通常,预设音量可以设置为0。In this embodiment, the executor of the method for processing audio (for example, the terminal device shown in FIG. 1) may adjust the volume of the dubbing track on the audio to be dubbed to be in response to detecting the start dubbing signal triggered by the user. The first target volume is to adjust the volume of audio tracks other than the dubbed audio track included in the audio to be dubbed to the second target volume. Among them, the dubbing audio track is an audio track with a preset volume added to the audio to be dubbed in advance. As an example, the audio to be dubbed may be the audio selected by the user from a preset audio set (for example, the audio set stored locally by the above-mentioned executive body). When the above-mentioned executive body detects that the audio to be dubbed is intended to be dubbed ( For example, the user selects the audio to be dubbed and clicks the "dubbing" button) to add a new audio track as the dubbing audio track to the audio to be dubbed. Usually, the preset volume can be set to 0.
待配音音频可以是上述执行主体预先通过有线连接方式或者无线连接方式从远程获取的音频,或者从本地获取的音频。应当理解,待配音音频可以是单独的音频文件,也可以是视频文件包括的音频成分。The audio to be dubbed may be the audio obtained by the above-mentioned executive body from a remote location through a wired connection or a wireless connection in advance, or an audio obtained locally. It should be understood that the audio to be dubbed may be a separate audio file or an audio component included in the video file.
上述开始配音信号可以是用户触发的、用于指示开始对待配音音频进行配音的操作的信号。作为示例,当用户点击上述执行主体的屏幕上显示的开始配音按钮时,生成开始配音信号。The above-mentioned dubbing start signal may be a signal triggered by the user and used to indicate the start of dubbing the dubbed audio. As an example, when the user clicks the start dubbing button displayed on the screen of the above-mentioned execution subject, a start dubbing signal is generated.
上述第一目标音量和第二目标音量可以是在配音结束后,播放配音后的待配音音频时的音量。The above-mentioned first target volume and second target volume may be the volume when the dubbed audio to be dubbed is played after the dubbing ends.
在本实施例的一些可选的实现方式中,第一目标音量和第二目标音量可以分别是预设的音量。也可以分别是由用户调整的音量。作为示例,上述执行主体可以向用户展示用于调整音量的界面,用户可以在该界面调整配音期间所用到的第一目标音量和第二目标音量。In some optional implementation manners of this embodiment, the first target volume and the second target volume may be preset volume respectively. It can also be the volume adjusted by the user. As an example, the above-mentioned execution subject may show the user an interface for adjusting the volume, and the user may adjust the first target volume and the second target volume used during dubbing on the interface.
需要说明的是,第一目标音量和第二目标音量可以是设置的固定的音量,也可以是根据设置的百分比确定的音量。例如,假设待配音音频的原始音量为100%,第一目标音量可以设置为上述原始音量的80%,第二目标音量可以设置为上述原始音量的20%。通过设置第一目标音量和第二目标音量,可以使用户更加灵活地将配音音轨与待配音音频融合,从而有助于改善配音的效果。It should be noted that the first target volume and the second target volume may be a set fixed volume, or may be a volume determined according to a set percentage. For example, assuming that the original volume of the audio to be dubbed is 100%, the first target volume may be set to 80% of the foregoing original volume, and the second target volume may be set to 20% of the foregoing original volume. By setting the first target volume and the second target volume, the user can more flexibly blend the dubbed audio track with the audio to be dubbed, thereby helping to improve the effect of dubbing.
步骤202,获取待录制音频信号,以及将待录制音频信号录制在配音音轨上。Step 202: Obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track.
在本实施例中,上述执行主体(例如图1所示的终端设备)可以获取待录制音频信号,以及将待录制音频信号录制在配音音轨上。其 中,待录制音频信号可以是用于录制在配音音轨的音频信号。作为示例,待录制音频信号可以是上述执行主体从远程或从本地获取的预先存储的音频信号。或者,待录制音频信号可以是由目标声音采集装置实时采集的音频信号。其中,目标声音采集装置可以是上述执行主体包括的装置(例如麦克风),也可以是与上述执行主体通信连接的装置。In this embodiment, the above-mentioned execution subject (for example, the terminal device shown in FIG. 1) can obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track. Among them, the audio signal to be recorded may be an audio signal for recording on a dubbing audio track. As an example, the audio signal to be recorded may be a pre-stored audio signal obtained remotely or locally by the above-mentioned executive body. Alternatively, the audio signal to be recorded may be an audio signal collected in real time by the target sound collecting device. Wherein, the target sound collection device may be a device (such as a microphone) included in the execution subject, or a device communicatively connected with the execution subject.
需要说明的是,步骤201中描述的向待配音音频中增加配音音轨,以及步骤202中描述的将待录制音频信号录制在配音音轨上的方法,是目前广泛研究和应用的公知技术,在此不再赘述。It should be noted that the method of adding a dubbing track to the audio to be dubbed described in step 201 and the method of recording the to-be-recorded audio signal on the dubbing track described in step 202 are well-known technologies that are currently widely studied and applied. I won't repeat them here.
步骤203,响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。Step 203: In response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the to-be-recorded audio signal during the dubbing time period at the second target volume. To-be-recorded audio signals on audio tracks other than the dubbed audio track.
在本实施例中,上述执行主体可以响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。其中,配音时间段是录制在配音音轨上的待录制音频信号,在播放时所处的时间段。作为示例,当待录制音频信号是由目标声音采集装置实时采集的音频信号时,配音时间段可以是从检测到开始配音信号的时间至检测到结束配音信号的时间的时间段。当待录制音频信号为预先存储的音频信号时,配音时间段可以是以检测到开始配音信号的时间为起点,持续时长为待录制音频信号的播放时长的时间段。In this embodiment, in response to detecting the end dubbing signal, the above-mentioned execution subject may save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the dubbing time period at the second target volume The audio to be dubbed includes audio signals to be recorded on other audio tracks except the dubbed audio track. Among them, the dubbing time period is the time period during which the to-be-recorded audio signal recorded on the dubbing audio track is played. As an example, when the audio signal to be recorded is an audio signal collected in real time by the target sound collecting device, the dubbing time period may be a time period from the time when the dubbing signal is detected to the time when the dubbing signal is detected to end. When the audio signal to be recorded is a pre-stored audio signal, the dubbing time period may be a time period starting from the detection of the start of the dubbing signal, and the duration is a time period of the playing time of the audio signal to be recorded.
上述结束配音信号可以是用户触发的、用于指示结束配音的操作的信号,也可以是上述执行主体自动生成的、用于指示结束配音的操作的信号。作为示例,如果待录制音频信号是由目标声音采集装置实时采集的音频信号,当用户点击上述执行主体的屏幕上显示的结束配音按钮时,生成结束配音信号。或者,当检测到用户的手指按住上述执行主体的屏幕上显示的开始配音按钮时,生成开始配音信号,当检测到用户的手指离开屏幕时,生成结束配音信号。作为另一示例,如果上述待录制音频信号为预先存储的音频信号,当上述执行主体检测 到待录制音频信号完全录制到配音音轨上时,生成结束配音信号。The above-mentioned dubbing end signal may be a signal triggered by a user and used to instruct the end of dubbing operation, or it may be a signal automatically generated by the execution subject and used to instruct the end of dubbing operation. As an example, if the audio signal to be recorded is an audio signal collected in real time by the target sound collecting device, when the user clicks the end dubbing button displayed on the screen of the execution subject, the end dubbing signal is generated. Alternatively, when it is detected that the user's finger presses the start dubbing button displayed on the screen of the execution subject, a start dubbing signal is generated, and when it is detected that the user's finger leaves the screen, an end dubbing signal is generated. As another example, if the above-mentioned audio signal to be recorded is a pre-stored audio signal, when the above-mentioned executive body detects that the audio signal to be recorded is completely recorded on the dubbing track, it generates an end dubbing signal.
在本实施例的一些可选的实现方式中,在步骤203之后,上述执行主体还可以将配音音轨的音量调整到预设音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到初始音量。其中,初始音量是在配音音轨上录制音频之前,待配音音频的音量。从而可以使配音结束后,在播放上述待配音音频时,配音音轨的音量不影响音频的播放。In some optional implementations of this embodiment, after step 203, the above-mentioned executive body may also adjust the volume of the dubbing audio track to a preset volume, and adjust the audio to be dubbed to include other than the dubbing audio track. The volume of the audio track is adjusted to the initial volume. Among them, the initial volume is the volume of the audio to be dubbed before the audio is recorded on the dubbing track. Therefore, after dubbing is finished, when the aforementioned audio to be dubbed is played, the volume of the dubbing track does not affect the playback of the audio.
继续参见图3,图3是根据本实施例的用于处理音频的方法的应用场景的一个示意图。在图3的应用场景中,终端设备301正在播放待配音音频302,用户欲对该待配音音频302进行配音。为了对待配音音频302进行配音,终端设备301预先在待配音音频302上增加了一条音量为零的配音音轨3021。此时,用户按下终端设备301的屏幕上的开始配音按钮303,终端设备301生成开始配音信号。当终端设备301检测到开始配音信号时,根据用户预先设置的第一目标音量(例如待配音音频的原始音量的80%)和第二目标音量(例如待配音音频的原始音量的20%),将配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量。然后,终端设备301上的麦克风采集用户的语音,生成待录制音频信号304,并将待录制音频信号304录制在配音音轨3021上。当用户的手指从开始配音按钮303上抬起时,终端设备301生成结束配音信号,终端设备301以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号,从而得到对待配音音频进行配音后的配音后音频305。Continue to refer to FIG. 3, which is a schematic diagram of an application scenario of the method for processing audio according to the present embodiment. In the application scenario of FIG. 3, the terminal device 301 is playing the audio 302 to be dubbed, and the user wants to dub the audio 302 to be dubbed. In order to dub the audio 302 to be dubbed, the terminal device 301 adds a dubbing track 3021 with a volume of zero to the audio 302 to be dubbed in advance. At this time, the user presses the start dubbing button 303 on the screen of the terminal device 301, and the terminal device 301 generates a start dubbing signal. When the terminal device 301 detects the start of dubbing signal, according to the first target volume (for example, 80% of the original volume of the audio to be dubbed) and the second target volume (for example, 20% of the original volume of the audio to be dubbed) preset by the user, The volume of the dubbing audio track is adjusted to the first target volume, and the volume of the audio tracks other than the dubbing audio track included in the audio to be dubbed is adjusted to the second target volume. Then, the microphone on the terminal device 301 collects the user's voice, generates an audio signal 304 to be recorded, and records the audio signal 304 to be recorded on the dubbing audio track 3021. When the user’s finger is lifted from the dubbing start button 303, the terminal device 301 generates an end dubbing signal, and the terminal device 301 saves the audio signal to be recorded recorded on the dubbing track during the dubbing period at the first target volume, and the second The target volume stores the to-be-dubbed audio signals included in the to-be-dubbed audio in the dubbing time period, and the to-be-recorded audio signals on other audio tracks except the dubbed audio track, so as to obtain the dubbed audio 305 after dubbing the to-be dubbed audio.
本公开的上述实施例提供的方法,通过响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,再获取待录制音频信号,以及将待录制音频信号录制在配音音轨上,响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保 存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号,从而可以通过向待配音音频增加音轨的手段,使得可以在不对原始的待配音音频进行修改的情况下进行配音,通过设置第一目标音量和第二目标音量可以有助于将录制的待录制音频信号与原始的待配音音频更好地融合,有助于灵活地对待配音音频进行配音及修改配音。The method provided by the above-mentioned embodiment of the present disclosure adjusts the volume of the dubbing track on the audio to be dubbed to the first target volume in response to the detection of a user-triggered dubbing start signal, and removes the dubbing audio included in the audio to be dubbed. The volume of other audio tracks other than the audio track is adjusted to the second target volume, and then the audio signal to be recorded is acquired, and the audio signal to be recorded is recorded on the dubbing track, and in response to detecting the end dubbing signal, it is saved at the first target volume The to-be-recorded audio signal recorded on the dubbing track during the dubbing time period, and the to-be-recorded audio signal on the other audio tracks except the dubbing track included in the dubbing audio during the dubbing time period is saved at the second target volume, thereby It is possible to add audio tracks to the audio to be dubbed, so that the original audio to be dubbed can be dubbed without modifying the original audio to be dubbed. By setting the first target volume and the second target volume, it can help to record the audio to be recorded. The signal is better integrated with the original audio to be dubbed, which helps to flexibly dub and modify the dubbed audio.
进一步参考图4,其示出了用于处理音频的方法的又一个实施例的流程400。该用于处理音频的方法的流程400,包括以下步骤:With further reference to FIG. 4, it shows a flow 400 of still another embodiment of a method for processing audio. The process 400 of the method for processing audio includes the following steps:
步骤401,响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量。Step 401: In response to detecting a user-triggered dubbing start signal, adjust the volume of the dubbing track on the audio to be dubbed to the first target volume, and adjust the audio to be dubbed to include audio tracks other than the dubbing track. The volume is adjusted to the second target volume.
在本实施例中,步骤401与图2对应实施例中的步骤201基本一致,这里不再赘述。In this embodiment, step 401 is basically the same as step 201 in the embodiment corresponding to FIG. 2, and will not be repeated here.
步骤402,获取待录制音频信号,以及将待录制音频信号录制在配音音轨上。Step 402: Obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track.
在本实施例中,步骤402与图2对应实施例中的步骤202基本一致,这里不再赘述。In this embodiment, step 402 is basically the same as step 202 in the embodiment corresponding to FIG. 2, and will not be repeated here.
步骤403,响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。Step 403: In response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing period at the first target volume, and save the to-be-recorded audio signal during the dubbing period at the second target volume. To-be-recorded audio signals on audio tracks other than the dubbed audio track.
在本实施例中,步骤403与图2对应实施例中的步骤203基本一致,这里不再赘述。In this embodiment, step 403 is basically the same as step 203 in the embodiment corresponding to FIG. 2, and will not be repeated here.
步骤404,响应于检测到用户触发的修改配音信号,展示用于对录制在配音音轨上的待录制音频信号进行修改操作的界面。Step 404: In response to detecting the modified dubbing signal triggered by the user, an interface for modifying the to-be-recorded audio signal recorded on the dubbing track is displayed.
在本实施例中,用于处理音频的方法的执行主体(例如图1所示的终端设备)可以响应于检测到用户触发的修改配音信号,展示用于对录制在配音音轨上的待录制音频信号进行修改操作的界面。In this embodiment, the executor of the method for processing audio (for example, the terminal device shown in FIG. 1) may respond to detecting the modified dubbing signal triggered by the user, and display the dubbing signal to be recorded on the dubbing track. The interface for modifying the audio signal.
上述修改配音信号可以是用户触发的、用于指示用户欲对已保存 的配音音轨上的配音信号进行修改的信号。作为示例,当用户点击上述执行主体的屏幕上显示的修改配音按钮时,生成修改配音信号。然后,在屏幕上展示用于对录制在配音音轨上的待录制音频信号进行修改操作的界面,用户可以利用该界面,控制上述执行主体对配音音轨上的待录制音频信号进行修改操作。The above-mentioned modified dubbing signal may be a signal triggered by the user and used to indicate that the user wants to modify the dubbing signal on the saved dubbing track. As an example, when the user clicks the modify dubbing button displayed on the screen of the execution subject, a modified dubbing signal is generated. Then, an interface for modifying the to-be-recorded audio signal recorded on the dubbing track is displayed on the screen, and the user can use this interface to control the above-mentioned execution subject to modify the to-be-recorded audio signal on the dubbing track.
在本实施例的一些可选的实现方式中,修改操作可以包括以下至少一种:删除操作、裁剪操作、重新录制操作。其中,删除操作可以用于将配音音轨上的待录制音频信号删除。裁剪操作可以用于将配音音轨上的待录制音频信号部分删除。重新录制操作可以用于将配音音轨上的待录制音频信号替换为重新录制的待录制音频信号。In some optional implementation manners of this embodiment, the modification operation may include at least one of the following: a deletion operation, a cropping operation, and a re-recording operation. Among them, the delete operation can be used to delete the to-be-recorded audio signal on the dubbing audio track. The cropping operation can be used to delete part of the audio signal to be recorded on the dubbing audio track. The re-recording operation can be used to replace the to-be-recorded audio signal on the dubbing track with the re-recorded-to-be-recorded audio signal.
步骤405,响应于检测到用户触发的结束修改配音信号,保存配音音轨上的、修改后的待录制音频信号。Step 405: In response to detecting the end of the modified dubbing signal triggered by the user, save the modified audio signal to be recorded on the dubbing track.
在本实施例中,上述执行主体可以响应于检测到用户触发的结束修改配音信号,保存配音音轨上的、修改后的待录制音频信号。其中,上述结束修改配音信号可以是用户触发的、用于指示用户对已保存的配音音轨上的配音信号已完成修改的信号。作为示例,当用户点击上述执行主体的屏幕上显示的结束修改配音按钮时,生成结束修改配音信号。然后,上述执行主体保存被执行修改操作后的配音音轨。In this embodiment, the above-mentioned execution subject may modify the dubbing signal in response to detecting the end triggered by the user, and save the modified audio signal to be recorded on the dubbing track. The above-mentioned end modification dubbing signal may be a signal triggered by the user and used to indicate that the user has completed the modification of the dubbing signal on the saved dubbing audio track. As an example, when the user clicks the end modification dubbing button displayed on the screen of the execution subject, the end modification dubbing signal is generated. Then, the above-mentioned execution subject saves the dubbing audio track after the modification operation has been performed.
从图4中可以看出,与图2对应的实施例相比,本实施例中的用于处理音频的方法的流程400突出了修改配音音轨的步骤。由此,本实施例描述的方案可以在不影响原始的待配音音频的情况下,灵活地对已配音的待配音音频进行修改,从而进一步提高了配音的灵活性。It can be seen from FIG. 4 that compared with the embodiment corresponding to FIG. 2, the process 400 of the method for processing audio in this embodiment highlights the step of modifying the dubbing audio track. Therefore, the solution described in this embodiment can flexibly modify the dubbed audio to be dubbed without affecting the original audio to be dubbed, thereby further improving the flexibility of dubbing.
进一步参考图5,作为对上述各图所示方法的实现,本公开提供了一种用于处理音频的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a device for processing audio. The device embodiment corresponds to the method embodiment shown in FIG. The device can be applied to various electronic devices.
如图5所示,本实施例的用于处理音频的装置500包括:调整单元501,被配置成响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,其中,配音 音轨是预先在待配音音频上增加的、预设音量的音轨;录制单元502,被配置成获取待录制音频信号,以及将待录制音频信号录制在配音音轨上;保存单元503,被配置成响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。As shown in FIG. 5, the apparatus 500 for processing audio in this embodiment includes: an adjustment unit 501 configured to adjust the volume of the dubbing track on the audio to be dubbed to in response to detecting the start dubbing signal triggered by the user The first target volume, which adjusts the volume of the audio tracks to be dubbed except for the dubbing audio track to the second target volume, where the dubbing audio track is pre-added to the audio to be dubbed and has a preset volume Audio track; recording unit 502, configured to obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing audio track; saving unit 503, configured to respond to the detection of the end dubbing signal, save at the first target volume The to-be-recorded audio signal recorded on the dubbing track during the dubbing time period is stored at the second target volume and the to-be-recorded audio signal on the other audio tracks except the dubbing track included in the dubbing audio during the dubbing time period is saved.
在本实施例中,调整单元501可以响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量。其中,配音音轨是预先在待配音音频上增加的、预设音量的音轨。通常,预设音量可以设置为0。待配音音频可以是上述装置500预先通过有线连接方式或者无线连接方式从远程获取的音频,或者从本地获取的音频。In this embodiment, the adjustment unit 501 may adjust the volume of the dubbing track on the audio to be dubbed to the first target volume in response to detecting the start dubbing signal triggered by the user, and adjust the audio track to be dubbed except for the dubbing audio track. Adjust the volume of other audio tracks to the second target volume. Among them, the dubbing audio track is an audio track with a preset volume added to the audio to be dubbed in advance. Usually, the preset volume can be set to 0. The audio to be dubbed may be the audio obtained by the foregoing device 500 from a remote location through a wired connection or a wireless connection in advance, or an audio obtained locally.
应当理解,待配音音频可以是单独的音频文件,也可以是视频文件包括的音频成分。It should be understood that the audio to be dubbed may be a separate audio file or an audio component included in the video file.
上述开始配音信号可以是用户触发的、用于指示开始对待配音音频进行配音的操作的信号。作为示例,当用户点击上述装置500的屏幕上显示的开始配音按钮时,生成开始配音信号。The above-mentioned dubbing start signal may be a signal triggered by the user and used to indicate the start of dubbing the dubbed audio. As an example, when the user clicks the start dubbing button displayed on the screen of the above-mentioned device 500, a start dubbing signal is generated.
上述第一目标音量和第二目标音量可以是在配音结束后,播放配音后的待配音音频时的音量。The above-mentioned first target volume and second target volume may be the volume when the dubbed audio to be dubbed is played after the dubbing ends.
在本实施例中,录制单元502可以获取待录制音频信号,以及将待录制音频信号录制在配音音轨上。其中,待录制音频信号可以是用于录制在配音音轨的音频信号。作为示例,待录制音频信号可以是上述装置500从远程或从本地获取的预先存储的音频信号。或者,待录制音频信号可以是由目标声音采集装置实时采集的音频信号。其中,目标声音采集装置可以是上述装置500包括的装置(例如麦克风),也可以是与上述装置500通信连接的装置。In this embodiment, the recording unit 502 can obtain the audio signal to be recorded, and record the audio signal to be recorded on the dubbing soundtrack. Wherein, the audio signal to be recorded may be an audio signal used for recording on a dubbing audio track. As an example, the audio signal to be recorded may be a pre-stored audio signal obtained remotely or locally by the aforementioned apparatus 500. Alternatively, the audio signal to be recorded may be an audio signal collected in real time by the target sound collecting device. The target sound collection device may be a device (such as a microphone) included in the foregoing device 500, or may be a device communicatively connected to the foregoing device 500.
在本实施例中,保存单元503可以响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音 轨之外的其他音轨上的待录制音频信号。In this embodiment, the saving unit 503 may, in response to detecting the end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the dubbing time period at the second target volume The audio to be dubbed includes audio signals to be recorded on other audio tracks except the dubbed audio track.
其中,配音时间段是录制在配音音轨上的待录制音频信号,在播放时所处的时间段。作为示例,当待录制音频信号是由目标声音采集装置实时采集的音频信号时,配音时间段可以是从检测到开始配音信号的时间至检测到结束配音信号的时间的时间段。当待录制音频信号为预先存储的音频信号时,配音时间段可以是以检测到开始配音信号的时间为起点,持续时长为待录制音频信号的播放时长的时间段。Among them, the dubbing time period is the time period during which the to-be-recorded audio signal recorded on the dubbing audio track is played. As an example, when the audio signal to be recorded is an audio signal collected in real time by the target sound collecting device, the dubbing time period may be a time period from the time when the dubbing signal is detected to the time when the dubbing signal is detected to end. When the audio signal to be recorded is a pre-stored audio signal, the dubbing time period may be a time period starting from the detection of the start of the dubbing signal, and the duration is a time period of the playing time of the audio signal to be recorded.
上述结束配音信号可以是用户触发的、用于指示结束配音的操作的信号,也可以是上述装置500自动生成的、用于指示结束配音的操作的信号。作为示例,如果待录制音频信号是由目标声音采集装置实时采集的音频信号,当用户点击上述装置500的屏幕上显示的结束配音按钮时,生成结束配音信号。或者,当检测到用户的手指按住上述装置500的屏幕上显示的开始配音按钮时,生成开始配音信号,当检测到用户的手指离开屏幕时,生成结束配音信号。作为另一示例,如果上述待录制音频信号为预先存储的音频信号,当上述装置500检测到待录制音频信号完全录制到配音音轨上时,生成结束配音信号。The above-mentioned dubbing end signal may be a signal triggered by a user and used to instruct the end of dubbing operation, or it may be a signal automatically generated by the above-mentioned apparatus 500 and used to instruct the end of dubbing operation. As an example, if the audio signal to be recorded is an audio signal collected in real time by the target sound collection device, when the user clicks the end dubbing button displayed on the screen of the device 500, an end dubbing signal is generated. Alternatively, when it is detected that the user's finger presses the start dubbing button displayed on the screen of the device 500, a start dubbing signal is generated, and when it is detected that the user's finger leaves the screen, an end dubbing signal is generated. As another example, if the audio signal to be recorded is a pre-stored audio signal, when the device 500 detects that the audio signal to be recorded is completely recorded on the dubbing track, it generates an end dubbing signal.
在本实施例的一些可选的实现方式中,保存单元503可以进一步被配置成:将配音音轨的音量调整到预设音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到初始音量。In some optional implementations of this embodiment, the saving unit 503 may be further configured to: adjust the volume of the dubbing audio track to a preset volume, and adjust the audio to be dubbed to include other audio except the dubbing audio track. The volume of the track is adjusted to the initial volume.
在本实施例的一些可选的实现方式中,保存单元503可以包括:展示模块(图中未示出),被配置成响应于检测到用户触发的修改配音信号,展示用于对录制在配音音轨上的待录制音频信号进行修改操作的界面;保存模块(图中未示出),被配置成响应于检测到用户触发的结束修改配音信号,保存配音音轨上的、修改后的待录制音频信号。In some alternative implementations of this embodiment, the saving unit 503 may include: a display module (not shown in the figure), configured to display a display module (not shown) in response to detecting a modified dubbing signal triggered by a user to display the The interface for modifying the audio signal to be recorded on the audio track; the save module (not shown in the figure) is configured to modify the dubbing signal in response to detecting the end triggered by the user, and save the modified pending audio signal on the audio track. Record audio signal.
在本实施例的一些可选的实现方式中,修改操作包括以下至少一种:删除操作、裁剪操作、重新录制操作。In some optional implementation manners of this embodiment, the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
在本实施例的一些可选的实现方式中,第一目标音量和第二目标音量分别是预设的音量,或者分别是由用户调整的音量。In some optional implementation manners of this embodiment, the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
本公开的上述实施例提供的装置,通过响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音 量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,再获取待录制音频信号,以及将待录制音频信号录制在配音音轨上,响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号,从而可以通过向待配音音频增加音轨的手段,使得可以在不对原始的待配音音频进行修改的情况下进行配音,有助于灵活地对待配音音频进行配音及修改配音。The device provided in the above-mentioned embodiment of the present disclosure adjusts the volume of the dubbing track on the audio to be dubbed to the first target volume by responding to the detection of a user-triggered dubbing start signal, so that the audio to be dubbed includes, except for the dubbing audio, The volume of other audio tracks other than the audio track is adjusted to the second target volume, and then the audio signal to be recorded is acquired, and the audio signal to be recorded is recorded on the dubbing track, and in response to detecting the end dubbing signal, it is saved at the first target volume The to-be-recorded audio signal recorded on the dubbing track during the dubbing time period, and the to-be-recorded audio signal on the other audio tracks except the dubbing track included in the dubbing audio during the dubbing time period is saved at the second target volume, thereby The method of adding a sound track to the audio to be dubbed enables dubbing without modifying the original audio to be dubbed, which helps to flexibly dub and modify the dubbed audio.
下面参考图6,其示出了适于用来实现本公开的实施例的终端设备600的结构示意图。本公开的实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图6示出的终端设备仅仅是一个示例,不应对本公开的实施例的功能和使用范围带来任何限制。Reference is now made to FIG. 6, which shows a schematic structural diagram of a terminal device 600 suitable for implementing embodiments of the present disclosure. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals ( For example, mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, and the like. The terminal device shown in FIG. 6 is only an example, and should not bring any limitation to the functions and use scope of the embodiments of the present disclosure.
如图6所示,终端设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储装置608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有终端设备600操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, the terminal device 600 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608. The program in the memory (RAM) 603 executes various appropriate actions and processing. In the RAM 603, various programs and data required for the operation of the terminal device 600 are also stored. The processing device 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to the bus 604.
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许终端设备600与其他设备进行无线或有线通信以交换数据。虽然图6示出了具有各种装置的终端设备600,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或 更少的装置。Generally, the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speaker, vibration An output device 607 such as a storage device; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609. The communication device 609 may allow the terminal device 600 to perform wireless or wired communication with other devices to exchange data. Although FIG. 6 shows a terminal device 600 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may be implemented alternatively or provided with more or fewer devices.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置608被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开的实施例的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart can be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product that includes a computer program carried on a computer-readable medium, the computer program containing program code for performing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network through the communication device 609, or from the storage device 608, or from the ROM 602. When the computer program is executed by the processing device 601, the above-described functions defined in the method of the embodiments of the present disclosure are executed.
需要说明的是,本公开所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In this disclosure, the computer-readable signal medium may include a data signal that is propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable signal medium may send, propagate, or transmit a program for use by or in combination with an instruction execution system, apparatus, or device . The program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: electric wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
上述计算机可读介质可以是上述终端设备中所包含的;也可以是单独存在,而未装配入该终端设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该终端设备执行时,使 得该终端设备:响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将待配音音频包括的、除配音音轨之外的其他音轨的音量调整到第二目标音量,其中,配音音轨是预先在待配音音频上增加的、预设音量的音轨;获取待录制音频信号,以及将待录制音频信号录制在配音音轨上;响应于检测到结束配音信号,以第一目标音量保存配音时间段内录制在配音音轨上的待录制音频信号,以第二目标音量保存配音时间段内待配音音频包括的、除配音音轨之外的其他音轨上的待录制音频信号。The above-mentioned computer-readable medium may be included in the above-mentioned terminal device; or it may exist alone without being assembled into the terminal device. The above-mentioned computer-readable medium carries one or more programs. When the above-mentioned one or more programs are executed by the terminal device, the terminal device: In response to detecting the start dubbing signal triggered by the user, dubbing on the audio to be dubbed The volume of the audio track is adjusted to the first target volume, and the volume of the audio tracks other than the dubbed audio track included in the audio to be dubbed is adjusted to the second target volume, wherein the dubbing audio track is added to the audio to be dubbed in advance Audio track with preset volume; Obtain the audio signal to be recorded and record the audio signal to be recorded on the dubbing track; In response to detecting the end dubbing signal, save the dubbing at the first target volume during the dubbing period The to-be-recorded audio signal on the track is saved at the second target volume, and the to-be-recorded audio signal on the other audio tracks except the dubbing audio track included in the dubbing time period is stored.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。The computer program code used to perform the operations of the present disclosure can be written in one or more programming languages or a combination thereof. The programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In situations involving remote computers, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (for example, through an Internet service provider Internet connection).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the drawings illustrate the possible implementation architecture, functions, and operations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of code that contains one or more logic functions Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented with dedicated hardware-based systems that perform specified functions or operations Or, it can be realized by a combination of dedicated hardware and computer instructions.
描述于本公开的实施例中所涉及到的单元可以通过软件的方式实 现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,录制单元还可以被描述为“录制单元的单元”。The units involved in the embodiments described in the present disclosure can be implemented in software or hardware. Among them, the name of the unit does not constitute a limitation on the unit itself under certain circumstances. For example, the recording unit can also be described as a "unit of the recording unit".
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only the preferred embodiment of the present disclosure and the explanation of the applied technical principles. Those skilled in the art should understand that the scope of the disclosure in this disclosure is not limited to a technical solution formed by a specific combination of the above technical features, but should also cover the above technical features or without departing from the above disclosed concept. Other technical solutions formed by arbitrary combination of equivalent features. For example, the above features and the technical features disclosed in this disclosure (but not limited to) with similar functions are replaced with each other to form a technical solution.

Claims (12)

  1. 一种用于处理音频的方法,包括:A method for processing audio, including:
    响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将所述待配音音频包括的、除所述配音音轨之外的其他音轨的音量调整到第二目标音量,其中,所述配音音轨是预先在所述待配音音频上增加的、预设音量的音轨;In response to detecting a user-triggered dubbing start signal, the volume of the dubbing audio track on the audio to be dubbed is adjusted to the first target volume, and other audio tracks included in the audio dubbing except the dubbing audio track The volume of is adjusted to the second target volume, where the dubbing audio track is an audio track with a preset volume added to the audio to be dubbed in advance;
    获取待录制音频信号,以及将所述待录制音频信号录制在所述配音音轨上;Acquiring an audio signal to be recorded, and recording the audio signal to be recorded on the dubbing soundtrack;
    响应于检测到结束配音信号,以所述第一目标音量保存配音时间段内录制在所述配音音轨上的待录制音频信号,以所述第二目标音量保存所述配音时间段内所述待配音音频包括的、除所述配音音轨之外的其他音轨上的待录制音频信号。In response to detecting the end dubbing signal, the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period is saved at the first target volume, and the second target volume is saved during the dubbing time period. The audio to be dubbed includes audio signals to be recorded on audio tracks other than the dubbed audio track.
  2. 根据权利要求1所述的方法,其中,在所述以所述第二目标音量保存所述配音时间段内所述待配音音频包括的、除所述配音音轨之外的其他音轨上的待录制音频信号之后,所述方法还包括:The method according to claim 1, wherein, during the time period of saving the dubbing at the second target volume, the audio to be dubbed includes the audio on a track other than the dubbing track After the audio signal is to be recorded, the method further includes:
    将所述配音音轨的音量调整到所述预设音量,将所述待配音音频包括的、除所述配音音轨之外的其他音轨的音量调整到初始音量。The volume of the dubbing audio track is adjusted to the preset volume, and the volume of the audio tracks other than the dubbing audio track included in the audio to be dubbed is adjusted to the initial volume.
  3. 根据权利要求1所述的方法,其中,在所述以所述第二目标音量保存所述配音时间段内所述待配音音频包括的、除所述配音音轨之外的其他音轨上的待录制音频信号之后,所述方法还包括:The method according to claim 1, wherein, during the time period of saving the dubbing at the second target volume, the audio to be dubbed includes the audio on a track other than the dubbing track After the audio signal is to be recorded, the method further includes:
    响应于检测到用户触发的修改配音信号,展示用于对录制在所述配音音轨上的待录制音频信号进行修改操作的界面;In response to detecting the modified dubbing signal triggered by the user, displaying an interface for modifying the to-be-recorded audio signal recorded on the dubbing track;
    响应于检测到用户触发的结束修改配音信号,保存所述配音音轨上的、修改后的待录制音频信号。In response to detecting the end of the modified dubbing signal triggered by the user, the modified audio signal to be recorded on the dubbing track is saved.
  4. 根据权利要求3所述的方法,其中,所述修改操作包括以下至少一种:删除操作、裁剪操作、重新录制操作。The method according to claim 3, wherein the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
  5. 根据权利要求1-4之一所述的方法,其中,所述第一目标音量和第二目标音量分别是预设的音量,或者分别是由所述用户调整的音量。The method according to any one of claims 1-4, wherein the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
  6. 一种用于处理音频的装置,包括:A device for processing audio, including:
    调整单元,被配置成响应于检测到用户触发的开始配音信号,将待配音音频上的配音音轨的音量调整到第一目标音量,将所述待配音音频包括的、除所述配音音轨之外的其他音轨的音量调整到第二目标音量,其中,所述配音音轨是预先在所述待配音音频上增加的、预设音量的音轨;The adjustment unit is configured to, in response to detecting a user-triggered dubbing start signal, adjust the volume of the dubbing track on the audio to be dubbed to the first target volume, and to exclude the dubbing audio track included in the audio to be dubbed The volume of the other audio tracks is adjusted to the second target volume, where the dubbing audio track is an audio track with a preset volume added to the audio to be dubbed in advance;
    录制单元,被配置成The recording unit is configured as
    获取待录制音频信号,以及将所述待录制音频信号录制在所述配音音轨上;Acquiring an audio signal to be recorded, and recording the audio signal to be recorded on the dubbing soundtrack;
    保存单元,被配置成响应于检测到结束配音信号,以所述第一目标音量保存配音时间段内录制在所述配音音轨上的待录制音频信号,以所述第二目标音量保存所述配音时间段内所述待配音音频包括的、除所述配音音轨之外的其他音轨上的待录制音频信号。The saving unit is configured to, in response to detecting an end dubbing signal, save the to-be-recorded audio signal recorded on the dubbing track during the dubbing time period at the first target volume, and save the audio signal at the second target volume. In the dubbing time period, the audio to be dubbed includes audio signals to be recorded on other audio tracks except the dubbing audio track.
  7. 根据权利要求6所述的装置,其中,所述保存单元进一步被配置成:The device according to claim 6, wherein the storage unit is further configured to:
    将所述配音音轨的音量调整到所述预设音量,将所述待配音音频包括的、除所述配音音轨之外的其他音轨的音量调整到初始音量。The volume of the dubbing audio track is adjusted to the preset volume, and the volume of the audio tracks other than the dubbing audio track included in the audio to be dubbed is adjusted to the initial volume.
  8. 根据权利要求6所述的装置,其中,所述保存单元包括:The device according to claim 6, wherein the storage unit comprises:
    展示模块,被配置成响应于检测到用户触发的修改配音信号,展示用于对录制在所述配音音轨上的待录制音频信号进行修改操作的界面;The display module is configured to display an interface for modifying the audio signal to be recorded recorded on the dubbing track in response to detecting the modified dubbing signal triggered by the user;
    保存模块,被配置成响应于检测到用户触发的结束修改配音信号,保存所述配音音轨上的、修改后的待录制音频信号。The saving module is configured to, in response to detecting the end modification of the dubbing signal triggered by the user, save the modified audio signal to be recorded on the dubbing track.
  9. 根据权利要求8所述的装置,其中,所述修改操作包括以下至少一种:删除操作、裁剪操作、重新录制操作。The apparatus according to claim 8, wherein the modification operation includes at least one of the following: a deletion operation, a cropping operation, and a re-recording operation.
  10. 根据权利要求6-9之一所述的装置,其中,所述第一目标音量和第二目标音量分别是预设的音量,或者分别是由所述用户调整的音量。8. The device according to any one of claims 6-9, wherein the first target volume and the second target volume are respectively preset volume or volume adjusted by the user.
  11. 一种终端设备,包括:A terminal device, including:
    一个或多个处理器;One or more processors;
    存储装置,其上存储有一个或多个程序,A storage device on which one or more programs are stored,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5中任一所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-5.
  12. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-5中任一所述的方法。A computer-readable medium with a computer program stored thereon, wherein the program is executed by a processor to implement the method according to any one of claims 1-5.
PCT/CN2019/127603 2019-01-15 2019-12-23 Method and device for processing audio WO2020147522A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910037108.0 2019-01-15
CN201910037108.0A CN111435600B (en) 2019-01-15 2019-01-15 Method and apparatus for processing audio

Publications (1)

Publication Number Publication Date
WO2020147522A1 true WO2020147522A1 (en) 2020-07-23

Family

ID=71580079

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/127603 WO2020147522A1 (en) 2019-01-15 2019-12-23 Method and device for processing audio

Country Status (2)

Country Link
CN (1) CN111435600B (en)
WO (1) WO2020147522A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000308B (en) * 2020-09-10 2023-04-18 成都拟合未来科技有限公司 Double-track audio playing control method, system, terminal and medium
CN112954390B (en) * 2021-01-26 2023-05-09 北京有竹居网络技术有限公司 Video processing method, device, storage medium and equipment
CN113421577A (en) * 2021-05-10 2021-09-21 北京达佳互联信息技术有限公司 Video dubbing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008002803A2 (en) * 2006-06-29 2008-01-03 Podfitness, Inc. Mixing media files
US20100319518A1 (en) * 2009-06-23 2010-12-23 Virendra Kumar Mehta Systems and methods for collaborative music generation
CN105359214A (en) * 2013-05-03 2016-02-24 石哲 Method for producing media contents in duet mode and apparatus used therein
US20160203830A1 (en) * 2013-10-18 2016-07-14 Apple Inc. Content Aware Audio Ducking
CN106952642A (en) * 2016-01-06 2017-07-14 广州酷狗计算机科技有限公司 The method and apparatus of audio synthesis

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103474055A (en) * 2012-08-06 2013-12-25 苏州沃通信息科技有限公司 Mobile phone KTV solution
CN104978145A (en) * 2015-01-27 2015-10-14 中兴通讯股份有限公司 Recording realization method and apparatus and mobile terminal
CN204795456U (en) * 2015-07-29 2015-11-18 王泰来 Dual track audio playback device
CN105336348B (en) * 2015-11-16 2019-03-05 合一网络技术(北京)有限公司 The processing system and method for Multi-audio-frequency track in video editing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008002803A2 (en) * 2006-06-29 2008-01-03 Podfitness, Inc. Mixing media files
US20100319518A1 (en) * 2009-06-23 2010-12-23 Virendra Kumar Mehta Systems and methods for collaborative music generation
CN105359214A (en) * 2013-05-03 2016-02-24 石哲 Method for producing media contents in duet mode and apparatus used therein
US20160203830A1 (en) * 2013-10-18 2016-07-14 Apple Inc. Content Aware Audio Ducking
CN106952642A (en) * 2016-01-06 2017-07-14 广州酷狗计算机科技有限公司 The method and apparatus of audio synthesis

Also Published As

Publication number Publication date
CN111435600B (en) 2021-05-18
CN111435600A (en) 2020-07-21

Similar Documents

Publication Publication Date Title
US9483110B2 (en) Adaptive media file rewind
WO2021073315A1 (en) Video file generation method and device, terminal and storage medium
WO2021093737A1 (en) Method and apparatus for generating video, electronic device, and computer readable medium
WO2020147522A1 (en) Method and device for processing audio
US11670339B2 (en) Video acquisition method and device, terminal and medium
US8135865B2 (en) Synchronization and transfer of digital media items
WO2021196903A1 (en) Video processing method and device, readable medium and electronic device
WO2023051293A1 (en) Audio processing method and apparatus, and electronic device and storage medium
WO2021114979A1 (en) Video page display method and apparatus, electronic device and computer-readable medium
CN110289024B (en) Audio editing method and device, electronic equipment and storage medium
WO2020143555A1 (en) Method and device used for displaying information
WO2021073205A1 (en) Video processing method and apparatus, storage medium, and electronic device
WO2022142619A1 (en) Method and device for private audio or video call
WO2023284437A1 (en) Media file processing method and apparatus, device, readable storage medium, and product
CN109582274B (en) Volume adjusting method and device, electronic equipment and computer readable storage medium
WO2020224294A1 (en) Method, system, and apparatus for processing information
WO2022179522A1 (en) Recommended video display method and apparatus, medium, and electronic device
US20240103802A1 (en) Method, apparatus, device and medium for multimedia processing
WO2021218646A1 (en) Interaction method and apparatus, and electronic device
CN111460211A (en) Audio information playing method and device and electronic equipment
WO2020143556A1 (en) Method and apparatus used for displaying page
CN105306501A (en) Method and system for performing interactive update on multimedia data
JP4191221B2 (en) Recording / reproducing apparatus, simultaneous recording / reproducing control method, and simultaneous recording / reproducing control program
CN106328174A (en) Method and device for processing recording data
US9767194B2 (en) Media file abbreviation retrieval

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19910061

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19910061

Country of ref document: EP

Kind code of ref document: A1