WO2022042634A1 - 音频数据的处理方法、装置、设备及存储介质 - Google Patents
音频数据的处理方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2022042634A1 WO2022042634A1 PCT/CN2021/114706 CN2021114706W WO2022042634A1 WO 2022042634 A1 WO2022042634 A1 WO 2022042634A1 CN 2021114706 W CN2021114706 W CN 2021114706W WO 2022042634 A1 WO2022042634 A1 WO 2022042634A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound effect
- audio data
- audio
- sound
- data
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title abstract description 7
- 230000000694 effects Effects 0.000 claims abstract description 302
- 238000000034 method Methods 0.000 claims abstract description 91
- 238000012545 processing Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 7
- 230000001960 triggered effect Effects 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 23
- 230000006870 function Effects 0.000 description 8
- 238000009499 grossing Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 238000005562 fading Methods 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 206010044565 Tremor Diseases 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/0482—Interaction with lists of selectable items, e.g. menus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04847—Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/57—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/34—Indicating arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/47205—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Definitions
- the embodiments of the present disclosure relate to the technical field of audio data processing, and in particular, to a method, apparatus, device, and storage medium for processing audio data.
- the video application provided by the related art has a voice-changing function, through which a user can add his/her favorite voice-changing effect to a video.
- a voice-changing function through which a user can add his/her favorite voice-changing effect to a video.
- the way of adding a voice changing effect to a video in the related art cannot meet the needs of users.
- embodiments of the present disclosure provide a method, apparatus, device, and storage medium for processing audio data.
- a first aspect of the embodiments of the present disclosure provides a method for processing audio data, the method includes: acquiring a first playback position on the first audio data, and a user's audition instruction for the first sound effect; adding the first sound effect to On the first audio clip in the first audio data, sound effect audition data is generated, and the sound effect audition data is played, wherein the first audio clip refers to the first audio data with the first playback position as the starting position.
- the audio clip when receiving the user's first adding instruction to the second sound effect, adding the second sound effect to the second audio clip in the first audio data to obtain the second audio data, wherein the first adding instruction includes The information of the first added length of the second sound effect in the first audio data, and the second audio clip refers to the audio clip in the first audio data with the first playback position as the starting position and the length of the first added length.
- a second aspect of the embodiments of the present disclosure provides an apparatus for processing audio data, the apparatus comprising:
- the first obtaining module is configured to obtain the first playback position on the first audio data and the user's audition instruction for the first sound effect.
- a first sound effect adding module configured to add a first sound effect to a first audio clip in the first audio data, generate sound effect audition data, and play the sound effect audition data, wherein the first audio clip refers to the first audio The audio segment starting from the first playback position in the data.
- the first receiving module is configured to receive a user's first adding instruction for the second sound effect, where the first adding instruction includes information of the first adding length of the second sound effect in the first audio data.
- a second sound effect adding module configured to add a second sound effect to a second audio clip in the first audio data to obtain second audio data, where the second audio clip refers to starting from the first playback position position, length is the audio clip of the first added length.
- a third aspect of the embodiments of the present disclosure provides a terminal device, the terminal device includes a memory and a processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the above-mentioned first aspect can be implemented method.
- a fourth aspect of the embodiments of the present disclosure provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the method of the first aspect can be implemented.
- the first sound effect is added to the first audio segment in the first audio data to obtain the sound effect for audition. Audition data. Then play the sound effect audition data, and add the second sound effect to the first audio data according to the information of the first adding length carried in the first adding instruction when receiving the first adding instruction of the second sound effect from the user
- the second audio data is obtained on the second audio segment with the first playback position as the starting position.
- the user can select any position on the audio data to audition the sound effect, and according to the audition result, a satisfactory sound effect can be added to a certain audio segment of the audio data, so that the effect cannot be compared with the sound effect.
- audition the user can select the sound effect he is satisfied with through the audition, and then add the sound effect he is satisfied with to the audio data, so as to ensure that the sound effect that the user finally chooses to add to the audio data is the sound effect that the user is satisfied with, and will not appear.
- a user can add a specific sound effect to a specific audio clip in the audio data, and can add a plurality of sound effects to the multiple audio clips of the audio data correspondingly, which enriches the effect of adding sound effects and improves the Sound effects add interest and enhance the user experience.
- FIG. 1 is a flowchart of a method for processing audio data provided by an embodiment of the present disclosure
- FIGS. 2A-2B are schematic diagrams of an operation method of an operation interface provided by an embodiment of the present disclosure.
- 3A is a schematic diagram of a method for adding sound effects provided by an embodiment of the present disclosure
- 3B is a schematic diagram of another sound effect adding method provided by an embodiment of the present disclosure.
- FIGS. 4A-4B are schematic diagrams of still another sound effect adding method provided by an embodiment of the present disclosure.
- FIGS. 5A-5B are schematic diagrams of still another sound effect adding method provided by an embodiment of the present disclosure.
- FIG. 6 is a schematic diagram of an audio smoothing method provided by an embodiment of the present disclosure.
- FIG. 7 is a flowchart of another audio data processing method provided by an embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of a method for adding sound effects provided by an embodiment of the present disclosure.
- FIG. 9 is a schematic structural diagram of an apparatus for processing audio data provided by an embodiment of the present disclosure.
- FIG. 10 is a schematic structural diagram of a terminal device in an embodiment of the present disclosure.
- FIG. 1 is a flowchart of an audio data processing method provided by an embodiment of the present disclosure, and the method may be executed by a terminal device.
- the terminal device can be exemplarily understood as a device with audio processing capability and video playback capability, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, and a smart TV.
- the terminal device referred to in this embodiment may be equipped with a sound collection device (such as a microphone, but not limited to a microphone).
- the sound effects referred to in this embodiment refer to the ability to be superimposed on the audio data, so that the original audio data produces a specific sound performance effect (for example, a trembling effect, an electronic sound effect, a cartoon character effect, but not limited to these effects) a sound form.
- the terminal device referred to in this embodiment may also be equipped with a shooting device and a sound collection device at the same time. When the shooting device collects video images, the sound collection device collects audio data to generate audio video. The terminal device can use this The method of an embodiment adds one or more sound effects to the audio data of the audio video. As shown in Figure 1, the method provided by this embodiment includes the following steps:
- Step 101 Acquire a first playback position on the first audio data and a user's audition instruction for the first sound effect.
- first audio data in this embodiment is only for the convenience of distinguishing multiple audio data involved in this embodiment, and has no other meaning.
- the first audio data in this embodiment may be understood as audio data of the video to be edited in the video editing interface or audio data recorded in other ways.
- the first audio data referred to in this embodiment may be audio data to which one or more sound effects have been added, or may also be audio data to which no sound effects have been added.
- the sound effect when the first audio data is specifically audio data to which a sound effect has been added, the sound effect may be added to all audio data of the first audio data, or may be added to a certain audio segment of the first audio data .
- the multiple sound effects may be understood as being added to different audio segments of the audio data, for example, adding one sound effect to one audio segment.
- the first playback position in this embodiment can be understood as a position selected by the user from the first audio data, or can also be understood as a default position on the first audio data when the user does not select a position, for example, The start position or the middle position of the audio data, etc., but not limited to these positions.
- the operation of obtaining the first playback position on the first audio data and the user's operation of listening to the first sound effect instruction in this embodiment can be triggered by at least one of the following user operations: a user's preset operation on the operation interface , preset voice commands.
- FIGS. 2A-2B are schematic diagrams of an operation method of an operation interface provided by an embodiment of the present disclosure.
- the operation interface includes a display area 21 , a first operation area 22 and a second operation area 23 .
- the first operation area 22 includes a playback progress control, and by moving the position of the focus on the playback progress control, the playback position of the first audio data can be selected.
- the second operation area 23 includes at least one sound effect control, wherein each sound effect control corresponds to one sound effect.
- the interfaces shown in FIG. 2A and FIG. 2B may be specifically a video editing interface.
- the display area 21 is used to display the video image, and the playback progress control in the first operation area 22 is used to adjust the playback position of the video.
- the play position is mapped to the first audio data to obtain the play position of the first audio data (ie, the so-called first play position in this embodiment).
- the interfaces shown in FIGS. 2A and 2B may be specifically audio editing interfaces, and the display area 21 may be used for displaying The sound curve of the first audio data, and the focus position on the playback progress control indicates the playback position of the first audio data.
- the user can perform a click operation on any position of the playback progress axis (ie, the line where the focus is in FIG. 2A ).
- the interface shown in FIG. 2A may display prompt information for prompting the user to select the starting position of the sound effect, and the prompt information can help the user to quickly select a position on the playback progress axis .
- the focus of the playback progress control will move to the position where the user clicked.
- the first playback position referred to in this embodiment can be obtained by mapping the position of the focus on the playback progress axis to the first audio data.
- the user can also drag the focus on the playback progress axis to move according to the prompt information, and when the focus stops moving, map the position of the focus on the playback progress axis to the first audio data to obtain this embodiment.
- the so-called first playback position can be obtained by mapping the position of the focus on the playback progress axis to the first audio data.
- the user can also drag the focus on the playback progress axis to move according to the prompt information, and when the focus stops moving, map the position of the focus on the playback progress axis to the first audio data to obtain this embodiment.
- the so-called first playback position can be obtained by mapping the position of the focus on the playback progress axis to the first audio data.
- the user triggers the audition instruction by performing a preset operation on a certain sound effect control on the second operation area 23, wherein the preset operation may be, for example, clicking a certain sound effect control , one of clicking a sound effect control multiple times in succession and sliding a sound effect control in a preset direction, but not limited to the above preset operations.
- the preset operation may be, for example, clicking a certain sound effect control , one of clicking a sound effect control multiple times in succession and sliding a sound effect control in a preset direction, but not limited to the above preset operations.
- the audition command of the sound effect corresponding to the sound effect control 2 is triggered, and the sound effect of the sound effect control 2 is the first sound effect in this embodiment.
- the user may continue to perform preset operations on other sound effect controls in the second operation area 23 to trigger the audition of the sound effects of other sound effect controls.
- the display interface of FIG. 2B can also be used to display a prompt message on how to audition sound effects, so as to teach the user how to audition audio effects quickly and reduce the difficulty of use for the user.
- Step 102 Add the first sound effect to the first audio clip in the first audio data, generate sound effect audition data, and play the sound effect audition data, wherein the first audio clip refers to the first audio clip in the first audio data.
- the methods for adding the first sound effect in this embodiment include the following:
- the first sound effect may be added to all audio data of the first audio data after the first playback position.
- FIG. 3A is a schematic diagram of a method for adding sound effects provided by an embodiment of the present disclosure.
- the first audio data is exemplarily and specifically audio data with a length of m seconds.
- the first playback position of is exemplarily the position of the f-th second of the audio data.
- the first sound effect may be added to the audio clip from the fth second to the mth second.
- the first sound effect may be added to the first audio data with the first playback position as the starting position, and the length of the audio clip with a preset duration (such as 1 second, but not limited to 1 second) middle.
- FIG. 3B is a schematic diagram of another sound effect adding method provided by an embodiment of the present disclosure.
- the first audio data is exemplified as audio data with a length of G seconds.
- the first playback position of is exemplarily the position of the Kth second of the audio data.
- the first sound effect may be added to the audio data of the first audio data within the preset time period after the Kth second.
- this embodiment only provides the above two methods for adding the first sound effect, it does not mean that there are only the above two methods for adding the first sound effect in the embodiment.
- the method for adding the first sound effect can be It is not necessary to be limited to a certain or several fixed adding methods to be set as required.
- the focus on the playback progress control moves synchronously on the playback progress axis.
- the position of the focus on the playback progress axis will be restored to the starting position of the sound effect selected by the user. That is to say, after the sound effect audition data is played, the playback position of the first audio data referred to in this embodiment will be restored to the first playback position referred to in this embodiment.
- this is only an implementation manner of this embodiment, but not all manners. In fact, in other implementation manners, after the sound effect audition data is played, the focus position may not be restored to the initial position selected by the user.
- This embodiment restores the playback position of the first audio data to the first playback position after the sound effect audition data is played, which can facilitate the user to audition multiple sound effects.
- the user needs to audition other sound effects other than the current sound effect , there is no need to re-select the starting position of the sound effect, which simplifies the user's operation and improves the user experience.
- the user is satisfied with the current sound effect, there is no need to search again for the starting position of the sound effect during the audition, which simplifies the user operation and improves the accuracy of adding the sound effect.
- Step 103 Receive a first adding instruction for the second sound effect from the user, where the first adding instruction includes information of the first adding length of the second sound effect in the first audio data.
- the first adding instruction referred to in this embodiment may be exemplarily triggered by a preset interface operation or a voice instruction, wherein the preset interface operation may be, for example, an operation of long-pressing the sound effect control, an operation of sliding in a preset direction, and continuous operation. Actions for clicking sound effects controls, but not limited to these actions.
- prompt information for prompting the user how to add sound effects to the first audio data may also be displayed. A sound effect can be quickly added to the first audio data.
- the user can quickly obtain the method for adding sound effects, and smoothly add sound effects to the audio data, which improves the fluency and experience of the user's operation. .
- the information of the first added length referred to in this embodiment may exemplarily include the information of the cut-off position of the sound effect or the information of the duration of the sound effect.
- Step 104 Add the second sound effect to the second audio clip in the first audio data to obtain the second audio data, wherein the second audio clip refers to the first playback position as the starting position and the length of the first audio clip. length of audio clip.
- the second sound effect referred to in this embodiment may be the first sound effect that the user has tried to listen to, or may be the unaudited sound effect selected by the user from the operation interface.
- the method for adding the second sound effect may at least include the following:
- FIGS. 4A-4B are schematic diagrams of another sound effect adding method provided by an embodiment of the present disclosure.
- the playback position f is the starting position for adding sound effects (ie, the first playback position) selected by the user. . As shown in FIG.
- the sound effect control n when the user presses the sound effect control n for a long time, if it is detected that the user's pressing time is longer than the first preset time (such as 0.5 seconds, but not limited to 0.5 seconds), the sound effect control starts from the playback position f.
- the sound effect corresponding to n (that is, the second sound effect) is added to the first audio data.
- the effect after the sound effect is added is played from the playback position f, and the focus on the playback progress control is on the playback progress axis according to the playback of the audio data. Move in the direction of the arrow in FIG. 4A . As shown in FIG.
- the user can trigger the adding operation of the sound effect by sliding the sound effect control in the preset direction.
- the sound effect corresponding to the sound effect control ie the second sound effect
- adding the sound effect is stopped, and the ending position of the sound effect on the first audio data is temporarily referred to as the third playback position.
- the audio clip located between the first playback position and the third playback position on the first audio data is the second audio clip in this embodiment.
- FIGS. 5A-5B are schematic diagrams of another sound effect adding method provided by an embodiment of the present disclosure.
- the playing position f is the starting position (ie, the first playing position) of the sound effect selected by the user. .
- the sound effect corresponding to the sound effect control n ie the second sound effect
- the focus on the playback progress control moves along the playback progress axis in the direction of the arrow in FIG. 5A along with the playback of the first audio data.
- FIG. 5A the playing position f is the starting position (ie, the first playing position) of the sound effect selected by the user.
- the sound effect corresponding to the sound effect control n ie the second sound effect
- the focus on the playback progress control moves along the playback progress axis in the direction of the arrow in FIG. 5A along with the playback of the first audio data.
- the first audio data corresponds to the position f to the position u. audio segment, that is, the second audio segment in this embodiment.
- the user can trigger the adding operation of the sound effect by continuously clicking the sound effect control.
- the sound effect corresponding to the sound effect control ie, the second sound effect
- the second audio clip is added to the first audio data from the first playback position.
- the audio clip during the period from the first playback position to the stop of adding the sound effect is referred to as the second audio clip in this embodiment.
- the present embodiment may continue to use the above method to obtain the second playback position on the second audio data and the user's second adding instruction to the third sound effect, and according to the second Add an instruction to add the third sound effect to the third audio clip in the second audio data to obtain the third audio data, and the third audio data obtained in this way is the audio data including two kinds of sound effects, wherein the third audio clip can be It is understood that in the second audio data, the second playback position is the starting position, and the length is the audio segment of the second added length.
- FIG. 6 is a schematic diagram of an audio smoothing method provided by an embodiment of the present disclosure, and the width between two horizontal lines in FIG. 6 represents the volume. As shown in FIG. 6 , it is assumed that the sound effect 61 in FIG.
- the sound effect 62 is the third sound effect in this embodiment
- the end position 63 of the second sound effect is the start of the third sound effect the starting position
- the second sound effect needs to be faded out, so that the second sound effect starts to reduce the volume at a position 64 which is a preset distance from the end position 63, and drops to 0 at the end position 63.
- the third sound effect can be gradually processed from the position 64, so that the volume of the third sound effect gradually increases from the position 64, and ends at the end position. 63 to the set value.
- the smoothing method provided in FIG. 6 is only illustrative, and not the only limitation of this embodiment. In fact, the related art has already provided a variety of audio smoothing methods, and this embodiment can adopt any one of them as required.
- the first sound effect is added to the first audio segment in the first audio data to obtain a sound effect for audition listening. data, then play the sound effect audition data, and when receiving the user's first adding instruction for the second sound effect, add the second sound effect to the first audio according to the information of the first adding length carried in the first adding instruction
- the second audio data is obtained on the second audio segment whose starting position is the first playback position in the data.
- the user can select any position on the audio data to audition the sound effect, and according to the audition result, a satisfactory sound effect can be added to a certain audio segment of the audio data, so that the audition cannot be performed compared to the effect of adding the sound effect.
- the user can select the sound effect he is satisfied with through audition, and then add the sound effect he is satisfied with to the audio data, so as to ensure that the sound effect that the user finally chooses to add to the audio data is the sound effect that the user is satisfied with, and will not appear because of The added sound effects are not satisfied and are repeatedly added, which simplifies the user's operation and improves the user experience.
- a user can add a specific sound effect to a specific audio clip in the audio data, and can add multiple sound effects to the multiple audio clips of the audio data correspondingly, which improves the fun of adding sound effects and enhances the user experience.
- FIG. 7 is a flowchart of another audio data processing method provided by an embodiment of the present disclosure. As shown in FIG. 7 , the above-mentioned method for adding a first sound effect includes the following steps:
- Step 701 Identify the target sound from the first audio segment based on a preset sound recognition model.
- Step 702 Add the first sound effect to the target sound in the first audio clip to generate audition data of the first sound effect on the first audio clip.
- the target sound referred to in this embodiment includes at least one of the following: human voice, animal sound, vehicle sound, musical instrument sound, background sound, and foreground sound.
- the voice recognition model in this embodiment can be exemplarily understood as a model that is pre-trained by a model training method and can identify and extract one or more of the above target voices.
- the model may be at least one of the following models, but is not limited to the following models: a support vector machine model, a deep neural network model, a logistic regression model, and the like.
- this embodiment may provide the user with an editing interface, which may include at least one sound options.
- the user can select one or more of these sound options as the target sound, and then the sound recognition model for recognizing the target sound is called, and the target sound is identified from the first audio segment through the model, and the result is obtained. where the target sounds are located.
- FIG. 8 is a schematic diagram of a method for adding sound effects provided by an embodiment of the present disclosure.
- the horizontal lines in FIG. 8 represent audio data.
- the positions of the target sounds identified based on the sound recognition model are q and w .
- the method for adding sound effects in this embodiment may also be applied to the process of adding a second sound effect in the above-mentioned embodiment, that is, a second sound effect can be added to the above-mentioned second sound effect based on the method in this embodiment.
- the specific adding method may refer to the method in this embodiment, which will not be repeated here.
- the target sound in the audio clip is identified by the preset sound recognition model, and the sound effect is added to the target sound of the audio clip, which can make the effect of adding the sound effect more interesting and improve the user experience.
- FIG. 9 is a schematic structural diagram of an apparatus for processing audio data provided by an embodiment of the present disclosure, and the processing apparatus may be understood as the above-mentioned terminal device or a part of functional modules in the above-mentioned terminal device. As shown in FIG. 9, the processing device 90 includes:
- the first obtaining module 91 is configured to obtain the first playback position on the first audio data and the user's audition instruction for the first sound effect.
- the first sound effect adding module 92 is configured to add the first sound effect to the first audio segment in the first audio data, generate sound effect audition data, and play the sound effect audition data, wherein the The first audio segment refers to an audio segment starting from the first playback position in the first audio data.
- the first receiving module 93 is configured to receive the first addition instruction of the second sound effect by the user, and the first addition instruction includes the information of the first addition length of the second sound effect in the first audio data.
- the second sound effect adding module 94 is configured to add the second sound effect to a second audio segment in the first audio data to obtain second audio data, wherein the second audio segment refers to the The first playback position is the start position, and the length is the audio clip of the first added length.
- the first audio data is audio data of the video to be edited in the video editing interface.
- the first obtaining module 91 includes:
- a first display unit configured to display the video editing interface, where the video editing interface includes a video playback progress control and a sound effect control for the first sound effect.
- An acquiring unit configured to acquire the first playback position selected by the user through the playback progress control, and the audition instruction triggered by the user through the sound effect control.
- the processing device 90 may further include:
- the restoring module is configured to restore the playing position of the first audio data to the first playing position after the sound effect audition data is played.
- the first sound effect adding module 92 includes:
- the recognition unit is configured to recognize the target sound from the first audio segment based on a preset sound recognition model.
- a sound effect adding unit configured to add the first sound effect to the target sound in the first audio clip, and generate audition data of the first sound effect on the first audio clip.
- the processing device 90 may further include:
- a display module for displaying at least one sound option.
- a determination module configured to determine a sound selected by the user from the at least one sound option as a target sound.
- the target sound includes at least one of the following:
- the second sound effect adding module 94 is configured to: add the second sound effect to the target sound in the second audio segment.
- the processing device 90 may further include:
- a second acquiring module configured to acquire the second playback position on the second audio data, and the user's second adding instruction to the third sound effect, where the second adding instruction includes the third sound effect in the information of the second added length on the second audio data.
- a third sound effect adding module configured to add the third sound effect to a third audio segment in the second audio data to obtain third audio data, where the third audio segment refers to the second audio In the data, the second playback position is taken as the starting position, and the length is the audio clip of the second added length.
- the processing device 90 may further include:
- a smoothing processing module configured to perform fade-out processing on the second sound effect when the end position of the second sound effect and the second playback position are two consecutive playback positions on the third audio data, and The third sound effect is fade-in.
- the apparatus provided in this embodiment can execute the method in any of the foregoing embodiments in FIG. 1 to FIG. 8 , and the execution manner and beneficial effects thereof are similar, and details are not described herein again.
- An embodiment of the present disclosure further provides a terminal device, the terminal device includes a processor and a memory, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the above-mentioned FIG. 1-FIG. The method of any one of 8.
- FIG. 10 is a schematic structural diagram of a terminal device in an embodiment of the present disclosure.
- the terminal device 1000 in the embodiment of the present disclosure may include, but is not limited to, such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle-mounted terminal ( For example, mobile terminals such as car navigation terminals) and the like, and stationary terminals such as digital TVs, desktop computers, and the like.
- the terminal device shown in FIG. 10 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present disclosure.
- a terminal device 1000 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 1001, which may be loaded into random access according to a program stored in a read only memory (ROM) 1002 or from a storage device 1008 Various appropriate operations and processes are executed by the programs in the memory (RAM) 1003 . In the RAM 1003, various programs and data required for the operation of the terminal device 1000 are also stored.
- the processing device 1001, the ROM 1002, and the RAM 1003 are connected to each other through a bus 1004.
- An input/output (I/O) interface 1005 is also connected to the bus 1004 .
- the following devices can be connected to the I/O interface 1005: input devices 1006 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration
- An output device 1007 such as a computer
- a storage device 1008 including, for example, a magnetic tape, a hard disk, etc.
- the communication means 1009 may allow the terminal device 1000 to communicate wirelessly or wiredly with other devices to exchange data.
- FIG. 13 shows the terminal device 1000 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
- embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
- the computer program may be downloaded and installed from the network via the communication device 1009, or from the storage device 1008, or from the ROM 1002.
- the processing apparatus 1001 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
- the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
- the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
- Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
- clients and servers can communicate using any currently known or future developed network protocols such as HTTP (HyperText Transfer Protocol), and can communicate with digital data in any form or medium.
- Communication eg, a communication network
- Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
- the above-mentioned computer-readable medium may be included in the above-mentioned terminal device; or may exist independently without being assembled into the terminal device.
- the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the terminal device, the terminal device: obtains the first playback position on the first audio data, and the user responds to the first sound effect.
- the audition instruction ; adding the first sound effect to the first audio clip in the first audio data, generating the sound effect audition data, and playing the sound effect audition data, wherein the first audio clip refers to all the sound effects in the first audio data.
- the first playback position is the audio clip of the starting position; then when receiving the first addition instruction of the second sound effect from the user, the second sound effect is added to the second audio clip in the first audio data, and the second sound effect is obtained.
- the first adding instruction includes the information of the first adding length of the second sound effect in the first audio data
- the second audio segment refers to the first audio data with the first playback position as the starting position, and the length is The first add-length audio clip.
- Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
- the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
- LAN local area network
- WAN wide area network
- each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
- the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
- the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.
- exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
- FPGAs Field Programmable Gate Arrays
- ASICs Application Specific Integrated Circuits
- ASSPs Application Specific Standard Products
- SOCs Systems on Chips
- CPLDs Complex Programmable Logical Devices
- a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
- the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
- Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
- machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read only memory
- EPROM or flash memory erasable programmable read only memory
- CD-ROM compact disk read only memory
- magnetic storage or any suitable combination of the foregoing.
- An embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the method of any of the foregoing embodiments in FIG. 1 to FIG. 8 can be implemented, The implementation manner and beneficial effects thereof are similar, and will not be repeated here.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Security & Cryptography (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
- User Interface Of Digital Computer (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
Claims (12)
- 一种音频数据的处理方法,其特征在于,包括:获取第一音频数据上的第一播放位置,以及用户对第一音效的试听指令;将所述第一音效添加到所述第一音频数据中的第一音频片段上,生成音效试听数据,并对所述音效试听数据进行播放,其中,所述第一音频片段是指所述第一音频数据中以所述第一播放位置为起始位置的音频片段;接收到用户对第二音效的第一添加指令,所述第一添加指令包括所述第二音效在所述第一音频数据中的第一添加长度的信息;将所述第二音效添加到所述第一音频数据中的第二音频片段上,得到第二音频数据,其中,所述第二音频片段是指以所述第一播放位置为起始位置,长度为所述第一添加长度的音频片段。
- 根据权利要求1所述的方法,其特征在于,所述第一音频数据为视频编辑界面中待编辑视频的音频数据。
- 根据权利要求2所述的方法,其特征在于,所述获取第一音频数据上的第一播放位置,以及用户对第一音效的试听指令,包括:显示所述视频编辑界面,所述视频编辑界面中包括视频的播放进度控件和所述第一音效的音效控件;获取用户通过所述播放进度控件选择得到的所述第一播放位置,以及所述用户通过所述音效控件触发的所述试听指令。
- 根据权利要求1-3中任一项所述的方法,其特征在于,所述方法还包括:在所述音效试听数据播放完毕之后,将所述第一音频数据的播放位置恢复到所述第一播放位置。
- 根据权利要求1所述的方法,其特征在于,所述将所述第一音效添加到所述第一音频数据中的第一音频片段上,生成音效试听数据,包括:基于预设的声音识别模型从所述第一音频片段中识别出目标声音;将所述第一音效添加到所述第一音频片段中的所述目标声音上,生成所述第一音效在所述第一音频片段上的试听数据。
- 根据权利要求5所述的方法,其特征在于,所述基于预设的声音识别模型从所述第一音频片段中识别出目标声音之前,所述方法还包括:显示至少一个声音选项;将所述用户从所述至少一个声音选项中选择的声音确定为目标声音。
- 根据权利要求5或6所述的方法,其特征在于,所述将所述第二音效添加到所述第一音频数据中的第二音频片段上,包括:将所述第二音效添加到所述第二音频片段中的所述目标声音上。
- 根据权利要求1所述的方法,其特征在于,所述将所述第二音效添加到所述第一音频数据中的第二音频片段上,得到第二音频数据之后,所述方法还包括:获取所述第二音频数据上的第二播放位置,以及所述用户对第三音效的第二添加指令,所述第二添加指令中包括所述第三音效在所述第二音频数据上的第二添加长度的信息;将所述第三音效添加到所述第二音频数据中的第三音频片段上,得到第三音频数据,其中所述第三音频片段是指所述第二音频数据中以所述第二播放位置为起始位置,长度为所述第二添加长度的音频片段。
- 根据权利要求8所述的方法,其特征在于,所述方法还包括:若所述第二音效的结束位置和所述第二播放位置是所述第三音频数据上的两个连续的播放位置,则对所述第二音效进行淡出处理,对所述第三音效进行淡入处理。
- 一种音频数据的处理装置,其特征在于,包括:第一获取模块,用于获取第一音频数据上的第一播放位置,以及用户对第一音效的试听指令;第一音效添加模块,用于将所述第一音效添加到所述第一音频数据中的第一音频片段上,生成音效试听数据,并对所述音效试听数据进行播放,其中,所述第一音频片段是指所述第一音频数据中以所述第一播放位置为起始位置的音频片段;第一接收模块,用于接收到用户对第二音效的第一添加指令,所述第一添加指令包括所述第二音效在所述第一音频数据中的第一添加长度的信 息;第二音效添加模块,用于将所述第二音效添加到所述第一音频数据中的第二音频片段上,得到第二音频数据,其中,所述第二音频片段是指以所述第一播放位置为起始位置,长度为所述第一添加长度的音频片段。
- 一种终端设备,其特征在于,包括:存储器和处理器,其中,所述存储器中存储有计算机程序,当所述计算机程序被所述处理器执行时,实现如权利要求1-9中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,所述存储介质中存储有计算机程序,当所述计算机程序被处理器执行时,实现如权利要求1-9中任一项所述的方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21860468.4A EP4192021A4 (en) | 2020-08-26 | 2021-08-26 | AUDIO DATA PROCESSING METHOD AND DEVICE AS WELL AS DEVICE AND STORAGE MEDIUM |
US18/023,286 US20230307004A1 (en) | 2020-08-26 | 2021-08-26 | Audio data processing method and apparatus, and device and storage medium |
JP2023513340A JP2023538943A (ja) | 2020-08-26 | 2021-08-26 | オーディオデータの処理方法、装置、機器及び記憶媒体 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010873112.3 | 2020-08-26 | ||
CN202010873112.3A CN112165647B (zh) | 2020-08-26 | 2020-08-26 | 音频数据的处理方法、装置、设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022042634A1 true WO2022042634A1 (zh) | 2022-03-03 |
Family
ID=73860283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/114706 WO2022042634A1 (zh) | 2020-08-26 | 2021-08-26 | 音频数据的处理方法、装置、设备及存储介质 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230307004A1 (zh) |
EP (1) | EP4192021A4 (zh) |
JP (1) | JP2023538943A (zh) |
CN (1) | CN112165647B (zh) |
WO (1) | WO2022042634A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115086708A (zh) * | 2022-06-06 | 2022-09-20 | 北京奇艺世纪科技有限公司 | 一种视频播放方法、装置、电子设备及存储介质 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112165647B (zh) * | 2020-08-26 | 2022-06-17 | 北京字节跳动网络技术有限公司 | 音频数据的处理方法、装置、设备及存储介质 |
CN113838451B (zh) * | 2021-08-17 | 2022-09-23 | 北京百度网讯科技有限公司 | 语音处理及模型训练方法、装置、设备和存储介质 |
CN113891151A (zh) * | 2021-09-28 | 2022-01-04 | 北京字跳网络技术有限公司 | 一种音频处理方法、装置、电子设备和存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102724423A (zh) * | 2011-09-30 | 2012-10-10 | 新奥特(北京)视频技术有限公司 | 一种素材分段处理的方法及装置 |
WO2017013762A1 (ja) * | 2015-07-22 | 2017-01-26 | Pioneer DJ株式会社 | 音処理装置および音処理方法 |
CN106559572A (zh) * | 2016-11-15 | 2017-04-05 | 努比亚技术有限公司 | 杂音定位方法及装置 |
CN108965757A (zh) * | 2018-08-02 | 2018-12-07 | 广州酷狗计算机科技有限公司 | 视频录制方法、装置、终端及存储介质 |
CN109346111A (zh) * | 2018-10-11 | 2019-02-15 | 广州酷狗计算机科技有限公司 | 数据处理方法、装置、终端及存储介质 |
CN111142838A (zh) * | 2019-12-30 | 2020-05-12 | 广州酷狗计算机科技有限公司 | 音频播放方法、装置、计算机设备及存储介质 |
CN112165647A (zh) * | 2020-08-26 | 2021-01-01 | 北京字节跳动网络技术有限公司 | 音频数据的处理方法、装置、设备及存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9728225B2 (en) * | 2013-03-12 | 2017-08-08 | Cyberlink Corp. | Systems and methods for viewing instant updates of an audio waveform with an applied effect |
CN106155623B (zh) * | 2016-06-16 | 2017-11-14 | 广东欧珀移动通信有限公司 | 一种音效配置方法、系统及相关设备 |
US10062367B1 (en) * | 2017-07-14 | 2018-08-28 | Music Tribe Global Brands Ltd. | Vocal effects control system |
CN109754825B (zh) * | 2018-12-26 | 2021-02-19 | 广州方硅信息技术有限公司 | 一种音频处理方法、装置、设备及计算机可读存储介质 |
WO2020151008A1 (en) * | 2019-01-25 | 2020-07-30 | Microsoft Technology Licensing, Llc | Automatically adding sound effects into audio files |
CN110377212B (zh) * | 2019-06-28 | 2021-03-16 | 上海元笛软件有限公司 | 通过音频触发显示的方法、装置、计算机设备和存储介质 |
-
2020
- 2020-08-26 CN CN202010873112.3A patent/CN112165647B/zh active Active
-
2021
- 2021-08-26 WO PCT/CN2021/114706 patent/WO2022042634A1/zh active Application Filing
- 2021-08-26 EP EP21860468.4A patent/EP4192021A4/en active Pending
- 2021-08-26 US US18/023,286 patent/US20230307004A1/en active Pending
- 2021-08-26 JP JP2023513340A patent/JP2023538943A/ja active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102724423A (zh) * | 2011-09-30 | 2012-10-10 | 新奥特(北京)视频技术有限公司 | 一种素材分段处理的方法及装置 |
WO2017013762A1 (ja) * | 2015-07-22 | 2017-01-26 | Pioneer DJ株式会社 | 音処理装置および音処理方法 |
CN106559572A (zh) * | 2016-11-15 | 2017-04-05 | 努比亚技术有限公司 | 杂音定位方法及装置 |
CN108965757A (zh) * | 2018-08-02 | 2018-12-07 | 广州酷狗计算机科技有限公司 | 视频录制方法、装置、终端及存储介质 |
CN109346111A (zh) * | 2018-10-11 | 2019-02-15 | 广州酷狗计算机科技有限公司 | 数据处理方法、装置、终端及存储介质 |
CN111142838A (zh) * | 2019-12-30 | 2020-05-12 | 广州酷狗计算机科技有限公司 | 音频播放方法、装置、计算机设备及存储介质 |
CN112165647A (zh) * | 2020-08-26 | 2021-01-01 | 北京字节跳动网络技术有限公司 | 音频数据的处理方法、装置、设备及存储介质 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4192021A4 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115086708A (zh) * | 2022-06-06 | 2022-09-20 | 北京奇艺世纪科技有限公司 | 一种视频播放方法、装置、电子设备及存储介质 |
CN115086708B (zh) * | 2022-06-06 | 2024-03-08 | 北京奇艺世纪科技有限公司 | 一种视频播放方法、装置、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112165647A (zh) | 2021-01-01 |
CN112165647B (zh) | 2022-06-17 |
EP4192021A1 (en) | 2023-06-07 |
JP2023538943A (ja) | 2023-09-12 |
EP4192021A4 (en) | 2024-01-03 |
US20230307004A1 (en) | 2023-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022042634A1 (zh) | 音频数据的处理方法、装置、设备及存储介质 | |
JP7181320B2 (ja) | 背景音楽を選択して動画を撮影する方法、装置、端末機及び媒体 | |
JP7387891B2 (ja) | 動画ファイルの生成方法、装置、端末及び記憶媒体 | |
KR20220103110A (ko) | 비디오 생성 장치 및 방법, 전자 장치, 및 컴퓨터 판독가능 매체 | |
US11670339B2 (en) | Video acquisition method and device, terminal and medium | |
CN110677711A (zh) | 视频配乐方法、装置、电子设备及计算机可读介质 | |
CN111970571B (zh) | 视频制作方法、装置、设备及存储介质 | |
CN109346111B (zh) | 数据处理方法、装置、终端及存储介质 | |
CN111833460B (zh) | 增强现实的图像处理方法、装置、电子设备及存储介质 | |
CN113257218B (zh) | 语音合成方法、装置、电子设备和存储介质 | |
CN109600559B (zh) | 一种视频特效添加方法、装置、终端设备及存储介质 | |
WO2021057740A1 (zh) | 视频生成方法、装置、电子设备和计算机可读介质 | |
US11272136B2 (en) | Method and device for processing multimedia information, electronic equipment and computer-readable storage medium | |
CN112153460A (zh) | 一种视频的配乐方法、装置、电子设备和存储介质 | |
WO2023051293A1 (zh) | 一种音频处理方法、装置、电子设备和存储介质 | |
EP4322537A1 (en) | Video processing method and apparatus, electronic device, and storage medium | |
WO2022160603A1 (zh) | 歌曲的推荐方法、装置、电子设备及存储介质 | |
WO2023011318A1 (zh) | 媒体文件处理方法、装置、设备、可读存储介质及产品 | |
CN110798327A (zh) | 消息处理方法、设备及存储介质 | |
CN110413834B (zh) | 语音评论修饰方法、系统、介质和电子设备 | |
WO2024037480A1 (zh) | 交互方法、装置、电子设备和存储介质 | |
CN111312280B (zh) | 用于控制语音的方法和装置 | |
US20230403413A1 (en) | Method and apparatus for displaying online interaction, electronic device and computer readable medium | |
WO2023040633A1 (zh) | 一种视频生成方法、装置、终端设备及存储介质 | |
CN115756258A (zh) | 音频特效的编辑方法、装置、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21860468 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202327011688 Country of ref document: IN |
|
ENP | Entry into the national phase |
Ref document number: 2023513340 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021860468 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2021860468 Country of ref document: EP Effective date: 20230227 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |