CN111179970B - Audio and video processing method, synthesis device, electronic equipment and storage medium - Google Patents

Audio and video processing method, synthesis device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111179970B
CN111179970B CN201910713206.1A CN201910713206A CN111179970B CN 111179970 B CN111179970 B CN 111179970B CN 201910713206 A CN201910713206 A CN 201910713206A CN 111179970 B CN111179970 B CN 111179970B
Authority
CN
China
Prior art keywords
audio data
processed
specific
audio
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910713206.1A
Other languages
Chinese (zh)
Other versions
CN111179970A (en
Inventor
王胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910713206.1A priority Critical patent/CN111179970B/en
Publication of CN111179970A publication Critical patent/CN111179970A/en
Application granted granted Critical
Publication of CN111179970B publication Critical patent/CN111179970B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/57Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for processing of video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/92Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
    • H04N5/9201Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving the multiplexing of an additional signal and the video signal
    • H04N5/9202Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving the multiplexing of an additional signal and the video signal the additional signal being a sound signal

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses an audio and video processing method, a synthesizing device, electronic equipment and a storage medium, wherein the audio and video processing method comprises the following steps: acquiring audio data to be processed, wherein the audio data to be processed is generated by acquiring source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; determining the position of second specific audio data in the audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise; and removing the second specific audio data and the delay audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain target audio. The audio and video processing method, the synthesis method, the device, the electronic equipment and the storage medium provided by the invention solve the problem of discontinuous background music in the audio and video synthesis process in the short video multi-segment recording process in the prior art.

Description

Audio and video processing method, synthesis device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an audio and video processing method, a synthesis method, an apparatus, an electronic device, and a storage medium.
Background
The short video multi-section recording is that a user records a plurality of sections of different short videos respectively and then synthesizes the short videos into a complete video, and simultaneously, the user can configure corresponding audio data for different short videos according to the actual needs of application scenes so as to be added into the complete video as background music when the plurality of sections of short videos are synthesized into the complete video, thereby enhancing the practicability, the interestingness and the like of the complete video.
However, the inventors realized that during the multi-segment recording of short videos there is often a pause due to the recording between different short videos, thereby introducing a delay, resulting in a discontinuity in the background music added when synthesizing a complete video.
From the above, the problem of discontinuous background music during audio and video synthesis in the short video multi-segment recording process is still to be solved.
Disclosure of Invention
In order to solve the problem of discontinuous background music during audio and video synthesis in a short video multi-segment recording process in the related art, embodiments of the present invention provide an audio and video processing method, a synthesizing method, an apparatus, an electronic device, and a storage medium.
The technical scheme adopted by the invention is as follows:
according to an aspect of an embodiment of the present invention, an audio/video processing method includes: acquiring audio data to be processed, wherein the audio data to be processed is generated by acquiring source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; determining the position of second specific audio data in the audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise; and removing the second specific audio data and the delay audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain target audio.
According to an aspect of an embodiment of the present invention, an audio/video synthesis method includes: in the short video multi-section recording process, aiming at source audio data respectively configured for a plurality of sections of short videos, acquiring a plurality of pieces of audio data to be processed, wherein each piece of audio data to be processed corresponds to a section of short video, and the audio data to be processed is generated by collecting source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; determining the position of second specific audio data in the audio data to be processed according to each piece of audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise; removing the second specific audio data and the delay audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, and obtaining target audio corresponding to a short video; and synthesizing the complete video according to the multiple target audios corresponding to the different short videos and the recorded multiple short videos.
According to an aspect of an embodiment of the present invention, an audio/video processing apparatus includes: the data acquisition module is used for acquiring audio data to be processed, wherein the audio data to be processed is generated by acquiring source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; the position determining module is used for determining the position of second specific audio data in the audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise; and the data removing module is used for removing the second specific audio data and the delay audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain target audio.
In an exemplary embodiment, the apparatus further comprises: a data generation module for generating the first specific audio data; the data splicing module is used for acquiring the source audio data and splicing the source audio data with the first specific audio data to obtain spliced audio data; the data acquisition module is used for controlling the loudspeaker to output the spliced audio data and controlling the microphone to acquire the spliced audio data so as to generate the audio data to be processed.
In an exemplary embodiment, the data acquisition module includes: the recording unit is used for recording the voice segment to obtain the first specific audio data; or, a selecting unit, configured to select a piece of audio data from an audio library as the first specific audio data.
In an exemplary embodiment, the data removal module includes: a termination position point determining unit, configured to determine a termination position point of the second specific audio data in the audio data to be processed according to a position of the second specific audio data in the audio data to be processed; a starting position point determining unit, configured to take an ending position point of the second specific audio data in the audio data to be processed as a starting position point of the target audio in the audio data to be processed; and the data extraction unit is used for extracting the target audio from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed.
In an exemplary embodiment, the sound effect processing module includes: and the audio mixing unit is used for mixing the source audio data with the target audio.
In an exemplary embodiment, the apparatus further comprises: the distribution module is used for obtaining a plurality of target audios aiming at source audio data configured for each short video in the short video multi-segment recording process, wherein each target audio corresponds to one short video; the synthesizing module is used for synthesizing a plurality of target audios into background music and synthesizing a plurality of short videos into a complete video; and an adding module for adding the background music to the complete video.
According to an aspect of an embodiment of the present invention, an audio/video synthesizing apparatus includes: the data acquisition module is used for acquiring a plurality of pieces of audio data to be processed aiming at source audio data which are respectively configured for a plurality of pieces of short videos in the short video multi-section recording process, wherein each piece of audio data to be processed corresponds to a short video, and the audio data to be processed is generated by acquiring the source audio data which are output by a microphone and carry first specific audio data by the loudspeaker; the position determining module is used for determining the position of second specific audio data in the audio data to be processed according to each piece of audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise; a data removing module, configured to remove the second specific audio data and the delayed audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain a target audio corresponding to a short video; and the video synthesis module is used for synthesizing the complete video according to a plurality of target audios corresponding to different short videos and the recorded multi-section short videos.
According to an aspect of an embodiment of the present invention, an electronic device includes a processor and a memory, where the memory stores computer readable instructions, and when executed by the processor, implement the audio/video processing method and the audio/video synthesizing method as described above.
According to an aspect of an embodiment of the present invention, a storage medium has stored thereon a computer program which, when executed by a processor, implements the audio-video processing method, the audio-video synthesizing method as described above.
In the above technical solution, when the speaker outputs the source audio data carrying the first specific audio data, the microphone collects the generated audio data to be processed to determine the position of the second specific audio data in the audio data to be processed, and based on the position of the second specific audio data in the audio data to be processed, removes the second specific audio data and the delayed audio data from the audio data to be processed, so as to obtain the target audio, that is, based on the first specific audio data carried by the source audio data, so that the delayed audio data is removed, thereby eliminating the delay introduced in the multi-segment recording process of the short video, and thus guaranteeing the continuity of the background music added during the complete video synthesis.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a schematic diagram of background music discontinuity during a short video multi-segment recording process according to the present invention.
Fig. 2 is a hardware configuration diagram of an electronic device, which is shown according to an exemplary embodiment.
Fig. 3 is a flow chart illustrating a method of audio and video processing according to an exemplary embodiment.
Fig. 4 is a flow chart illustrating another audio-video processing method according to an exemplary embodiment.
Fig. 5 is a flow chart of step 330 in one embodiment of the corresponding embodiment of fig. 3.
FIG. 6 is a flow chart of step 350 in one embodiment of the corresponding embodiment of FIG. 3.
Fig. 7 is a schematic diagram of extracting target audio from audio data to be processed according to the present invention.
Fig. 8 is a flowchart illustrating another audio-video processing method according to an exemplary embodiment.
Fig. 9 is a flowchart illustrating a method of audio and video synthesis according to an exemplary embodiment.
Fig. 10 is a flow chart of step 690 in one embodiment of the corresponding embodiment of fig. 9.
Fig. 11 is a schematic diagram of an audio/video synthesis method in a short video multi-segment recording process according to the present invention.
Fig. 12 is a block diagram of an audio-video processing apparatus according to an exemplary embodiment.
Fig. 13 is a block diagram illustrating an apparatus for audio-video composition and the like according to an exemplary embodiment.
Fig. 14 is a block diagram of an electronic device, according to an example embodiment.
There has been shown in the drawings, and will hereinafter be described, specific embodiments of the invention with the understanding that the present disclosure is to be considered in all respects as illustrative, and not restrictive, the scope of the inventive concepts being indicated by the appended claims.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
As described above, in the short video multi-segment recording process, there is a discontinuity in the background music added when the multi-segment short video synthesizes the complete video due to the delay introduction.
The discontinuity of background music is specifically described herein.
Currently, short videos are recorded in multiple segments, that is, multiple segments of short videos with a certain duration are respectively cut from the same video or different videos, and often, the short videos are implemented by electronic devices, for example, the implemented electronic devices may be smart phones, tablet computers, notebook computers, palm computers, desktop computers, etc., and it should be noted that the multiple segments of short videos have independence, that is, have time discontinuities, especially for multiple segments of short videos derived from the same video.
According to the actual needs of the application scene, the user can also configure audio data for different short videos in the short video multi-segment recording process, that is, when the short video multi-segment recording is performed, the speaker outputs the audio data carried by the video itself, and certainly does not exclude that some videos belong to silent videos, and also outputs the audio data configured by the user for the short videos, so that the practicability and the interestingness of audio-video synthesis are enhanced, and the user experience is facilitated to be improved.
However, the inventor has realized that taking an android (android) system deployed in a smart phone as an example, since the android system deployed in the smart phone has a delay for audio processing, that is, there is a certain delay for audio data collected by a microphone compared with audio data output by a speaker, it is also considered that the beginning of each recorded short video generates delayed audio data, that is, a delay is introduced, as shown in fig. 1, 101 to 103 are recorded short videos, 1011, 1021, 1013 are delayed audio data located at the beginning of the recorded short videos.
Based on this, when the complete video composition is performed, i.e. the complete video is generated by stitching of multiple segments of short videos, as shown at 104 in fig. 1, this causes discontinuities in the background music in the complete video, i.e. delayed audio data at the junction between different short videos, e.g. as shown at 105 in fig. 1.
For discontinuity of background music, the echo cancellation technology of an android system can be utilized to cancel delay introduced in the short video multi-segment recording process, however, the inventor finds that, because the delay of the android system on audio processing in different smartphones is different from 10 milliseconds to hundreds of milliseconds, unified processing cannot be achieved, which makes the synthesis effect of complete videos in different smartphones greatly different, and it can be understood that if unified processing is performed, some smartphones may cancel discontinuity of background music, while some smartphones still have discontinuity of background music, so that the echo cancellation technology still cannot completely cancel discontinuity of background music.
Therefore, the invention particularly provides an audio and video processing method which can completely eliminate the discontinuity of background music, is suitable for various types of electronic equipment, has good universality and practicability, and correspondingly, the matched audio and video processing device is deployed in the electronic equipment, wherein the electronic equipment comprises, but is not limited to, a smart phone, a tablet computer, a notebook computer, a palm computer and a desktop computer so as to realize the audio and video processing method.
Referring to fig. 2, fig. 2 is a block diagram of an electronic device, according to an example embodiment.
It should be noted that this electronic device is only an example adapted to the present invention, and should not be construed as providing any limitation on the scope of use of the present invention. Nor should such an electronic device be construed as necessarily relying on or necessarily having one or more of the components of the exemplary electronic device 200 shown in fig. 2.
The hardware structure of the electronic device 200 may vary widely depending on the configuration or performance, as shown in fig. 2, the electronic device 200 includes: a power supply 210, an interface 230, at least one memory 250, and at least one central processing unit (CPU, central Processing Units) 270.
Specifically, the power supply 210 is configured to provide an operating voltage for each hardware device on the electronic device 200.
The interface 230 includes at least one input/output interface 235 for receiving external signals. Of course, in other examples of the adaptation of the present invention, the interface 230 may further include at least one wired or wireless network interface 231, at least one serial-parallel interface 233, and at least one USB interface 237, as shown in fig. 2, which is not specifically limited herein. For example, the speaker is an output interface of the input-output interfaces 235, and the microphone is an input interface of the input-output interfaces 235.
The memory 250 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, where the resources stored include an operating system 251, application programs 253, and data 255, and the storage mode may be transient storage or permanent storage.
The operating system 251 is used for managing and controlling various hardware devices and applications 253 on the electronic device 200, so as to implement the operation and processing of the cpu 270 on the mass data 255 in the memory 250, which may be Windows server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.
The application 253 is a computer program that performs at least one specific task based on the operating system 251, and may include at least one module (not shown in fig. 2), each of which may respectively contain a series of computer readable instructions for the electronic device 200. For example, the audio/video processing apparatus may be regarded as an application 253 deployed on the electronic device.
The data 255 may be photographs, pictures, etc. stored on a disk, or may be audio data to be processed, an audio library, etc. stored in the memory 250.
The central processor 270 may include one or more of the above processors and is configured to communicate with the memory 250 via at least one communication bus to read computer readable instructions stored in the memory 250, thereby implementing operations and processing of the bulk data 255 in the memory 250. The audio-video processing method is accomplished, for example, by the central processor 270 reading a series of computer readable instructions stored in the memory 250.
It is to be understood that the configuration shown in fig. 2 is merely illustrative and that electronic device 200 may also include more or fewer components than shown in fig. 2 or have different components than shown in fig. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
Referring to fig. 3, in an exemplary embodiment, an audio/video processing method is applicable to an electronic device, for example, a smart phone, where the structure of the electronic device may be as shown in fig. 2.
The audio and video processing method can be executed by the electronic equipment, and can also be understood to be executed by an audio and video processing device deployed in the electronic equipment. In the method embodiments described below, the execution subject of each step is described as an electronic device for convenience of description, but this configuration is not limited thereto.
As shown in fig. 3 (a), the audio/video processing method may include the following steps:
in step 310, audio data to be processed is obtained.
The audio data to be processed are generated by collecting source audio data carrying first specific audio data and output by a loudspeaker through a microphone.
In this embodiment, the source audio data refers to audio data configured for different short videos according to actual needs of an application scene by a user.
The source audio data may be derived from audio data in an audio library, for example, in a KTV scene, the source audio data is an accompaniment of a song in the audio library; or may be audio data recorded by the user according to the actual needs of the application scene, for example, a section of voice recorded by the user.
It will also be appreciated that for short video multi-segment recordings, the source audio data is essentially known audio data, intended to be added as background music when the full video is synthesized from multiple segments of short video.
It should be noted that, for the output of the speaker, in addition to the source audio data carrying the first specific audio data, audio data carried by the video itself may also be included, and the audio data collected by the microphone substantially generates a certain delay, and it is understood that the delay is consistent with the delay generated by the collection of the source audio data by the microphone, so in this embodiment, the delay cancellation is only performed based on the source audio data.
Second, as previously described, the beginning of the audio data captured by the microphone will produce delayed audio data, e.g., the delayed audio data is an ambient sound, thereby introducing a delay that will result in a discontinuity in the background music during audio-video synthesis. Then, the delay is eliminated to ensure the continuity of the background music, and the delay audio data generated by the beginning of the audio data collected by the microphone is essentially eliminated.
Based on this, in the present embodiment, the first specific audio data is added at the beginning of the source audio data in order to eliminate the delay introduced in the short video multi-segment recording process.
The first specific audio data is also known substantially, and may be audio data derived from an audio library, or audio data recorded by a user according to actual needs of an application scene, which is not limited herein.
Therefore, when the loudspeaker plays the source audio data carrying the first specific audio data, the delay introduced in the short video multi-segment recording process can be eliminated based on the first specific audio data through the microphone acquisition.
It will be appreciated that the electronic device may store the audio data collected by the microphone, taking into account processing capabilities, after the microphone collects the audio data. For example, the audio data is stored to the memory 250 shown in fig. 2.
Then, the audio data to be processed may be acquired by the microphone, so as to perform the related processing on the audio data to be processed in real time, or may be acquired for a historical period of time, or may be processed according to an instruction of an operator.
In other words, the acquired audio data to be processed may be real-time audio data collected by the microphone or pre-selected stored historical audio data, which is not particularly limited in this embodiment.
Step 330, determining the position of the second specific audio data in the audio data to be processed.
Wherein the second specific audio data is first specific audio data carrying background noise.
As mentioned above, the speaker plays the source audio data carrying the first specific audio data, and it is understood that the background noise inevitably exists in the recording process, so that the audio data collected by the microphone is substantially the first specific audio data carrying the background noise and the source audio data carrying the background noise, and in addition, since a delay is introduced in the recording process, the audio data collected by the microphone also contains the delayed audio data and is located at the beginning of the audio data collected by the microphone.
As can be seen from the above, the audio data to be processed sequentially includes delay audio data, first specific audio data (i.e. second specific audio data) carrying background noise, and source audio data carrying background noise.
Based on this, in this embodiment, the delay is eliminated to ensure the continuity of the background music, which is essentially to eliminate the delay audio data generated at the beginning of the audio data collected by the microphone, and the second specific audio data after the background noise is added to the first specific audio data before the source audio data, which can also be understood as only retaining the source audio data carrying the background noise.
Specifically, the position of the second specific audio data in the audio data to be processed is determined. Then, bounded by the location, the audio data before the location is removed, i.e. the second specific audio data and the delayed audio data are removed, and the audio data after the location is preserved, i.e. the source audio data carrying background noise is preserved.
And step 350, removing the second specific audio data and the delay audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain target audio.
By the above process, based on the first specific audio data, the delay audio data in the audio data to be processed can be accurately determined, so that the delay introduced in the short video multi-segment recording process can be accurately eliminated, and the problem of discontinuous background music in the audio and video synthesis process in the short video multi-segment recording process in the prior art is solved.
In addition, the delay difference introduced when different types of electronic equipment conduct short video multi-section recording is not needed to be considered, and the method has higher universality.
Further, in another exemplary embodiment, as shown in fig. 3 (b), after step 350, the method as described above may further include the steps of:
And step 370, performing sound effect processing on the target audio.
As described above, the target audio is essentially the source audio data carrying the background noise, and it can be understood that the audio effect of the target audio is deteriorated after recording and interference of the background noise.
Among them, sound effect processing includes, but is not limited to: noise reduction, excitation, compression limiting, equalization, mixing, etc.
For example, the source audio data is mixed with the target audio.
Under the action of the embodiment, the audio effect of the target audio is fully ensured, and the synthesis quality of the audio and the video is improved.
Referring to fig. 4, in an exemplary embodiment, prior to step 310, the method as described above may further include the steps of:
step 410 generates the first specific audio data.
Optionally, recording the voice segment to obtain the first specific audio data.
Optionally, a piece of audio data is selected from an audio library as the first specific audio data.
Step 430, obtaining the source audio data, and splicing the source audio data with the first specific audio data to obtain spliced audio data.
The spliced audio data are source audio data carrying first specific audio data, and the first specific audio data are located before the source audio data.
And 450, controlling the loudspeaker to output the spliced audio data, and controlling the microphone to collect the spliced audio data to generate the audio data to be processed.
Under the action of the above embodiment, the addition of the first specific audio data is implemented, that is, the first specific audio data is added as known audio data, so, by the output of the speaker and the collection of the microphone, the audio data collected by the microphone must contain the first specific audio data or the first specific audio data carrying the background noise, so that the delay introduced in the subsequent short video multi-segment recording process based on the first specific audio data is implemented.
Referring to fig. 5, in an exemplary embodiment, step 330 may include the steps of:
step 331, performing a cross-correlation operation between the first specific audio data and the audio data to be processed, to obtain a cross-correlation operation result.
Wherein the cross-correlation operation result indicates a strong correlation between the first specific audio data and the second specific audio data in the audio data to be processed.
That is, since the second specific audio data in the audio data to be processed is substantially the first specific audio data carrying background noise, the first specific audio data and the second specific audio data have strong correlation when performing the cross-correlation operation, and the first specific audio data and the remaining audio data in the audio data to be processed, for example, the delayed audio data or the source audio data carrying background noise, have weak correlation when performing the cross-correlation operation.
Step 333, obtaining the position of the second specific audio data in the audio data to be processed according to the strong correlation indicated by the cross-correlation operation result.
Thus, if the cross-correlation operation result indicates a strong correlation, the audio data representing the cross-correlation operation with the first specific audio data among the audio data to be processed is the second specific audio data, i.e., the position of the second specific audio data in the audio data to be processed is regarded herein.
Similarly, if the cross-correlation result indicates weak correlation, the audio data representing the cross-correlation operation with the first specific audio data in the audio data to be processed may be delayed audio data or source audio data carrying background noise, i.e. the position of the second specific audio data in the audio data to be processed is not considered here.
Under the action of the embodiment, the position of the second specific audio data in the audio data to be processed is accurately found based on the strong correlation between the first specific audio data and the second specific audio data, which is beneficial to eliminating delay introduced in the short video multi-segment recording process.
Referring to fig. 6, in an exemplary embodiment, step 350 may include the steps of:
step 351, determining a termination position point of the second specific audio data in the audio data to be processed according to the position of the second specific audio data in the audio data to be processed.
Step 353, taking the ending position point of the second specific audio data in the audio data to be processed as the starting position point of the target audio in the audio data to be processed.
Step 355, extracting the target audio from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed.
Specifically, as shown in fig. 7, the position of the second specific audio data 301 in the audio data 300 to be processed refers to a start position point 3011 and a stop position point 3012.
Then, the end position point 3012 is taken as the start position point 3021 of the target audio 302 in the audio data 300 to be processed.
Thus, the audio data preceding the start position point 3021, including the second specific audio data 301 and the delayed audio data 303, is removed, whereas the audio data following the start position point 3021, i.e. the source audio data carrying the background noise, i.e. the target audio 302, is preserved, being bounded by the start position point 3021.
The process completes the extraction of the target audio, effectively eliminates the delay audio data, and further fully ensures the continuity of background music during the subsequent audio and video synthesis.
Through the embodiments, the audio data collected by the microphone is delayed compared with the audio data output by the loudspeaker, so that the delay is eliminated, the problem that the sound effects obtained by different electronic devices are different is avoided, and the universality is effectively expanded.
Referring to fig. 8, in an exemplary embodiment, after step 350 or step 370, the method as described above may further include the steps of:
step 510, in the process of recording multiple short videos, obtaining multiple target audios according to source audio data configured for each short video, where each target audio corresponds to one short video.
Step 530, synthesizing a plurality of target audios into background music, and synthesizing a plurality of short videos into a complete video.
Step 550, adding the background music to the complete video.
In the process, the audio and video synthesis in the short video multi-section recording process is realized.
The audio-video synthesis method is further described below based on short video multi-segment recording.
Referring to fig. 9, in an exemplary embodiment, an audio/video synthesis method is applicable to an electronic device, for example, a smart phone, where the structure of the electronic device may be as shown in fig. 2.
The audio and video synthesis method can be executed by the electronic equipment, and can also be understood to be executed by an audio and video synthesis device arranged in the electronic equipment. In the method embodiments described below, the execution subject of each step is described as an electronic device for convenience of description, but this configuration is not limited thereto.
In step 610, in the short video multi-segment recording process, for the source audio data respectively configured for the multi-segment short video, a plurality of audio data to be processed are obtained, each audio data to be processed corresponds to a segment of short video, and the audio data to be processed is generated by the microphone when the speaker outputs the source audio data carrying the first specific audio data.
Step 630, for each audio data to be processed, determining a position of a second specific audio data in the audio data to be processed, where the second specific audio data is a first specific audio data carrying background noise.
Step 650, removing the second specific audio data and the delayed audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, and obtaining the target audio corresponding to the short video.
Step 690, synthesizing the complete video according to the multiple target audios corresponding to the different short videos and the recorded multiple short videos.
Referring to fig. 10, in an exemplary embodiment, step 690 may include the steps of:
in step 691, a plurality of target audios corresponding to different short videos are synthesized into background music, and the recorded multiple short videos are synthesized into the complete video.
Step 693, adding the background music to the full video.
Specifically, as shown in fig. 11, the audio-video composition includes two branches: splicing the output branch and collecting the synthesized branch.
Splicing output branches:
the source audio data 701 and the first specific audio data 702 are acquired respectively, and the source audio data 701 carrying the first specific audio data 702 is formed for output by the speaker 703 by decoding and splicing.
Collecting and synthesizing branch:
while the speaker 703 outputs the source audio data 701 carrying the first specific audio data 702, the microphone 704 captures and generates the audio data 705 to be processed, at this time, the audio data 705 to be processed includes the delayed audio data, the first specific audio data carrying the background noise (i.e., the second specific audio data), and the source audio data carrying the background noise due to the introduced delay.
Then, the first specific audio data 702 and the audio data to be processed 705 are subjected to a cross-correlation operation 706, so that the position 707 of the second specific audio data in the audio data to be processed 706 can be determined, and the target audio 708 is extracted from the audio data to be processed 706.
Then, the mixing of the target audio 708 and the source audio data 701 is performed, resulting in the target audio 709 having a good sound effect finally.
Finally, the synthesis of the background music is performed according to the target audio 709, and the synthesis of the complete video is performed according to the short video, and the background music is added to the complete video, thereby completing the audio-video synthesis process.
Through the process, the continuity of background music during audio and video synthesis in the short video multi-section recording process is fully ensured.
In addition, the audio and video synthesis quality is effectively improved through the mixing of the source audio data and the target audio.
The following is an embodiment of the apparatus of the present invention, which may be used to execute the audio/video processing method according to the present invention. For details not disclosed in the embodiment of the apparatus of the present invention, please refer to a method embodiment of the audio/video processing method related to the present invention.
Referring to fig. 12, in an exemplary embodiment, an audio/video processing device 900 includes, but is not limited to: a data acquisition module 910, a location determination module 930, and a data removal module 950.
The data obtaining module 910 is configured to obtain audio data to be processed, where the audio data to be processed is generated by collecting, by a microphone, source audio data that is output by a speaker and carries first specific audio data.
The location determining module 930 is configured to determine a location of second specific audio data in the audio data to be processed, where the second specific audio data is first specific audio data carrying background noise.
The data removing module 950 is configured to remove the second specific audio data and the delayed audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain the target audio.
In an exemplary embodiment, the audio-video processing device 900 described above further includes, but is not limited to:
the data generation module is used for generating the first specific audio data.
And the data splicing module is used for acquiring the source audio data and splicing the source audio data with the first specific audio data to obtain spliced audio data.
The data acquisition module is used for controlling the loudspeaker to output the spliced audio data and controlling the microphone to acquire the spliced audio data so as to generate the audio data to be processed.
In an exemplary embodiment, the data acquisition module 910 includes, but is not limited to:
the recording unit is used for recording the voice section to obtain the first specific audio data. Or alternatively, the first and second heat exchangers may be,
and the selecting unit is used for selecting a section of audio data from the audio library as the first specific audio data.
In an exemplary embodiment, the location determination module 930 includes, but is not limited to:
the cross-correlation operation unit is used for performing cross-correlation operation between the first specific audio data and the audio data to be processed to obtain a cross-correlation operation result, and the cross-correlation operation result indicates strong correlation between the first specific audio data and the second specific audio data in the audio data to be processed.
And the position determining unit is used for obtaining the position of the second specific audio data in the audio data to be processed according to the strong correlation indicated by the cross-correlation operation result.
In an exemplary embodiment, the data removal module 950 includes, but is not limited to:
the termination position point determining unit is used for determining the termination position point of the second specific audio data in the audio data to be processed according to the position of the second specific audio data in the audio data to be processed.
And the starting position point determining unit is used for taking the ending position point of the second specific audio data in the audio data to be processed as the starting position point of the target audio in the audio data to be processed.
And the data extraction unit is used for extracting the target audio from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed.
In an exemplary embodiment, the audio-video processing device 900 described above further includes, but is not limited to: sound effect processing modules including, but not limited to:
and the audio mixing unit is used for mixing the source audio data with the target audio.
In an exemplary embodiment, the audio-video processing device 900 described above further includes, but is not limited to:
the distribution module is used for obtaining a plurality of target audios aiming at source audio data configured for each short video in the short video multi-segment recording process, and each target audio corresponds to one short video.
And the synthesis module is used for synthesizing a plurality of target audios into background music and synthesizing a plurality of short videos into a complete video.
And an adding module for adding the background music to the complete video.
Referring to fig. 13, in an exemplary embodiment, an audio-video synthesizing apparatus 1100 includes, but is not limited to: a data acquisition module 1110, a location determination module 1130, a data removal module 1150, and a video composition module 1190.
The data obtaining module 1110 is configured to obtain, for source audio data of multiple short videos in a short video multi-segment recording process, a plurality of audio data to be processed, where each audio data to be processed corresponds to a short video, and the audio data to be processed is generated by collecting, by a microphone, the source audio data carrying the first specific audio data output by a speaker.
The location determining module 1130 is configured to determine, for each piece of audio data to be processed, a location of second specific audio data in the piece of audio data to be processed, where the second specific audio data is first specific audio data carrying background noise.
The data removing module 1150 is configured to remove the second specific audio data and the delayed audio data from the audio data to be processed based on the position of the second specific audio data in the audio data to be processed, so as to obtain the target audio corresponding to the short video.
The video synthesis module 1190 is configured to synthesize a complete video according to the multiple target audios corresponding to the different short videos and the recorded multiple short videos.
In an exemplary embodiment, the video composition module 1190 includes, but is not limited to:
and the synthesizing unit is used for synthesizing a plurality of target audios corresponding to different short videos into background music and synthesizing the recorded multiple sections of short videos into the complete video.
And the adding unit is used for adding the background music to the complete video.
In an exemplary embodiment, the audio-video synthesizing apparatus 1100 as described above further includes related functional modules for implementing the following steps, including but not limited to:
Wherein the first specific audio data is generated.
And acquiring the source audio data, and splicing the source audio data with the first specific audio data to obtain spliced audio data.
And controlling the loudspeaker to output the spliced audio data, and controlling the microphone to collect the spliced audio data to generate the audio data to be processed.
In an exemplary embodiment, the audio-video synthesizing apparatus 1100 as described above further includes related functional modules for implementing the following steps, including but not limited to:
and recording the voice segment to obtain the first specific audio data. Or alternatively, the first and second heat exchangers may be,
a piece of audio data is selected from an audio library as the first specific audio data.
In an exemplary embodiment, the audio-video synthesizing apparatus 1100 as described above further includes related functional modules for implementing the following steps, including but not limited to:
and performing cross-correlation operation between the first specific audio data and the audio data to be processed to obtain a cross-correlation operation result, wherein the cross-correlation operation result indicates strong correlation between the first specific audio data and the second specific audio data in the audio data to be processed.
And obtaining the position of the second specific audio data in the audio data to be processed according to the strong correlation indicated by the cross-correlation operation result.
In an exemplary embodiment, the audio-video synthesizing apparatus 1100 as described above further includes related functional modules for implementing the following steps, including but not limited to:
and determining a termination position point of the second specific audio data in the audio data to be processed according to the position of the second specific audio data in the audio data to be processed.
And taking the ending position point of the second specific audio data in the audio data to be processed as the starting position point of the target audio in the audio data to be processed.
And extracting the target audio from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed.
In an exemplary embodiment, the audio-video synthesizing apparatus 1100 as described above further includes related functional modules for implementing the following steps, including but not limited to:
wherein the source audio data is mixed with the target audio.
It should be noted that, when the audio/video processing and synthesizing device provided in the foregoing embodiment performs audio/video processing and synthesizing, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be completed by different functional modules according to needs, that is, the internal structure of the audio/video processing and synthesizing device is divided into different functional modules, so as to complete all or part of the functions described above.
In addition, the embodiments of the audio/video processing and synthesizing device and the audio/video processing and synthesizing method provided in the foregoing embodiments belong to the same concept, and the specific manner in which each module performs the operation has been described in detail in the method embodiment, which is not described herein again.
Referring to fig. 14, in an exemplary embodiment, an electronic device 1000 is provided that is not limited to at least one processor 1001, at least one memory 1002, and at least one communication bus 1003.
Wherein the memory 1002 has stored thereon computer readable instructions, the processor 1001 reads the computer readable instructions stored in the memory 1002 via the communication bus 1003.
The computer readable instructions, when executed by the processor 1001, implement the audio and video processing method and the audio and video synthesizing method in the above embodiments.
In an exemplary embodiment, a storage medium has stored thereon a computer program which, when executed by a processor, implements the audio-video processing method and the audio-video synthesizing method in the above embodiments.
The foregoing is merely illustrative of the preferred embodiments of the present invention and is not intended to limit the embodiments of the present invention, and those skilled in the art can easily make corresponding variations or modifications according to the main concept and spirit of the present invention, so that the protection scope of the present invention shall be defined by the claims.

Claims (14)

1. An audio/video processing method, comprising:
acquiring audio data to be processed, wherein the audio data to be processed is generated by acquiring source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; wherein the first specific audio data is known audio data and spliced before the source audio data;
determining the position of second specific audio data in the audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise;
determining a termination position point of the second specific audio data in the audio data to be processed according to the position of the second specific audio data in the audio data to be processed;
taking the ending position point of the second specific audio data in the audio data to be processed as the starting position point of the target audio in the audio data to be processed;
and extracting the target audio from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed.
2. The method of claim 1, wherein prior to the obtaining the audio data to be processed, the method further comprises:
Generating the first specific audio data;
acquiring the source audio data, and splicing the source audio data with the first specific audio data to obtain spliced audio data;
and controlling the loudspeaker to output the spliced audio data, and controlling the microphone to collect the spliced audio data to generate the audio data to be processed.
3. The method of claim 2, wherein the generating the first particular audio data comprises:
recording the voice segment to obtain the first specific audio data; or alternatively, the first and second heat exchangers may be,
a piece of audio data is selected from an audio library as the first specific audio data.
4. The method of claim 1, wherein the determining the location of the second particular audio data in the audio data to be processed comprises:
performing cross-correlation operation between the first specific audio data and the audio data to be processed to obtain a cross-correlation operation result, wherein the cross-correlation operation result indicates strong correlation between the first specific audio data and the second specific audio data in the audio data to be processed;
and obtaining the position of the second specific audio data in the audio data to be processed according to the strong correlation indicated by the cross-correlation operation result.
5. The method of claim 1, wherein after obtaining the target audio, the method further comprises:
mixing the source audio data with the target audio.
6. The method of any one of claims 1 to 5, wherein after obtaining the target audio, the method further comprises:
in the short video multi-segment recording process, a plurality of target audios are obtained aiming at source audio data configured for each short video, and each target audio corresponds to one short video;
synthesizing a plurality of target audios into background music, and synthesizing a plurality of short videos into a complete video;
the background music is added to the full video.
7. An audio and video synthesis method is characterized by comprising the following steps:
in the short video multi-section recording process, a plurality of pieces of audio data to be processed are acquired aiming at source audio data respectively configured for a plurality of sections of short videos, each piece of audio data to be processed corresponds to a section of short video, and the audio data to be processed is generated by collecting source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; wherein the first specific audio data is known audio data and spliced before the source audio data;
Determining the position of second specific audio data in the audio data to be processed according to each piece of audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise;
determining a termination position point of the second specific audio data in the audio data to be processed according to the position of the second specific audio data in the audio data to be processed;
taking the ending position point of the second specific audio data in the audio data to be processed as the starting position point of the target audio in the audio data to be processed;
extracting target audio corresponding to a short video from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed;
and synthesizing the complete video according to the multiple target audios corresponding to the different short videos and the recorded multiple short videos.
8. The method of claim 7, wherein synthesizing the complete video from the plurality of target audios corresponding to the different short videos and the recorded plurality of pieces of short video comprises:
synthesizing a plurality of target audios corresponding to different short videos into background music, and synthesizing a plurality of recorded short videos into the complete video;
The background music is added to the full video.
9. An audio/video processing apparatus, comprising:
the data acquisition module is used for acquiring audio data to be processed, wherein the audio data to be processed is generated by acquiring source audio data which is output by a loudspeaker and carries first specific audio data by a microphone; wherein the first specific audio data is known audio data and spliced before the source audio data;
the position determining module is used for determining the position of second specific audio data in the audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise;
a termination position point determining unit, configured to determine a termination position point of the second specific audio data in the audio data to be processed according to a position of the second specific audio data in the audio data to be processed;
a starting position point determining unit, configured to take an ending position point of the second specific audio data in the audio data to be processed as a starting position point of the target audio in the audio data to be processed;
And the data extraction unit is used for extracting the target audio from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed.
10. The apparatus of claim 9, wherein the location determination module comprises:
the cross-correlation operation unit is used for performing cross-correlation operation between the first specific audio data and the audio data to be processed to obtain a cross-correlation operation result, wherein the cross-correlation operation result indicates strong correlation between the first specific audio data and the second specific audio data in the audio data to be processed;
and the position determining unit is used for obtaining the position of the second specific audio data in the audio data to be processed according to the strong correlation indicated by the cross-correlation operation result.
11. An audio and video synthesizing apparatus, comprising:
the data acquisition module is used for acquiring a plurality of pieces of audio data to be processed aiming at source audio data which are respectively configured for a plurality of pieces of short videos in the short video multi-section recording process, wherein each piece of audio data to be processed corresponds to a short video, and the audio data to be processed is generated by acquiring the source audio data which are output by a microphone and carry first specific audio data by the loudspeaker; wherein the first specific audio data is known audio data and spliced before the source audio data;
The position determining module is used for determining the position of second specific audio data in the audio data to be processed according to each piece of audio data to be processed, wherein the second specific audio data is first specific audio data carrying background noise;
the data removing module is used for determining a termination position point of the second specific audio data in the audio data to be processed according to the position of the second specific audio data in the audio data to be processed; taking the ending position point of the second specific audio data in the audio data to be processed as the starting position point of the target audio in the audio data to be processed; extracting target audio corresponding to a short video from the audio data to be processed based on the initial position point of the target audio in the audio data to be processed;
and the video synthesis module is used for synthesizing the complete video according to a plurality of target audios corresponding to different short videos and the recorded multi-section short videos.
12. The apparatus of claim 11, wherein the video compositing module comprises:
the synthesizing unit is used for synthesizing a plurality of target audios corresponding to different short videos into background music and synthesizing a plurality of recorded short videos into the complete video;
And the adding unit is used for adding the background music to the complete video.
13. An electronic device, comprising:
a processor; a kind of electronic device with high-pressure air-conditioning system
A memory having stored thereon computer readable instructions which, when executed by the processor, implement the method of any of claims 1 to 8.
14. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the method according to any one of claims 1 to 8.
CN201910713206.1A 2019-08-02 2019-08-02 Audio and video processing method, synthesis device, electronic equipment and storage medium Active CN111179970B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910713206.1A CN111179970B (en) 2019-08-02 2019-08-02 Audio and video processing method, synthesis device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910713206.1A CN111179970B (en) 2019-08-02 2019-08-02 Audio and video processing method, synthesis device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111179970A CN111179970A (en) 2020-05-19
CN111179970B true CN111179970B (en) 2023-10-20

Family

ID=70651833

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910713206.1A Active CN111179970B (en) 2019-08-02 2019-08-02 Audio and video processing method, synthesis device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111179970B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153456B (en) * 2020-09-25 2023-03-28 北京达佳互联信息技术有限公司 Video data recording method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6226608B1 (en) * 1999-01-28 2001-05-01 Dolby Laboratories Licensing Corporation Data framing for adaptive-block-length coding system
CN1770848A (en) * 2004-11-04 2006-05-10 松下电器产业株式会社 Audio signal delay apparatus and method
CN102625006A (en) * 2011-01-31 2012-08-01 深圳三石科技有限公司 Method and system for synchronization and alignment of echo cancellation data and audio communication equipment
CN102971788A (en) * 2010-04-13 2013-03-13 弗兰霍菲尔运输应用研究公司 Method and encoder and decoder for gapless playback of an audio signal
CN106067990A (en) * 2016-06-29 2016-11-02 合信息技术(北京)有限公司 Audio-frequency processing method, device and video player
CN108198551A (en) * 2018-01-15 2018-06-22 深圳前海黑鲸科技有限公司 The processing method and processing device of echo cancellor delay
CN109308905A (en) * 2017-07-28 2019-02-05 北京搜狗科技发展有限公司 Audio data processing method, device, electronic equipment and storage medium
US10284985B1 (en) * 2013-03-15 2019-05-07 Smule, Inc. Crowd-sourced device latency estimation for synchronization of recordings in vocal capture applications
CN109951651A (en) * 2019-02-20 2019-06-28 浙江工业大学 A kind of collaboration method of audio broadcasting and video grabber

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6226608B1 (en) * 1999-01-28 2001-05-01 Dolby Laboratories Licensing Corporation Data framing for adaptive-block-length coding system
CN1770848A (en) * 2004-11-04 2006-05-10 松下电器产业株式会社 Audio signal delay apparatus and method
CN102971788A (en) * 2010-04-13 2013-03-13 弗兰霍菲尔运输应用研究公司 Method and encoder and decoder for gapless playback of an audio signal
CN102625006A (en) * 2011-01-31 2012-08-01 深圳三石科技有限公司 Method and system for synchronization and alignment of echo cancellation data and audio communication equipment
US10284985B1 (en) * 2013-03-15 2019-05-07 Smule, Inc. Crowd-sourced device latency estimation for synchronization of recordings in vocal capture applications
CN106067990A (en) * 2016-06-29 2016-11-02 合信息技术(北京)有限公司 Audio-frequency processing method, device and video player
CN109308905A (en) * 2017-07-28 2019-02-05 北京搜狗科技发展有限公司 Audio data processing method, device, electronic equipment and storage medium
CN108198551A (en) * 2018-01-15 2018-06-22 深圳前海黑鲸科技有限公司 The processing method and processing device of echo cancellor delay
CN109951651A (en) * 2019-02-20 2019-06-28 浙江工业大学 A kind of collaboration method of audio broadcasting and video grabber

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种面向互联网应用的多路实时流媒体同步合成方案;王英兰 等;东华大学学报(自然科学版);第44卷(第1期);第108-114页 *

Also Published As

Publication number Publication date
CN111179970A (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN109379633B (en) Video editing method and device, computer equipment and readable storage medium
CN110324718A (en) Audio-video generation method, device, electronic equipment and readable medium
CN108234793B (en) Communication method, communication device, electronic equipment and storage medium
CN110830832B (en) Audio playing parameter configuration method of mobile terminal and related equipment
JP2023530964A (en) SEARCH RESULT DISPLAY METHOD, APPARATUS, READABLE MEDIUM AND ELECTRONIC DEVICE
WO2021052130A1 (en) Video processing method, apparatus and device, and computer-readable storage medium
CN111179970B (en) Audio and video processing method, synthesis device, electronic equipment and storage medium
CN113542626B (en) Video dubbing method and device, computer equipment and storage medium
CN110312161B (en) Video dubbing method and device and terminal equipment
CN105578224A (en) Multimedia data acquisition method, device, smart television and set-top box
CN112911332B (en) Method, apparatus, device and storage medium for editing video from live video stream
CN104240697A (en) Audio data feature extraction method and device
CN116016817A (en) Video editing method, device, electronic equipment and storage medium
CN111145770B (en) Audio processing method and device
CN111147655B (en) Model generation method and device
CN115243087A (en) Audio and video co-shooting processing method and device, terminal equipment and storage medium
CN109495786B (en) Pre-configuration method and device of video processing parameter information and electronic equipment
CN109889737B (en) Method and apparatus for generating video
CN112584225A (en) Video recording processing method, video playing control method and electronic equipment
CN111145769A (en) Audio processing method and device
CN113948054A (en) Audio track processing method, device, electronic equipment and storage medium
CN111210837B (en) Audio processing method and device
CN111145793B (en) Audio processing method and device
CN114979764B (en) Video generation method, device, computer equipment and storage medium
CN111145792B (en) Audio processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant