WO2021008055A1 - Procédé et appareil de synthèse vidéo, ainsi que terminal et support de stockage - Google Patents

Procédé et appareil de synthèse vidéo, ainsi que terminal et support de stockage Download PDF

Info

Publication number
WO2021008055A1
WO2021008055A1 PCT/CN2019/120302 CN2019120302W WO2021008055A1 WO 2021008055 A1 WO2021008055 A1 WO 2021008055A1 CN 2019120302 W CN2019120302 W CN 2019120302W WO 2021008055 A1 WO2021008055 A1 WO 2021008055A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
time point
audio
videos
accent
Prior art date
Application number
PCT/CN2019/120302
Other languages
English (en)
Chinese (zh)
Inventor
吴晗
李文涛
王森
陈恒全
Original Assignee
广州酷狗计算机科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州酷狗计算机科技有限公司 filed Critical 广州酷狗计算机科技有限公司
Publication of WO2021008055A1 publication Critical patent/WO2021008055A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/40Rhythm
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • This application relates to the field of video processing technology, and in particular to a method, device, terminal and storage medium for video synthesis.
  • the embodiments of the present application provide a method, device, terminal, and storage medium for video synthesis, which can solve the problem of low video synthesis efficiency.
  • the technical solution is as follows:
  • a method for video synthesis includes:
  • the multiple material videos and the material audio are synthesized to obtain a synthesized video, wherein the switching time point of each material video in the synthesized video is the accent beat of the audio data Point in time.
  • the determining a plurality of material videos in the material video set based on the accent beat time point includes:
  • a plurality of material videos are determined in the material video set.
  • the determining a plurality of material videos in the material video set based on the number N of the accent beat time points, the start time point and the end time point of the material audio includes:
  • N+1 material videos are determined in the material video set.
  • the synthesizing the multiple material videos and the material audio based on the accent beat time point to obtain a synthesized video includes:
  • each sub-video is synthesized to obtain a synthesized material video
  • the synthesized material video and the material audio are synthesized to obtain a synthesized video.
  • the determining the sub-video corresponding to the currently acquired material video based on the currently acquired material video and the accent beat time point includes:
  • N+1 material videos are determined in the material video set.
  • the material video set is a total material video spliced into a plurality of material videos.
  • the material video set is a video set including multiple independent material videos.
  • the acquisition of the material video set, the material audio, and the accent beat time point of the material audio sent by the server includes:
  • the accent beat time point of the material audio for synthesizing video is determined.
  • a video synthesis device in a second aspect, includes:
  • a sending module configured to send a material acquisition request to the server, wherein the material acquisition request carries characteristic information of the material audio;
  • An acquisition module configured to acquire the material video set, the material audio, and the accent beat time point of the material audio sent by the server;
  • a determining module configured to determine multiple material videos in the material video collection based on the accent beat time point
  • the synthesis module is configured to synthesize the multiple material videos and the material audio based on the accent beat time point to obtain a synthesized video, wherein the switching time point of each material video in the synthesized video is the The accent beat time point of the audio data.
  • the determining module is used to:
  • a plurality of material videos are determined in the material video set.
  • the determining module is used to:
  • N+1 material videos are determined in the material video set.
  • the synthesis module is used for:
  • each sub-video is synthesized to obtain a synthesized material video
  • the synthesized material video and the material audio are synthesized to obtain a synthesized video.
  • the synthesis module is used for:
  • N+1 material videos are determined in the material video set.
  • the material video set is a total material video spliced into a plurality of material videos.
  • the material video set is a video set including multiple independent material videos.
  • the obtaining module 1120 is configured to:
  • the accent beat time point of the material audio for synthesizing video is determined.
  • a terminal in a third aspect, characterized in that the terminal includes a processor and a memory, and at least one instruction is stored in the memory, and the instruction is loaded and executed by the processor to implement the first The operations performed by the video synthesis method described in the aspect.
  • a computer-readable storage medium characterized in that at least one instruction is stored in the computer-readable storage medium, and the instruction is loaded and executed by a processor to implement the above-mentioned first aspect.
  • the material video set, the material audio, and the accent beat time point of the material audio are obtained from the server. Then, multiple material videos are selected from the material video set, and finally, based on the accent beat time point, the multiple material videos and the material audio are synthesized to obtain a synthesized video. In the obtained synthesized video, the switching time point of each material video is the accent beat time point of the audio data. In this way, the material can be automatically obtained, and the material video and material audio can be automatically synthesized without manual processing, and the efficiency is high. .
  • FIG. 1 is a flowchart of a method for video synthesis provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of an application program interface provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an application program interface provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of calculating the number of material videos provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of calculating the number of material videos provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of calculating the number of material videos according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an application program interface provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of calculating the duration of a sub video provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of calculating the duration of a sub video provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of calculating the duration of a sub video provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a video synthesis device provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • the embodiment of the present application provides a method for video synthesis, which may be implemented by a terminal.
  • the terminal may be a mobile phone, a tablet computer, and so on.
  • An application program (hereinafter referred to as a video production application) that can be used to make a composite video is installed in the terminal.
  • the video production application can be a comprehensive application with a variety of functions, such as making composite video, Video recording, video playback, video editing, live broadcast functions, etc., can also be a single-function application with only the function of making composite videos.
  • the user can select music in the video production application, and obtain the material for making the synthesized video from the server through the application.
  • the material can include material audio corresponding to the music and some material videos.
  • the application program can synthesize the acquired materials based on this method to obtain a synthesized video.
  • a music playback application and a video production application can be installed in the terminal at the same time.
  • the music playback application can be a comprehensive application with a variety of functions, such as music playback and audio recording. , Live broadcast function, etc., can also be a single-function application, only with the function of music playback.
  • the user can select favorite music through the music playback application, and obtain materials through the video production application to make a composite video.
  • a case where a music playback application and a video production application are installed in the terminal is taken as an example for description.
  • Fig. 1 is a flowchart of a method for video synthesis provided by an embodiment of the present application. Referring to Fig. 1, this embodiment includes:
  • Step 101 Send a material acquisition request to a server, where the material acquisition request carries characteristic information of the material audio.
  • the feature information of the material audio may be a music name, a hash value of the music name, or a hash value of audio data, etc.
  • the feature information can uniquely identify the material audio, and the specific information is not limited here.
  • the terminal is installed with a music playing application and a video production application.
  • the music player application is provided with a music selection interface for the user.
  • a search bar and a music list can be included.
  • the music list can display the music name, Music duration and other information.
  • the user can select favorite music in the above-mentioned music list, or search for favorite music through the search bar, and select the music.
  • the music playback application can jump to the music playback interface as shown in Figure 3.
  • the lyrics, music name, artist name, and music playback progress of the currently playing music can be displayed Wait.
  • the submission option is the option to trigger video synthesis. When the user selects the submission option, it indicates that he wants to use the currently playing music as the material audio to make a composite video.
  • the music playing application starts the video production application installed in the terminal through the system and sets the characteristics of the current playing music (ie material audio) The information is sent to the video production application. Then, the video production application sends a material acquisition request to the server through the terminal, and the material acquisition request carries the characteristic information of the material and audio.
  • the server may be a background server of the video production application.
  • Step 102 Obtain the material video set, the material audio, and the accent beat time point of the material audio sent by the server.
  • the accent beat time point is the time point corresponding to the beat point with the beat value of 1 in the material audio.
  • the server may store audio dot data.
  • the dot data includes the beat time point and the beat value.
  • the beat time point and the corresponding beat value can be used by the technicians to use the machine according to the BPM (Beat Per Minute, every minute) of the audio data.
  • the number of beats per minute), beat information, etc. are collected and generated, and can also be manually marked and produced by a technician by listening to the audio data.
  • the server can also store multiple material videos, and score the material videos according to their size, image quality, clarity and other indicators, and then select a preset number of material videos with higher scores from the stored material videos.
  • the preset number can be specified by the technician according to the general needs of the user, for example 20.
  • the server can cut each material video and cut each material video to a preset duration.
  • the preset duration can also be specified by the technician according to the general needs of the user. For example, 7s. Because the duration between two accent beats in general audio is usually 2s to 4s, and the preset duration is set to greater than 4s, you can try to avoid the duration of the clipped material video being less than the duration between two accent beats.
  • the material video needs to be transmitted to the terminal, considering the transmission delay, the duration of the material video should not be too long, and it can be 6s to 10s. You can perform the above processing after receiving the material acquisition request sent by the terminal, or you can perform it in advance and store the processed material video. After receiving the material acquisition request, you can directly obtain the locally stored processed material video, thereby Improve the overall efficiency of making composite videos.
  • the server After the server cuts the preset number of video materials, it can use these material videos as the material video set sent to the terminal. In addition, considering the problem of transmission delay, the server can also perform splicing processing on these material videos after cutting the preset number of material videos to obtain a total material video, and send the total material video as The terminal's material video set is sent to the terminal, so that the transmission delay is small. It should also be noted here that when the total material video is used as the material video set, the server also needs to simultaneously send the separation time point of each material video in the total material video to the terminal. For example, the total material video consists of 5 7s material videos, then the separation time points include 0:07 (0 minutes and 7 seconds), 0:14, 0:21, 0:28, and 0:35.
  • the server After receiving the material acquisition request, the server obtains the corresponding material audio and the dot data of the material audio according to the characteristic information of the material audio. Because in the subsequent production and synthesis, the accent beat time point in the dot data (the time point corresponding to the beat point with a beat value of 1) is mainly used, so you can only send the accent beat time point with the material audio and material video set To the terminal to reduce the amount of data transmitted. The terminal receives the accent beat time point of the material audio and the material video collection material audio sent by the server.
  • the material audio received by the above terminal is the original material audio
  • the terminal can also cut the original material audio.
  • the processing can be as follows: the server can also send a preset cutting time to the terminal Point, based on the preset cutting time point and preset cutting time, the original material audio is cut to obtain the material audio for synthesizing the video. In the accent beat time point of the original material audio, determine the The accent beat time point of the material audio.
  • the preset cutting time point can be a time point determined by the technician based on the rhythm of the material audio, etc., or it can be the climax time point of the material audio.
  • the climax time point can be manually marked by the technician, or Collected by the machine. If the server sends both of these time points to the terminal, the terminal preferentially selects the time point determined by the technician according to the rhythm of the audio data and other comprehensive considerations.
  • the terminal After obtaining the preset cutting time point and the original material audio, the terminal intercepts the material audio of the preset cutting time after the preset cutting time point in the original material audio, as the material audio for synthesizing the video.
  • the terminal can intercept the material audio of the preset cutting time after the start time point of the original material audio as the material audio for synthesizing the video.
  • the material audio in the following steps are all material audio used to synthesize the video, and correspondingly, the accent beat time point is also the accent beat time point of the material audio used to synthesize the video.
  • Step 103 Based on the accent beat time point, multiple material videos are determined in the material video collection.
  • the terminal may determine multiple material videos in the acquired material video set based on the number N of accent beat time points, the starting time point and the ending time point of the material audio.
  • the accent beat time point is the start time point or the end time point of the material audio, when determining the material video, there can be the following situations:
  • the number of accent beat time points is 5
  • the start time point of the material audio is the accent beat time point
  • the end time point is not the accent beat time point, which is equivalent to the accent beat time point.
  • the material audio is divided into 5 parts, so each part can correspond to a material video. Therefore, 5 material videos can be determined in the material video set.
  • each part can correspond to one material video. Therefore, 4 material videos can be determined in the material video set.
  • the number of accent beat time points is 5, and the start time point and end time point of the material audio are not accent beat time points, which is equivalent to the accent beat time point.
  • the number of material videos included in the material video set is less than the calculated number of material videos that need to be determined, then all the material videos in the material video set can be determined.
  • the selection method of N, N-1, or N+1 material videos determined above can be selected in the case of the material video set being a video set of multiple independent material videos. Random selection in the video set. In the case that the material video set is a total material video, in addition to random selection, it can also be selected sequentially from the first to the back, or from the last to the front, or from the first at intervals.
  • the specific selection method is not limited in the embodiment of this application.
  • Step 104 Based on the accent beat time point, synthesize multiple material videos and material audio to obtain a synthesized video, where the switching time point of each material video in the synthesized video is the accent beat time point of the audio data.
  • the terminal can randomly determine the synthesis order of each material video when synthesizing the video.
  • the material video set is a total material video
  • it can also be determined according to the position of the material video in the total material video.
  • the composition sequence of the material videos the material videos are obtained one by one, and for each material video obtained, the sub-video corresponding to the currently obtained material video is determined based on the currently obtained material video and the accent beat time point.
  • each sub-video is synthesized to obtain a synthesized material video.
  • switching special effects such as fade in, fade in, pop-in, louvered appearance, etc.
  • switching special effects and switching special effect durations can be determined by The technicians pre-set according to the general needs of users. Then, the synthesized material video and audio data are synthesized to obtain synthesized video. Finally, the composite video can be automatically played by the video production application. As shown in Figure 7, the composite video is automatically played in the middle of the display interface of the video production application.
  • Case 1 If the synthesis sequence of the currently acquired material video is the first, it is determined that the starting time point of the material audio is after the starting time point and the first accent beat time point closest to the starting time point In the material video, a video of the first time length is intercepted from the start time point of the material video as the first sub-video corresponding to the material video.
  • Case 2 If the synthesis order of the currently acquired material video is not the first, determine the first total duration of the generated sub-video, and determine the first time point of the first total duration after the start time point of the material audio, Determine the second accent beat time point after the first time point and closest to the first time point. If there is a second accent beat time point, determine the second time length between the first time point and the second accent beat time point. In the material video, the second time length video is intercepted from the start time point of the material video as The second sub video corresponding to the material video. If there is no second accent beat time point, determine the third time length from the first time point to the end time point of the material audio. In the material video, the third time length video is intercepted from the start time point of the material video It is the third sub video corresponding to the material video.
  • the duration of the material audio is 15s
  • the starting time point of the material audio is 0:00
  • the first duration between the start time point 0:00 and the first accent beat time point 0:03 is 3s, then in the material video, from the start time of the material video Click to start intercepting 3s as the corresponding first sub-video.
  • the duration of the material audio is 15s
  • the start time point of the material audio is 0:00
  • the end time point is 0:15
  • the first total duration of the generated sub-video is 13s.
  • the first time point of the first total duration after the start time point of the material audio is 0:13
  • the time point of the second accent beat is 0:14
  • the second duration from 13 to the second accent beat time point 0:14 is 1s, then, in the material video, 3s can be intercepted from the start time point of the material video as the corresponding second video.
  • the material can be In the video, cut 2s from the start time point of the material video as the corresponding third sub-video.
  • the duration of the synthesized video finally obtained is the total duration of the sub-video.
  • the method provided by the embodiment of the present application obtains the material video set, the material audio, and the accent beat time point of the material audio from the server. Then, multiple material videos are selected from the material video set, and finally, based on the accent beat time point, multiple material videos and the material audio are synthesized to obtain a synthesized video. In the obtained synthesized video, the switching time point of each material video is the accent beat time point of the audio data. In this way, the terminal can automatically obtain the material and automatically synthesize the material video and the material audio without manual processing. Higher.
  • a composite video can be made from the corresponding material audio and the shared material video stored in the server.
  • the composite video can be used as a demo (sample) corresponding to the music and the video can be automatically played. Show it to users. In this way, it is possible to attract users to enter the above-mentioned video production application to make a composite video by themselves.
  • an embodiment of the present application also provides a device for video synthesis.
  • the device may be the terminal in the foregoing embodiment.
  • the device includes: a sending module 1110, an acquiring module 1120, and a determination Module 110 and synthesis module 1140.
  • the sending module 1110 is configured to send a material acquisition request to the server, where the material acquisition request carries feature information of the material audio;
  • the obtaining module 1120 is configured to obtain the material video set, the material audio, and the accent beat time point of the material audio sent by the server;
  • the determining module 1130 is configured to determine multiple material videos in the material video set based on the accent beat time point;
  • the synthesis module 1140 is configured to synthesize the multiple material videos and the material audio based on the accent beat time point to obtain a synthesized video, wherein the switching time point of each material video in the synthesized video is all Describe the accent beat time point of the audio data.
  • the determining module 1130 is configured to:
  • a plurality of material videos are determined in the material video set.
  • the determining module 1130 is configured to:
  • N+1 material videos are determined in the material video set.
  • the synthesis module 1140 is configured to:
  • each sub-video is synthesized to obtain a synthesized material video
  • the synthesized material video and the material audio are synthesized to obtain a synthesized video.
  • the synthesis module 1140 is configured to:
  • N+1 material videos are determined in the material video set.
  • the material video set is a total material video spliced into a plurality of material videos.
  • the material video set is a video set including multiple independent material videos.
  • the obtaining module 1120 is used to:
  • the accent beat time point of the material audio for synthesizing video is determined.
  • the video synthesis device provided in the above embodiment only uses the division of the above functional modules for illustration when synthesizing videos.
  • the above functions can be allocated by different functional modules as needed. That is, the internal structure of the terminal is divided into different functional modules to complete all or part of the functions described above.
  • the video synthesis device provided in the foregoing embodiment and the video synthesis method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • FIG. 12 shows a structural block diagram of a terminal 1200 provided by an exemplary embodiment of the present application.
  • the terminal 1200 can be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, moving picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, moving picture expert compressing standard audio Level 4) Player, laptop or desktop computer.
  • the terminal 1200 may also be called user equipment, portable terminal, laptop terminal, desktop terminal and other names.
  • the terminal 1200 includes a processor 1201 and a memory 1202.
  • the processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on.
  • the processor 1201 may adopt at least one hardware form among DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array, Programmable Logic Array). achieve.
  • the processor 1201 may also include a main processor and a coprocessor.
  • the main processor is a processor used to process data in the awake state, also called a CPU (Central Processing Unit, central processing unit); the coprocessor is A low-power processor used to process data in the standby state.
  • the processor 1201 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used to render and draw content that needs to be displayed on the display screen.
  • the processor 1201 may also include an AI (Artificial Intelligence) processor, which is used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 1202 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 1202 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 1202 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 1201 to implement the video synthesis provided in the method embodiment of the present application. Methods.
  • the terminal 1200 may further include: a peripheral device interface 1203 and at least one peripheral device.
  • the processor 1201, the memory 1202, and the peripheral device interface 1203 may be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1203 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1204, a touch display screen 1205, a camera 1206, an audio circuit 1207, a positioning component 1208, and a power supply 1209.
  • the peripheral device interface 1203 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 1201 and the memory 1202.
  • the processor 1201, the memory 1202, and the peripheral device interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 1201, the memory 1202, and the peripheral device interface 1203 or The two can be implemented on separate chips or circuit boards, which are not limited in this embodiment.
  • the radio frequency circuit 1204 is used for receiving and transmitting RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 1204 communicates with a communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 1204 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the radio frequency circuit 1204 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and so on.
  • the radio frequency circuit 1204 can communicate with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes but is not limited to: metropolitan area network, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area network and/or WiFi (Wireless Fidelity, wireless fidelity) network.
  • the radio frequency circuit 1204 may also include NFC (Near Field Communication) related circuits, which is not limited in this application.
  • the display screen 1205 is used to display a UI (User Interface, user interface).
  • the UI can include graphics, text, icons, videos, and any combination thereof.
  • the display screen 1205 also has the ability to collect touch signals on or above the surface of the display screen 1205.
  • the touch signal may be input to the processor 1201 as a control signal for processing.
  • the display screen 1205 may also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 1205 may be one display screen 1205, which is provided with the front panel of the terminal 1200; in other embodiments, there may be at least two display screens 1205, which are respectively arranged on different surfaces of the terminal 1200 or in a folded design; In still other embodiments, the display screen 1205 may be a flexible display screen, which is disposed on the curved surface or the folding surface of the terminal 1200. Furthermore, the display screen 1205 can also be set as a non-rectangular irregular pattern, that is, a special-shaped screen.
  • the display screen 1205 may be made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
  • the camera assembly 1206 is used to capture images or videos.
  • the camera assembly 1206 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • there are at least two rear cameras each of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function, Integrate with the wide-angle camera to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions.
  • the camera assembly 1206 may also include a flash.
  • the flash can be a single-color flash or a dual-color flash. Dual color temperature flash refers to the combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • the audio circuit 1207 may include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 1201 for processing, or input to the radio frequency circuit 1204 to implement voice communication. For the purpose of stereo collection or noise reduction, there may be multiple microphones, which are respectively set in different parts of the terminal 1200.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is used to convert the electrical signal from the processor 1201 or the radio frequency circuit 1204 into sound waves.
  • the speaker can be a traditional membrane speaker or a piezoelectric ceramic speaker.
  • the speaker When the speaker is a piezoelectric ceramic speaker, it can not only convert the electrical signal into human audible sound waves, but also convert the electrical signal into human inaudible sound waves for purposes such as distance measurement.
  • the audio circuit 1207 may also include a headphone jack.
  • the positioning component 1208 is used to locate the current geographic location of the terminal 1200 to implement navigation or LBS (Location Based Service, location-based service).
  • the positioning component 1208 may be a positioning component based on the GPS (Global Positioning System, Global Positioning System) of the United States, the Beidou system of China, the Granus system of Russia, or the Galileo system of the European Union.
  • the power supply 1209 is used to supply power to various components in the terminal 1200.
  • the power source 1209 may be alternating current, direct current, disposable batteries or rechargeable batteries.
  • the rechargeable battery may support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • the terminal 1200 further includes one or more sensors 1210.
  • the one or more sensors 1210 include, but are not limited to: an acceleration sensor 1211, a gyroscope sensor 1212, a pressure sensor 1213, a fingerprint sensor 1214, an optical sensor 1215, and a proximity sensor 1216.
  • the acceleration sensor 1211 can detect the magnitude of acceleration on the three coordinate axes of the coordinate system established by the terminal 1200.
  • the acceleration sensor 1211 can be used to detect the components of the gravitational acceleration on three coordinate axes.
  • the processor 1201 may control the touch screen 1205 to display the user interface in a horizontal view or a vertical view according to the gravity acceleration signal collected by the acceleration sensor 1211.
  • the acceleration sensor 1211 can also be used for game or user motion data collection.
  • the gyroscope sensor 1212 can detect the body direction and rotation angle of the terminal 1200, and the gyroscope sensor 1212 can cooperate with the acceleration sensor 1211 to collect the user's 3D actions on the terminal 1200.
  • the processor 1201 can implement the following functions according to the data collected by the gyroscope sensor 1212: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 1213 may be disposed on the side frame of the terminal 1200 and/or the lower layer of the touch screen 1205.
  • the processor 1201 performs left and right hand recognition or quick operation according to the holding signal collected by the pressure sensor 1213.
  • the processor 1201 operates according to the user's pressure on the touch display screen 1205 to control the operability controls on the UI interface.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 1214 is used to collect the user's fingerprint.
  • the processor 1201 identifies the user's identity according to the fingerprint collected by the fingerprint sensor 1214, or the fingerprint sensor 1214 identifies the user's identity according to the collected fingerprint. When it is recognized that the user's identity is a trusted identity, the processor 1201 authorizes the user to perform related sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
  • the fingerprint sensor 1214 may be provided on the front, back or side of the terminal 1200. When a physical button or a manufacturer logo is provided on the terminal 1200, the fingerprint sensor 1214 can be integrated with the physical button or the manufacturer logo.
  • the optical sensor 1215 is used to collect the ambient light intensity.
  • the processor 1201 may control the display brightness of the touch screen 1205 according to the intensity of the ambient light collected by the optical sensor 1215. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 1205 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 1205 is decreased.
  • the processor 1201 may also dynamically adjust the shooting parameters of the camera assembly 1206 according to the ambient light intensity collected by the optical sensor 1215.
  • the proximity sensor 1216 also called a distance sensor, is usually arranged on the front panel of the terminal 1200.
  • the proximity sensor 1216 is used to collect the distance between the user and the front of the terminal 1200.
  • the processor 1201 controls the touch screen 1205 to switch from the on-screen state to the off-screen state; when the proximity sensor 1216 detects When the distance between the user and the front of the terminal 1200 gradually increases, the processor 1201 controls the touch display screen 1205 to switch from the rest screen state to the bright screen state.
  • FIG. 12 does not constitute a limitation on the terminal 1200, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.
  • a computer-readable storage medium includes a memory storing instructions.
  • the instructions can be executed by a processor in a terminal to complete the video synthesis method in the foregoing embodiment.
  • the computer-readable storage medium may be non-transitory.
  • the computer-readable storage medium may be ROM (Read-Only Memory), RAM (Random Access Memory), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Television Signal Processing For Recording (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

L'invention concerne un procédé de synthèse vidéo se rapportant au domaine technique du traitement vidéo. Le procédé comprend les étapes consistant à : envoyer une requête d'acquisition d'élément à un serveur, la requête d'acquisition d'élément transportant des informations de caractéristique d'un élément audio ; acquérir un ensemble d'éléments vidéo, l'élément audio et un point temporel de battement accentué de l'élément audio envoyé par le serveur ; déterminer, sur la base du point temporel de battement accentué, une pluralité d'éléments vidéo à partir de l'ensemble d'éléments vidéo ; et synthétiser, sur la base du point temporel de battement accentué, la pluralité d'éléments vidéo et l'élément audio pour obtenir une vidéo synthétisée, un point temporel de commutation de chaque élément vidéo dans la vidéo synthétisée étant le point temporel de battement accentué des données audio. Au moyen de la présente invention, l'efficacité de synthèse vidéo peut être améliorée.
PCT/CN2019/120302 2019-07-17 2019-11-22 Procédé et appareil de synthèse vidéo, ainsi que terminal et support de stockage WO2021008055A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910647507.9A CN110336960B (zh) 2019-07-17 2019-07-17 视频合成的方法、装置、终端及存储介质
CN201910647507.9 2019-07-17

Publications (1)

Publication Number Publication Date
WO2021008055A1 true WO2021008055A1 (fr) 2021-01-21

Family

ID=68145712

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120302 WO2021008055A1 (fr) 2019-07-17 2019-11-22 Procédé et appareil de synthèse vidéo, ainsi que terminal et support de stockage

Country Status (2)

Country Link
CN (1) CN110336960B (fr)
WO (1) WO2021008055A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113613061A (zh) * 2021-07-06 2021-11-05 北京达佳互联信息技术有限公司 一种卡点模板生成方法、装置、设备及存储介质
CN113727038A (zh) * 2021-07-28 2021-11-30 北京达佳互联信息技术有限公司 一种视频处理方法、装置、电子设备及存储介质
CN114286164A (zh) * 2021-12-28 2022-04-05 北京思明启创科技有限公司 一种视频合成的方法、装置、电子设备及存储介质
CN114390356A (zh) * 2022-01-19 2022-04-22 维沃移动通信有限公司 视频处理方法、视频处理装置和电子设备

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112235631B (zh) * 2019-07-15 2022-05-03 北京字节跳动网络技术有限公司 视频处理方法、装置、电子设备及存储介质
CN110336960B (zh) * 2019-07-17 2021-12-10 广州酷狗计算机科技有限公司 视频合成的方法、装置、终端及存储介质
CN110519638B (zh) * 2019-09-06 2023-05-16 Oppo广东移动通信有限公司 处理方法、处理装置、电子装置和存储介质
CN110677711B (zh) * 2019-10-17 2022-03-01 北京字节跳动网络技术有限公司 视频配乐方法、装置、电子设备及计算机可读介质
CN110797055B (zh) * 2019-10-29 2021-09-03 北京达佳互联信息技术有限公司 多媒体资源合成方法、装置、电子设备及存储介质
CN110769309B (zh) * 2019-11-04 2023-03-31 北京字节跳动网络技术有限公司 用于展示音乐点的方法、装置、电子设备和介质
CN112822563A (zh) * 2019-11-15 2021-05-18 北京字节跳动网络技术有限公司 生成视频的方法、装置、电子设备和计算机可读介质
CN112822541B (zh) * 2019-11-18 2022-05-20 北京字节跳动网络技术有限公司 视频生成方法、装置、电子设备和计算机可读介质
CN111064992A (zh) * 2019-12-10 2020-04-24 懂频智能科技(上海)有限公司 一种根据音乐节拍自动进行视频内容切换的方法
CN110933487B (zh) * 2019-12-18 2022-05-03 北京百度网讯科技有限公司 卡点视频的生成方法、装置、设备及存储介质
CN111065001B (zh) * 2019-12-25 2022-03-22 广州酷狗计算机科技有限公司 视频制作的方法、装置、设备及存储介质
CN111031394B (zh) * 2019-12-30 2022-03-22 广州酷狗计算机科技有限公司 视频制作的方法、装置、设备及存储介质
CN111625682B (zh) * 2020-04-30 2023-10-20 腾讯音乐娱乐科技(深圳)有限公司 视频的生成方法、装置、计算机设备及存储介质
CN111741365B (zh) * 2020-05-15 2021-10-26 广州小迈网络科技有限公司 视频合成数据处理方法、系统、装置和存储介质
CN111970571B (zh) * 2020-08-24 2022-07-26 北京字节跳动网络技术有限公司 视频制作方法、装置、设备及存储介质
CN112153463B (zh) * 2020-09-04 2023-06-16 上海七牛信息技术有限公司 一种多素材视频合成方法、装置、电子设备及存储介质
CN112866584B (zh) * 2020-12-31 2023-01-20 北京达佳互联信息技术有限公司 视频合成方法、装置、终端及存储介质
CN113014959B (zh) * 2021-03-15 2022-08-09 福建省捷盛网络科技有限公司 一种互联网短视频合并系统
CN115695899A (zh) * 2021-07-23 2023-02-03 花瓣云科技有限公司 视频的生成方法、电子设备及其介质
CN113676772B (zh) * 2021-08-16 2023-08-08 上海哔哩哔哩科技有限公司 视频生成方法及装置
WO2023051245A1 (fr) * 2021-09-29 2023-04-06 北京字跳网络技术有限公司 Procédé et appareil de traitement de vidéo, dispositif et support de stockage
CN113923378B (zh) * 2021-09-29 2024-03-19 北京字跳网络技术有限公司 视频处理方法、装置、设备及存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101421707A (zh) * 2006-04-13 2009-04-29 伊默生公司 从数字音频信号自动产生触觉事件的系统和方法
CN101640057A (zh) * 2009-05-31 2010-02-03 北京中星微电子有限公司 一种音视频匹配方法及装置
US20100220197A1 (en) * 2009-03-02 2010-09-02 John Nicholas Dukellis Assisted Video Creation Utilizing a Camera
CN102117638A (zh) * 2009-12-30 2011-07-06 北京华旗随身数码股份有限公司 音乐节奏控制的视频输出的方法及播放装置
CN107770457A (zh) * 2017-10-27 2018-03-06 维沃移动通信有限公司 一种视频制作方法及移动终端
CN108124101A (zh) * 2017-12-18 2018-06-05 北京奇虎科技有限公司 视频采集方法、装置、电子设备和计算机可读存储介质
CN108259983A (zh) * 2017-12-29 2018-07-06 广州市百果园信息技术有限公司 一种视频图像处理方法、计算机可读存储介质和终端
CN109413342A (zh) * 2018-12-21 2019-03-01 广州酷狗计算机科技有限公司 音视频处理方法、装置、终端及存储介质
CN110336960A (zh) * 2019-07-17 2019-10-15 广州酷狗计算机科技有限公司 视频合成的方法、装置、终端及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001313915A (ja) * 2000-04-28 2001-11-09 Matsushita Electric Ind Co Ltd テレビ会議装置
CN107483843B (zh) * 2017-08-16 2019-11-15 成都品果科技有限公司 音视频匹配剪辑方法及装置
CN107770626B (zh) * 2017-11-06 2020-03-17 腾讯科技(深圳)有限公司 视频素材的处理方法、视频合成方法、装置及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101421707A (zh) * 2006-04-13 2009-04-29 伊默生公司 从数字音频信号自动产生触觉事件的系统和方法
US20100220197A1 (en) * 2009-03-02 2010-09-02 John Nicholas Dukellis Assisted Video Creation Utilizing a Camera
CN101640057A (zh) * 2009-05-31 2010-02-03 北京中星微电子有限公司 一种音视频匹配方法及装置
CN102117638A (zh) * 2009-12-30 2011-07-06 北京华旗随身数码股份有限公司 音乐节奏控制的视频输出的方法及播放装置
CN107770457A (zh) * 2017-10-27 2018-03-06 维沃移动通信有限公司 一种视频制作方法及移动终端
CN108124101A (zh) * 2017-12-18 2018-06-05 北京奇虎科技有限公司 视频采集方法、装置、电子设备和计算机可读存储介质
CN108259983A (zh) * 2017-12-29 2018-07-06 广州市百果园信息技术有限公司 一种视频图像处理方法、计算机可读存储介质和终端
CN109413342A (zh) * 2018-12-21 2019-03-01 广州酷狗计算机科技有限公司 音视频处理方法、装置、终端及存储介质
CN110336960A (zh) * 2019-07-17 2019-10-15 广州酷狗计算机科技有限公司 视频合成的方法、装置、终端及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113613061A (zh) * 2021-07-06 2021-11-05 北京达佳互联信息技术有限公司 一种卡点模板生成方法、装置、设备及存储介质
CN113727038A (zh) * 2021-07-28 2021-11-30 北京达佳互联信息技术有限公司 一种视频处理方法、装置、电子设备及存储介质
CN113727038B (zh) * 2021-07-28 2023-09-05 北京达佳互联信息技术有限公司 一种视频处理方法、装置、电子设备及存储介质
CN114286164A (zh) * 2021-12-28 2022-04-05 北京思明启创科技有限公司 一种视频合成的方法、装置、电子设备及存储介质
CN114286164B (zh) * 2021-12-28 2024-02-09 北京思明启创科技有限公司 一种视频合成的方法、装置、电子设备及存储介质
CN114390356A (zh) * 2022-01-19 2022-04-22 维沃移动通信有限公司 视频处理方法、视频处理装置和电子设备

Also Published As

Publication number Publication date
CN110336960A (zh) 2019-10-15
CN110336960B (zh) 2021-12-10

Similar Documents

Publication Publication Date Title
WO2021008055A1 (fr) Procédé et appareil de synthèse vidéo, ainsi que terminal et support de stockage
WO2020253096A1 (fr) Procédé et appareil de synthèse vidéo, et terminal et support de stockage
CN110267067B (zh) 直播间推荐的方法、装置、设备及存储介质
US11632584B2 (en) Video switching during music playback
CN111065001B (zh) 视频制作的方法、装置、设备及存储介质
CN110491358B (zh) 进行音频录制的方法、装置、设备、系统及存储介质
CN111918090B (zh) 直播画面显示方法、装置、终端及存储介质
WO2021068903A1 (fr) Procédé de détermination d'informations de rapport de réglage de volume, appareil, dispositif et support de stockage
CN111142838B (zh) 音频播放方法、装置、计算机设备及存储介质
CN111061405B (zh) 录制歌曲音频的方法、装置、设备及存储介质
EP3618055B1 (fr) Procédé et terminal de mélange audio et support d'informations
CN109743461B (zh) 音频数据处理方法、装置、终端及存储介质
CN109982129B (zh) 短视频的播放控制方法、装置及存储介质
WO2021139535A1 (fr) Procédé, appareil et système pour lire un contenu audio, et dispositif et support de stockage
WO2022095465A1 (fr) Procédé et appareil d'affichage d'informations
WO2023011050A1 (fr) Procédé et système pour réaliser une mise en chœur de connexion de microphone, ainsi que dispositif et support de stockage
WO2020244516A1 (fr) Procédé et dispositif d'interaction en ligne
CN110798327B (zh) 消息处理方法、设备及存储介质
CN111818358A (zh) 音频文件的播放方法、装置、终端及存储介质
WO2022227581A1 (fr) Procédé d'affichage de ressources et dispositif informatique
CN111083526A (zh) 视频转场方法、装置、计算机设备及存储介质
CN114245218A (zh) 音视频播放方法、装置、计算机设备及存储介质
CN112822544B (zh) 视频素材文件生成方法、视频合成方法、设备及介质
CN111031394B (zh) 视频制作的方法、装置、设备及存储介质
CN110808021B (zh) 音频播放的方法、装置、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19937614

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19937614

Country of ref document: EP

Kind code of ref document: A1