WO2016177296A1

WO2016177296A1 - Video generation method and apparatus

Info

Publication number: WO2016177296A1
Application number: PCT/CN2016/080666
Authority: WO
Inventors: 王超; 李纯
Original assignee: 腾讯科技（深圳）有限公司
Priority date: 2015-05-04
Filing date: 2016-04-29
Publication date: 2016-11-10
Also published as: CN104967900B; CN104967900A

Abstract

Disclosed are a video generation method and apparatus, which belong to the technical field of computers. The method comprises: playing an accompanying audio of a song to be recorded, displaying a lyric subtitle corresponding to the accompanying audio, and performing video image shooting and audio recording; after video image shooting and audio recording are completed, displaying options of at least one pre-stored video special-effect combination, and displaying a synthetic key; when a selection instruction for a first video special-effect combination of at least one video special-effect combination is received, according to the first video special-effect combination, performing special-effect combination processing on a shot video image and displaying the processed video image, and at the same time, playing the accompanying audio and the recorded audio; and when a click instruction for the synthetic key is received, synthesizing the accompanying audio, the recorded audio and the processed video image to obtain a synthesized video. By means of the present invention, the flexibility of song video recording can be enhanced.

Description

Method and device for generating video

The present application claims priority to Chinese Patent Application No. 201510221018.9, entitled "A Method and Apparatus for Generating Video", filed on May 4, 2015, the entire contents of in.

Technical field

The embodiments of the present invention relate to the field of computer technologies, and in particular, to a method and an apparatus for generating a video.

Background technique

With the development of computer technology, mobile phones, computers and other terminals have been widely used, and the types of applications on the corresponding terminals are more and more and more functions are becoming more and more abundant. The singing application (or karaoke application) is a very popular entertainment application.

Users can record songs through the singing application. While recording songs, they can also take video images and get songs with videos.

In the process of implementing the embodiments of the present invention, the inventors have found that the related art has at least the following problems:

Based on the above process of recording a song, the video provided in the song is generally a simple presentation of the content of the camera, and the flexibility of recording the video of the song is poor.

Summary of the invention

In order to solve the problems of the related art, an embodiment of the present invention provides a method and apparatus for generating a video. The technical solution is as follows:

In a first aspect, a method of generating a video is provided, the method comprising:

Playing the accompaniment audio of the song to be recorded, displaying the lyrics subtitle corresponding to the accompaniment audio, and performing video image shooting and audio recording;

After the capturing of the video image and the recording of the audio are ended, displaying an option of the at least one video effect combination stored in advance, and displaying the composite button; wherein the video effect combination includes at least one filter and/or at least one Foreground video

Selecting a selection finger for the first video effect combination in the at least one video effect combination Timing, according to the first video special effect combination, performing combined special effect processing on the captured video image, and displaying the processed video image while playing the accompaniment audio and the recorded audio;

When the click command for the composite button is received, the accompaniment audio, the recorded audio, and the processed video image are combined to obtain a composite video.

In a second aspect, an apparatus for generating a video is provided, the apparatus comprising:

a playing module, configured to play the accompaniment audio of the song to be recorded, display the lyrics subtitle corresponding to the accompaniment audio, and perform video image shooting and audio recording;

a display module, configured to display an option of pre-stored at least one video special effect combination after the capturing of the video image and the recording of the audio, and display the composite button; wherein the video special effect combination includes at least one filter And/or at least one foreground video;

a preview module, configured to perform combined special effect processing on the captured video image according to the first video special effect combination when receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, and processing The subsequent video image is displayed while playing the accompaniment audio and the recorded audio;

And a synthesizing module, configured to synthesize the accompaniment audio, the recorded audio, and the processed video image to obtain a synthesized video when receiving a click instruction to the composite button.

In a third aspect, a terminal is provided, where the terminal includes:

One or more processors; and

Memory

The memory stores one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including instructions for:

And when receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, performing combined special effect processing on the captured video image according to the first video special effect combination, and performing the processed video image on the processed video image Displaying, playing the accompaniment audio and recorded audio simultaneously;

The beneficial effects brought by the technical solutions provided by the embodiments of the present invention are:

In the embodiment of the present invention, the accompaniment audio of the song to be recorded is played, the lyrics subtitle corresponding to the accompaniment audio is displayed, and the video image is captured and the audio is recorded. After the video image is captured and the audio is recorded, the pre-stored display is displayed. An option of at least one video effect combination and displaying a composite button, wherein the video effect combination includes at least one filter and/or at least one foreground video. When receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, combining the special effect processing on the captured video image according to the first video special effect combination, and displaying the processed video image while simultaneously displaying The accompaniment audio and the recorded audio are played, and when the click command for the composite button is received, the accompaniment audio, the recorded audio, and the processed video image are combined to obtain a composite video. The video image in the synthesized video thus obtained is not a simple presentation of the content captured by the camera, but is subjected to a combination of video effects, thereby enhancing the flexibility of video recording of the song.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in light of the inventive work.

FIG. 1 is a flowchart of a method for generating a video according to an embodiment of the present invention;

2 is a schematic diagram of interface display according to an embodiment of the present invention;

3 is a schematic diagram of an interface display according to an embodiment of the present invention;

4 is a schematic diagram of interface display according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a video generating apparatus according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

detailed description

The embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

An embodiment of the present invention provides a method for generating a video. As shown in FIG. 1 , the processing procedure of the method may include the following steps:

Step 101: Play the accompaniment audio of the song to be recorded, display the lyrics subtitle corresponding to the accompaniment audio, and perform video image shooting and audio recording.

Step 102: After the shooting of the video image and the recording of the audio are finished, displaying an option of the pre-stored combination of the at least one video special effect, and displaying the composite button; wherein the video special effect combination includes at least one filter and/or at least one foreground video. .

Step 103: When receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, performing combined special effect processing on the captured video image according to the first video special effect combination, and performing the processed video image on the processed video image. Display, playing accompaniment audio and recorded audio simultaneously.

Step 104: When receiving a click command for the composite button, synthesizing the accompaniment audio, the recorded audio, and the processed video image to obtain a composite video.

The embodiment of the invention provides a method for generating a video, and the execution body of the method is a terminal. The terminal may be any terminal with a video capture function, such as a mobile phone with a camera, a tablet, and the like, and an application for song and video recording may be installed on the terminal. The terminal may be provided with a processor and a memory, and the processor may be used for processing video images and audio, and the memory may be used for storing data required in the following processing and generated data. It can be equipped with input and output devices such as camera, microphone, screen, audio output device, camera can be used for video image shooting, microphone can be used for audio recording, screen can be used for video, lyrics subtitles, etc. Type of screen, audio output device can be used for audio playback, can be headphones or speakers. In this embodiment, the terminal is a mobile phone as an example, and a detailed description of the solution is performed. Other situations are similar, and the embodiment is not described in detail.

The processing flow shown in FIG. 1 will be described in detail below with reference to specific implementations, and the content can be as follows:

The song to be recorded may be a song or a song fragment that the user wants to make a K song.

In the implementation, the user can install the above application on the terminal and operate the application, and then trigger the terminal to display the main interface of the application, in the main interface, a little song button can be displayed, and the user clicks the song button. After that, the terminal can be triggered to switch to the corresponding song selection interface of the application, and the song selection interface can display a list of songs in which the user can select the song he likes (ie, the target song mentioned later). Songs with accompaniment files stored locally can be displayed in the song list, and songs with accompaniment files stored on the network side can also be displayed. After the user selects a song in the song list, the terminal can use the song as the song to be recorded, and can display the recording interface of the song. The recording interface can display a recording button. When the user clicks the recording button, the terminal can trigger the terminal to retrieve the song. The accompaniment file of the song, and the accompaniment file is run to play the accompaniment audio of the song, and the lyrics subtitle corresponding to the song is displayed on the recording interface, and the terminal can start capturing the video image through the front camera of the terminal and recording through the microphone. Audio. In addition, the video image captured by the front camera of the terminal can be transmitted to the screen for display in real time, and the user can adjust the position of the terminal based on the image displayed on the screen.

Optionally, the user can select a favorite clip as the song to be recorded in the target song. Correspondingly, before step 101, the following processing may be performed: determining the song segment selected by the user in the target song. Correspondingly in step 101, the accompaniment audio of the song segment can be played.

In the implementation, after the user selects a certain song (ie, the target song) in the song list, the user can select a song segment to be recorded in the song, and further, the terminal can obtain the song segment selected by the user in the target song.

Optionally, the user can select a song segment in the target song in a variety of ways, and several alternative treatments are given below:

In the first mode, the lyrics list of the target song is displayed; the starting point and the ending point of the song segment set by the user in the lyrics list are obtained; and the song segment selected by the user in the target song is determined according to the starting point and the ending point.

In the implementation, the user can drag the start line and the end line displayed in the lyrics, and the song piece corresponding to the lyric content between the start line and the end line is the song piece selected by the user. Specifically, as shown in FIG. 2, the lyric list of the target song may be displayed in the recording interface, and the start line and the end line may be displayed in the lyric list, and the user may drag the above start line and end line up and down. Selected from the list of lyrics Take the song fragment you like, the lyrics below the starting line are the starting lyrics (starting sentence) of the song segment selected by the user, and the lyrics above the ending line are the ending lyrics (terminating sentence) of the song segment selected by the user, and then The user can click the record button, and the terminal can be triggered to obtain the start time point of the start lyrics and the end time point of terminating the lyrics as the start point and the end point of the song segment, and can be determined in the target song according to the start point and the end point. Selected song clips. Further, the terminal can play the accompaniment audio of the song segment.

In addition, in the application, you can also set the upper limit of the duration of the song clip, such as 30 seconds. If the time difference between the above starting point and the ending point is greater than 30 seconds, the recording button in the recording interface can be set to enter the unclickable state. In addition, the minimum length of the song segment, such as 10 seconds, can be set in advance in the application. If the time difference between the above starting point and the ending point is less than 10 seconds, the recording button in the recording interface can be set to enter the unclickable state.

In the second mode, the playing time axis of the target song is displayed; the starting point and the ending point of the song segment set by the user in the playing time axis are acquired; and the song segment selected by the user in the target song is determined according to the starting point and the ending point.

In the implementation, the recording interface can display the playing time axis of the target song and the recording button, and the two lines located at different positions can also be displayed on the displayed playing time axis, and the user can select the favorite by dragging the two lines. After the selection, the user can click the recording button in the recording interface, and the terminal can be triggered to obtain the playing time point of the two lines, and the two playing time points are the starting point of the song segment selected by the user. At the termination point, the terminal can acquire a song segment between the start point and the end point in the target song, and then play the accompaniment audio of the song segment.

Optionally, before starting recording, a variety of filter options can also be displayed in the recording interface, as shown in Figure 3, where the user can select a filter for real-time processing of the captured video image. . After selecting the filter, the user can click the record button to start recording, and the application can perform corresponding image processing on each image frame captured according to the filter selected by the user, and output the filtered video image to the screen. Display and encode it and save it to a file in real time.

Step 102: After the shooting of the video image and the recording of the audio are finished, displaying an option of the at least one video special effect combination stored in advance, and displaying the composite button; wherein the video special effect combination includes at least one filter and/or at least one foreground video.

The video special effect combination is a plurality of video special effects for combining video images, the video special effects may be a filter, a foreground video, etc., and the filter may be used to adjust pixel values of each pixel in the video image to achieve Tools for a specific visual effect, such as black and white filters, quaint filters, etc. The scene video can be a video that is hovering over the top of the video image.

In the implementation, when the last audio frame of the accompaniment audio of the song to be recorded is finished playing, or when the user clicks the end button in the recording interface during the accompaniment audio playing, the shooting of the video image and the recording of the audio are ended, and the terminal can correspondingly Switching to the preview interface of the application, as shown in FIG. 4, the preview interface may display an option of one or more combinations of video effects pre-stored locally, wherein the video effect combination may use different different time segments of the video image. Filters and/or foreground video can also use different filters and/or foreground video for the same time period. The processing information corresponding to each video special effect combination may be recorded in the above application, and the processing information may include each filter in the video special effect combination, a start time point and an end time point of the foreground video. In addition, a composite button for synthesizing the video image and the audio may be displayed in the preview interface described above.

In the implementation, the user can select a video effect combination that he or she likes according to the option of the video special effect combination displayed on the preview interface. When the user clicks on one of the video effect combination options, the terminal will receive the video special effect combination (ie, The selection instruction of the first video special effect combination, at this time, the terminal can acquire the processing information of the video special effect combination stored therein, according to each filter in the processing information, the start time point and the end time point of the foreground video, for each The video frames involved in the filter and foreground video are processed, and the processed video images are output to the screen for display in real time. For example, the start time point of a black and white filter in the first video special effect combination is the 5th second, and the end time point is the 13th second. When the terminal performs combined special effects processing on the captured video image according to the first video special effect combination, The video frames from the 5th to 13th second of the captured image are subjected to black and white filter processing.

While the video image is being processed, the accompaniment audio and the recorded audio can be acquired, and the audio and video are played synchronously according to the time of each video frame and the time of each audio frame in the accompaniment audio and the recorded audio.

If the preview interface displays multiple video effects combination options, after the user selects one of the video effects combination previews, the user can also select other video effects combinations for preview, and finally select a favorite video effect combination.

In the implementation, after the user selects the video effect combination and previews, the user can click on the preview interface. When the button is pressed, the terminal will receive the click command for the above composite button, and the terminal can acquire the video image processed by the video special effect and perform ffmpeg (a set can be used for recording, converting digital audio, video, and can It is converted into a stream of open source computer programs) encoding, in addition, the terminal can also acquire the accompaniment audio and recorded audio, and audio-encode it to obtain the encoded audio, and then the encoded video image and the encoded audio. Ffmpeg synthesis, get a composite video.

Optionally, the user may also perform special effects on the recorded audio when recording the song, and correspondingly, the following processing may be performed: displaying an option of at least one audio special effect, when receiving the first audio special effect in the at least one audio special effect When the selection instruction is executed, the recorded audio is subjected to special effects according to the first audio effect, and the processed audio is played. Correspondingly, the processing of step 104 may be as follows: when receiving a click command for the composite button, synthesizing the processed audio, the accompaniment audio, and the processed video image to obtain a composite video.

In the implementation, the above preview interface can also display one or more audio effects options, such as doll sounds, phonographs and other audio effects, the user can select an audio effect that he likes in the options of the displayed audio effects (ie, the first An audio effect), at this time, the terminal will receive a selection instruction for the audio effect, perform special effects processing on the recorded audio according to the selected audio special effect, and play the processed audio on the preview interface for the user. Preview it. After selecting the audio effect and previewing, the user can click the composite button displayed in the preview interface to trigger the terminal to obtain the encoded video image. In addition, the terminal can also acquire the accompaniment audio and the processed audio and encode it to obtain the coded image. The audio is then ffmpeg synthesized by the encoded video image and the encoded audio to obtain a composite video.

Optionally, when the user records the song, the recorded audio may also be edited, and correspondingly, the following processing may be performed: displaying at least one option of audio editing processing; when receiving the first in at least one audio editing process When an audio editing process selects an instruction, the first audio editing process is performed on the recorded audio, and the processed audio is played. Correspondingly, the processing of step 104 may be as follows: when receiving a click command for the composite button, synthesizing the processed audio, the accompaniment audio, and the processed video image to obtain a composite video.

In the implementation, the above preview interface may also display one or more options for audio editing processing, such as adjusting the volume, turning down the volume, moving the vocal forward, moving the vocal backwards and denoising, etc., the user may Select one of the audio editing processing options (ie, the first audio editing processing) in one or more of the displayed audio editing processing options. Specifically, the preview interface may display options such as volume adjustment, vocal movement, and denoising when the user needs When you adjust the volume of the recorded audio, you can click on the volume The whole option, at this time, will trigger the terminal to display the volume adjustment axis (the volume gradually increases from left to right, or the volume gradually increases from bottom to top), and the volume adjustment line can also be displayed on the displayed volume adjustment axis. The user can adjust the volume by moving the position of the volume adjustment line. When the user clicks the vocal movement option, the terminal will receive the user's selection instruction for the vocal movement option, and the time axis can be displayed, and the vocal moving line at the middle position of the time axis can also be displayed on the displayed time axis. The user can adjust the size of the movement by moving the position of the vocal moving line, that is, when the vocal moving line moves forward, the terminal can move the recorded audio forward, and when the vocal moving line moves backward, the terminal can The recorded audio moves backwards, which avoids re-recording the audio when the recorded audio is misaligned. When the user clicks the denoising option, the terminal will receive a selection instruction for the denoising option, and then the denoised processing of the recorded audio can be performed. After the user selects the first audio editing processing option, the preview button in the preview interface can be clicked, and the terminal can be triggered to perform the first audio editing process on the recorded audio, and play the processed audio. After the user selects the audio editing process and previews, the synthesized button displayed in the preview interface can be clicked to trigger the terminal to obtain the encoded video image. In addition, the terminal can also acquire the accompaniment audio and the processed audio and encode the code to obtain the code. After the audio, the encoded video image is then ffmpeg synthesized with the encoded audio to obtain a composite video.

In the embodiment of the present invention, the accompaniment audio of the song to be recorded is played, the lyrics subtitle corresponding to the accompaniment audio is displayed, and the video image is captured and the audio is recorded. After the video image is captured and the audio is recorded, the pre-stored display is displayed. At least one video effect combination option and displaying a composite button, wherein the video effect combination includes at least one filter and/or at least one foreground video, when receiving a selection of a first video effect combination in the at least one video effect combination When instructing, according to the first video special effect combination, the combined video effect is performed on the captured video image, and the processed video image is displayed, and the accompaniment audio and the recorded audio are played simultaneously, when receiving the click command on the composite button , the accompaniment audio, the recorded audio, and the processed video image are combined to obtain a composite video. The video image in the synthesized video thus obtained is not a simple presentation of the content captured by the camera, but is processed by the video special effect, thereby enhancing the flexibility of the video recording of the song.

Based on the same technical concept, an embodiment of the present invention further provides a device for generating a video. As shown in FIG. 5, the device includes:

The playing module 510 is configured to play the accompaniment audio of the song to be recorded, display the lyrics subtitle corresponding to the accompaniment audio, and perform video image capturing and audio recording;

a display module 520, configured to display after the shooting of the video image and the recording of the audio An option of pre-stored at least one video effect combination is displayed, and a composite button is displayed; wherein the video effect combination includes at least one filter and/or at least one foreground video;

The preview module 530 is configured to: when receiving the selection instruction for the first video special effect combination in the at least one video special effect combination, perform combined special effect processing on the captured video image according to the first video special effect combination, and The processed video image is displayed while playing the accompaniment audio and the recorded audio;

The synthesizing module 540 is configured to synthesize the accompaniment audio, the recorded audio, and the processed video image to obtain a synthesized video when receiving a click instruction to the composite button.

Optionally, the method further includes a determining module, configured to determine a song segment selected by the user in the target song before playing the accompaniment audio of the song to be recorded;

The playing module 510 is configured to: play the accompaniment audio of the song segment.

Optionally, the determining module is configured to:

Display a list of lyrics of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the lyrics list;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.

Optionally, the determining module is configured to:

Display the playback timeline of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the playing time axis;

Optionally, the display module 520 is further configured to display an option of at least one audio effect;

The preview module 530 is further configured to perform special effect processing on the recorded audio according to the first audio special effect when receiving a selection instruction for the first audio special effect in the at least one audio special effect, and The processed audio is played;

The synthesizing module 540 is configured to synthesize the processed audio, the accompaniment audio, and the processed video image to obtain a synthesized video when receiving a click instruction to the composite button.

Optionally, the display module 520 is further configured to display an option of at least one audio editing process;

The preview module 530 is further configured to: when the selection instruction of the first audio editing process in the at least one audio editing process is received, perform the first audio editing on the recorded audio And play the processed audio;

It should be noted that the device for generating a video provided by the foregoing embodiment is only illustrated by dividing the foregoing functional modules when generating a video. In an actual application, the function allocation may be completed by different functional modules as needed. The internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the device for generating a video provided by the foregoing embodiment is the same as the method for generating a video. The specific implementation process is described in detail in the method embodiment, and details are not described herein again.

Please refer to FIG. 6 , which is a schematic structural diagram of a terminal according to an embodiment of the present invention. The terminal may be used to implement the method for generating video provided in the foregoing embodiment. Specifically:

The terminal 900 may include an RF (Radio Frequency) circuit 110, a memory 120 including one or more computer readable storage media, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, and a WiFi (wireless fidelity, wireless). The fidelity module 170 includes a processor 180 having one or more processing cores, and a power supply 190 and the like. It will be understood by those skilled in the art that the terminal structure shown in FIG. 6 does not constitute a limitation to the terminal, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements. among them:

The RF circuit 110 can be used for transmitting and receiving information or during a call, receiving and transmitting signals, and in particular, receiving downlink information of the base station and then processing it by one or more processors 180; The data related to the uplink is sent to the base station. Generally, the RF circuit 110 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, an LNA (Low Noise Amplifier). , duplexer, etc. In addition, RF circuitry 110 can also communicate with the network and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.

The memory 120 can be used to store software programs and modules, and the processor 180 executes various functional applications and data processing by running software programs and modules stored in the memory 120. The memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 900 (such as audio data, phone book, etc.) and the like. Moreover, memory 120 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 120 may also include a memory controller to provide access to memory 120 by processor 180 and input unit 130.

The input unit 130 can be configured to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function controls. In particular, input unit 130 can include touch-sensitive surface 131 as well as other input devices 132. Touch-sensitive surface 131, also referred to as a touch display or trackpad, can collect touch operations on or near the user (such as a user using a finger, stylus, etc., on any suitable object or accessory on touch-sensitive surface 131 or The operation near the touch-sensitive surface 131) and driving the corresponding connecting device according to a preset program. Alternatively, the touch-sensitive surface 131 can include two portions of a touch detection device and a touch controller. Wherein, the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information. The processor 180 is provided and can receive commands from the processor 180 and execute them. In addition, the touch-sensitive surface 131 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves. In addition to the touch-sensitive surface 131, the input unit 130 can also include other input devices 132. Specifically, other input devices 132 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.

The display unit 140 can be used to display information entered by the user or information provided to the user and various graphical user interfaces of the terminal 900, which can be composed of graphics, text, icons, video, and any combination thereof. The display unit 140 may include a display panel 141. Alternatively, the display panel 141 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like. Further, the touch-sensitive surface 131 may cover the display panel 141, and when the touch-sensitive surface 131 detects a touch operation thereon or nearby, it is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 according to the touch event The type provides a corresponding visual output on display panel 141. Although in FIG. 6, touch-sensitive surface 131 and display panel 141 are implemented as two separate components to implement input and input functions, in some embodiments, touch-sensitive surface 131 can be integrated with display panel 141 for input. And output function.

Terminal 900 can also include at least one type of sensor 150, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 141 according to the brightness of the ambient light, and the proximity sensor may close the display panel 141 when the terminal 900 moves to the ear. / or backlight. As a kind of motion sensor, the gravity acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity. It can be used to identify the gesture of the mobile phone (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the terminal 900 can also be configured with gyroscopes, barometers, hygrometers, thermometers, infrared sensors and other sensors, here Let me repeat.

The audio circuit 160, the speaker 161, and the microphone 162 can provide an audio interface between the user and the terminal 900. The audio circuit 160 can transmit the converted electrical data of the received audio data to the speaker 161 for conversion to the sound signal output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal by the audio circuit 160. After receiving, it is converted into audio data, and then processed by the audio data output processor 180, transmitted to the terminal, for example, via the RF circuit 110, or outputted to the memory 120 for further processing. The audio circuit 160 may also include an earbud jack to provide communication of the peripheral earphones with the terminal 900.

WiFi is a short-range wireless transmission technology, and the terminal 900 can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 170, which provides wireless broadband Internet access for users. Although FIG. 6 shows the WiFi module 170, it can be understood that it does not belong to the end. The necessary configuration of the end 900 can be omitted as needed within the scope of not changing the essence of the invention.

The processor 180 is a control center of the terminal 900 that connects various portions of the entire handset using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 120, and recalling data stored in the memory 120, The various functions and processing data of the terminal 900 are performed to perform overall monitoring of the mobile phone. Optionally, the processor 180 may include one or more processing cores; preferably, the processor 180 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like. The modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 180.

The terminal 900 also includes a power source 190 (such as a battery) that supplies power to the various components. Preferably, the power source can be logically coupled to the processor 180 through a power management system to manage functions such as charging, discharging, and power management through the power management system. Power supply 190 may also include any one or more of a DC or AC power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

Although not shown, the terminal 900 may further include a camera, a Bluetooth module, and the like, and details are not described herein again. Specifically, in this embodiment, the display unit of the terminal 900 is a touch screen display, the terminal 900 further includes a memory, and one or more programs, wherein one or more programs are stored in the memory and configured to be one or one The above processor executes one or more programs to perform the method of generating a video as described in the various embodiments above.

In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium comprising instructions, such as a memory comprising instructions executable by a processor of a mobile terminal to perform the method of generating video as described above. For example, the non-transitory computer readable storage medium may be a ROM (Read-Only Memory), a RAM (Random-Access Memory), or a CD-ROM (Compact Disc Read-Only Memory, CD-ROM, tape, floppy disk and optical data storage devices.

A person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium. The storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

The above are only the preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalents, improvements, etc., which are within the spirit and scope of the present invention, should be included in the protection of the present invention. Within the scope.

Claims

A method of generating a video, the method comprising:

Playing the accompaniment audio of the song to be recorded, displaying the lyrics subtitle corresponding to the accompaniment audio, and performing video image shooting and audio recording;

After the capturing of the video image and the recording of the audio are ended, displaying an option of the at least one video effect combination stored in advance, and displaying the composite button; wherein the video effect combination includes at least one filter and/or at least one Foreground video

And when receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, performing combined special effect processing on the captured video image according to the first video special effect combination, and performing the processed video image on the processed video image Displaying, playing the accompaniment audio and recorded audio simultaneously;

When the click command for the composite button is received, the accompaniment audio, the recorded audio, and the processed video image are combined to obtain a composite video.
The method according to claim 1, wherein before the playing the accompaniment audio of the song to be recorded, the method further comprises: determining a song segment selected by the user in the target song;

The playing the accompaniment audio of the song to be recorded includes: playing the accompaniment audio of the song segment.
The method of claim 2, wherein the determining a piece of song selected by the user in the target song comprises:

Display a list of lyrics of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the lyrics list;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.
The method of claim 2, wherein the determining a piece of song selected by the user in the target song comprises:

Display the playback timeline of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the playing time axis;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.
The method of claim 1 further comprising:

An option to display at least one audio effect;

When receiving a selection instruction for the first audio effect in the at least one audio effect, according to The first audio special effect, performing special effects processing on the recorded audio, and playing the processed audio;

When the click command of the composite button is received, synthesizing the accompaniment audio, the recorded audio, and the processed video image to obtain a composite video, including:

When the click command for the composite button is received, the processed audio, the accompaniment audio, and the processed video image are combined to obtain a synthesized video.
The method of claim 1 further comprising:

Display at least one option for audio editing processing;

When receiving a selection instruction for the first audio editing process in the at least one audio editing process, performing the first audio editing process on the recorded audio, and playing the processed audio;

When the click command of the composite button is received, synthesizing the accompaniment audio, the recorded audio, and the processed video image to obtain a composite video, including:

When the click command for the composite button is received, the processed audio, the accompaniment audio, and the processed video image are combined to obtain a synthesized video.
A device for generating a video, characterized in that the device comprises:

a playing module, configured to play the accompaniment audio of the song to be recorded, display the lyrics subtitle corresponding to the accompaniment audio, and perform video image shooting and audio recording;

a display module, configured to display an option of pre-stored at least one video special effect combination after the capturing of the video image and the recording of the audio, and display the composite button; wherein the video special effect combination includes at least one filter And/or at least one foreground video;

a preview module, configured to perform combined special effect processing on the captured video image according to the first video special effect combination when receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, and processing The subsequent video image is displayed while playing the accompaniment audio and the recorded audio;

And a synthesizing module, configured to synthesize the accompaniment audio, the recorded audio, and the processed video image to obtain a synthesized video when receiving a click instruction to the composite button.
The device according to claim 7, wherein the device further comprises: a determining module, configured to: before the playing the accompaniment audio of the song to be recorded, determining a song segment selected by the user in the target song;

The playing module is configured to: play the accompaniment audio of the song segment.
The device according to claim 8, wherein the determining module is configured to:

Display a list of lyrics of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the lyrics list;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.
The device according to claim 8, wherein the determining module is configured to:

Display the playback timeline of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the playing time axis;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.
The device according to claim 7, wherein the display module is further configured to display an option of at least one audio effect;

The preview module is further configured to perform special effect processing on the recorded audio according to the first audio special effect when receiving a selection instruction for the first audio special effect in the at least one audio special effect, and process the After the audio is played;

The synthesizing module is configured to synthesize the processed audio, the accompaniment audio, and the processed video image to obtain a synthesized video when receiving a click instruction to the composite button.
The method according to claim 7, wherein the display module is further configured to display an option of at least one audio editing process;

The preview module is further configured to perform the first audio editing process on the recorded audio when receiving a selection instruction for the first audio editing process in the at least one audio editing process, and after processing Audio is played;

The synthesizing module is configured to synthesize the processed audio, the accompaniment audio, and the processed video image to obtain a synthesized video when receiving a click instruction to the composite button.
A terminal, wherein the terminal comprises:

One or more processors; and

Memory

The memory stores one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including instructions for:

Playing the accompaniment audio of the song to be recorded, displaying the lyrics subtitle corresponding to the accompaniment audio, and performing video image shooting and audio recording;

After the capturing of the video image and the recording of the audio are ended, displaying an option of the at least one video effect combination stored in advance, and displaying the composite button; wherein the video effect combination includes at least one filter and/or at least one Foreground video

And when receiving a selection instruction for the first video special effect combination in the at least one video special effect combination, performing combined special effect processing on the captured video image according to the first video special effect combination, and performing the processed video image on the processed video image Displaying, playing the accompaniment audio and recorded audio simultaneously;

When the click command for the composite button is received, the accompaniment audio, the recorded audio, and the processed video image are combined to obtain a composite video.
The terminal of claim 13 wherein said one or more programs further comprise instructions for:

Determining a song segment selected by the user in the target song;

The accompaniment audio of the song segment is played.
The terminal of claim 14, wherein the one or more programs further comprise instructions for:

Display a list of lyrics of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the lyrics list;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.
The terminal of claim 14, wherein the one or more programs further comprise instructions for:

Display the playback timeline of the target song;

Obtaining a starting point and a ending point of a song segment set by the user in the playing time axis;

And determining, according to the starting point and the ending point, a piece of song selected by the user in the target song.
The terminal of claim 13 wherein said one or more programs further comprise instructions for:

An option to display at least one audio effect;

When receiving a selection instruction for the first audio effect in the at least one audio effect, performing special effects processing on the recorded audio according to the first audio effect, and performing the processed audio Play

When the click command of the composite button is received, synthesizing the accompaniment audio, the recorded audio, and the processed video image to obtain a composite video, including:

When the click command for the composite button is received, the processed audio, the accompaniment audio, and the processed video image are combined to obtain a synthesized video.
The terminal of claim 13 wherein said one or more programs further comprise instructions for:

Display at least one option for audio editing processing;

When receiving a selection instruction for the first audio editing process in the at least one audio editing process, performing the first audio editing process on the recorded audio, and playing the processed audio;

When the click command of the composite button is received, synthesizing the accompaniment audio, the recorded audio, and the processed video image to obtain a composite video, including:

When the click command for the composite button is received, the processed audio, the accompaniment audio, and the processed video image are combined to obtain a synthesized video.