WO2019086037A1 - Video material processing method, video synthesis method, terminal device and storage medium - Google Patents

Video material processing method, video synthesis method, terminal device and storage medium Download PDF

Info

Publication number
WO2019086037A1
WO2019086037A1 PCT/CN2018/114100 CN2018114100W WO2019086037A1 WO 2019086037 A1 WO2019086037 A1 WO 2019086037A1 CN 2018114100 W CN2018114100 W CN 2018114100W WO 2019086037 A1 WO2019086037 A1 WO 2019086037A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
effect
material set
segment
content
Prior art date
Application number
PCT/CN2018/114100
Other languages
French (fr)
Chinese (zh)
Inventor
张涛
董霙
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2019086037A1 publication Critical patent/WO2019086037A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally

Definitions

  • the present application relates to the field of video synthesis, and in particular, to a video material processing method, a video synthesis method, a terminal device, and a storage medium.
  • Video production is the recombination of pictures, videos, audio and other materials to generate video.
  • video production typically requires the installation of video production software on a personal computing device.
  • These video production software can provide feature-rich video editing features, but the operation is complicated.
  • the present application proposes a new video synthesis scheme to solve the problem of how to improve the operational convenience of video synthesis.
  • a method for processing a video material comprising: acquiring a material set of a video to be synthesized, and determining an attribute of the material set, wherein the material set includes multiple a material element, each material element includes at least one of media content in a picture, a text, an audio, and a video, the attribute includes a play order and a play duration of each material element in the material set; determining an effect parameter corresponding to the material set, and an effect The parameter corresponds to a video effect mode; the material set and the effect parameter are transmitted to the video composition server, so that the video composition server synthesizes the plurality of material elements in the material set into corresponding according to the effect parameter and the attribute of the material set Video in video effect mode.
  • a video synthesizing method which is executed by a server, and the method includes: acquiring, from a material processing application, a material set of a video to be synthesized and an effect parameter about the material set, wherein the material set includes multiple a material element, each material element includes at least one of media content in a picture, a text, an audio, and a video, and the attribute of the material collection includes a play order and a play duration of each material element in the material set, and the effect parameter corresponds to a video Effect mode; combines multiple material elements in a material collection into a video in video effect mode based on the effect parameters and the properties of the material collection.
  • a terminal device comprising: a processor and a memory; wherein the memory stores computer readable instructions that enable the processor to perform a processing method of the video material according to the present application.
  • a server comprising: a processor and a memory; the memory storing computer readable instructions that enable the processor to perform a video synthesis method according to the present application.
  • a non-volatile storage medium storing a data processing program that, when executed by a computing device, causes the computing device to perform a processing method or a video synthesis method of a video material.
  • content selection can be performed in a user interface (for example, the user interface of FIGS. 3A to 3G), so that the material collection of the video to be synthesized can be conveniently obtained.
  • the processing scheme of the present application can also automatically clip the video to generate a video segment and corresponding description information, thereby enabling the user to quickly determine the content of the video segment and perform segment selection by viewing the description information.
  • the processing scheme of the present application can prevent the user from performing complicated operations related to the video effect on the local terminal device, and can intuitively present the preview effect maps (eg, effect animations, etc.) of the plurality of video effect modes to the user, thereby facilitating the user. Quickly determine the effect mode of the video to be composited. On this basis, the processing scheme of the present application can synthesize video through the video synthesis server, thereby greatly improving the user experience.
  • FIG. 1 shows a schematic diagram of an application scenario 100 in accordance with some embodiments of the present application
  • FIG. 2A shows a flowchart of a method 200 of processing video material in accordance with some embodiments of the present application
  • FIG. 2B shows a flowchart of obtaining a collection of materials in accordance with some embodiments of the present application
  • FIG. 3A illustrates a schematic diagram of a user interface for acquiring picture content, according to some embodiments of the present application
  • FIG. 3B illustrates an interface diagram of a display picture of some embodiments
  • 3C shows a schematic diagram of acquiring audio information in accordance with some embodiments of the present application.
  • 3D illustrates a user interface for generating video segments in accordance with some embodiments of the present application
  • Figure 3E shows an editing interface of a video clip
  • 3F illustrates a user interface for adjusting a play order, in accordance with some embodiments of the present application
  • 3G illustrates a user interface for determining an effect parameter, in accordance with some embodiments of the present application
  • FIG. 4 illustrates a flow diagram of a video composition method 400 in accordance with some embodiments of the present application
  • FIG. 5 illustrates a video rendering process in accordance with some embodiments of the present application
  • FIG. 6 illustrates a flow diagram of a video composition method 600 in accordance with some embodiments of the present application
  • FIG. 7 shows a schematic diagram of a processing device 700 for video material in accordance with some embodiments of the present application.
  • FIG. 8 shows a schematic diagram of a video synthesizing apparatus 800 in accordance with some embodiments of the present application.
  • FIG. 9 shows a schematic diagram of a video synthesizing device 900 in accordance with some embodiments of the present application.
  • Figure 10 shows a block diagram of the composition of a computing device.
  • FIG. 1 shows a schematic diagram of an application scenario 100 in accordance with some embodiments of the present application.
  • the application scenario 100 includes a terminal device 110 and a server 120.
  • the terminal device 110 may be, for example, various devices such as a desktop computer, a notebook computer, a tablet computer, a mobile phone, or a handheld game console, but is not limited thereto.
  • Server 120 may include one or more hardware independent servers.
  • the server 120 may also be a device resource such as a virtual server or a distributed cluster, but is not limited thereto.
  • the terminal device 110 can include various applications, such as a material processing application 111.
  • the material processing application 111 can acquire the video material of the video to be composited and transmit the video material to the server 120.
  • server 120 can synthesize the corresponding video based on the received video material.
  • the server 120 can also transmit the synthesized video to the terminal device 110.
  • the material processing application 111 may be, for example, a client or a browser dedicated to managing the material, and the like, which is not limited in this application.
  • server 120 can include a video composition application (not shown in FIG. 1).
  • the video composition application may be, for example, software that synthesizes a video using a collection of materials, or may be a component of various multimedia applications.
  • the multimedia application is, for example, software that provides video content to the terminal device 110. The processing method of the video material will be described below with reference to FIG.
  • the processing method 200 of the video material may be performed by, for example, the material processing application 111, but is not limited thereto.
  • the material processing application 111 may be, for example, a browser for processing a material or a client for processing a material.
  • the material processing application 111 can also be a component of an application such as an instant messaging application (QQ, WeChat, etc.), a social networking application, a video application (such as Tencent video, etc.) or a news client.
  • the processing method 200 of the video material includes step S201, acquiring a material set of the video to be synthesized, and determining attributes of the material set.
  • the material collection may include a plurality of material elements.
  • Each material element includes at least one of media content in images, text, audio, and video.
  • the properties of the clip collection include the play order and play duration of each clip in the clip.
  • a user interface for acquiring material elements is provided.
  • the user interface can include at least one control corresponding to at least one media type, respectively.
  • a control is a view object in the user interface for interacting with a user, such as an input box, a drop-down selection box, a button, and the like.
  • step S201 may obtain media content corresponding to the media type of the control in response to operation of any of the controls in the user interface, and use it as a media content of a material element in the material collection.
  • step S201 may respond to the operation of the picture control as the picture content of a material element. It is also noted that the material elements containing the picture content may also typically include text or audio associated with the picture.
  • step S201 obtains text information associated with the picture content in response to the operation of the text input control, and uses the same as the corresponding material element. Text content.
  • step S201 may, in response to an operation of the audio control, acquire audio information associated with the picture content as the audio content of the corresponding material element.
  • the audio content is, for example, narration or background music or the like.
  • step S201 may use the playing duration of the picture as the playing duration of the corresponding material element. In order to more vividly explain the execution process of step S201, an exemplary description will be made below with reference to FIGS. 3A to 3C.
  • FIG. 3A illustrates a schematic diagram of a user interface for acquiring picture content, in accordance with some embodiments of the present application.
  • FIG. 3B shows an interface diagram of a display picture of some embodiments.
  • step S201 when the user operates the control 301, step S201 can acquire a picture and display it in the preview window 302.
  • Step S201 may determine the play duration of the picture in response to the operation of the play duration control 303.
  • Step S201 may acquire text information related to the picture in the preview window 302 in response to the operation of the text input control 304. In other words, text information is a supplement to the picture.
  • FIG. 3C illustrates a schematic diagram of acquiring audio information in accordance with some embodiments of the present application.
  • step S201 can obtain locally stored audio (eg, for background music) in response to operation of control 305.
  • step S201 can record a piece of audio content in response to operation of control 306.
  • the audio content is, for example, a narration recorded for the picture in the preview window 302.
  • step S201 may acquire a video as a video content of a material element.
  • step S201 in response to an operation on a video control in the user interface, acquires a video clip as the video content of a material element.
  • the video may be, for example, a video file stored locally or a video content stored in the cloud.
  • step S201 may also add text content, audio content, and the like thereto.
  • step S201 may use the playing duration of the video content as the playing duration of the material element.
  • step S201 may include steps S2011-S2014.
  • step S2011 a video is acquired.
  • step S2012 at least one video segment is extracted from the segment of video according to a predetermined video editing algorithm and description information of each video segment is generated.
  • step S2012 first determines at least one key image frame of the video segment. For each key image frame, step S2012 may extract a video segment containing the key image frame from the segment of video.
  • the video clip can include an audio clip associated with a sequence of image frames of the video clip. In this way, step S2012 performs character recognition on the audio segment to acquire the corresponding text, and generates description information corresponding to the video segment according to the text. It should be understood that various algorithms capable of automatically editing video can be adopted in step S2012, which is not limited in this application.
  • step S2013 a user interface for displaying description information of each video segment is provided, so that the user performs segment selection according to the description information of each video segment.
  • step S2014 in response to the selection operation of the at least one video segment, each of the selected video segments is respectively used as the video content of one of the material elements in the material collection. In other words, step S2014 can generate each of the selected video segments as one material element.
  • the embodiment of the present application may also send a video clip request to the cloud, and the video clip is performed by a cloud device (for example, the server 120). Based on this, an embodiment of the present application can acquire a clipped video clip from a cloud device.
  • a cloud device for example, the server 120.
  • FIG. 3D illustrates a user interface for generating video segments in accordance with some embodiments of the present application.
  • window 307 is a preview window of the video to be clipped.
  • embodiments of the present application may generate multiple video segments, such as segment 309.
  • Figure 3E shows the editing interface of a video clip.
  • the window 310 is a preview window of the segment 309.
  • the area 311 is descriptive information about the segment 309.
  • the user can input text content corresponding to the video clip through the text input control 312.
  • the user can also obtain audio content for the video clip through control 313 or control 314.
  • icon 315 represents an acquired audio file.
  • the user can select at least one video clip.
  • the present embodiment can treat each selected video clip and the corresponding text content and audio content as one material element.
  • step S201 can acquire a plurality of material elements.
  • step S201 may use the generation order of the plurality of material elements as the default playback order.
  • step S201 may also adjust the play order of the plurality of material elements in response to the user operation.
  • Figure 3F illustrates a user interface that adjusts the playback order in accordance with some embodiments of the present application.
  • FIG. 3F presents a thumbnail corresponding to each material element. For example 316 and 317. The thumbnails are arranged in order within the display area.
  • Step S201 may adjust an arrangement order of each element in the material set in response to a movement operation of the thumbnail in the user interface, and use the adjusted arrangement order as a play order of the material set.
  • the method 200 may perform step S202.
  • step S202 an effect parameter corresponding to the material set is determined.
  • each effect parameter corresponds to a video effect mode.
  • Video effects include, for example, transition effects and particle effects between adjacent material elements.
  • the transition effect refers to the scene over-effect between two scenes (ie, two material elements).
  • embodiments of the present application may employ predetermined techniques (eg, wipe, overlay, page curl, etc.) to achieve a smooth transitional effect.
  • the transition effect can also include the effect of the picture entering the picture (also known as the picture fly-in effect).
  • Particle effects are animated effects that simulate objects such as water, fire, fog, and gas in reality.
  • a video effect mode corresponds to the overall effect of a video to be synthesized.
  • one video effect mode can be a predetermined video effect or a combination of multiple predetermined video effects.
  • step S202 may provide a user interface including a plurality of effect options. Among them, each effect option corresponds to an effect parameter.
  • an effect parameter can be considered as an identifier corresponding to a video effect mode.
  • step S202 may display a corresponding preview effect map in the user interface.
  • step S202 may use the effect parameter corresponding to the selected effect option as the effect parameter corresponding to the material set.
  • Figure 3G illustrates a user interface for determining performance parameters in accordance with some embodiments of the present application.
  • region 319 shows a number of effect options, such as 320 and 321 .
  • Each option corresponds to a video effect mode.
  • the effect animation can intuitively represent a video effect mode. In this way, the user can select a video effect mode by viewing the effect animation without performing complicated operations related to the video effect in the terminal device.
  • step S202 may select an effect parameter corresponding to the effect option currently being previewed in response to the operation of the control 323.
  • step S203 the material set and effect parameters are transmitted to a video composition server (eg, server 120).
  • a video composition server eg, server 120
  • the video composition server can synthesize multiple material elements in the material collection into videos corresponding to the determined video effect mode according to the effect parameters and the attributes of the material collection.
  • a video composition request is sent to the video composition server.
  • the video composition request may include a material collection and an effect parameter.
  • the video composition server can synthesize the material into a video in response to the video composition request.
  • the video composition server may send hint information about providing a video composition service to the material processing application 111.
  • step S203 in response to receiving the prompt information, the material set and the effect parameter are transmitted to the video composition server, so that the video composition server can synthesize the corresponding video according to the received material set and the effect parameter.
  • the processing method 200 of the video material of the present application content selection can be performed in a user interface (for example, the user interface of FIGS. 3A to 3G), so that the material set of the video to be synthesized can be conveniently obtained.
  • the processing method 200 of the video material can also automatically clip the video to generate a video segment and corresponding description information, thereby enabling the user to quickly determine the content of the video segment and perform segment selection by viewing the description information.
  • the processing method 200 of the video material can prevent the user from performing complicated operations related to the video effect on the local terminal device, and can intuitively present the preview effect image (for example, effect animation, etc.) of the plurality of video effect modes to the user. Thereby, it is convenient for the user to quickly determine the effect mode of the video to be synthesized. Based on this, the method 200 of the present application can synthesize video through the video synthesis server, thereby greatly improving the user experience.
  • FIG. 4 illustrates a flow diagram of a video composition method 400 in accordance with some embodiments of the present application.
  • the video synthesis method 400 can be performed by a video synthesis application.
  • Server 120 can include a video synthesis application.
  • the video composition application may be, for example, software that synthesizes a video using a collection of materials, or may be a component of various multimedia applications.
  • the multimedia application is, for example, software that provides video content to the terminal device 110.
  • a material set of a video to be synthesized and an effect parameter about the material set are acquired from the material processing application 111, wherein the material set includes a plurality of material elements, each material element includes a picture, At least one of text, audio, and video.
  • the properties of the clip collection include the play order and play duration of each clip in the clip.
  • the effect parameter corresponds to a video effect mode.
  • step S402 a plurality of material elements in the material set are synthesized into a video of a corresponding video effect mode (ie, a video effect mode specified by the effect parameter) according to the effect parameter and the attribute of the material set.
  • step S402 performs a normalization process on the set of materials to cause each of the material elements to be converted into a predetermined format.
  • the predetermined format includes, for example, an image encoding format, an image playback frame rate, an image size, and the like.
  • the predetermined format is associated with an effect parameter.
  • each effect parameter is configured with a corresponding predetermined format.
  • step S402 can determine a corresponding predetermined format according to the effect parameter, and perform normalization processing on the material element. Based on this, step S402 can synthesize the normalized material set into a video according to the effect parameter.
  • the video composition application is configured with multiple video composition scripts.
  • each video composition script (which may also be referred to as a video composition template) corresponds to a video composition effect that can be executed by the video composition application.
  • step S402 can determine a plurality of rendering stages corresponding to the effect parameters.
  • Each rendering stage includes at least one of the plurality of video composition scripts described above, and the rendering result of each rendering stage is the input of the next rendering stage.
  • step S402 can render the material elements in the material collection according to multiple rendering stages to synthesize the video.
  • step S402 can implement a superimposed composite effect (ie, a video effect mode corresponding to the effect parameter).
  • FIG. 5 illustrates a video rendering process in accordance with some embodiments of the present application.
  • the process shown in Figure 5 includes three rendering stages S1, S2, and S3.
  • Stage S1 executes scripts X1 and X2.
  • the material set may include, for example, 20 material elements.
  • step S402 the first 10 material elements can be rendered by executing the script X1, and the last 10 material elements are rendered by the script X2.
  • step S402 can continue to perform the overlay effect processing by executing scripts X3 and X4 at stage S2.
  • the step S402 may continue the superimposition effect processing at the stage S3, thereby generating a rendering result corresponding to the effect parameter.
  • Step S402 may, for example, invoke an After Effects (abbreviated as AE) application to execute the script, but is not limited thereto.
  • AE After Effects
  • aerender indicates the name of the AE command line execution program.
  • Project test.aepx indicates that the current project template file is test.aepx.
  • Comp indicates that the synthesizer name used for this rendering is tes.
  • RStemplate indicates that the script name is test_1.
  • Omtemplate indicates that the video output template name is test_2.
  • Output indicates that the output video is named test.mov.
  • the video composition method 400 can acquire a set of materials from the material processing application 111 and determine a plurality of rendering stages corresponding to the effect parameters. Based on this, the video composition method 400 can synthesize the rendering result with the superimposed video effect by performing a plurality of rendering stages.
  • the video synthesis method 400 performs multiple stages of rendering through the material collection, and can generate various complex video effects, thereby greatly improving the efficiency of video synthesis and increasing the type of video synthesis effect.
  • FIG. 6 shows a flow diagram of a video composition method 600 in accordance with some embodiments of the present application.
  • Video synthesis method 600 can be performed by a video synthesis application.
  • server 120 can include the video composition application.
  • the video synthesis method 600 includes steps S601 to S602.
  • the implementations of steps S601 to S602 are consistent with steps S401 to S402, respectively, and are not described herein again.
  • the video synthesis method 600 further includes step S603.
  • step S603 voice information corresponding to the text content is generated. Specifically, for text content in one material element, step S603 can be converted into voice information.
  • step S603 can perform voice conversion using various predetermined speech conversion algorithms. For example, step S603 can invoke the Xunfei speech synthesis component to obtain a corresponding audio file.
  • step S604 caption information corresponding to the voice information is generated.
  • step S604 can adopt various techniques capable of generating subtitles, which is not limited in this application.
  • step S604 may call Fast Forward MPEG (abbreviated as FFMPEG) software for caption generation, but is not limited thereto.
  • FFMPEG Fast Forward MPEG
  • the generated subtitle includes parameters such as a subtitle effect, a subtitle display time, and the like.
  • step S605 the voice information and the subtitle information are added to the video synthesized in step S602.
  • FIG. 7 shows a schematic diagram of a processing device 700 for video material in accordance with some embodiments of the present application.
  • the material processing application 111 may, for example, be in a processing device 700 that includes video material.
  • the processing device 700 of the video material includes a material acquisition unit 701, an effect determination unit 702, and a transmission unit 703.
  • the material acquiring unit 701 can acquire a material set of the video to be synthesized, and determine an attribute of the material set.
  • the material collection includes a plurality of material elements, each of which includes at least one of image content, text, audio, and video.
  • the attribute includes a play order and a play duration of each material element in the material set.
  • the material acquisition unit 701 can provide a user interface for obtaining a material element.
  • the user interface includes at least one control corresponding to at least one media type, respectively.
  • the at least one media type includes at least one of text, picture, audio, and video.
  • the material acquisition unit 701 can acquire the media content corresponding to the media type of the control and use it as a piece of media content of a material element in the material collection.
  • the material acquisition unit 701 can obtain a picture and use it as the picture content of a material element of the material collection in response to an operation on a picture control in the user interface.
  • the material acquisition unit 701 may also acquire the input text information associated with the picture content as the text content of the material element in response to the operation of the text input control associated with the picture control. In some embodiments, the material acquisition unit 701 can also retrieve the input audio information associated with the picture content in response to an operation of the audio control associated with the picture control and use it as the audio content of the material element. In some embodiments, the material acquisition unit 701 can also retrieve a video segment as the video content of a material element of the material collection in response to an operation on a video control in the user interface.
  • the material acquisition unit 701 first acquires a piece of video, and then extracts at least one video segment from the segment of video and generates descriptive information for each video segment according to a video editing algorithm.
  • the material acquisition unit 701 can determine at least one key image frame of the video. For each key image frame, the material acquisition unit 701 can extract a video segment containing the key image frame from the video.
  • the video clip includes an audio clip.
  • the material acquisition unit 701 can also perform character recognition on the audio segment to acquire corresponding text, and generate description information corresponding to the video segment according to the text.
  • the material acquisition unit 701 can provide a user interface for displaying description information of each video segment, so that the user performs segment selection according to the description information of each video segment.
  • the material acquisition unit 701 respectively determines each of the selected video segments as the video content of one material element in the material set in response to the selection operation on the video segment.
  • the material acquisition unit 701 can provide a user interface that presents thumbnails corresponding to respective material elements in the material collection.
  • the thumbnails corresponding to the respective material elements are sequentially arranged in the corresponding display area of the user interface.
  • the material acquisition unit 701 can adjust the arrangement order of the elements in the material set in response to the movement operation of the thumbnails in the user interface, and use the adjusted arrangement order as the playback order of the material collection.
  • the material obtaining unit 701 may use the playing time of the picture content as the playing duration of the material element.
  • the material acquisition unit 701 may use the playing time of the video content as the playing duration of the material element.
  • the effect determination unit 702 can determine an effect parameter corresponding to the material set.
  • the effect parameter corresponds to a video effect mode.
  • the effect determination unit 702 can provide a user interface that includes a plurality of effect options. Each of these effect options corresponds to an effect parameter.
  • the effect determination unit 702 displays the corresponding preview effect map in the user interface.
  • the effect determining unit 702 sets the effect parameter corresponding to the selected effect option as the effect parameter corresponding to the material set.
  • the transmitting unit 703 may transmit the material set and the effect parameter to the video composition server, so that the video composition server combines the plurality of material elements in the material set into a video corresponding to the video effect mode according to the effect parameter and the attribute of the material set.
  • content selection can be performed in a user interface (for example, the user interface of FIGS. 3A to 3G), so that the material set of the video to be synthesized can be conveniently obtained.
  • the processing device 700 of the video material can also automatically clip the video to generate a video segment and corresponding description information, thereby enabling the user to quickly determine the content of the video segment and perform segment selection by viewing the description information.
  • the processing device 700 of the video material can prevent the user from performing complicated operations related to the video effect on the local terminal device, and can intuitively present the preview effect map (eg, effect animation, etc.) of the plurality of video effect modes to the user. Thereby, it is convenient for the user to quickly determine the effect mode of the video to be synthesized. Based on this, the processing device 700 of the video material can synthesize the video through the video synthesis server, thereby greatly improving the user experience.
  • FIG. 8 shows a schematic diagram of a video synthesis device 800 in accordance with some embodiments of the present application.
  • the video synthesis application can include a video synthesis device 800.
  • Server 120 may, for example, include the video composition application.
  • the video synthesizing apparatus 800 may include a communication unit 801 and a video synthesizing unit 802.
  • the communication unit 801 can acquire a material set of a video to be synthesized and an effect parameter regarding the material set from the material processing application 111.
  • the material collection includes a plurality of material elements, and each of the material elements includes at least one of image content, text, audio, and video.
  • the properties of the clip collection include the play order and play duration of each clip in the clip.
  • the effect parameter corresponds to a video effect mode.
  • the video synthesizing unit 802 can synthesize a plurality of material elements in the material set into a video of a video effect mode according to the effect parameter and the attribute of the material set.
  • video synthesizing unit 802 can normalize the set of material to cause each material element to be converted into a predetermined format.
  • the predetermined format includes an image encoding format, an image playback frame rate, and an image size.
  • the video synthesizing unit 802 synthesizes the normalized processed material set into a corresponding video.
  • video synthesizing unit 802 can determine a plurality of rendering stages corresponding to the effect parameters based on a plurality of video synthesis scripts for execution in the predetermined video composition application.
  • Each video composition script corresponds to a video composition effect
  • each rendering stage includes at least one of the plurality of video composition scripts
  • the rendering result of each rendering stage is the input content of the next rendering stage.
  • the video composition unit 802 can render the set of materials to generate a corresponding video.
  • the video effect mode may, for example, include a video transition mode between adjacent material elements. It should be noted that a more specific implementation of the video synthesizing apparatus 800 is consistent with the video synthesizing method 400, and details are not described herein again.
  • the video synthesizing apparatus 800 according to the present application can acquire a material set from the material processing application 111 and determine a plurality of rendering stages corresponding to the effect parameters.
  • the video synthesizing device 800 can synthesize the rendering result with the superimposed video effect by performing a plurality of rendering stages.
  • the video synthesizing device 800 performs multi-stage rendering through the material set, and can generate various complicated video effects, thereby greatly improving the efficiency of video synthesis and increasing the type of video synthesis effect.
  • FIG. 9 shows a schematic diagram of a video synthesizing device 900 in accordance with some embodiments of the present application.
  • the video synthesis application can include a video synthesis device 900.
  • Server 120 may, for example, include the video composition application.
  • the video synthesizing apparatus 900 includes a communication unit 901 and a video synthesizing unit 902.
  • the communication unit 901 can be implemented as an embodiment consistent with the communication unit 801.
  • the video synthesizing unit 902 can be implemented as an embodiment consistent with the video synthesizing unit 802, and details are not described herein again.
  • the device 900 may further include a speech synthesis unit 903, a subtitle generation unit 904, and an addition unit 905.
  • the speech synthesis unit 903 can generate the speech information corresponding to the text content.
  • the subtitle generating unit 904 can generate subtitle information corresponding to the voice information.
  • the adding unit 905 is for adding the voice information and the caption information to the generated video. It should be noted that a more specific implementation of the video synthesizing apparatus 900 is consistent with the video synthesizing method 600, and details are not described herein again.
  • Figure 10 shows a block diagram of the composition of a computing device.
  • the computing device includes one or more processors (CPU or GPU) 1002, a communication module 1004, a memory 1006, a user interface 1010, and a communication bus 1008 for interconnecting these components.
  • processors CPU or GPU
  • communication module 1004
  • memory 1006
  • user interface 1010 the computing device includes one or more processors (CPU or GPU) 1002
  • communication bus 1008 for interconnecting these components.
  • the processor 1002 can receive and transmit data through the communication module 1004 to effect network communication and/or local communication.
  • User interface 1010 includes one or more output devices 1012 that include one or more speakers and/or one or more visual displays.
  • User interface 1010 also includes one or more input devices 1014 including, for example, a keyboard, a mouse, a voice command input unit or loudspeaker, a touch screen display, a touch sensitive tablet, a gesture capture camera or other input button or control, and the like.
  • the memory 1006 may be a high speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state storage device; or a nonvolatile memory such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, Or other non-volatile solid-state storage devices.
  • a high speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state storage device
  • nonvolatile memory such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, Or other non-volatile solid-state storage devices.
  • the memory 1006 stores a set of instructions executable by the processor 1002, including:
  • An operating system 1016 including a program for processing various basic system services and for performing hardware related tasks
  • the application 1018 includes various programs for implementing the above method, and the program can implement the processing flow in each of the above examples.
  • application 1018 can include a video material processing application in accordance with the present application.
  • the video material processing application may include the processing device 700 of the video material shown in FIG.
  • application 1018 can include a video composition application.
  • the video composition application may include, for example, the video synthesizing device 800 shown in FIG. 8 or the video synthesizing device 900 shown in FIG.
  • each of the examples of the present application can be implemented by a data processing program executed by a data processing device such as a computer.
  • the data processing program constitutes the present application.
  • a data processing program that is usually stored in one storage medium is executed by directly reading the program out of the storage medium or by installing or copying the program to a storage device (such as a hard disk or a memory) of the data processing device. Therefore, such a storage medium also constitutes the present invention.
  • the storage medium can use any type of recording method, such as paper storage medium (such as paper tape, etc.), magnetic storage medium (such as floppy disk, hard disk, flash memory, etc.), optical storage medium (such as CD-ROM, etc.), magneto-optical storage medium ( Such as MO, etc.).
  • the present application therefore also discloses a non-volatile storage medium in which is stored a data processing program for performing any of the above-described methods of the present application.
  • the method steps described in this application can be implemented by a data processing program, and can also be implemented by hardware, for example, by logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers, and embedded control. And so on.
  • ASICs application specific integrated circuits
  • programmable logic controllers programmable logic controllers
  • embedded control embedded control

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

Disclosed in the present application are a video material processing method, a video synthesis method, a terminal device and a storage medium. The video material processing method, performed by the terminal device, comprises: acquiring a material set of videos to be synthesized, and determining attributes of the material set, wherein the material set comprises multiple material elements, each material element comprises at least one type of multimedia content from pictures, characters, audio and video, and the attributes comprise the playback order and the playback duration of each material element in the material set; determining effect parameters corresponding to the material set, the effect parameters corresponding to video effect modes; and transmitting the material set and the effect parameters to a video synthesis server to enable the video synthesis server to synthesize the multiple material elements in the material set into a video corresponding to the video effect modes on the basis of the effect parameters and the attributes of the material set.

Description

视频素材的处理方法、视频合成方法、终端设备及存储介质Video material processing method, video synthesis method, terminal device and storage medium
本申请要求于2017年11月06日提交中国专利局、申请号为201711076478.2、发明名称为“视频素材的处理方法、视频合成方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application filed on November 6, 2017, the Chinese Patent Office, the application number is 201711076478.2, and the invention is entitled "Processing method of video material, video synthesis method, device and storage medium". This is incorporated herein by reference.
技术领域Technical field
本申请涉及视频合成领域,尤其涉及视频素材的处理方法、视频合成方法、终端设备及存储介质。The present application relates to the field of video synthesis, and in particular, to a video material processing method, a video synthesis method, a terminal device, and a storage medium.
背景技术Background technique
随着多媒体技术的发展,视频制作已广泛应用在人们的生活中。视频制作是对图片、视频、音频等素材进行重组编码而生成视频。目前,视频制作通常需要在个人计算设备中安装视频制作软件。这些视频制作软件可以提供功能丰富的视频编辑功能,但操作复杂。With the development of multimedia technology, video production has been widely used in people's lives. Video production is the recombination of pictures, videos, audio and other materials to generate video. Currently, video production typically requires the installation of video production software on a personal computing device. These video production software can provide feature-rich video editing features, but the operation is complicated.
发明内容Summary of the invention
为此,本申请提出了一种新的视频合成方案,以解决如何提高视频合成的操作方便性的问题。To this end, the present application proposes a new video synthesis scheme to solve the problem of how to improve the operational convenience of video synthesis.
根据本申请一个方面,提出了一种视频素材的处理方法,由终端设备执行,所述方法包括:获取待合成视频的素材集合,并确定所述素材集合的属性,其中,素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容,所述属性包括该素材集合中各素材元素的播放顺序和播放时长;确定与素材集合对应的效果参数,效果参数对应于一种视频效果模式;将素材集合和所述效果参数传输至视频合成服务器,以便视频合成服务器根据效果参数和所述素材集合的属性,将素材集合中的多个素材元素合成为对应于视频效果模式的视频。According to an aspect of the present application, a method for processing a video material is provided, which is performed by a terminal device, the method comprising: acquiring a material set of a video to be synthesized, and determining an attribute of the material set, wherein the material set includes multiple a material element, each material element includes at least one of media content in a picture, a text, an audio, and a video, the attribute includes a play order and a play duration of each material element in the material set; determining an effect parameter corresponding to the material set, and an effect The parameter corresponds to a video effect mode; the material set and the effect parameter are transmitted to the video composition server, so that the video composition server synthesizes the plurality of material elements in the material set into corresponding according to the effect parameter and the attribute of the material set Video in video effect mode.
根据本申请一方面,提出一种视频合成方法,由服务器执行,所述方法包括:从素材处理应用获取一个待合成视频的素材集合和关于该素材集合的效果参数,其中,素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容,素材集合的属性包括该素材集合中各个素材元素的播放顺序和播放时长,所述效果参数对应于一种视频效果模式;根据效果参数和素材集合的属性,将素材集合中的多个素材元素合成为视频效果模式的视频。According to an aspect of the present application, a video synthesizing method is provided, which is executed by a server, and the method includes: acquiring, from a material processing application, a material set of a video to be synthesized and an effect parameter about the material set, wherein the material set includes multiple a material element, each material element includes at least one of media content in a picture, a text, an audio, and a video, and the attribute of the material collection includes a play order and a play duration of each material element in the material set, and the effect parameter corresponds to a video Effect mode; combines multiple material elements in a material collection into a video in video effect mode based on the effect parameters and the properties of the material collection.
根据本申请一方面,提供一种终端设备,包括:处理器和存储器;所述存储器中存储有计算机可读指令,可以使所述处理器执行根据本申请的视频素材的处理方法。According to an aspect of the present application, there is provided a terminal device comprising: a processor and a memory; wherein the memory stores computer readable instructions that enable the processor to perform a processing method of the video material according to the present application.
根据本申请一方面,提供一种服务器,包括:处理器和存储器;所述存储器中存储有计算机可读指令,可以使所述处理器执行根据本申请的视频合成方法。According to an aspect of the present application, a server is provided, comprising: a processor and a memory; the memory storing computer readable instructions that enable the processor to perform a video synthesis method according to the present application.
根据本申请一方面,提供一种非易失性存储介质,存储有数据处理程序,所述数据处理程序当由计算设备执行时,使得所述计算设备执行视频素材的处理方法或者视频合成方法。According to an aspect of the present application, there is provided a non-volatile storage medium storing a data processing program that, when executed by a computing device, causes the computing device to perform a processing method or a video synthesis method of a video material.
综上,根据本申请的视频素材的处理方案,可以在用户界面(例如图3A至3G的用户界面)中进行内容选择,从而可以方便地获取待合成视频的素材集合。特别是,本申请的处理方案还可以对视频进行自动化剪辑而生成视频片段和相应的描述信息,从而可以使得用户通过查看描述信息而快速确定视频片段的内容并进行片段选择。另外,本申请的处理方案可以避免用户在本地终端设备上进行与视频效果有关的复杂操作,而可以直观向用户呈现多种视频效果模式的预览效果图(例如效果动画等等),从而便于用户快速确定待合成视频的效果模式。在此基础上,本申请的处理方案可以通过视频合成服务器合成视频,从而极大提高用户体验度。In summary, according to the processing scheme of the video material of the present application, content selection can be performed in a user interface (for example, the user interface of FIGS. 3A to 3G), so that the material collection of the video to be synthesized can be conveniently obtained. In particular, the processing scheme of the present application can also automatically clip the video to generate a video segment and corresponding description information, thereby enabling the user to quickly determine the content of the video segment and perform segment selection by viewing the description information. In addition, the processing scheme of the present application can prevent the user from performing complicated operations related to the video effect on the local terminal device, and can intuitively present the preview effect maps (eg, effect animations, etc.) of the plurality of video effect modes to the user, thereby facilitating the user. Quickly determine the effect mode of the video to be composited. On this basis, the processing scheme of the present application can synthesize video through the video synthesis server, thereby greatly improving the user experience.
附图说明DRAWINGS
为了更清楚地说明本申请实例中的技术方案,下面将对实例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the examples of the present application, the drawings used in the description of the examples will be briefly described below. Obviously, the drawings in the following description are only some examples of the present application, For ordinary technicians, other drawings can be obtained based on these drawings without paying for creative labor.
图1示出了根据本申请一些实施例的应用场景100的示意图;FIG. 1 shows a schematic diagram of an application scenario 100 in accordance with some embodiments of the present application;
图2A示出了根据本申请一些实施例的视频素材的处理方法200的流程图;2A shows a flowchart of a method 200 of processing video material in accordance with some embodiments of the present application;
图2B示出了根据本申请一些实施例的获取素材集合的流程图;2B shows a flowchart of obtaining a collection of materials in accordance with some embodiments of the present application;
图3A示出了根据本申请一些实施例的获取图片内容的用户界面示意图;FIG. 3A illustrates a schematic diagram of a user interface for acquiring picture content, according to some embodiments of the present application; FIG.
图3B示出了一些实施例的显示图片的界面示意图;FIG. 3B illustrates an interface diagram of a display picture of some embodiments; FIG.
图3C示出了根据本申请一些实施例的获取音频信息的示意图;3C shows a schematic diagram of acquiring audio information in accordance with some embodiments of the present application;
图3D示出了根据本申请一些实施例的生成视频片段的用户界面;3D illustrates a user interface for generating video segments in accordance with some embodiments of the present application;
图3E示出了一个视频片段的编辑界面;Figure 3E shows an editing interface of a video clip;
图3F示出了根据本申请一些实施例的调节播放顺序的用户界面;3F illustrates a user interface for adjusting a play order, in accordance with some embodiments of the present application;
图3G示出了根据本申请一些实施例的确定效果参数的用户界面;3G illustrates a user interface for determining an effect parameter, in accordance with some embodiments of the present application;
图4示出了根据本申请一些实施例的视频合成方法400的流程图;FIG. 4 illustrates a flow diagram of a video composition method 400 in accordance with some embodiments of the present application;
图5示出了根据本申请一些实施例的视频渲染过程;FIG. 5 illustrates a video rendering process in accordance with some embodiments of the present application;
图6示出了根据本申请一些实施例的视频合成方法600的流程图;FIG. 6 illustrates a flow diagram of a video composition method 600 in accordance with some embodiments of the present application;
图7示出了根据本申请一些实施例的视频素材的处理装置700的示意图;FIG. 7 shows a schematic diagram of a processing device 700 for video material in accordance with some embodiments of the present application;
图8示出了根据本申请一些实施例的视频合成装置800的示意图;FIG. 8 shows a schematic diagram of a video synthesizing apparatus 800 in accordance with some embodiments of the present application;
图9示出了根据本申请一些实施例的视频合成装置900的示意图;及FIG. 9 shows a schematic diagram of a video synthesizing device 900 in accordance with some embodiments of the present application; and
图10示出了一个计算设备的组成结构图。Figure 10 shows a block diagram of the composition of a computing device.
具体实施方式Detailed ways
下面将结合本申请实例中的附图,对本申请实例中的技术方案进行清楚、完整地描述,显然,所描述的实例仅是本申请一部分实例,而不是全部的实例。基于本申请中的实例,本领域普通技术人员在没有做出创造性劳动前提下所获 得的所有其他实例,都属于本申请保护的范围。The technical solutions in the examples of the present application are clearly and completely described in the following with reference to the accompanying drawings in the present application. It is obvious that the described examples are only a part of the examples of the present application, and not all examples. All other examples obtained by a person of ordinary skill in the art based on the examples in the present application without creative efforts are within the scope of the present application.
下面将结合本申请实例中的附图,对本申请实例中的技术方案进行清楚、完整地描述,显然,所描述的实例仅是本申请一部分实例,而不是全部的实例。基于本申请中的实例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实例,都属于本申请保护的范围。The technical solutions in the examples of the present application are clearly and completely described in the following with reference to the accompanying drawings in the present application. It is obvious that the described examples are only a part of the examples of the present application, and not all examples. All other examples obtained by a person of ordinary skill in the art based on the examples in the present application without creative efforts are within the scope of the present application.
图1示出了根据本申请一些实施例的应用场景100的示意图。如图1所示,应用场景100包括终端设备110和服务器120。这里,终端设备110例如可以是台式电脑、笔记本电脑、平板电脑、移动电话或掌上游戏机等各种设备,但不限于此。服务器120可以包括一个或多个硬件独立的服务器。服务器120还可以是虚拟服务器或者分布式集群等设备资源,但不限于此。终端设备110可以包括各种应用,例如素材处理应用111。素材处理应用111可以获取待合成视频的视频素材,并将视频素材传输到服务器120中。这样,服务器120可以基于所接收的视频素材合成相应的视频。服务器120还可以将所合成的视频传输到终端设备110。这里,素材处理应用111例如可以是专用于管理素材的客户端或者浏览器等,本申请对此不作限制。在一些实施例中,服务器120可以包括视频合成应用(图1未示出)。视频合成应用例如可以是利用素材集合来合成视频的软件,也可以是各种多媒体应用的组件。这里,多媒体应用例如是向终端设备110提供视频内容的软件。下面结合图2对视频素材的处理方法进行说明。FIG. 1 shows a schematic diagram of an application scenario 100 in accordance with some embodiments of the present application. As shown in FIG. 1, the application scenario 100 includes a terminal device 110 and a server 120. Here, the terminal device 110 may be, for example, various devices such as a desktop computer, a notebook computer, a tablet computer, a mobile phone, or a handheld game console, but is not limited thereto. Server 120 may include one or more hardware independent servers. The server 120 may also be a device resource such as a virtual server or a distributed cluster, but is not limited thereto. The terminal device 110 can include various applications, such as a material processing application 111. The material processing application 111 can acquire the video material of the video to be composited and transmit the video material to the server 120. In this way, the server 120 can synthesize the corresponding video based on the received video material. The server 120 can also transmit the synthesized video to the terminal device 110. Here, the material processing application 111 may be, for example, a client or a browser dedicated to managing the material, and the like, which is not limited in this application. In some embodiments, server 120 can include a video composition application (not shown in FIG. 1). The video composition application may be, for example, software that synthesizes a video using a collection of materials, or may be a component of various multimedia applications. Here, the multimedia application is, for example, software that provides video content to the terminal device 110. The processing method of the video material will be described below with reference to FIG.
图2A示出了根据本申请一些实施例的视频素材的处理方法200的流程图。视频素材的处理方法200例如可以由素材处理应用111执行,但不限于此。这里,素材处理应用111例如可以是用于处理素材的浏览器或者用于处理素材的客户端。另外,素材处理应用111也可以是即时通讯应用(QQ、微信等等)、社交网络应用、视频应用(例如腾讯视频等)或者新闻客户端等应用的一个组件。2A shows a flow diagram of a method 200 of processing video material in accordance with some embodiments of the present application. The processing method 200 of the video material may be performed by, for example, the material processing application 111, but is not limited thereto. Here, the material processing application 111 may be, for example, a browser for processing a material or a client for processing a material. In addition, the material processing application 111 can also be a component of an application such as an instant messaging application (QQ, WeChat, etc.), a social networking application, a video application (such as Tencent video, etc.) or a news client.
如图2A所示,视频素材的处理方法200包括步骤S201,获取待合成视频的素材集合,并确定素材集合的属性。这里,素材集合可以包括多个素材元素。每个素材元素包括图片、文字、音频和视频中至少一种媒体内容。素材集合的 属性包括该素材集合中各素材元素的播放顺序和播放时长。根据本申请一些实施例,在步骤S201中,提供用于获取素材元素的用户界面。用户界面可以包括分别对应于至少一种媒体类型的至少一个控件。这里,一个控件是用户界面中用于与用户进行交互的视图对象,例如为输入框、下拉选择框、按钮等等。媒体类型的范围例如包括:文字、图片、音频和视频,但不限于此。在此基础上,步骤S201可以响应于对用户界面中任一控件的操作,获取与该控件的媒体类型对应的媒体内容,并将其作为素材集合中一个素材元素的一项媒体内容。As shown in FIG. 2A, the processing method 200 of the video material includes step S201, acquiring a material set of the video to be synthesized, and determining attributes of the material set. Here, the material collection may include a plurality of material elements. Each material element includes at least one of media content in images, text, audio, and video. The properties of the clip collection include the play order and play duration of each clip in the clip. According to some embodiments of the present application, in step S201, a user interface for acquiring material elements is provided. The user interface can include at least one control corresponding to at least one media type, respectively. Here, a control is a view object in the user interface for interacting with a user, such as an input box, a drop-down selection box, a button, and the like. The range of media types includes, for example, text, pictures, audio, and video, but is not limited thereto. Based on this, step S201 may obtain media content corresponding to the media type of the control in response to operation of any of the controls in the user interface, and use it as a media content of a material element in the material collection.
在一些实施例中,在用户通过一个图片控件选择一幅本地存储或来自网络的图片时,步骤S201可以响应于对图片控件的操作,将该图片作为一个素材元素的图片内容。另外说明的是,包含图片内容的素材元素通常还可以包括与图片相关联的文字或音频。在一些实施例中,在用户通过一个文字输入控件输入对应于该图片的文字时,步骤S201响应于对文字输入控件的操作,获取与图片内容关联的文字信息,并将其作为相应素材元素的文字内容。在一些实施例中,步骤S201可以响应于对音频控件的操作,获取与图片内容关联的音频信息,并将其作为相应素材元素的音频内容。这里,音频内容例如为旁白或者背景音乐等等。另外,步骤S201可以将图片的播放时长作为相应素材元素的播放时长。为了更形象说明步骤S201的执行过程,下面结合图3A至图3C进行示例性描述。In some embodiments, when the user selects a picture stored locally or from the network through a picture control, step S201 may respond to the operation of the picture control as the picture content of a material element. It is also noted that the material elements containing the picture content may also typically include text or audio associated with the picture. In some embodiments, when the user inputs text corresponding to the picture through a text input control, step S201 obtains text information associated with the picture content in response to the operation of the text input control, and uses the same as the corresponding material element. Text content. In some embodiments, step S201 may, in response to an operation of the audio control, acquire audio information associated with the picture content as the audio content of the corresponding material element. Here, the audio content is, for example, narration or background music or the like. In addition, step S201 may use the playing duration of the picture as the playing duration of the corresponding material element. In order to more vividly explain the execution process of step S201, an exemplary description will be made below with reference to FIGS. 3A to 3C.
图3A示出了根据本申请一些实施例的获取图片内容的用户界面示意图。图3B示出了一些实施例的显示图片的界面示意图。如图3A及3B所示,在用户对控件301进行操作时,步骤S201可以获取一张图片并显示在预览窗口302中。步骤S201可以响应于对播放时长控件303的操作,确定图片的播放时长。步骤S201可以响应于对文字输入控件304的操作,获取与预览窗口302中图片相关的文字信息。换言之,文字信息是对图片的补充说明。图3C示出了根据本申请一些实施例的获取音频信息的示意图。例如,步骤S201可以响应于对控件305的操作获取本地存储的音频(例如为背景音乐)。又例如,步骤S201可以响应 于对控件306的操作,录制一段音频内容。该音频内容例如是针对预览窗口302中图片而录制的旁白。FIG. 3A illustrates a schematic diagram of a user interface for acquiring picture content, in accordance with some embodiments of the present application. FIG. 3B shows an interface diagram of a display picture of some embodiments. As shown in FIGS. 3A and 3B, when the user operates the control 301, step S201 can acquire a picture and display it in the preview window 302. Step S201 may determine the play duration of the picture in response to the operation of the play duration control 303. Step S201 may acquire text information related to the picture in the preview window 302 in response to the operation of the text input control 304. In other words, text information is a supplement to the picture. FIG. 3C illustrates a schematic diagram of acquiring audio information in accordance with some embodiments of the present application. For example, step S201 can obtain locally stored audio (eg, for background music) in response to operation of control 305. For another example, step S201 can record a piece of audio content in response to operation of control 306. The audio content is, for example, a narration recorded for the picture in the preview window 302.
在一些实施例中,步骤S201可以获取一段视频,并将其作为一个素材元素的视频内容。例如,步骤S201响应于对用户界面中一个视频控件的操作,获取一个视频片段,并将其作为一个素材元素的视频内容。这里,视频例如可以是存储在本地的视频文件,也可以是存储在云端的视频内容。对于包含视频内容的素材元素,步骤S201还可以在其中加入文字内容、音频内容等等。在一个素材元素包括视频内容时,步骤S201可以将视频内容的播放时长作为该素材元素的播放时长。In some embodiments, step S201 may acquire a video as a video content of a material element. For example, step S201, in response to an operation on a video control in the user interface, acquires a video clip as the video content of a material element. Here, the video may be, for example, a video file stored locally or a video content stored in the cloud. For the material element containing the video content, step S201 may also add text content, audio content, and the like thereto. When a material element includes video content, step S201 may use the playing duration of the video content as the playing duration of the material element.
在一些实施例中,步骤S201可以包括步骤S2011-S2014。如图2B所示,在步骤S2011中,获取一段视频。在步骤S2012中,根据预定的视频剪辑算法,从该段视频中提取至少一个视频片段并生成每个视频片段的描述信息。具体而言,根据本申请一些实施例,步骤S2012首先确定视频段的至少一个关键图像帧。对于每个关键图像帧,步骤S2012可以从该段视频中提取包含该关键图像帧的一个视频片段。该视频片段可以包括一个与该视频片段的图像帧序列相关的音频片段。这样,步骤S2012对音频片段进行文字识别以获取相应的文字,并根据该文字生成与视频片段对应的描述信息。应当理解,步骤S2012可以采用各种能够自动剪辑视频的算法,本申请对此不作限制。In some embodiments, step S201 may include steps S2011-S2014. As shown in FIG. 2B, in step S2011, a video is acquired. In step S2012, at least one video segment is extracted from the segment of video according to a predetermined video editing algorithm and description information of each video segment is generated. In particular, in accordance with some embodiments of the present application, step S2012 first determines at least one key image frame of the video segment. For each key image frame, step S2012 may extract a video segment containing the key image frame from the segment of video. The video clip can include an audio clip associated with a sequence of image frames of the video clip. In this way, step S2012 performs character recognition on the audio segment to acquire the corresponding text, and generates description information corresponding to the video segment according to the text. It should be understood that various algorithms capable of automatically editing video can be adopted in step S2012, which is not limited in this application.
在此基础上,在步骤S2013中,提供显示每个视频片段的描述信息的用户界面,以便用户根据每个视频片段的描述信息进行片段选择。Based on this, in step S2013, a user interface for displaying description information of each video segment is provided, so that the user performs segment selection according to the description information of each video segment.
在步骤S2014中,响应于对至少一个视频片段的选择操作,分别将所选定的每个视频片段作为素材集合中一个素材元素的视频内容。换言之,步骤S2014可以将所选定的每个视频片段生成为一个素材元素。In step S2014, in response to the selection operation of the at least one video segment, each of the selected video segments is respectively used as the video content of one of the material elements in the material collection. In other words, step S2014 can generate each of the selected video segments as one material element.
另外说明的是,不限于通过步骤S2012对视频进行剪辑,本申请的实施例还可以向云端发送视频剪辑请求,并由云端设备(例如服务器120)进行视频剪辑。在此基础上,本申请的实施例可以从云端设备获取已剪辑的视频片段。另 外,为了更形象说明生成包含视频内容的素材元素的过程,下面结合图3D和3E进行示例性说明。In addition, it is not limited to the video editing by step S2012. The embodiment of the present application may also send a video clip request to the cloud, and the video clip is performed by a cloud device (for example, the server 120). Based on this, an embodiment of the present application can acquire a clipped video clip from a cloud device. In addition, in order to more clarify the process of generating a material element containing video content, an exemplary explanation will be made below with reference to Figs. 3D and 3E.
图3D示出了根据本申请一些实施例的生成视频片段的用户界面。如图3D所示,窗口307为待剪辑视频的预览窗口。响应于对控件308的操作,本申请的实施例可以生成多个视频片段,例如片段309。图3E示出了一个视频片段的编辑界面。例如,响应于对片段309的操作(例如为点击或双击等等),而进入图3E所示界面。其中,窗口310为片段309的预览窗口。区域311为关于片段309的描述信息。另外,用户可以通过文字输入控件312输入与视频片段对应的文字内容。用户还可以通过控件313或者控件314获取针对视频片段的音频内容。例如,图标315表示所获取的一个音频文件。另外,通过操作图3D中勾选框,用户可以选定至少一个视频片段。这样,本实施例可以将所选择的每个视频片段以及相应的文字内容和音频内容作为为一个素材元素。FIG. 3D illustrates a user interface for generating video segments in accordance with some embodiments of the present application. As shown in FIG. 3D, window 307 is a preview window of the video to be clipped. In response to operation of control 308, embodiments of the present application may generate multiple video segments, such as segment 309. Figure 3E shows the editing interface of a video clip. For example, in response to an operation on segment 309 (e.g., for a click or double tap, etc.), the interface shown in Figure 3E is entered. The window 310 is a preview window of the segment 309. The area 311 is descriptive information about the segment 309. In addition, the user can input text content corresponding to the video clip through the text input control 312. The user can also obtain audio content for the video clip through control 313 or control 314. For example, icon 315 represents an acquired audio file. Additionally, by operating the checkbox in Figure 3D, the user can select at least one video clip. Thus, the present embodiment can treat each selected video clip and the corresponding text content and audio content as one material element.
综上,步骤S201可以获取多个素材元素。这里,步骤S201可以将多个素材元素的生成顺序作为默认的播放顺序。另外,步骤S201还可以响应于用户操作对多个素材元素的播放顺序进行调节。例如,图3F示出了根据本申请一些实施例的调节播放顺序的用户界面。图3F呈现有各素材元素对应的缩略图。例如316和317。缩略图在显示区域内依次排列。步骤S201可以响应于对用户界面中缩略图的移动操作而调节素材集合中各元素的排列顺序,并将调节后的排列顺序作为素材集合的播放顺序。In summary, step S201 can acquire a plurality of material elements. Here, step S201 may use the generation order of the plurality of material elements as the default playback order. In addition, step S201 may also adjust the play order of the plurality of material elements in response to the user operation. For example, Figure 3F illustrates a user interface that adjusts the playback order in accordance with some embodiments of the present application. FIG. 3F presents a thumbnail corresponding to each material element. For example 316 and 317. The thumbnails are arranged in order within the display area. Step S201 may adjust an arrangement order of each element in the material set in response to a movement operation of the thumbnail in the user interface, and use the adjusted arrangement order as a play order of the material set.
对于步骤S201中确定的素材集合,方法200可以执行步骤S202。在步骤S202中,确定与素材集合对应的效果参数。这里,每种效果参数对应于一种视频效果模式。视频效果例如包括相邻素材元素之间的转场效果和粒子特效等等。其中,转场效果是指在两个场景(即两个素材元素)之间的场景过度效果。例如,本申请的实施例可以采用预定的技巧(如划像、叠变、卷页等),实现平滑过度的转场效果。转场效果还可以包括图片进入画面的效果(也可以称为图片飞入效果)。粒子特效是模拟现实中水、火、雾、气等对象的动画效果。另外说明的 是,一种视频效果模式对应于一个待合成视频的整体效果。实际上,一种视频效果模式可以是一种预定视频效果或者是多种预定视频效果的组合。为了避免用户在终端设备110中针对视频效果模式进行复杂的操作,步骤S202可以提供包含多个效果选项的用户界面。其中,每个效果选项对应于一种效果参数。这里,一种效果参数可以被认为是与一种视频效果模式对应的标识。响应于对多个效果选项中任一个的预览操作,步骤S202可以在用户界面中显示相应的预览效果图。响应于对多个效果选项中任一个的选定操作,步骤S202可以将所选定的效果选项所对应的效果参数作为与素材集合对应的效果参数。例如,图3G示出了根据本申请一些实施例的确定效果参数的用户界面。如图3G所示,区域319示出了多个效果选项,例如320和321。每个选项对应一种视频效果模式。例如,效果选项320被预览时,相应的效果动画被显示在窗口318中。窗口322中选项为当前正被预览的效果选项。这里,效果动画可以直观的表示一种视频效果模式。这样,用户可以通过查看效果动画来选定一种视频效果模式,而不需要在终端设备中进行与视频效果有关的复杂操作。例如,步骤S202可以响应于对控件323的操作,而选定当前正被预览的效果选项对应的效果参数。For the set of materials determined in step S201, the method 200 may perform step S202. In step S202, an effect parameter corresponding to the material set is determined. Here, each effect parameter corresponds to a video effect mode. Video effects include, for example, transition effects and particle effects between adjacent material elements. Among them, the transition effect refers to the scene over-effect between two scenes (ie, two material elements). For example, embodiments of the present application may employ predetermined techniques (eg, wipe, overlay, page curl, etc.) to achieve a smooth transitional effect. The transition effect can also include the effect of the picture entering the picture (also known as the picture fly-in effect). Particle effects are animated effects that simulate objects such as water, fire, fog, and gas in reality. In addition, a video effect mode corresponds to the overall effect of a video to be synthesized. In fact, one video effect mode can be a predetermined video effect or a combination of multiple predetermined video effects. In order to avoid a user performing complicated operations on the video effect mode in the terminal device 110, step S202 may provide a user interface including a plurality of effect options. Among them, each effect option corresponds to an effect parameter. Here, an effect parameter can be considered as an identifier corresponding to a video effect mode. In response to the preview operation on any of the plurality of effect options, step S202 may display a corresponding preview effect map in the user interface. In response to the selected operation of any one of the plurality of effect options, step S202 may use the effect parameter corresponding to the selected effect option as the effect parameter corresponding to the material set. For example, Figure 3G illustrates a user interface for determining performance parameters in accordance with some embodiments of the present application. As shown in FIG. 3G, region 319 shows a number of effect options, such as 320 and 321 . Each option corresponds to a video effect mode. For example, when the effect option 320 is previewed, the corresponding effect animation is displayed in the window 318. The option in window 322 is the effect option currently being previewed. Here, the effect animation can intuitively represent a video effect mode. In this way, the user can select a video effect mode by viewing the effect animation without performing complicated operations related to the video effect in the terminal device. For example, step S202 may select an effect parameter corresponding to the effect option currently being previewed in response to the operation of the control 323.
在确定素材集合和效果参数后,方法200可以执行步骤S203。在步骤S203中,将素材集合和效果参数传输至视频合成服务器(例如为服务器120)。这样,视频合成服务器可以根据效果参数和素材集合的属性,将素材集合中的多个素材元素合成为对应于所确定视频效果模式的视频。根据一些实施例,在步骤S203中,向视频合成服务器发送视频合成请求。视频合成请求可以包括素材集合和效果参数。这样,视频合成服务器可以响应于视频合成请求,将素材集合合成视频。根据本申请一些实施例,视频合成服务器可以向素材处理应用111发送关于提供视频合成服务的提示信息。在步骤S203中,响应于接收到该提示信息,向视频合成服务器发送素材集合和效果参数,以便视频合成服务器可以根据所接收的素材集合和效果参数合成相应视频。After determining the material set and effect parameters, method 200 can perform step S203. In step S203, the material set and effect parameters are transmitted to a video composition server (eg, server 120). In this way, the video composition server can synthesize multiple material elements in the material collection into videos corresponding to the determined video effect mode according to the effect parameters and the attributes of the material collection. According to some embodiments, in step S203, a video composition request is sent to the video composition server. The video composition request may include a material collection and an effect parameter. In this way, the video composition server can synthesize the material into a video in response to the video composition request. According to some embodiments of the present application, the video composition server may send hint information about providing a video composition service to the material processing application 111. In step S203, in response to receiving the prompt information, the material set and the effect parameter are transmitted to the video composition server, so that the video composition server can synthesize the corresponding video according to the received material set and the effect parameter.
综上,根据本申请的视频素材的处理方法200,可以在用户界面(例如图 3A至3G的用户界面)中进行内容选择,从而可以方便地获取待合成视频的素材集合。特别是,视频素材的处理方法200还可以对视频进行自动化剪辑而生成视频片段和相应的描述信息,从而可以使得用户通过查看描述信息而快速确定视频片段的内容并进行片段选择。另外,视频素材的处理方法200可以避免用户在本地终端设备上进行与视频效果有关的复杂操作,而是可以直观地向用户呈现多种视频效果模式的预览效果图(例如效果动画等等),从而便于用户快速确定待合成视频的效果模式。在此基础上,本申请的方法200可以通过视频合成服务器合成视频,从而极大提高用户体验度。In summary, according to the processing method 200 of the video material of the present application, content selection can be performed in a user interface (for example, the user interface of FIGS. 3A to 3G), so that the material set of the video to be synthesized can be conveniently obtained. In particular, the processing method 200 of the video material can also automatically clip the video to generate a video segment and corresponding description information, thereby enabling the user to quickly determine the content of the video segment and perform segment selection by viewing the description information. In addition, the processing method 200 of the video material can prevent the user from performing complicated operations related to the video effect on the local terminal device, and can intuitively present the preview effect image (for example, effect animation, etc.) of the plurality of video effect modes to the user. Thereby, it is convenient for the user to quickly determine the effect mode of the video to be synthesized. Based on this, the method 200 of the present application can synthesize video through the video synthesis server, thereby greatly improving the user experience.
下面结合图4对视频合成方式进一步说明。图4示出了根据本申请一些实施例的视频合成方法400的流程图。视频合成方法400可以由视频合成应用执行。服务器120可以包括视频合成应用。视频合成应用例如可以是利用素材集合来合成视频的软件,也可以是各种多媒体应用的组件。这里,多媒体应用例如是向终端设备110提供视频内容的软件。The video synthesis mode will be further described below with reference to FIG. FIG. 4 illustrates a flow diagram of a video composition method 400 in accordance with some embodiments of the present application. The video synthesis method 400 can be performed by a video synthesis application. Server 120 can include a video synthesis application. The video composition application may be, for example, software that synthesizes a video using a collection of materials, or may be a component of various multimedia applications. Here, the multimedia application is, for example, software that provides video content to the terminal device 110.
如图4所示,在步骤S401中,从素材处理应用111获取一个待合成视频的素材集合和关于该素材集合的效果参数,其中,素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容。素材集合的属性包括该素材集合中各个素材元素的播放顺序和播放时长。效果参数对应于一种视频效果模式。As shown in FIG. 4, in step S401, a material set of a video to be synthesized and an effect parameter about the material set are acquired from the material processing application 111, wherein the material set includes a plurality of material elements, each material element includes a picture, At least one of text, audio, and video. The properties of the clip collection include the play order and play duration of each clip in the clip. The effect parameter corresponds to a video effect mode.
在步骤S402中,根据效果参数和素材集合的属性,将素材集合中的多个素材元素合成为相应视频效果模式(即,效果参数所指定的视频效果模式)的视频。在一些实施例中,步骤S402对素材集合进行标准化处理,以使得各个素材元素转换成预定格式。预定格式例如包括图像编码格式、图像播放帧率和图像尺寸等等。在一些实施例中,预定格式与效果参数相关联。换言之,每种效果参数均被配置有相应的预定格式。这样,步骤S402可以根据效果参数确定相应的预定格式,对素材元素执行标准化处理。在此基础上,步骤S402可以根据效果参数,将经过标准化处理的素材集合合成为视频。In step S402, a plurality of material elements in the material set are synthesized into a video of a corresponding video effect mode (ie, a video effect mode specified by the effect parameter) according to the effect parameter and the attribute of the material set. In some embodiments, step S402 performs a normalization process on the set of materials to cause each of the material elements to be converted into a predetermined format. The predetermined format includes, for example, an image encoding format, an image playback frame rate, an image size, and the like. In some embodiments, the predetermined format is associated with an effect parameter. In other words, each effect parameter is configured with a corresponding predetermined format. In this way, step S402 can determine a corresponding predetermined format according to the effect parameter, and perform normalization processing on the material element. Based on this, step S402 can synthesize the normalized material set into a video according to the effect parameter.
在一些实施例中,视频合成应用配置有多个视频合成脚本。这里,每个视频合成脚本(也可以称为视频合成模板)对应一种视频合成效果,可以由视频合成应用执行。基于效果参数,步骤S402可以确定与该效果参数对应的多个渲染阶段。每个渲染阶段包括上述多个视频合成脚本中至少一个脚本,每个渲染阶段的渲染结果为下一个渲染阶段的输入内容。这样,步骤S402可以按照多个渲染阶段,对素材集合中素材元素进行渲染,以合成视频。这里,通过多个渲染阶段,步骤S402可以实现叠加合成效果(即效果参数对应的视频效果模式)。图5示出了根据本申请一些实施例的视频渲染过程。如图5所示的过程包括S1、S2和S3三个渲染阶段。阶段S1执行脚本X1和X2。这里,素材集合例如可以包括20个素材元素。步骤S402可以通过执行脚本X1对前10个素材元素进行渲染,通过脚本X2对后10个素材元素进行渲染。对于S1的渲染结构,步骤S402可以在阶段S2可以继续通过执行脚本X3和X4继续进行叠加效果处理。对于阶段S2的渲染结果,步骤S402可以在阶段S3继续进行叠加效果处理,从而生成对应于效果参数的渲染结果。这里,每个脚本的格式例如为可扩展标记语言(Extensible Markup Language,缩写为XML)。步骤S402例如可以调用后效果(After Effects,缩写为AE)应用来执行脚本,但不限于此。步骤S402调用AE执行渲染操作的代码示例如下:In some embodiments, the video composition application is configured with multiple video composition scripts. Here, each video composition script (which may also be referred to as a video composition template) corresponds to a video composition effect that can be executed by the video composition application. Based on the effect parameters, step S402 can determine a plurality of rendering stages corresponding to the effect parameters. Each rendering stage includes at least one of the plurality of video composition scripts described above, and the rendering result of each rendering stage is the input of the next rendering stage. In this way, step S402 can render the material elements in the material collection according to multiple rendering stages to synthesize the video. Here, through a plurality of rendering stages, step S402 can implement a superimposed composite effect (ie, a video effect mode corresponding to the effect parameter). FIG. 5 illustrates a video rendering process in accordance with some embodiments of the present application. The process shown in Figure 5 includes three rendering stages S1, S2, and S3. Stage S1 executes scripts X1 and X2. Here, the material set may include, for example, 20 material elements. In step S402, the first 10 material elements can be rendered by executing the script X1, and the last 10 material elements are rendered by the script X2. For the rendering structure of S1, step S402 can continue to perform the overlay effect processing by executing scripts X3 and X4 at stage S2. For the rendering result of the stage S2, the step S402 may continue the superimposition effect processing at the stage S3, thereby generating a rendering result corresponding to the effect parameter. Here, the format of each script is, for example, Extensible Markup Language (abbreviated as XML). Step S402 may, for example, invoke an After Effects (abbreviated as AE) application to execute the script, but is not limited thereto. The code example of the step S402 invoking the AE to perform the rendering operation is as follows:
“aerender-project test.aepx–comp“test”-RStemplate“test_1”–Omtemplate“test_2”-output test.mov"aerender-project test.aepx–comp "test"-RStemplate "test_1" –Omtemplate "test_2"-output test.mov
其中,aerender表示AE命令行执行程序的名称。Among them, aerender indicates the name of the AE command line execution program.
project test.aepx表示目前工程模板文件为test.aepx。Project test.aepx indicates that the current project template file is test.aepx.
comp表示此次渲染使用的合成器名称是tes。Comp indicates that the synthesizer name used for this rendering is tes.
RStemplate表示脚本名称是test_1。RStemplate indicates that the script name is test_1.
Omtemplate表示视频输出模板名是test_2。Omtemplate indicates that the video output template name is test_2.
output表示输出视频名为test.mov。Output indicates that the output video is named test.mov.
综上,根据本申请的视频合成方法400可以获取来自素材处理应用111的素 材集合,并确定与效果参数对应的多个渲染阶段。在此基础上,视频合成方法400可以通过执行多个渲染阶段而合成具有叠加视频效果的渲染结果。特别说明的是,视频合成方法400通过素材集合进行多个阶段的渲染,可以生成各种复杂的视频效果,从而极大提高了视频合成的效率和增加视频合成效果的类型。In summary, the video composition method 400 according to the present application can acquire a set of materials from the material processing application 111 and determine a plurality of rendering stages corresponding to the effect parameters. Based on this, the video composition method 400 can synthesize the rendering result with the superimposed video effect by performing a plurality of rendering stages. In particular, the video synthesis method 400 performs multiple stages of rendering through the material collection, and can generate various complex video effects, thereby greatly improving the efficiency of video synthesis and increasing the type of video synthesis effect.
图6示出了根据本申请一些实施例的视频合成方法600的流程图。视频合成方法600可以由视频合成应用执行。例如,服务器120可以包括该视频合成应用。FIG. 6 shows a flow diagram of a video composition method 600 in accordance with some embodiments of the present application. Video synthesis method 600 can be performed by a video synthesis application. For example, server 120 can include the video composition application.
如图6所示,视频合成方法600包括步骤S601至S602。步骤S601至S602的实施方式分别与步骤S401至S402一致,这里不再赘述。另外,视频合成方法600还包括步骤S603。As shown in FIG. 6, the video synthesis method 600 includes steps S601 to S602. The implementations of steps S601 to S602 are consistent with steps S401 to S402, respectively, and are not described herein again. In addition, the video synthesis method 600 further includes step S603.
在步骤S603中,生成文字内容对应的语音信息。具体而言,对于一个素材元素中的文本内容,步骤S603可以转化为语音信息。这里,步骤S603可以采用各种预定的语音转化算法进行语音转化。例如,步骤S603可以调用讯飞语音合成组件,以获取相应的音频文件。In step S603, voice information corresponding to the text content is generated. Specifically, for text content in one material element, step S603 can be converted into voice information. Here, step S603 can perform voice conversion using various predetermined speech conversion algorithms. For example, step S603 can invoke the Xunfei speech synthesis component to obtain a corresponding audio file.
在步骤S604中,生成语音信息对应的字幕信息。这里,步骤S604可以采用各种能够生成字幕的技术,本申请对此不做限制。例如,步骤S604可以调用快速向前动态图像(Fast Forward MPEG,缩写为FFMPEG)软件进行字幕生成,但不限于此。这里,所生成的字幕包括字幕效果、字幕显示时间等参数。In step S604, caption information corresponding to the voice information is generated. Here, step S604 can adopt various techniques capable of generating subtitles, which is not limited in this application. For example, step S604 may call Fast Forward MPEG (abbreviated as FFMPEG) software for caption generation, but is not limited thereto. Here, the generated subtitle includes parameters such as a subtitle effect, a subtitle display time, and the like.
在步骤S605中,将语音信息和字幕信息加入到步骤S602合成的视频中。In step S605, the voice information and the subtitle information are added to the video synthesized in step S602.
图7示出了根据本申请一些实施例的视频素材的处理装置700的示意图。素材处理应用111例如可以包括视频素材的处理装置700中。如图7所示,视频素材的处理装置700包括素材获取单元701、效果确定单元702和传输单元703。其中,素材获取单元701可以获取待合成视频的素材集合,并确定所述素材集合的属性。该素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容。所述属性包括该素材集合中各素材元素的播放顺序和播放时长。在一些实施例中,素材获取单元701可以提供用于获取素 材元素的用户界面。用户界面包括分别对应于至少一种媒体类型的至少一个控件。上述至少一种媒体类型包括:文字、图片、音频、视频中的至少一个。响应于对用户界面中任一控件的操作,素材获取单元701可以获取与该控件的媒体类型对应的媒体内容,并将其作为素材集合中一个素材元素的一项媒体内容。在一些实施例中,素材获取单元701可以响应于对用户界面中一个图片控件的操作,获取一张图片并将其作为素材集合的一个素材元素的图片内容。在一些实施例中,素材获取单元701还可以响应于对与图片控件相关联的文字输入控件的操作,获取所输入与图片内容关联的文字信息,并将其作为素材元素的文字内容。在一些实施例中,素材获取单元701还可以响应于对与图片控件相关联的音频控件的操作,获取所输入与图片内容关联的音频信息,并将其作为素材元素的音频内容。在一些实施例中,素材获取单元701还可以响应于对所述用户界面中一个视频控件的操作,获取一个视频片段并将其作为所述素材集合的一个素材元素的视频内容。FIG. 7 shows a schematic diagram of a processing device 700 for video material in accordance with some embodiments of the present application. The material processing application 111 may, for example, be in a processing device 700 that includes video material. As shown in FIG. 7, the processing device 700 of the video material includes a material acquisition unit 701, an effect determination unit 702, and a transmission unit 703. The material acquiring unit 701 can acquire a material set of the video to be synthesized, and determine an attribute of the material set. The material collection includes a plurality of material elements, each of which includes at least one of image content, text, audio, and video. The attribute includes a play order and a play duration of each material element in the material set. In some embodiments, the material acquisition unit 701 can provide a user interface for obtaining a material element. The user interface includes at least one control corresponding to at least one media type, respectively. The at least one media type includes at least one of text, picture, audio, and video. In response to operation of any of the controls in the user interface, the material acquisition unit 701 can acquire the media content corresponding to the media type of the control and use it as a piece of media content of a material element in the material collection. In some embodiments, the material acquisition unit 701 can obtain a picture and use it as the picture content of a material element of the material collection in response to an operation on a picture control in the user interface. In some embodiments, the material acquisition unit 701 may also acquire the input text information associated with the picture content as the text content of the material element in response to the operation of the text input control associated with the picture control. In some embodiments, the material acquisition unit 701 can also retrieve the input audio information associated with the picture content in response to an operation of the audio control associated with the picture control and use it as the audio content of the material element. In some embodiments, the material acquisition unit 701 can also retrieve a video segment as the video content of a material element of the material collection in response to an operation on a video control in the user interface.
在一些实施例中,素材获取单元701首先获取一段视频,然后根据视频剪辑算法,从该段视频中提取至少一个视频片段并生成每个视频片段的描述信息。In some embodiments, the material acquisition unit 701 first acquires a piece of video, and then extracts at least one video segment from the segment of video and generates descriptive information for each video segment according to a video editing algorithm.
具体而言,素材获取单元701可以确定视频的至少一个关键图像帧。对于每个关键图像帧,素材获取单元701可以从该视频中提取包含该关键图像帧的一个视频片段。该视频片段包括一个音频片段。素材获取单元701还可以对音频片段进行文字识别以获取相应的文字,并根据文字生成与视频片段对应的描述信息。Specifically, the material acquisition unit 701 can determine at least one key image frame of the video. For each key image frame, the material acquisition unit 701 can extract a video segment containing the key image frame from the video. The video clip includes an audio clip. The material acquisition unit 701 can also perform character recognition on the audio segment to acquire corresponding text, and generate description information corresponding to the video segment according to the text.
在此基础上,素材获取单元701可以提供显示每个视频片段的描述信息的用户界面,以便用户根据每个视频片段的描述信息进行片段选择。素材获取单元701响应于对视频片段的选择操作,分别将所选定的每个视频片段作为素材集合中一个素材元素的视频内容。Based on this, the material acquisition unit 701 can provide a user interface for displaying description information of each video segment, so that the user performs segment selection according to the description information of each video segment. The material acquisition unit 701 respectively determines each of the selected video segments as the video content of one material element in the material set in response to the selection operation on the video segment.
在一些实施例中,素材获取单元701可以提供呈现所述素材集合中各素材元素对应的缩略图的用户界面。各素材元素对应的缩略图在用户界面的相应显 示区域内依次排列。素材获取单元701可以响应于对用户界面中缩略图的移动操作而调节素材集合中各元素的排列顺序,并将调节后的排列顺序作为素材集合的播放顺序。在一些实施例中,在一个素材元素包括图片内容时,素材获取单元701可以将该图片内容的播放时长作为该素材元素的播放时长。在一个素材元素包括视频内容时,素材获取单元701可以将该视频内容的播放时长作为该素材元素的播放时长。In some embodiments, the material acquisition unit 701 can provide a user interface that presents thumbnails corresponding to respective material elements in the material collection. The thumbnails corresponding to the respective material elements are sequentially arranged in the corresponding display area of the user interface. The material acquisition unit 701 can adjust the arrangement order of the elements in the material set in response to the movement operation of the thumbnails in the user interface, and use the adjusted arrangement order as the playback order of the material collection. In some embodiments, when a material element includes picture content, the material obtaining unit 701 may use the playing time of the picture content as the playing duration of the material element. When a material element includes video content, the material acquisition unit 701 may use the playing time of the video content as the playing duration of the material element.
效果确定单元702可以确定与素材集合对应的效果参数。效果参数对应于一种视频效果模式。在一些实施例中,效果确定单元702可以提供包含多个效果选项的用户界面。其中每个效果选项对应于一种效果参数。响应于对多个效果选项中任一个的预览操作,效果确定单元702在用户界面中显示相应的预览效果图。响应于对多个效果选项中任一个的选定操作,效果确定单元702将所选定的效果选项所对应的效果参数作为与素材集合对应的效果参数。The effect determination unit 702 can determine an effect parameter corresponding to the material set. The effect parameter corresponds to a video effect mode. In some embodiments, the effect determination unit 702 can provide a user interface that includes a plurality of effect options. Each of these effect options corresponds to an effect parameter. In response to the preview operation on any of the plurality of effect options, the effect determination unit 702 displays the corresponding preview effect map in the user interface. In response to the selected operation of any one of the plurality of effect options, the effect determining unit 702 sets the effect parameter corresponding to the selected effect option as the effect parameter corresponding to the material set.
传输单元703可以将素材集合和效果参数传输至视频合成服务器,以便视频合成服务器根据效果参数和素材集合的属性,将素材集合中的多个素材元素合成为对应于视频效果模式的视频。需要说明的是,装置700更具体的实施方式与方法200一致,这里不再赘述。综上,根据本申请的视频素材的处理装置700,可以在用户界面(例如图3A至3G的用户界面)中进行内容选择,从而可以方便地获取待合成视频的素材集合。特别是,视频素材的处理装置700还可以对视频进行自动化剪辑而生成视频片段和相应的描述信息,从而可以使得用户通过查看描述信息而快速确定视频片段的内容并进行片段选择。另外,视频素材的处理装置700可以避免用户在本地终端设备上进行与视频效果有关的复杂操作,而是可以直观地向用户呈现多种视频效果模式的预览效果图(例如效果动画等等),从而便于用户快速确定待合成视频的效果模式。在此基础上,视频素材的处理装置700可以通过视频合成服务器合成视频,从而极大提高用户体验度。The transmitting unit 703 may transmit the material set and the effect parameter to the video composition server, so that the video composition server combines the plurality of material elements in the material set into a video corresponding to the video effect mode according to the effect parameter and the attribute of the material set. It should be noted that a more specific implementation of the apparatus 700 is consistent with the method 200, and details are not described herein again. In summary, according to the processing device 700 of the video material of the present application, content selection can be performed in a user interface (for example, the user interface of FIGS. 3A to 3G), so that the material set of the video to be synthesized can be conveniently obtained. In particular, the processing device 700 of the video material can also automatically clip the video to generate a video segment and corresponding description information, thereby enabling the user to quickly determine the content of the video segment and perform segment selection by viewing the description information. In addition, the processing device 700 of the video material can prevent the user from performing complicated operations related to the video effect on the local terminal device, and can intuitively present the preview effect map (eg, effect animation, etc.) of the plurality of video effect modes to the user. Thereby, it is convenient for the user to quickly determine the effect mode of the video to be synthesized. Based on this, the processing device 700 of the video material can synthesize the video through the video synthesis server, thereby greatly improving the user experience.
图8示出了根据本申请一些实施例的视频合成装置800的示意图。视频合 成应用可以包括视频合成装置800。服务器120例如可以包括该视频合成应用。FIG. 8 shows a schematic diagram of a video synthesis device 800 in accordance with some embodiments of the present application. The video synthesis application can include a video synthesis device 800. Server 120 may, for example, include the video composition application.
如图8所示,视频合成装置800可以包括通信单元801和视频合成单元802。通信单元801可以从素材处理应用111获取一个待合成视频的素材集合和关于该素材集合的效果参数。其中,素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容。素材集合的属性包括该素材集合中各个素材元素的播放顺序和播放时长。效果参数对应于一种视频效果模式。As shown in FIG. 8, the video synthesizing apparatus 800 may include a communication unit 801 and a video synthesizing unit 802. The communication unit 801 can acquire a material set of a video to be synthesized and an effect parameter regarding the material set from the material processing application 111. The material collection includes a plurality of material elements, and each of the material elements includes at least one of image content, text, audio, and video. The properties of the clip collection include the play order and play duration of each clip in the clip. The effect parameter corresponds to a video effect mode.
视频合成单元802,可以根据效果参数和素材集合的属性,将素材集合中的多个素材元素合成为视频效果模式的视频。在一些实施例中,视频合成单元802可以对素材集合进行标准化处理,以使得各个素材元素转换成预定格式。预定格式包括图像编码格式、图像播放帧率和图像尺寸。根据效果参数,视频合成单元802将经过标准化处理的素材集合合成为相应视频。在一些实施例中,视频合成单元802可以基于用于在预定的视频合成应用中执行的多个视频合成脚本,确定效果参数所对应的多个渲染阶段。其中,每个视频合成脚本对应于一种视频合成效果,每个渲染阶段包括上述多个视频合成脚本中至少一个脚本,每个渲染阶段的渲染结果为下一个渲染阶段的输入内容。基于多个渲染阶段,视频合成单元802可以对素材集合进行渲染,以生成相应视频。视频效果模式例如可以包括相邻素材元素之间的视频转场模式。需要说明的是,视频合成装置800更具体的实施方式与视频合成方法400一致,这里不再赘述。综上,根据本申请的视频合成装置800可以获取来自素材处理应用111的素材集合,并确定与效果参数对应的多个渲染阶段。在此基础上,视频合成装置800可以通过执行多个渲染阶段而合成具有叠加视频效果的渲染结果。特别说明的是,视频合成装置800通过素材集合进行多个阶段的渲染,可以生成各种复杂的视频效果,从而极大提高了视频合成的效率和增加视频合成效果的类型。The video synthesizing unit 802 can synthesize a plurality of material elements in the material set into a video of a video effect mode according to the effect parameter and the attribute of the material set. In some embodiments, video synthesizing unit 802 can normalize the set of material to cause each material element to be converted into a predetermined format. The predetermined format includes an image encoding format, an image playback frame rate, and an image size. According to the effect parameters, the video synthesizing unit 802 synthesizes the normalized processed material set into a corresponding video. In some embodiments, video synthesizing unit 802 can determine a plurality of rendering stages corresponding to the effect parameters based on a plurality of video synthesis scripts for execution in the predetermined video composition application. Each video composition script corresponds to a video composition effect, and each rendering stage includes at least one of the plurality of video composition scripts, and the rendering result of each rendering stage is the input content of the next rendering stage. Based on the plurality of rendering stages, the video composition unit 802 can render the set of materials to generate a corresponding video. The video effect mode may, for example, include a video transition mode between adjacent material elements. It should be noted that a more specific implementation of the video synthesizing apparatus 800 is consistent with the video synthesizing method 400, and details are not described herein again. In summary, the video synthesizing apparatus 800 according to the present application can acquire a material set from the material processing application 111 and determine a plurality of rendering stages corresponding to the effect parameters. Based on this, the video synthesizing device 800 can synthesize the rendering result with the superimposed video effect by performing a plurality of rendering stages. In particular, the video synthesizing device 800 performs multi-stage rendering through the material set, and can generate various complicated video effects, thereby greatly improving the efficiency of video synthesis and increasing the type of video synthesis effect.
图9示出了根据本申请一些实施例的视频合成装置900的示意图。视频合成应用可以包括视频合成装置900。服务器120例如可以包括该视频合成应用。FIG. 9 shows a schematic diagram of a video synthesizing device 900 in accordance with some embodiments of the present application. The video synthesis application can include a video synthesis device 900. Server 120 may, for example, include the video composition application.
如图9所示,视频合成装置900包括通信单元901和视频合成单元902。这里,通信单元901可以被实现为与通信单元801一致的实施方式。视频合成单元902可以被实现为与视频合成单元802一致的实施方式,这里不再赘述。另外,装置900还可以包括语音合成单元903、字幕生成单元904和添加单元905。As shown in FIG. 9, the video synthesizing apparatus 900 includes a communication unit 901 and a video synthesizing unit 902. Here, the communication unit 901 can be implemented as an embodiment consistent with the communication unit 801. The video synthesizing unit 902 can be implemented as an embodiment consistent with the video synthesizing unit 802, and details are not described herein again. In addition, the device 900 may further include a speech synthesis unit 903, a subtitle generation unit 904, and an addition unit 905.
当素材集合中一个素材元素包括图片内容和相应的文字内容时,语音合成单元903可以生成与文字内容对应的语音信息。字幕生成单元904可以生成与语音信息对应的字幕信息。在此基础上,添加单元905用于将语音信息和字幕信息加入到已生成的视频中。需要说明的是,视频合成装置900更具体的实施方式与视频合成方法600一致,这里不再赘述。When one of the material elements in the material set includes the picture content and the corresponding text content, the speech synthesis unit 903 can generate the speech information corresponding to the text content. The subtitle generating unit 904 can generate subtitle information corresponding to the voice information. Based on this, the adding unit 905 is for adding the voice information and the caption information to the generated video. It should be noted that a more specific implementation of the video synthesizing apparatus 900 is consistent with the video synthesizing method 600, and details are not described herein again.
图10示出了一个计算设备的组成结构图。如图10所示,该计算设备包括一个或者多个处理器(CPU或GPU)1002、通信模块1004、存储器1006、用户接口1010,以及用于互联这些组件的通信总线1008。Figure 10 shows a block diagram of the composition of a computing device. As shown in FIG. 10, the computing device includes one or more processors (CPU or GPU) 1002, a communication module 1004, a memory 1006, a user interface 1010, and a communication bus 1008 for interconnecting these components.
处理器1002可通过通信模块1004接收和发送数据以实现网络通信和/或本地通信。The processor 1002 can receive and transmit data through the communication module 1004 to effect network communication and/or local communication.
用户接口1010包括一个或多个输出设备1012,其包括一个或多个扬声器和/或一个或多个可视化显示器。用户接口1010也包括一个或多个输入设备1014,其包括诸如,键盘,鼠标,声音命令输入单元或扩音器,触屏显示器,触敏输入板,姿势捕获摄像机或其他输入按钮或控件等。User interface 1010 includes one or more output devices 1012 that include one or more speakers and/or one or more visual displays. User interface 1010 also includes one or more input devices 1014 including, for example, a keyboard, a mouse, a voice command input unit or loudspeaker, a touch screen display, a touch sensitive tablet, a gesture capture camera or other input button or control, and the like.
存储器1006可以是高速随机存取存储器,诸如DRAM、SRAM、DDR RAM、或其他随机存取固态存储设备;或者非易失性存储器,诸如一个或多个磁盘存储设备、光盘存储设备、闪存设备,或其他非易失性固态存储设备。The memory 1006 may be a high speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state storage device; or a nonvolatile memory such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, Or other non-volatile solid-state storage devices.
存储器1006存储处理器1002可执行的指令集,包括:The memory 1006 stores a set of instructions executable by the processor 1002, including:
操作系统1016,包括用于处理各种基本系统服务和用于执行硬件相关任务的程序;An operating system 1016, including a program for processing various basic system services and for performing hardware related tasks;
应用1018,包括用于实现上述方法的各种程序,这种程序能够实现上述各实例中的处理流程。在图10的计算设备实现为终端设备110时,应用1018可 以包括根据本申请的视频素材处理应用。视频素材处理应用可以包括图7所示的视频素材的处理装置700。另外,在图10的计算设备实现为服务器120时,应用1018可以包括视频合成应用。视频合成应用例如可以包括图8所示的视频合成装置800或图9所示的视频合成装置900。The application 1018 includes various programs for implementing the above method, and the program can implement the processing flow in each of the above examples. When the computing device of FIG. 10 is implemented as terminal device 110, application 1018 can include a video material processing application in accordance with the present application. The video material processing application may include the processing device 700 of the video material shown in FIG. Additionally, when the computing device of FIG. 10 is implemented as server 120, application 1018 can include a video composition application. The video composition application may include, for example, the video synthesizing device 800 shown in FIG. 8 or the video synthesizing device 900 shown in FIG.
另外,本申请的每一个实例可以通过由数据处理设备如计算机执行的数据处理程序来实现。显然,数据处理程序构成了本申请。此外,通常存储在一个存储介质中的数据处理程序通过直接将程序读取出存储介质或者通过将程序安装或复制到数据处理设备的存储设备(如硬盘或内存)中执行。因此,这样的存储介质也构成了本发明。存储介质可以使用任何类型的记录方式,例如纸张存储介质(如纸带等)、磁存储介质(如软盘、硬盘、闪存等)、光存储介质(如CD-ROM等)、磁光存储介质(如MO等)等。Additionally, each of the examples of the present application can be implemented by a data processing program executed by a data processing device such as a computer. Obviously, the data processing program constitutes the present application. Further, a data processing program that is usually stored in one storage medium is executed by directly reading the program out of the storage medium or by installing or copying the program to a storage device (such as a hard disk or a memory) of the data processing device. Therefore, such a storage medium also constitutes the present invention. The storage medium can use any type of recording method, such as paper storage medium (such as paper tape, etc.), magnetic storage medium (such as floppy disk, hard disk, flash memory, etc.), optical storage medium (such as CD-ROM, etc.), magneto-optical storage medium ( Such as MO, etc.).
因此本申请还公开了一种非易失性存储介质,其中存储有数据处理程序,该数据处理程序用于执行本申请上述方法的任何一种实例。The present application therefore also discloses a non-volatile storage medium in which is stored a data processing program for performing any of the above-described methods of the present application.
另外,本申请所述的方法步骤除了可以用数据处理程序来实现,还可以由硬件来实现,例如,可以由逻辑门、开关、专用集成电路(ASIC)、可编程逻辑控制器和嵌微控制器等来实现。因此这种可以实现本申请所述方法的硬件也可以构成本申请。In addition, the method steps described in this application can be implemented by a data processing program, and can also be implemented by hardware, for example, by logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers, and embedded control. And so on. Thus, such hardware that can implement the methods described herein can also form the present application.
以上所述仅为本申请的示例性实例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。The above description is only an illustrative example of the present application, and is not intended to limit the present application. Any modifications, equivalent substitutions, improvements, etc., which are within the spirit and principles of the present application, should be included in the protection of the present application. Within the scope of.

Claims (16)

  1. 一种视频素材的处理方法,由终端设备执行,所述方法包括:A method for processing a video material is performed by a terminal device, and the method includes:
    获取待合成视频的素材集合,并确定所述素材集合的属性,其中,所述素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容,所述属性包括所述素材集合中各素材元素的播放顺序和播放时长;Obtaining a set of materials of the video to be synthesized, and determining attributes of the set of materials, wherein the set of materials includes a plurality of material elements, each material element including at least one of media content of a picture, a text, an audio, and a video, The attribute includes a play order and a play duration of each material element in the material set;
    确定与所述素材集合对应的效果参数,所述效果参数对应于视频效果模式;以及Determining an effect parameter corresponding to the set of materials, the effect parameter corresponding to a video effect mode;
    将所述素材集合和所述效果参数传输至视频合成服务器,以便所述视频合成服务器根据所述效果参数和所述素材集合的属性,将所述素材集合中的多个素材元素合成为对应于所述视频效果模式的视频。Transmitting the material set and the effect parameter to a video composition server, so that the video composition server synthesizes a plurality of material elements in the material set to correspond to according to the effect parameter and an attribute of the material set The video of the video effect mode.
  2. 如权利要求1所述的方法,其中,所述获取待合成视频的素材集合,包括:The method of claim 1, wherein the obtaining a material set of the video to be synthesized comprises:
    提供用于获取素材元素的用户界面,所述用户界面包括分别对应于至少一种媒体类型的至少一个控件,所述至少一种媒体类型包括:文字、图片、音频、视频中的至少一个;Providing a user interface for acquiring a material element, the user interface including at least one control respectively corresponding to at least one media type, the at least one media type comprising: at least one of text, picture, audio, video;
    响应于对所述用户界面中任一个控件的操作,获取与该控件的媒体类型对应的媒体内容,并将其作为所述素材集合中一个素材元素的一项媒体内容。In response to operation of any one of the user interfaces, the media content corresponding to the media type of the control is obtained and used as a media content of a material element in the material collection.
  3. 如权利要求2所述的方法,其中,所述响应于对所述用户界面中任一个控件的操作,获取与该控件的媒体类型对应的媒体内容,并将其作为所述素材集合中一个素材元素的一项媒体内容,包括:The method of claim 2, wherein the responsive to an operation of any one of the user interfaces, obtaining media content corresponding to a media type of the control and using it as a material in the material collection A media content of the element, including:
    响应于对所述用户界面中一个图片控件的操作,获取一张图片并将其作为所述素材集合的一个素材元素的图片内容。In response to an operation on a picture control in the user interface, a picture is taken and used as the picture content of a material element of the material collection.
  4. 如权利要求2所述的方法,其中,所述响应于对所述用户界面中任一个控件的操作,获取该控件对应的媒体类型的媒体内容,并将其作为所述素材集 合中一个素材元素的一项媒体内容,包括:The method of claim 2, wherein the responsive to an operation of any one of the user interfaces, obtaining media content of a media type corresponding to the control and using it as a material element in the material collection A media content, including:
    响应于对所述用户界面中一个视频控件的操作,获取一个视频片段并将其作为所述素材集合的一个素材元素的视频内容。In response to operation of a video control in the user interface, a video clip is retrieved and used as the video content of a material element of the material collection.
  5. 如权利要求1所述的方法,其中,所述获取待合成视频的素材集合,包括:The method of claim 1, wherein the obtaining a material set of the video to be synthesized comprises:
    获取一段视频;Get a video;
    根据视频剪辑算法,从该段视频中提取至少一个视频片段并生成每个视频片段的描述信息;Extracting at least one video segment from the segment of video and generating description information for each video segment according to a video editing algorithm;
    提供显示所述每个视频片段的描述信息的用户界面,以便用户根据每个视频片段的描述信息进行片段选择;Providing a user interface for displaying description information of each of the video segments, so that the user performs segment selection according to description information of each video segment;
    响应于对所述至少一个视频片段的选择操作,分别将所选定的每个视频片段作为所述素材集合中一个素材元素的视频内容。Each of the selected video segments is respectively used as video content of one of the material elements in the material set in response to a selection operation of the at least one video segment.
  6. 如权利要求5所述的方法,其中,所述根据视频剪辑算法,从该段视频中提取至少一个视频片段并生成每个视频片段的描述信息,包括:The method of claim 5, wherein said extracting at least one video segment from the segment of video and generating description information for each video segment according to a video editing algorithm comprises:
    确定所述视频的至少一个关键图像帧;Determining at least one key image frame of the video;
    对于每个关键图像帧,从该视频中提取包含该关键图像帧的一个视频片段,该视频片段包括音频片段;For each key image frame, extracting a video segment containing the key image frame from the video, the video segment including an audio segment;
    对所述音频片段进行文字识别以获取相应的文字,并根据所述文字生成与所述视频片段对应的描述信息。Performing character recognition on the audio segment to obtain a corresponding text, and generating description information corresponding to the video segment according to the text.
  7. 如权利要求1所述的方法,其中,所述确定所述素材集合的属性,包括:The method of claim 1 wherein said determining an attribute of said set of materials comprises:
    提供呈现所述素材集合中各素材元素对应的缩略图的用户界面,所述各素材元素对应的缩略图在所述用户界面的相应显示区域内依次排列;Providing a user interface for presenting a thumbnail corresponding to each material element in the material set, and thumbnails corresponding to the material elements are sequentially arranged in a corresponding display area of the user interface;
    响应于对所述用户界面中缩略图的移动操作而调节所述素材集合中各元素的排列顺序,并将调节后的排列顺序作为所述素材集合的播放顺序。The arrangement order of each element in the material set is adjusted in response to a movement operation of the thumbnail in the user interface, and the adjusted arrangement order is used as a play order of the material set.
  8. 如权利要求1所述的方法,其中,所述确定与所述素材集合对应的效果参数,所述效果参数对应于视频效果模式,包括:The method of claim 1, wherein the determining an effect parameter corresponding to the material set, the effect parameter corresponding to a video effect mode comprises:
    提供包含多个效果选项的用户界面,其中,每个效果选项对应于一种效果参数;Providing a user interface with multiple effect options, where each effect option corresponds to an effect parameter;
    响应于对所述多个效果选项中任一个的预览操作,在所述用户界面中显示相应的预览效果图;Displaying a corresponding preview effect map in the user interface in response to a preview operation on any of the plurality of effect options;
    响应于对所述多个效果选项中任一个的选定操作,将所选定的效果选项对应的效果参数作为与所述素材集合对应的效果参数。In response to the selected operation of any of the plurality of effect options, the effect parameter corresponding to the selected effect option is taken as an effect parameter corresponding to the set of materials.
  9. 如权利要求1所述的方法,其中,所述将所述素材集合和所述效果参数传输至视频合成服务器,以便所述视频合成服务器根据所述效果参数和所述素材集合的属性,将所述素材集合中的多个素材元素合成为对应于所述视频效果模式的视频包括,向所述视频合成服务器发送视频合成请求,所述视频合成请求包括所述素材集合和所述效果参数,以便所述视频合成服务器响应于所述视频合成请求而将所述素材集合中的多个素材元素合成为对应于所述视频效果模式的所述视频。The method of claim 1, wherein said transmitting said set of materials and said effect parameters to a video composition server, such that said video composition server is based on said effect parameters and attributes of said set of materials Synthesizing a plurality of material elements in the set of materials into a video corresponding to the video effect mode includes transmitting a video composition request to the video composition server, the video composition request including the material set and the effect parameter, so that The video composition server synthesizes a plurality of material elements in the material set into the video corresponding to the video effect mode in response to the video composition request.
  10. 一种视频合成方法,由服务器执行,所述方法包括:A video synthesis method is performed by a server, the method comprising:
    从素材处理应用获取一个待合成视频的素材集合和关于该素材集合的效果参数,其中,所述素材集合包括多个素材元素,每个素材元素包括图片、文字、音频和视频中至少一种媒体内容,所述素材集合的属性包括该素材集合中各个素材元素的播放顺序和播放时长,所述效果参数对应于视频效果模式;Obtaining a material set of a video to be synthesized and an effect parameter about the material set from the material processing application, wherein the material set includes a plurality of material elements, each material element including at least one of a picture, a text, an audio, and a video Content, the attribute of the material set includes a play order and a play duration of each material element in the material set, and the effect parameter corresponds to a video effect mode;
    根据所述效果参数和所述素材集合的属性,将所述素材集合中的多个素材元素合成为所述视频效果模式的视频。And synthesizing a plurality of material elements in the material set into a video of the video effect mode according to the effect parameter and an attribute of the material set.
  11. 如权利要求10所述的方法,其中,当所述素材集合中一个素材元素包括图片内容和相应的文字内容时,所述方法还包括:The method of claim 10, wherein when one of the material elements in the material set includes picture content and corresponding text content, the method further comprises:
    生成与所述文字内容对应的语音信息;Generating voice information corresponding to the text content;
    生成与所述语音信息对应的字幕信息;Generating caption information corresponding to the voice information;
    将所述语音信息和所述字幕信息加入到所述视频中。The voice information and the caption information are added to the video.
  12. 如权利要求10所述的方法,其中,所述根据所述效果参数和所述素材 集合的属性,将所述素材集合中的多个素材元素合成为所述视频效果模式的视频,包括:The method of claim 10, wherein the synthesizing the plurality of material elements in the material set into the video of the video effect mode according to the effect parameter and the attribute of the material set comprises:
    基于用于在预定的视频合成应用中执行的多个视频合成脚本,确定所述效果参数所对应的多个渲染阶段,其中,每个视频合成脚本对应于一种视频合成效果,每个渲染阶段包括所述多个视频合成脚本中至少一个脚本,每个渲染阶段的渲染结果为下一个渲染阶段的输入内容;Determining a plurality of rendering stages corresponding to the effect parameter based on a plurality of video composition scripts for execution in a predetermined video composition application, wherein each video composition script corresponds to a video composition effect, each rendering stage Include at least one of the plurality of video composition scripts, and the rendering result of each rendering stage is an input content of a next rendering stage;
    基于所述多个渲染阶段,对所述素材集合进行渲染,以生成所述视频。The set of materials is rendered based on the plurality of rendering stages to generate the video.
  13. 如权利要求10所述的方法,其中,所述根据所述效果参数和所述素材集合的属性,将所述素材集合中的多个素材元素合成为所述视频效果模式的视频,包括:The method of claim 10, wherein the synthesizing the plurality of material elements in the material set into the video of the video effect mode according to the effect parameter and the attribute of the material set comprises:
    对所述素材集合进行标准化处理,以使得各个素材元素转换成预定格式,所述预定格式包括图像编码格式、图像播放帧率和图像尺寸;Normalizing the set of materials to convert respective material elements into a predetermined format, the image format, an image playback frame rate, and an image size;
    根据所述效果参数,将经过标准化处理的所述素材集合合成为所述视频。The material set that has undergone normalization processing is synthesized into the video according to the effect parameter.
  14. 一种终端设备,包括:处理器和存储器;所述存储器中存储有计算机可读指令,可以使所述处理器执行如权利要求1-9中任一项所述的方法。A terminal device comprising: a processor and a memory; the memory storing computer readable instructions, the processor being operative to perform the method of any of claims 1-9.
  15. 一种服务器,包括:处理器和存储器;所述存储器中存储有计算机可读指令,可以使所述处理器执行如权利要求10-13中任一项所述的方法。A server comprising: a processor and a memory; the memory having computer readable instructions stored thereon, the processor being operative to perform the method of any of claims 10-13.
  16. 一种非易失性存储介质,存储有数据处理程序,所述数据处理程序当由计算设备执行时,使得所述计算设备执行权利要求1-13中任一项所述的方法。A non-volatile storage medium storing a data processing program, the data processing program, when executed by a computing device, causing the computing device to perform the method of any of claims 1-13.
PCT/CN2018/114100 2017-11-06 2018-11-06 Video material processing method, video synthesis method, terminal device and storage medium WO2019086037A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711076478.2A CN107770626B (en) 2017-11-06 2017-11-06 Video material processing method, video synthesizing device and storage medium
CN201711076478.2 2017-11-06

Publications (1)

Publication Number Publication Date
WO2019086037A1 true WO2019086037A1 (en) 2019-05-09

Family

ID=61273334

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/114100 WO2019086037A1 (en) 2017-11-06 2018-11-06 Video material processing method, video synthesis method, terminal device and storage medium

Country Status (2)

Country Link
CN (1) CN107770626B (en)
WO (1) WO2019086037A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112532896A (en) * 2020-10-28 2021-03-19 北京达佳互联信息技术有限公司 Video production method, video production device, electronic device and storage medium
US20220417591A1 (en) * 2020-03-24 2022-12-29 Beijing Dajia Internet Information Technology Co., Ltd. Video rendering method and apparatus, electronic device, and storage medium

Families Citing this family (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107770626B (en) * 2017-11-06 2020-03-17 腾讯科技(深圳)有限公司 Video material processing method, video synthesizing device and storage medium
CN108540854A (en) * 2018-03-29 2018-09-14 努比亚技术有限公司 Live video clipping method, terminal and computer readable storage medium
CN108536790A (en) * 2018-03-30 2018-09-14 北京市商汤科技开发有限公司 The generation of sound special efficacy program file packet and sound special efficacy generation method and device
CN108495171A (en) * 2018-04-03 2018-09-04 优视科技有限公司 Method for processing video frequency and its device, storage medium, electronic product
CN114125512B (en) * 2018-04-10 2023-01-31 腾讯科技(深圳)有限公司 Promotion content pushing method and device and storage medium
CN108924584A (en) * 2018-05-30 2018-11-30 互影科技(北京)有限公司 The packaging method and device of interactive video
CN108900927A (en) * 2018-06-06 2018-11-27 芽宝贝(珠海)企业管理有限公司 The generation method and device of video
CN108986227B (en) * 2018-06-28 2022-11-29 北京市商汤科技开发有限公司 Particle special effect program file package generation method and device and particle special effect generation method and device
CN108900897B (en) * 2018-07-09 2021-10-15 腾讯科技(深圳)有限公司 Multimedia data processing method and device and related equipment
CN109168027B (en) * 2018-10-25 2020-12-11 北京字节跳动网络技术有限公司 Instant video display method and device, terminal equipment and storage medium
CN109658483A (en) * 2018-11-20 2019-04-19 北京弯月亮科技有限公司 The generation system and generation method of Video processing software data file
CN109379643B (en) * 2018-11-21 2020-06-09 北京达佳互联信息技术有限公司 Video synthesis method, device, terminal and storage medium
WO2020107297A1 (en) * 2018-11-28 2020-06-04 深圳市大疆创新科技有限公司 Video clipping control method, terminal device, system
KR20230173220A (en) * 2019-01-18 2023-12-26 스냅 아이엔씨 Systems and methods for generating personalized videos with customized text messages
CN109819179B (en) * 2019-03-21 2022-02-01 腾讯科技(深圳)有限公司 Video editing method and device
CN110336960B (en) * 2019-07-17 2021-12-10 广州酷狗计算机科技有限公司 Video synthesis method, device, terminal and storage medium
CN110445992A (en) * 2019-08-16 2019-11-12 深圳特蓝图科技有限公司 A kind of video clipping synthetic method based on XML
CN112822541B (en) 2019-11-18 2022-05-20 北京字节跳动网络技术有限公司 Video generation method and device, electronic equipment and computer readable medium
CN111010591B (en) * 2019-12-05 2021-09-17 北京中网易企秀科技有限公司 Video editing method, browser and server
CN111883099B (en) * 2020-04-14 2021-10-15 北京沃东天骏信息技术有限公司 Audio processing method, device, system, browser module and readable storage medium
CN111479158B (en) * 2020-04-16 2022-06-10 北京达佳互联信息技术有限公司 Video display method and device, electronic equipment and storage medium
CN111416991B (en) * 2020-04-28 2022-08-05 Oppo(重庆)智能科技有限公司 Special effect processing method and apparatus, and storage medium
CN111614912B (en) * 2020-05-26 2023-10-03 北京达佳互联信息技术有限公司 Video generation method, device, equipment and storage medium
CN111710021A (en) * 2020-05-26 2020-09-25 珠海九松科技有限公司 Method and system for generating dynamic video based on static medical materials
CN111787395B (en) * 2020-05-27 2023-04-18 北京达佳互联信息技术有限公司 Video generation method and device, electronic equipment and storage medium
CN111831615B (en) * 2020-05-28 2024-03-12 北京达佳互联信息技术有限公司 Method, device and system for generating video file
CN111683280B (en) * 2020-06-04 2024-06-21 腾讯科技(深圳)有限公司 Video processing method and device and electronic equipment
CN111767414A (en) * 2020-06-12 2020-10-13 上海传英信息技术有限公司 Dynamic image generation method and device
CN113838490B (en) * 2020-06-24 2022-11-11 华为技术有限公司 Video synthesis method and device, electronic equipment and storage medium
CN111951357A (en) * 2020-08-11 2020-11-17 深圳市前海手绘科技文化有限公司 Application method of sound material in hand-drawn animation
CN112040271A (en) * 2020-09-04 2020-12-04 杭州七依久科技有限公司 Cloud intelligent editing system and method for visual programming
CN114390354B (en) * 2020-10-21 2024-05-10 西安诺瓦星云科技股份有限公司 Program production method, device and system and computer readable storage medium
CN112287168A (en) * 2020-10-30 2021-01-29 北京有竹居网络技术有限公司 Method and apparatus for generating video
CN112632326B (en) * 2020-12-24 2022-02-18 北京风平科技有限公司 Video production method and device based on video script semantic recognition
CN112969092B (en) * 2021-01-29 2022-05-10 稿定(厦门)科技有限公司 Video file playing system
CN113055730B (en) * 2021-02-07 2023-08-18 深圳市欢太科技有限公司 Video generation method, device, electronic equipment and storage medium
CN115209215A (en) * 2021-04-09 2022-10-18 北京字跳网络技术有限公司 Video processing method, device and equipment
CN113810538B (en) * 2021-09-24 2023-03-17 维沃移动通信有限公司 Video editing method and video editing device
CN113992940B (en) * 2021-12-27 2022-03-29 北京美摄网络科技有限公司 Web end character video editing method, system, electronic equipment and storage medium
CN113986087B (en) * 2021-12-27 2022-04-12 深圳市大头兄弟科技有限公司 Video rendering method based on subscription
CN114286164B (en) * 2021-12-28 2024-02-09 北京思明启创科技有限公司 Video synthesis method and device, electronic equipment and storage medium
CN114401377A (en) * 2021-12-30 2022-04-26 杭州摸象大数据科技有限公司 Financial marketing video generation method and device, computer equipment and storage medium
CN114466222B (en) * 2022-01-29 2023-09-26 北京百度网讯科技有限公司 Video synthesis method and device, electronic equipment and storage medium
CN114979054B (en) * 2022-05-13 2024-06-18 维沃移动通信有限公司 Video generation method, device, electronic equipment and readable storage medium
CN115129212A (en) * 2022-05-30 2022-09-30 腾讯科技(深圳)有限公司 Video editing method, video editing device, computer equipment, storage medium and product
CN116634058B (en) * 2022-05-30 2023-12-22 荣耀终端有限公司 Editing method of media resources, electronic equipment and readable storage medium
CN115134659B (en) * 2022-06-15 2024-06-25 阿里巴巴云计算(北京)有限公司 Video editing and configuring method, device, browser, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060233514A1 (en) * 2005-04-14 2006-10-19 Shih-Hsiung Weng System and method of video editing
CN101086886A (en) * 2006-06-07 2007-12-12 索尼株式会社 Recording system and recording method
US20080247726A1 (en) * 2007-04-04 2008-10-09 Nhn Corporation Video editor and method of editing videos
CN103928039A (en) * 2014-04-15 2014-07-16 北京奇艺世纪科技有限公司 Video compositing method and device
CN104780439A (en) * 2014-01-15 2015-07-15 腾讯科技(深圳)有限公司 Video processing method and device
CN105657538A (en) * 2015-12-31 2016-06-08 北京东方云图科技有限公司 Method and device for synthesizing video file by mobile terminal
CN105679347A (en) * 2016-01-07 2016-06-15 北京东方云图科技有限公司 Method and apparatus for making video file through programming process
CN107193841A (en) * 2016-03-15 2017-09-22 北京三星通信技术研究有限公司 Media file accelerates the method and apparatus played, transmit and stored
CN107770626A (en) * 2017-11-06 2018-03-06 腾讯科技(深圳)有限公司 Processing method, image synthesizing method, device and the storage medium of video material

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085612A (en) * 2017-05-15 2017-08-22 腾讯科技(深圳)有限公司 media content display method, device and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060233514A1 (en) * 2005-04-14 2006-10-19 Shih-Hsiung Weng System and method of video editing
CN101086886A (en) * 2006-06-07 2007-12-12 索尼株式会社 Recording system and recording method
US20080247726A1 (en) * 2007-04-04 2008-10-09 Nhn Corporation Video editor and method of editing videos
CN104780439A (en) * 2014-01-15 2015-07-15 腾讯科技(深圳)有限公司 Video processing method and device
CN103928039A (en) * 2014-04-15 2014-07-16 北京奇艺世纪科技有限公司 Video compositing method and device
CN105657538A (en) * 2015-12-31 2016-06-08 北京东方云图科技有限公司 Method and device for synthesizing video file by mobile terminal
CN105679347A (en) * 2016-01-07 2016-06-15 北京东方云图科技有限公司 Method and apparatus for making video file through programming process
CN107193841A (en) * 2016-03-15 2017-09-22 北京三星通信技术研究有限公司 Media file accelerates the method and apparatus played, transmit and stored
CN107770626A (en) * 2017-11-06 2018-03-06 腾讯科技(深圳)有限公司 Processing method, image synthesizing method, device and the storage medium of video material

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220417591A1 (en) * 2020-03-24 2022-12-29 Beijing Dajia Internet Information Technology Co., Ltd. Video rendering method and apparatus, electronic device, and storage medium
CN112532896A (en) * 2020-10-28 2021-03-19 北京达佳互联信息技术有限公司 Video production method, video production device, electronic device and storage medium

Also Published As

Publication number Publication date
CN107770626B (en) 2020-03-17
CN107770626A (en) 2018-03-06

Similar Documents

Publication Publication Date Title
WO2019086037A1 (en) Video material processing method, video synthesis method, terminal device and storage medium
US11943486B2 (en) Live video broadcast method, live broadcast device and storage medium
WO2022048478A1 (en) Multimedia data processing method, multimedia data generation method, and related device
WO2020077856A1 (en) Video photographing method and apparatus, electronic device and computer readable storage medium
WO2020029526A1 (en) Method for adding special effect to video, device, terminal apparatus, and storage medium
WO2020077855A1 (en) Video photographing method and apparatus, electronic device and computer readable storage medium
EP3195601B1 (en) Method of providing visual sound image and electronic device implementing the same
US11670339B2 (en) Video acquisition method and device, terminal and medium
JP2022552344A (en) MOVIE FILE GENERATION METHOD, DEVICE, TERMINAL AND STORAGE MEDIUM
WO2019047878A1 (en) Method for controlling terminal by voice, terminal, server and storage medium
CN112804459A (en) Image display method and device based on virtual camera, storage medium and electronic equipment
WO2023104102A1 (en) Live broadcasting comment presentation method and apparatus, and device, program product and medium
JP2004288197A (en) Interface for presenting data expression in screen area inset
WO2010102525A1 (en) Method for generating gif, and system and media player thereof
CN111629253A (en) Video processing method and device, computer readable storage medium and electronic equipment
EP3024223B1 (en) Videoconference terminal, secondary-stream data accessing method, and computer storage medium
WO2022000983A1 (en) Video processing method and apparatus, and electronic device and storage medium
CA3001480C (en) Video-production system with dve feature
WO2019227429A1 (en) Method, device, apparatus, terminal, server for generating multimedia content
JP2005051703A (en) Live streaming broadcasting method, live streaming broadcasting apparatus, live streaming broadcasting system, program, recording medium, broadcasting method, and broadcasting apparatus
US10698744B2 (en) Enabling third parties to add effects to an application
JP2022145503A (en) Live distribution information processing method, apparatus, electronic device, storage medium, and program
US9569546B2 (en) Sharing of documents with semantic adaptation across mobile devices
CN116126177A (en) Data interaction control method and device, electronic equipment and storage medium
WO2023182937A2 (en) Special effect video determination method and apparatus, electronic device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18873412

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18873412

Country of ref document: EP

Kind code of ref document: A1