WO2022213801A1 - 视频处理方法、装置及设备 - Google Patents

视频处理方法、装置及设备 Download PDF

Info

Publication number
WO2022213801A1
WO2022213801A1 PCT/CN2022/082095 CN2022082095W WO2022213801A1 WO 2022213801 A1 WO2022213801 A1 WO 2022213801A1 CN 2022082095 W CN2022082095 W CN 2022082095W WO 2022213801 A1 WO2022213801 A1 WO 2022213801A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
timestamp
target
placeholder
template
Prior art date
Application number
PCT/CN2022/082095
Other languages
English (en)
French (fr)
Inventor
诸葛晶晶
李耔余
沈言浩
倪光耀
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Priority to US18/551,967 priority Critical patent/US20240177374A1/en
Publication of WO2022213801A1 publication Critical patent/WO2022213801A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/458Scheduling content for creating a personalised stream, e.g. by combining a locally stored advertisement with an incoming stream; Updating operations, e.g. for OS modules ; time-related management operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/32Image data format

Definitions

  • the embodiments of the present disclosure relate to the technical field of video processing, and in particular, to a video processing method, apparatus, device, storage medium, computer program product, and computer program.
  • Embodiments of the present disclosure provide a video processing method, apparatus, device, storage medium, computer program product, and computer program, and the method can render various types of materials including text, pictures, and videos based on placeholders, Generates rendered videos including multiple types, improving user experience.
  • an embodiment of the present disclosure provides a video processing method, including:
  • the video template includes a plurality of placeholders, wherein each of the placeholders is used to indicate at least one of text, pictures and videos;
  • the types of the multiple materials include at least one of text, pictures, and videos;
  • the multiple materials are respectively imported into the corresponding placeholder positions in the video template and rendered to obtain a synthesized video.
  • an embodiment of the present disclosure provides a video processing apparatus, including:
  • a receiving module for receiving a video generation request
  • the first obtaining module is configured to obtain a video template according to the video generation request, wherein the video template includes a plurality of placeholders, wherein each of the placeholders is used to indicate at least one of the texts, pictures and videos.
  • a second acquiring module configured to acquire multiple materials according to the video generation request, wherein the types of the multiple materials include at least one of text, pictures and videos;
  • a rendering module configured to import the multiple materials into the corresponding placeholder positions in the video template based on the types of the materials and render them to obtain a synthesized video.
  • embodiments of the present disclosure provide an electronic device, including: a processor and a memory;
  • the memory stores computer-executable instructions
  • the processor executes the computer-executable instructions, so that the electronic device executes the video processing method described in the first aspect.
  • an embodiment of the present disclosure provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the implementation is as described in the first aspect above video processing method.
  • an embodiment of the present disclosure provides a computer program product, including a computer program that, when executed by a processor, implements the video processing method described in the first aspect.
  • an embodiment of the present disclosure provides a computer program that, when executed by a processor, implements the video processing method described in the first aspect.
  • the method first receives a video generation request from a user, and then acquires a video template and multiple materials according to the video generation request, wherein the video template contains Including a plurality of placeholders, each placeholder is used to indicate at least one of text, picture and video, and the plurality of materials includes at least one of text, picture and video; then based on the material type, the plurality of materials are Import the positions of the corresponding placeholders in the video template and render them to obtain the synthesized video.
  • the embodiments of the present disclosure render multiple types of materials including text, pictures, and videos based on placeholders to obtain rendered videos, and can generate multiple types of rendered videos, which improves user experience.
  • FIG. 1 is a schematic scene diagram of a video processing method according to an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart 1 of a video processing method provided by an embodiment of the present disclosure
  • FIG. 3 is a second schematic flowchart of a video processing method provided by an embodiment of the present disclosure.
  • FIG. 4 is a structural block diagram of a video processing apparatus provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a video processing method, obtaining a video template according to a user request, the video template includes a plurality of placeholders, and each placeholder is used to indicate a text, a picture, and a video in the video. At least one; obtain multiple materials according to the user's request, where the multiple materials include at least one of text, pictures, and videos, and import the various types of materials including text, pictures and videos into the corresponding account in the video template respectively. position the bitmap and render it, resulting in a composite video.
  • the placeholders in the video template are used to render various types of materials including text, pictures and videos to obtain a rendered video, which improves the user experience.
  • FIG. 1 is a schematic scene diagram of a video processing method provided by an embodiment of the present disclosure.
  • the system provided in this embodiment includes a client 101 and a server 102 .
  • the client 101 may be installed on devices such as mobile phones, tablet computers, personal computers, wearable electronic devices, and smart home devices. This embodiment does not specifically limit the implementation of the client terminal 101, as long as the client terminal 101 can perform input and output interaction with the user.
  • Server 102 may comprise a single server or a cluster of several servers.
  • FIG. 2 is a first schematic flowchart of a video processing method provided by an embodiment of the present disclosure.
  • the method of this embodiment can be applied to the server shown in FIG. 1 , and the video processing method includes:
  • the client may be installed on a mobile terminal, such as a personal computer, a tablet computer, a mobile phone, a wearable electronic device, a smart home device, and other devices.
  • the client can send the user's video generation request to the server.
  • S202 Acquire a video template according to the video generation request, wherein the video template includes multiple placeholders, wherein each placeholder is used to indicate at least one of text, picture and video.
  • a video template may be obtained from a video template library, where the video template includes multiple placeholders.
  • each placeholder may be set with a corresponding type label, and the type label is used to indicate that the corresponding placeholder belongs to at least one type of text, picture and video.
  • the placeholder may have a preset format and include at least one of the following parameters: used to indicate the type of material supported by the placeholder (eg, at least one of text, picture, and video) The type label of ; a placeholder identifier, which is used to indicate the corresponding rendering effect, material resolution, etc.
  • the placeholder when the placeholder supports video-type materials, the placeholder may further include the following parameters: the start time of the video material, the end time of the video material, the start time of the video to be synthesized, and the end time of the video to be synthesized .
  • acquiring a video template from a video template library may include the following two methods:
  • a video template may be randomly selected from the video template library in response to the video generation request.
  • the user can select the corresponding video template or video template type on the client side, and according to the user's selection on the client side, add the user selection information to the video generation request; after receiving the video generation request, the server parses and obtains the user selection information, The video template determined by the user at the client is selected from the video template library according to the user selection information.
  • the process of how to create a video template includes:
  • video template production materials where the video template production materials include at least one of rendering materials and cutscenes; pre-adding multiple placeholders; and creating a video template according to the video template production materials and the pre-added multiple placeholders.
  • the pre-added placeholders are used to indicate at least three types of information as follows, including: pictures, texts, and videos.
  • the text includes letters, numbers, symbols, and the like.
  • S203 Acquire multiple materials according to the video generation request, where the types of the multiple materials include at least one of text, picture, and video.
  • the material may be obtained from a material library (which may include a database).
  • the multiple materials may include at least one of multiple types of text, pictures, and videos
  • the placeholders in the corresponding video template may also include multiple types of indicated text, pictures, and videos. at least one of the placeholders for .
  • the type of the material import each type of material into the position of the placeholder corresponding to the type, so that the material replaces the placeholder, and the image frames of the video template after the imported material are rendered frame by frame, and then the composite is obtained. 's video.
  • the user's video generation request is received first, and then a video template and multiple materials are obtained according to the video generation request, wherein the video template includes multiple placeholders, and each placeholder is used to indicate text, pictures and videos.
  • At least one of the multiple materials includes at least one of text, pictures and videos; then based on the type of materials, the multiple materials are imported into the corresponding placeholder positions in the video template and rendered to obtain a synthesized video .
  • the embodiments of the present disclosure render multiple types of materials including text, pictures, and videos based on placeholders to obtain rendered videos, which can provide users with rendered videos including multiple types of materials, thereby improving user experience.
  • the video processing method according to the embodiment of the present disclosure has been described above in conjunction with the server. Those skilled in the art should understand that the video processing method according to the embodiment of the present disclosure can also be executed by a device with a client installed, or can also be executed by an integrated
  • the all-in-one device performs the server function and the client function. For the sake of brevity, the specific steps and methods will not be repeated.
  • FIG. 3 is a second schematic flowchart of a video processing method provided by an embodiment of the present disclosure.
  • the above-mentioned material includes a first type label, which is used to indicate the type of the material; the placeholder includes a second type label, which is used to indicate the type indicated by the placeholder.
  • step S204 based on the material type, multiple materials are imported into the corresponding placeholder positions in the video template and rendered to obtain a synthesized video, which may include:
  • S301 Filter out target materials and target placeholders whose first type labels are consistent with the second type labels.
  • the placeholders in the video template can be identified, which specifically includes: acquiring each video template image frame from the video template according to the video timestamp of the video to be synthesized, and determining whether there is a placeholder in each video template image frame , if it exists, the placeholder is identified, and the second type label of the placeholder is obtained.
  • the consistency between the second type label and the first type label may include consistency of the type information indicated by the label.
  • both the first type of tags and the second type of tags include tag tags.
  • each material includes a corresponding first type tag
  • the first type tag may be added by material generation.
  • the first type tag may be the first number information of each material, and the first number information may be customized by the client to indicate any material.
  • the placeholder in the video template may be a placeholder added when the video template is produced, and each placeholder is configured with a predefined second type label, and each predefined second type label is used for Indicates the type of asset this placeholder matches.
  • the second type tag may be second number information that matches the first number information of the material.
  • all placeholders may be traversed according to the first type label of the material until a second type label consistent with the first type label of the material is queried.
  • the specific screening process is similar to the above. , for the sake of brevity, will not be repeated here.
  • materials can be classified into three types: text materials, picture materials, and video materials.
  • different preprocessing methods are used to process the target material to import the position of the corresponding target placeholder in the video template.
  • S303 Render the image frame of the video template imported into the target material to obtain a synthesized video.
  • each frame of the video template imported into the target material is rendered to obtain a synthesized video.
  • the video template into which the target material is imported has a target placeholder, and a corresponding rendering effect is used for rendering according to the target placeholder.
  • the renderer corresponding to the target placeholder of the video template is identified; according to the rendering effect of the renderer, the image frame of the video template imported into each material is analyzed. to render.
  • the renderer may include a shader renderer, and the shader renderer is used to indicate rendering effect attributes such as the position, shape, transparency, and dynamic effect of the placeholder material.
  • the material can be imported into the corresponding position of the video template, and corresponding rendering can be performed to improve the presentation effect of the synthesized video.
  • step S302 is mainly described in detail, after each target material is preprocessed and imported into the position of the target placeholder in the video template, the specific details are as follows:
  • typesetting processing may be performed on the text material according to the characteristics such as the size or shape of the placeholder, and the typesetting processing text material may be converted into a texture format.
  • the target material includes a picture material, after the picture material is converted into a texture format, the position of the target placeholder in the video template is imported.
  • the image file format may include BMP, TGA, JPG, GIF, PNG and other formats; after being converted to texture format, the texture format may include R5G6B5, A4R4G4B4, A1R5G5B5, R8G8B8, A8R8G8B8 and other formats.
  • a known or future-developed processing method for texture-transforming can be used to perform texture-transforming processing on the picture material, and the present disclosure does not limit the specific texture-transforming processing method.
  • the target material includes a video material, extract an image frame from the video material, and after the image frame is converted into a texture format, import the position of the target placeholder in the video template.
  • image frames of the corresponding video material need to be screened out from the video material according to the timestamp of the video to be synthesized.
  • the specific process of extracting image frames from the video material includes: determining the first start timestamp and the first end timestamp of the video material in the video to be synthesized; determining the second start timestamp and the second start timestamp indicated by the placeholder End timestamp; calculate the image extracted from the video material according to the timestamp of the currently rendered frame of the video to be synthesized, the first start timestamp and the first end timestamp, and the second start timestamp and the second end timestamp
  • the target timestamp of the frame image frames are extracted from the video footage based on the target timestamp.
  • Timestamp including:
  • the target timestamp is obtained The proportional time length of the video footage
  • the target timestamp is obtained according to the proportional time length of the second start timestamp and the target timestamp in the video material.
  • Timestamp the specific formula of its calculation process can be:
  • t src is the target timestamp of the extracted image frame; dest in is the first start timestamp; dest out is the first end timestamp; src in is the second start timestamp; src out is the second end timestamp ; curTime is the timestamp of the currently rendered frame.
  • extracting an image frame from a video material according to a target timestamp includes: if the time length of the video material is less than the time length indicated by the placeholder corresponding to the video material in the video template, re-extracting the image frame from the video material Continue to extract image frames from the starting point.
  • the time length indicated by the placeholder may be obtained according to the difference between the second start timestamp indicated by the placeholder and the second end timestamp.
  • the image frames are extracted from the video material according to the time stamp of the extracted image frame. If the time length of the video material is less than the time length indicated by the placeholder in the video template, the image frame is resumed to the starting point of the video material and continues to be extracted, that is, the proposed
  • the image frame idx (tsrc%T) x fps (% is the remainder), where T is the time length of the video material, and fps is the frame rate.
  • the video processing method provided by the above embodiment may be executed by the server, and the video generation request comes from the client. Accordingly, in the above step S204, multiple materials are respectively imported into the corresponding placeholders in the video template.
  • the method further includes: sending the synthesized video to the client. By sending the synthesized video to the client, the user experience is further improved.
  • FIG. 4 is a structural block diagram of a video processing apparatus provided by an embodiment of the present disclosure.
  • the apparatus includes: a receiving module 401 , a first obtaining module 402 , a second obtaining module 403 and a rendering module 404 .
  • the receiving module 401 is used for receiving a video generation request
  • the first obtaining module 402 is configured to obtain a video template according to the video generation request, wherein the video template includes a plurality of placeholders, wherein each of the placeholders is used to indicate text, pictures and videos. at least one;
  • a second acquiring module 403, configured to acquire multiple materials according to the video generation request, wherein the types of the multiple materials include at least one of text, pictures, and videos;
  • the rendering module 404 is configured to, based on the types of the materials, respectively import the multiple materials into the corresponding placeholder positions in the video template and render them to obtain a synthesized video.
  • the material includes a first type tag, and the placeholder includes a second type tag;
  • the rendering module 404 includes:
  • a screening unit 4041 configured to filter out target materials and target placeholders whose labels of the first type are consistent with those of the second type;
  • Importing unit 4042 configured to import the target material into the position of the target placeholder in the video template after preprocessing
  • the rendering unit 4043 is configured to render the image frame of the video template imported into the target material to obtain the synthesized video.
  • the rendering unit 4043 includes:
  • the first rendering sub-unit 40431 is configured to import the position of the target placeholder in the video template after typesetting and texturing the text material if the target material includes text material;
  • the second rendering subunit 40432 is configured to import the position of the target placeholder in the video template after the image material is converted into a texture format if the target material includes a picture material;
  • the third rendering subunit 40433 is configured to extract an image frame from the video material if the target material includes a video material, and after converting the extracted image frame to a texture format, import it into the video template The location of the target placeholder.
  • the third rendering subunit 40433 is specifically configured to determine the first start timestamp and the first end timestamp of the video material in the video to be synthesized; the second start time stamp and the second end time stamp indicated by the bit symbol; according to the time stamp of the currently rendered frame of the video to be synthesized, the first start time stamp and the first end time stamp, and the The second start time stamp and the second end time stamp are used to calculate the target time stamp of the image frame extracted from the video material; and the image frame is extracted from the video material according to the target time stamp.
  • the third rendering subunit 40433 is specifically configured to: obtain, according to the second end timestamp and the second start timestamp, the information indicated by the placeholder time length; according to the ratio of the difference between the timestamp of the current rendering frame and the first start timestamp and the difference between the first end timestamp and the first start timestamp, and the ratio of the
  • the product of the time lengths indicated by the bit identifiers is used to obtain the proportional time length of the target time stamp in the video material; according to the second start time stamp and the target time stamp in the proportional time length of the video material , to obtain the target timestamp.
  • the calculation is performed to extract the image frame from the video material.
  • the formula for the target timestamp is:
  • t src is the target timestamp of the extracted image frame; dest in is the first start timestamp; dest out is the first end timestamp; src in is the second start timestamp; src out is the second end timestamp ; curTime is the timestamp of the currently rendered frame.
  • the third rendering subunit 40433 is further configured to, if the time length of the video material is less than the time length indicated by the placeholder corresponding to the video material in the video template, Then, continue to extract image frames from the starting point of the video material.
  • the rendering unit 4043 is specifically configured to identify the renderer corresponding to the target placeholder in the video template; after importing the target material according to the rendering effect pair of the renderer The image frame of the video template to render.
  • the apparatus further includes: a production module 405, configured to obtain a video template production material, wherein the video template production material includes at least one of rendering material and cutscenes; pre-adding The plurality of placeholders; the video template is produced according to the video template with the material and the pre-added plurality of placeholders.
  • a production module 405 configured to obtain a video template production material, wherein the video template production material includes at least one of rendering material and cutscenes; pre-adding The plurality of placeholders; the video template is produced according to the video template with the material and the pre-added plurality of placeholders.
  • the apparatus is applied to a server, the video generation request is from a client, and the apparatus further includes:
  • a sending module 406 configured to respectively import the multiple materials into the corresponding placeholder positions in the video template and render them, and after obtaining the synthesized video, send the synthesized video to the client.
  • the embodiments of the present disclosure further provide an electronic device.
  • the electronic device 500 may be a client device or a server.
  • the client device may include, but is not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, referred to as PDA), tablet computers (Portable Android Device, referred to as PAD), portable multimedia players ( Portable Media Player (PMP for short), in-vehicle clients (such as in-vehicle navigation clients), mobile clients of wearable electronic devices, etc., as well as fixed clients such as digital TV (Television), desktop computers, smart home devices, etc.
  • PDA Personal Digital Assistant
  • PAD Portable Media Player
  • PMP Portable Media Player
  • in-vehicle clients such as in-vehicle navigation clients
  • mobile clients of wearable electronic devices etc.
  • fixed clients such as digital TV (Television), desktop computers, smart home devices, etc.
  • the electronic device shown in FIG. 5 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 500 may include a processing device (such as a central processing unit, a graphics processor, etc.) 501, which may be stored in a read-only memory (Read Only Memory, ROM for short) 502 according to a program or from a storage device 508 loads a program into a random access memory (Random Access Memory, RAM for short) 503 to perform various appropriate actions and processes, thereby implementing the video processing method according to the embodiment of the present disclosure.
  • ROM Read Only Memory
  • RAM Random Access Memory
  • various programs and data required for the operation of the electronic device 500 are also stored.
  • the processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
  • An Input/Output (I/O for short) interface 505 is also connected to the bus 504 .
  • I/O interface 505 the following devices can be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD for short) ), speaker, vibrator, etc. output device 507; storage device 508 including, eg, magnetic tape, hard disk, etc.; and communication device 509. Communication means 509 may allow electronic device 500 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 5 shows electronic device 500 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.
  • LCD Liquid Crystal Display
  • speaker vibrator
  • output device 507 storage device 508 including, eg, magnetic tape, hard
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 509, or from the storage device 508, or from the ROM 502.
  • the processing apparatus 501 When the computer program is executed by the processing apparatus 501, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read-only memory (Erasable Programmable ROM, EPROM or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc ROM, CD-ROM for short), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that includes or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code included on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical cable, radio frequency (RF for short), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic apparatus; or may exist alone without being incorporated into the electronic apparatus.
  • the aforementioned computer-readable medium carries one or more programs, and when the aforementioned one or more programs are executed by the electronic device, causes the electronic device to execute the methods shown in the foregoing embodiments.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or, can be connected to an external A computer (eg using an internet service provider to connect via the internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that includes one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit that obtains at least two Internet Protocol addresses".
  • exemplary types of hardware logic components include: Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Products ( Application Specific Standard Product (ASSP), System on Chip (SOC), Complex Programmable Logic Device (CPLD), etc.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Products
  • SOC System on Chip
  • CPLD Complex Programmable Logic Device
  • a machine-readable medium may be a tangible medium that may include or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • a video processing method including:
  • the video template includes a plurality of placeholders, wherein each of the placeholders is used to indicate at least one of text, pictures and videos;
  • the types of the multiple materials include at least one of text, pictures, and videos;
  • the multiple materials are respectively imported into the corresponding placeholder positions in the video template and rendered to obtain a synthesized video.
  • the material includes a first type tag, and the placeholder includes a second type tag; the plurality of materials are respectively imported into the material based on the type of the material
  • the position of the corresponding placeholder in the video template is rendered, and the synthesized video is obtained, including: filtering out the target material and target placeholder whose first type label is consistent with the second type label;
  • the target material is preprocessed and imported into the video template at the position of the target placeholder; the image frame of the video template imported into the target material is rendered to obtain the synthesized video.
  • the preprocessing of the target material and then importing the target material into the position of the target placeholder in the video template includes: if the target material includes text material, After the text material is processed in typesetting and texture format, it is imported into the position of the target placeholder in the video template; if the target material includes a picture material, after the picture material is processed in a texture format, Import the position of the target placeholder in the video template; if the target material includes video material, extract image frames from the video material, and after the extracted image frames are processed in a texture format, import the target material. position of the target placeholder in the video template.
  • the extracting image frames from the video material includes: determining a first start timestamp and a first end timestamp of the video material in the video to be synthesized; the second start timestamp and the second end timestamp indicated by the placeholder; according to the timestamp of the currently rendered frame of the video to be synthesized, the first start timestamp and the first end timestamp, and the second start time stamp and the second end time stamp, calculate the target time stamp of the image frame extracted from the video material; extract the image frame from the video material according to the target time stamp.
  • the timestamp of the currently rendered frame according to the video to be synthesized, the first start timestamp and the first end timestamp, and the second start timestamp and the second end time stamp calculating the target time stamp for extracting the image frame from the video material, comprising: obtaining the target time stamp indicated by the placeholder according to the second end time stamp and the second start time stamp time length; according to the ratio of the difference between the timestamp of the current rendering frame and the first start timestamp and the difference between the first end timestamp and the first start timestamp, and the ratio of the
  • the product of the time lengths indicated by the bit identifiers is used to obtain the proportional time length of the target time stamp in the video material; according to the second start time stamp and the target time stamp in the proportional time length of the video material , to obtain the target timestamp.
  • the extracting the image frame from the video material according to the target timestamp includes: if the time length of the video material is less than the corresponding occupancy of the video material in the video template If the time length indicated by the bit symbol is exceeded, the image frame is continuously extracted from the starting point of the video material.
  • the rendering of the image frame of the video template imported into the target material includes: identifying a renderer corresponding to the target placeholder in the video template; according to the renderer
  • the rendering special effect renders the image frame of the video template after importing the target material.
  • the method before the receiving a video generation request, the method further includes: acquiring a video template production material, wherein the video template production material includes at least one of a rendering material and a cutscene; generating the multiple placeholders; and generating the video template according to the video template creating the material and the multiple pre-added placeholders.
  • a video processing apparatus including:
  • a receiving module for receiving a video generation request
  • the first obtaining module is configured to obtain a video template according to the video generation request, wherein the video template includes a plurality of placeholders, wherein each of the placeholders is used to indicate at least one of the texts, pictures and videos.
  • a second acquiring module configured to acquire multiple materials according to the video generation request, wherein the types of the multiple materials include at least one of text, pictures and videos;
  • a rendering module configured to import the multiple materials into the corresponding placeholder positions in the video template based on the types of the materials and render them to obtain a synthesized video.
  • the material includes a first type tag
  • the placeholder includes a second type tag
  • the rendering module includes: a filtering unit for filtering out the first type A target material and a target placeholder whose label is consistent with the label of the second type; an import unit, configured to preprocess the target material and import it into the position of the target placeholder in the video template; a rendering unit, It is used for rendering the image frame of the video template imported into the target material to obtain the synthesized video.
  • the rendering unit includes: a first rendering sub-unit configured to, if the target material includes a text material, perform typesetting and texture conversion on the text material, importing the position of the target placeholder in the video template; the second rendering subunit is configured to import the video template after the image material is converted into a texture format if the target material includes a picture material The position of the target placeholder described in ; the third rendering subunit is used for extracting image frames from the video material if the target material includes video material, and converting the extracted image frame to a texture format After processing, the location of the target placeholder in the video template is imported.
  • the third rendering subunit is specifically configured to determine a first start timestamp and a first end timestamp of the video material in the video to be synthesized; determine the place occupancy the second start timestamp and the second end timestamp indicated by the symbol; according to the timestamp of the currently rendered frame of the video to be synthesized, the first start timestamp and the first end timestamp, and the The second start time stamp and the second end time stamp are used to calculate the target time stamp of the image frame extracted from the video material; and the image frame is extracted from the video material according to the target time stamp.
  • the third rendering sub-unit is specifically configured to: obtain, according to the second end timestamp and the second start timestamp, the value indicated by the placeholder time length; according to the ratio of the difference between the timestamp of the current rendering frame and the first start timestamp and the difference between the first end timestamp and the first start timestamp, and the placeholder According to the product of the time length indicated by the symbol, the proportional time length in which the target time stamp is located in the video material is obtained; according to the proportional time length of the second start time stamp and the target time stamp in the video material, Get the target timestamp.
  • the third rendering subunit is further configured to: if the time length of the video material is less than the time length indicated by the placeholder corresponding to the video material in the video template, then Continue to extract image frames from the beginning of the video material again.
  • the rendering unit is specifically configured to identify a renderer corresponding to a target placeholder in the video template;
  • the image frame of the video template is rendered.
  • the apparatus further includes: a production module, configured to obtain a video template production material, wherein the video template production material includes at least one of a rendering material and a cutscene; generating the multiple placeholders; and generating the video template according to the video template creating the material and the multiple pre-added placeholders.
  • a production module configured to obtain a video template production material, wherein the video template production material includes at least one of a rendering material and a cutscene; generating the multiple placeholders; and generating the video template according to the video template creating the material and the multiple pre-added placeholders.
  • an electronic device comprising: a processor and a memory;
  • the memory stores computer-executable instructions
  • the processor executes the computer-executable instructions, so that the electronic device executes the video processing method described in the first aspect and various possible designs of the first aspect.
  • a computer-readable storage medium where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, The video processing methods described above in the first aspect and various possible designs of the first aspect are implemented.
  • embodiments of the present disclosure provide a computer program product, including a computer program that, when executed by a processor, implements the video processing method described in the first aspect and various possible designs of the first aspect.
  • an embodiment of the present disclosure provides a computer program that, when executed by a processor, implements the video processing method described in the first aspect and various possible designs of the first aspect.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本公开实施例提供一种视频处理方法、装置及设备,该方法包括:接收视频生成请求;根据视频生成请求获取视频模板,其中视频模板中包括多个占位符,其中,每个占位符用于指示文本、图片和视频中的至少一种;根据视频生成请求获取多个素材,其中多个素材的类型包括文本、图片和视频中的至少一种;基于素材的类型,将多个素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频。本公开能够基于占位符对包括文本、图片和视频在内的多种类型的素材进行渲染,针对每个视频生成请求生成包括多种类型的渲染视频,提升了用户体验。

Description

视频处理方法、装置及设备
本申请要求于2021年04月09日提交中国专利局、申请号为202110385345.3、申请名称为“视频处理方法、装置及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及视频处理技术领域,尤其涉及一种视频处理方法、装置、设备、存储介质、计算机程序产品及计算机程序。
背景技术
随着客户端设备硬件性能的提高,以及人工智能技术的不断进步,在客户端设备上运行的应用程序(Application,简称APP)也越来越多。对于一些视频类APP,在传统的视频渲染过程中,用户只能根据单一类型的渲染模板,对内容进行渲染,这样最终得到的渲染视频的效果单一,无法满足用户对视频多样性的需求。
发明内容
本公开实施例提供一种视频处理方法、装置、设备、存储介质、计算机程序产品及计算机程序,该方法能够基于占位符对包括文本、图片和视频在内的多种类型的素材进行渲染,生成包括多种类型的渲染视频,提升了用户体验。
第一方面,本公开实施例提供一种视频处理方法,包括:
接收视频生成请求;
根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;以及
基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
第二方面,本公开实施例提供一种视频处理装置,包括:
接收模块,用于接收视频生成请求;
第一获取模块,用于根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
第二获取模块,用于根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;
渲染模块,用于基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
第三方面,本公开实施例提供一种电子设备,包括:处理器和存储器;
所述存储器存储计算机执行指令;
所述处理器执行所述计算机执行指令,使得所述电子设备执行如上述第一方面所述的视频处理方法。
第四方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上述第一方面所述的视频处理方法。
第五方面,本公开实施例提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如上述第一方面所述的视频处理方法。
第六方面,本公开实施例提供一种计算机程序,所述计算机程序被处理器执行时,实现如上述第一方面所述的视频处理方法。
本实施例提供的视频处理方法、装置、设备、存储介质、计算机程序产品及计算机程序,该方法首先接收用户的视频生成请求,然后根据视频生成请求获取视频模板和多个素材,其中视频模板中包括多个占位符,每个占位符用于指示文本、图片和视频中的至少一种,多个素材包括文本、图片和视频中的至少一种;然后基于素材类型,将多个素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频。本公开实施例基于占位符对包括文本、图片和视频在内的多种类型的素材进行渲染得到渲染视频,能够生成包括多种类型的渲染视频,提升了用户体验。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的 一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种视频处理方法的场景示意图;
图2为本公开实施例提供的视频处理方法流程示意图一;
图3为本公开实施例提供的视频处理方法流程示意图二;
图4为本公开实施例提供的视频处理装置的结构框图;
图5为本公开实施例提供的电子设备的硬件结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
目前,传统的视频渲染过程中,用户根据固定的渲染模板,只能对某种固定类型的内容进行渲染,这样最终得到的渲染视频的效果单一,无法满足用户对视频多样性的需求,用户体验较差。为了解决上述技术问题,本公开实施例提供了一种视频处理方法,根据用户请求获取视频模板,视频模板中包括多个占位符,每个占位符用于指示文本、图片和视频中的至少一种;根据用户请求获取多个素材,这里的多个素材包括文本、图片和视频中的至少一种,将包括文本、图片和视频多种类型的各素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频。利用视频模板中的占位符对包括文本、图片和视频在内的多种类型的素材进行渲染得到渲染视频,提升了用户体验。
参考图1,图1为本公开实施例提供的一种视频处理方法的场景示意图。如图1所示,本实施例提供的系统包括客户端101和服务器102。其中,客户端101可以安装在手机、平板电脑、个人电脑、可穿戴电子设备和智能家居设备等设备上。本实施例对客户端101的实现方式不做特别限制,该客户端101上能够与用户进行输入输出交互即可。服务器102可以包括一台服务器或几台服务器组成的集群。
参考图2,图2为本公开实施例提供的视频处理方法流程示意图一。本实施例的方法可以应用在图1所示的服务器中,该视频处理方法包括:
S201:接收视频生成请求。
具体地,接收任一客户端发送的视频生成请求。
在本公开实施例中,客户端可以安装在移动终端,如个人电脑、平板电脑和手机、可穿戴电子设备、智能家居设备等设备上。客户端可以发送用户的视频生成请求到服务器。
S202:根据视频生成请求获取视频模板,其中视频模板中包括多个占位符,其中,每个占位符用于指示文本、图片和视频中的至少一种。
具体地,可以从视频模板库中获取视频模板,其中视频模板中包括多个占位符。其中,每个占位符可以设置有一个对应的类型标签,该类型标签用于指示对应的占位符属于文本、图片和视频中的至少一种类型。
在一个实施例中,占位符可以具有预设格式,并且包括以下参数中的至少一种:用于指示该占位符支持的素材类型(例如,文本、图片和视频中的至少一种)的类型标签;占位符标识,该占位符标识用于指示对应的渲染效果、素材分辨率等。并且,当占位符支持视频类型的素材时,占位符中还可以包括如下参数:视频素材的起始时间、视频素材的结束时间、待合成视频的起始时间以及待合成视频的结束时间。
在本公开实施例中,从视频模板库中获取视频模板,可以包括如下两种方式:
方式一,可以响应于视频生成请求,从视频模板库中随机选择一个视频模板。
方式二,用户可以在客户端选择对应的视频模板或视频模板类型,根据用户在客户端的选择,在视频生成请求中添加用户选择信息;服务器在接收到视频生成请求后,解析得到用户选择信息,根据该用户选择信息从视频模板库中选择用户在客户端确定的视频模板。
在本公开的一个实施例中,在步骤S201之前,对于如何建立视频模板的过程,包括:
获取视频模板制作素材,其中视频模板制作素材包括渲染素材和过场动画中的至少一种;预添加多个占位符;根据视频模板制作素材和预添加的多个占位符制作视频模板。
其中,预添加的多个占位符用于指示如下至少三种类型信息,包括:图片、文本和视频。这里,文本包括可以文字、数字和符号等。
S203:根据视频生成请求获取多个素材,其中多个素材的类型包括文本、图片和视频中的至少一种。
在一个实施例中,可以从素材库(可以包括数据库)中获取素材。
S204:基于素材的类型,将多个素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
在本公开实施例中,多个素材可以包括文本、图片和视频中多种类型中的至少一种,对应的视频模板中的占位符也可以是包括指示文本、图片和视频中多种类型的占位符中的至少一种。
具体地,根据素材的类型,将各类型的素材导入该类型对应的占位符的位置,使得素材替换占位符,并对导入素材后的视频模板的图像帧进行逐帧渲染,进而得到合成的视频。
从上述描述可知,首先接收用户的视频生成请求,然后根据视频生成请求获取视频模板和多个素材,其中视频模板中包括多个占位符,每个占位符用于指示文本、图片和视频中的至少一种,多个素材包括文本、图片和视频中的至少一种;然后基于素材类型,将多个素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频。本公开实施例基于占位符对包括文本、图片和视频在内的多种类型的素材进行渲染得到渲染视频,能够为用户提供包括多种类型的素材的渲染视频,提升了用户体验。
以上结合服务器描述了根据本公开实施例的视频处理方法,本领域技术人员应当理解的是,根据本公开实施例的视频处理方法还可以由安装有客户端的设备执行,或者,还可以由集成了服务器功能和客户端功能的一体机设备执行,为简明起见,具体步骤和方法不再赘述。
参考图3,图3为本公开实施例提供的视频处理方法流程示意图二。本实施例中,上述素材包括第一类型标签,用于指示素材的类型;占位符包括第二类型标签,用于指示该占位符所指示的类型;相应地,步骤S204中,基于素材的类型,多个素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频,具体可以包括:
S301:筛选出第一类型标签与第二类型标签一致的目标素材和目标占位符。
在一个实施例中,可以识别视频模板中的占位符,具体包括:根据待合成视频的视频时间戳从视频模板中获取各视频模板图像帧,判断各视频模板图像帧中是否存在占位符,若存在则识别出该占位符,则获取占位符的第二类型标签。
具体地,根据占位符的第二类型标签遍历所有的素材,直至查询到与该占位符的第二类型标签一致的第一类型标签,则确定该第一类型标签对应的素材为目标素材,对应的占位符即为目标占位符。其中,第二类型标签与第一类型标签相一致可以包括标签所指示的类型信息一致。可选地,第一类型标签和第二类型标签均包括tag标签。
在本公开实施例中,每一个素材包括一个对应的第一类型标签,该第一类型标签可以是素材生成添加的。例如,第一类型标签可以是每一个素材的第一编号信息,该第一编号信息可以是由客户端进行自定义的,用于指示任一素材。
其中,视频模板中的占位符可以是在制作视频模板时添加的占位符,同时每个占位符配置有一个预定义的第二类型标签,每个预定义的第二类型标签用于指示该占位符匹配的素材类型。例如,第二类型标签可以是与素材的第一编号信息匹配的第二编号信息。
此外,在一个实施例中,还可以根据素材的第一类型标签遍历所有的占位符,直至查询到与该素材的第一类型标签一致的第二类型标签,具体的筛选过程和上文类似,为简明起见,不再赘述。
S302:将目标素材进行预处理后导入视频模板中目标占位符的位置。
在本公开实施例中,素材可以分为文本素材、图片素材和视频素材三种类型。根据目标素材的类型不同,采用不同的预处理方式对目标素材进行处理,以导入视频模板中对应的目标占位符的位置。
S303:对导入目标素材的视频模板的图像帧进行渲染,得到合成的视频。
在本公开实施例中,对导入目标素材的视频模板的每一帧进行渲染,得到合成的视频。
其中,导入目标素材的视频模板中具有目标占位符,根据该目标占位符采用相应的渲染效果进行渲染。具体地,根据该目标占位符中的参数(例如,占位符标识),识别视频模板的目标占位符对应的渲染器;根据渲染器的渲染效果对导入各素材的视频模板的图像帧进行渲染。
其中,渲染器可以包括shader渲染器,shader渲染器用于指示占位素材的位置、形状、透明度及动态效果等渲染效果属性。
从上述描述可知,通过占位符的标签和素材的标签进行匹配,能够将素材导入视频模板的对应位置,并进行相应的渲染,以提升合成的视频的呈现效果。
本公开另一实施例提供的视频处理方法中,主要详细描述了上述步骤S302中,将各目标素材进行预处理后导入视频模板中目标占位符的位置,具体详述如下:
S3021:若目标素材包括文本素材,则对文本素材进行排版、转纹理格式处理后,导入视频模板中目标占位符的位置。
在本公开实施例中,可以根据占位符的大小或形状等特征,对文本素材进行排版处理,并将排版处理后的文本素材转换成纹理格式。
S3022:若目标素材包括图片素材,则对图片素材进行转纹理格式处理后,导入视频模板中目标占位符的位置。
在本公开实施例中,图像文件格式可以包括BMP,TGA,JPG,GIF,PNG等格式;转为纹理格式后,纹理格式可以包括R5G6B5,A4R4G4B4,A1R5G5B5,R8G8B8,A8R8G8B8等格式。可以采用已知的或者未来开发的转纹理格式处理方法对图片素材进行转纹理格式处理,本公开不对具体的转纹理格式处理方法进行限制。
S3023:若目标素材包括视频素材,则从视频素材抽取图像帧,并对图像帧进行转纹理格式处理后,导入视频模板中目标占位符的位置。
在本公开实施例中,对于视频素材需要根据待合成视频的时间戳从视频素材筛选出对应的视频素材的图像帧。
其中,从视频素材抽取图像帧的具体过程,包括:确定视频素材在待合成视频的第一起始时间戳和第一结束时间戳;确定占位符所指示的第二起始时间戳和第二结束时间戳;根据待合成视频的当前渲染帧的时间戳,第一起始时间戳和第一结束时间戳,以及第二起始时间戳和第二结束时间戳,计算从视频素材中抽取的图像帧的目标时间戳;根据目标时间戳从视频素材中抽取图像帧。
其中,根据待合成视频的当前渲染帧的时间戳,第一起始时间戳和第一结束时间戳,以及第二起始时间戳和第二结束时间戳,计算从视频素材中抽取图像帧的目标时间戳,包括:
根据第二结束时间戳和第二起始时间戳,获得占位符所指示的时间长度;
根据当前渲染帧的时间戳和第一起始时间戳的差值与第一结束时间戳和第一起始时间戳的差值的比值,与占位符所指示的时间长度的乘积,得到目标时间戳位于视频素材的比例时间长度;
根据第二起始时间戳和目标时间戳占视频素材的比例时间长度,获得目标时间戳。
其中,根据待合成视频的当前渲染帧的时间戳,第一起始时间戳和第一结束时间戳,以及第二起始时间戳和第二结束时间戳,计算从视频素材中抽取图像帧的目标时间戳,其计算过程的具体的公式可以为:
Figure PCTCN2022082095-appb-000001
式中,t src为抽取图像帧的目标时间戳;dest in为第一起始时间戳;dest out为第一结束时间戳;src in为第二起始时间戳;src out为第二结束时间戳;curTime为当前渲染帧的时间戳。通过上述公式,可以获取抽取图像帧的目标时间戳,以根据目标时间戳从视频素材中抽取图像帧。
从上述描述可知,通过对文本素材、图片素材和视频素材的预处理,能够实现不同类型形式的素材匹配输入,进一步提升视频效果。
在本公开的一个实施例中,根据目标时间戳从视频素材抽取图像帧,包括:若视频素材的时间长度小于视频模板中视频素材对应的占位符所指示的时间长度,则重新从视频素材的起点继续抽取图像帧。
其中,占位符所指示的时间长度可以根据占位符所指示的第二起始时间戳与第二结束时间戳作差得到。
具体地,根据抽取图像帧的时间戳从视频素材抽取图像帧,若视频素材的时间长度小于视频模板中占位符所指示的时间长度,则恢复至视频素材的起点继续抽取图像帧,即提出的图像帧idx=(tsrc%T)x fps(%为取余),其中T为视频素材的时间长度,fps为帧率。
从上述描述可知,通过恢复至视频素材的起点继续抽取图像帧,保证输入的视频素材的完整,从而使得合成的视频中不会出现内容缺失画面的问题。
在本公开的一个实施例中,上述实施例提供的视频处理方法可以由服务器执行,视频生成请求来自客户端,相应地,在上述步骤S204,将多个素材分别导入视频模板中对应的占位符的位置并进行渲染,得到合成的视频之后,还包括:将合成的视频发送至客户端。通过将合成的视频发给客户端,进一步提升用户体验。
对应于上文实施例的视频处理方法,图4为本公开实施例提供的视频处理装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。参照图4,所述装置包括:接收模块401、第一获取模块402、第二获取模块403和渲染模块404。
其中,接收模块401,用于接收视频生成请求;
第一获取模块402,用于根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
第二获取模块403,用于根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;
渲染模块404,用于基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
根据本公开的一个或多个实施例,所述素材包括第一类型标签,所述占位符包括第二类型标签;所述渲染模块404包括:
筛选单元4041,用于筛选出所述第一类型标签与所述第二类型标签一致的目标素材和目标占位符;
导入单元4042,用于将所述目标素材进行预处理后导入所述视频模板中所述目标占位符的位置;
渲染单元4043,用于对导入所述目标素材的视频模板的图像帧进行渲染,得到所述合成的视频。
根据本公开的一个或多个实施例,所述渲染单元4043,包括:
第一渲染子单元40431,用于若所述目标素材包括文本素材,则对所述文本素材进行排版、转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;
第二渲染子单元40432,用于若所述目标素材包括图片素材,则对所述图片素材进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;
第三渲染子单元40433,用于若所述目标素材包括视频素材,则从所述视频素材中抽取图像帧,并对所述抽取的图像帧进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置。
根据本公开的一个或多个实施例,所述第三渲染子单元40433,具体用于确定所述视频素材在待合成视频中的第一起始时间戳和第一结束时间戳;确定所述占位符所指示的第二起始时间戳和第二结束时间戳;根据所述待合成视频的当前渲染帧的时间戳、所述第一起始时间戳和所述第一结束时间戳、以及所述第二起始时间戳和所述第二结束时间戳,计算从所述视频素材中抽取的图像帧的目标时间戳;根据所述目标时间戳从所述视频素材中抽取图像帧。
根据本公开的一个或多个实施例,所述第三渲染子单元40433,具体用于:根据所述第二结束时间戳和所述第二起始时间戳,获得所述占位符所指示的时间长度;根据所述当前渲染帧的时间戳和所述第一起始时间戳的差值与所述第一结束时间戳和所述第一起始时间戳的差值的比值,与所述占位符所指示的时间长度的乘积,得到所述目标时间戳位于所述视频素材的比例时间长度;根据所述第二起始时间戳和所述目标时间戳占所述视频素材的比例时间长度,获得所述目标时间戳。
具体地,所述根据待合成视频的当前渲染帧的时间戳,第一起始时间戳和第一结束时间戳,以及第二起始时间戳和第二结束时间戳,计算从视频素材抽取图像帧的目标时间戳的公式为:
Figure PCTCN2022082095-appb-000002
式中,t src为抽取图像帧的目标时间戳;dest in为第一起始时间戳;dest out为第一结束时间戳;src in为第二起始时间戳;src out为第二结束时间戳;curTime为当前渲染帧的时间戳。
根据本公开的一个或多个实施例,所述第三渲染子单元40433,还用于若所述视频素材的时间长度小于所述视频模板中视频素材对应的占位符所指示的时间长度,则重新从所述视频素材的起点继续抽取图像帧。
根据本公开的一个或多个实施例,所述渲染单元4043,具体用于识别所述视频模板中目标占位符对应的渲染器;根据所述渲染器的渲染特效对导入所述目标素材后的视频模板的图像帧进行渲染。
根据本公开的一个或多个实施例,所述装置还包括:制作模块405,用于获取视频模板制作素材,其中所述视频模板制作素材包括渲染素材和过场动画中的至少一种;预添加所述多个占位符;根据所述视频模板制作素材和预添加的所述多个占位符制作所述视频模板。
根据本公开的一个或多个实施例,所述装置应用于服务器,所述视频生成请求来自客户端,所述装置还包括:
发送模块406,用于所述将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到所述合成的视频之后,将所述合成的视频发送至所述客户端。
本实施例提供的装置,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。
为了实现上述各实施例,本公开实施例还提供了一种电子设备。
参考图5,其示出了适于用来实现本公开实施例的电子设备500的结构示意图,该电子设备500可以为客户端设备或服务器。其中,客户端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载客户端(例如车载导航客户端)、可穿戴电子设备等等的移动客户端以及诸如数字TV(Television)、台式计算机、智能家居设备等等的固定客户端。图5示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图5所示,电子设备500可以包括处理装置(例如中央处理器、图形处理器等)501,其可以根据存储在只读存储器(Read Only Memory,简称ROM)502中的程序或者从存储装置508加载到随机访问存储器(Random Access Memory,简称RAM)503中的程序而执行各种适当的动作和处理,从而实现根据本公开实施例中的视频处理方法。在RAM 503中,还存储有电子设备500操作所需的各种程序和数据。处理装置501、ROM 502以 及RAM 503通过总线504彼此相连。输入/输出(Input/Output,简称I/O)接口505也连接至总线504。
通常,以下装置可以连接至I/O接口505:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置506;包括例如液晶显示器(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置507;包括例如磁带、硬盘等的存储装置508;以及通信装置509。通信装置509可以允许电子设备500与其他设备进行无线或有线通信以交换数据。虽然图5示出了具有各种装置的电子设备500,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包括用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置509从网络上被下载和安装,或者从存储装置508被安装,或者从ROM 502被安装。在该计算机程序被处理装置501执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable ROM,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc ROM,简称CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包括或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包括的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,简称RF)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述电子设备中所包括的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包括一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,简称FPGA)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、专用标准产品(Application Specific Standard Product,简称ASSP)、片上系统(System on Chip,简称SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,简称CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包括或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
第一方面,根据本公开的一个或多个实施例,提供了一种视频处理方法,包括:
接收视频生成请求;
根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;以及
基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
根据本公开的一个或多个实施例,所述素材包括第一类型标签,所述占位符包括第二类型标签;所述基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到所述合成的视频,包括:筛选出所述第一类型标签与所述第二类型标签一致的目标素材和目标占位符;将所述目标素材进行预处理后导入所述视频模板中所述目标占位符的位置;对导入所述目标素材的视频模板的图像帧进行渲染,得到所述合成的视频。
根据本公开的一个或多个实施例,所述将所述目标素材进行预处理后导入所述视频模板中所述目标占位符的位置,包括:若所述目标素材包括文本素材,则对所述文本素材进行排版、转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;若所述目标素材包括图片素材,则对所述图片素材进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;若所述目标素材包括视频素材,则从所述视频素材中抽取图像帧,并对抽取的图像帧进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置。
根据本公开的一个或多个实施例,所述从所述视频素材中抽取图像帧,包括:确定所述视频素材在待合成视频中的第一起始时间戳和第一结束时间戳;确定所述占位符所指示 的第二起始时间戳和第二结束时间戳;根据所述待合成视频的当前渲染帧的时间戳、所述第一起始时间戳和所述第一结束时间戳、以及所述第二起始时间戳和所述第二结束时间戳,计算从所述视频素材中抽取的图像帧的目标时间戳;根据所述目标时间戳从所述视频素材中抽取图像帧。
根据本公开的一个或多个实施例,所述根据待合成视频的当前渲染帧的时间戳、所述第一起始时间戳和所述第一结束时间戳、以及所述第二起始时间戳和所述第二结束时间戳,计算从视频素材中抽取图像帧的目标时间戳,包括:根据所述第二结束时间戳和所述第二起始时间戳,获得所述占位符所指示的时间长度;根据所述当前渲染帧的时间戳和所述第一起始时间戳的差值与所述第一结束时间戳和所述第一起始时间戳的差值的比值,与所述占位符所指示的时间长度的乘积,得到所述目标时间戳位于所述视频素材的比例时间长度;根据所述第二起始时间戳和所述目标时间戳占所述视频素材的比例时间长度,获得所述目标时间戳。
根据本公开的一个或多个实施例,所述根据所述目标时间戳从所述视频素材中抽取图像帧,包括:若所述视频素材的时间长度小于所述视频模板中视频素材对应的占位符所指示的时间长度,则重新从所述视频素材的起点继续抽取图像帧。
根据本公开的一个或多个实施例,所述对导入所述目标素材的视频模板的图像帧进行渲染,包括:识别所述视频模板中目标占位符对应的渲染器;根据所述渲染器的渲染特效对导入所述目标素材后的视频模板的图像帧进行渲染。
根据本公开的一个或多个实施例,所述接收视频生成请求之前,还包括:获取视频模板制作素材,其中所述视频模板制作素材包括渲染素材和过场动画中的至少一种;预添加所述多个占位符;根据所述视频模板制作素材和预添加的所述多个占位符制作所述视频模板。
第二方面,根据本公开的一个或多个实施例,提供了一种视频处理装置,包括:
接收模块,用于接收视频生成请求;
第一获取模块,用于根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
第二获取模块,用于根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;
渲染模块,用于基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
根据本公开的一个或多个实施例,所述素材包括第一类型标签,所述占位符包括第二类型标签;所述渲染模块,包括:筛选单元,用于筛选出所述第一类型标签与所述第二类型标签一致的目标素材和目标占位符;导入单元,用于将所述目标素材进行预处理后导入所述视频模板中所述目标占位符的位置;渲染单元,用于对导入所述目标素材的视频模板的图像帧进行渲染,得到所述合成的视频。
根据本公开的一个或多个实施例,所述渲染单元,包括:第一渲染子单元,用于若所述目标素材包括文本素材,则对所述文本素材进行排版、转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;第二渲染子单元,用于若所述目标素材包括图片素材,则对所述图片素材进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;第三渲染子单元,用于若所述目标素材包括视频素材,则从所述视频素材中抽取图像帧,并对所述抽取的图像帧进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置。
根据本公开的一个或多个实施例,所述第三渲染子单元,具体用于确定所述视频素材在待合成视频中的第一起始时间戳和第一结束时间戳;确定所述占位符所指示的第二起始时间戳和第二结束时间戳;根据所述待合成视频的当前渲染帧的时间戳、所述第一起始时间戳和所述第一结束时间戳、以及所述第二起始时间戳和所述第二结束时间戳,计算从所述视频素材中抽取的图像帧的目标时间戳;根据所述目标时间戳从所述视频素材中抽取图像帧。
根据本公开的一个或多个实施例,所述第三渲染子单元,具体用于:根据所述第二结束时间戳和所述第二起始时间戳,获得所述占位符所指示的时间长度;根据所述当前渲染帧的时间戳和所述第一起始时间戳的差值与所述第一结束时间戳和所述第一起始时间戳的差值的比值,与所述占位符所指示的时间长度的乘积,得到所述目标时间戳位于所述视频素材的比例时间长度;根据所述第二起始时间戳和所述目标时间戳占所述视频素材的比例时间长度,获得所述目标时间戳。
根据本公开的一个或多个实施例,所述第三渲染子单元,还用于若所述视频素材的时间长度小于所述视频模板中视频素材对应的占位符所指示的时间长度,则重新从所述视频素材的起点继续抽取图像帧。
根据本公开的一个或多个实施例,所述渲染单元,具体用于识别所述视频模板中目标占位符对应的渲染器;根据所述渲染器的渲染特效对导入所述目标素材后的视频模板的图像帧进行渲染。
根据本公开的一个或多个实施例,所述装置还包括:制作模块,用于获取视频模板制作素材,其中所述视频模板制作素材包括渲染素材和过场动画中的至少一种;预添加所述多个占位符;根据所述视频模板制作素材和预添加的所述多个占位符制作所述视频模板。
第三方面,根据本公开的一个或多个实施例,提供了一种电子设备,包括:处理器和存储器;
所述存储器存储计算机执行指令;
所述处理器执行所述计算机执行指令,使得所述电子设备执行如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第四方面,根据本公开的一个或多个实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第五方面,本公开实施例提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如上第一方面以及第一方面各种可能的设计所述的视频处理方法。
第六方面,本公开实施例提供一种计算机程序,所述计算机程序被处理器执行时,实现如上述第一方面以及第一方面各种可能的设计所述的视频处理方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包括了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (13)

  1. 一种视频处理方法,所述方法包括:
    接收视频生成请求;
    根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
    根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;以及
    基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
  2. 根据权利要求1所述的方法,其中,所述素材包括第一类型标签,所述占位符包括第二类型标签;
    所述基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到所述合成的视频,包括:
    筛选出所述第一类型标签与所述第二类型标签一致的目标素材和目标占位符;
    将所述目标素材进行预处理后导入所述视频模板中所述目标占位符的位置;
    对导入所述目标素材的视频模板的图像帧进行渲染,得到所述合成的视频。
  3. 根据权利要求2所述的方法,其中,所述将所述目标素材进行预处理后导入所述视频模板中所述目标占位符的位置,包括:
    若所述目标素材包括文本素材,则对所述文本素材进行排版、转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;
    若所述目标素材包括图片素材,则对所述图片素材进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置;
    若所述目标素材包括视频素材,则从所述视频素材中抽取图像帧,并对抽取的图像帧进行转纹理格式处理后,导入所述视频模板中所述目标占位符的位置。
  4. 根据权利要求3所述的方法,其中,所述从所述视频素材中抽取图像帧,包括:
    确定所述视频素材在待合成视频中的第一起始时间戳和第一结束时间戳;
    确定所述占位符所指示的第二起始时间戳和第二结束时间戳;
    根据所述待合成视频的当前渲染帧的时间戳、所述第一起始时间戳和所述第一结束时间戳、以及所述第二起始时间戳和所述第二结束时间戳,计算从所述视频素材中抽取的图像帧的目标时间戳;
    根据所述目标时间戳从所述视频素材中抽取图像帧。
  5. 根据权利要求4所述的方法,其中,所述根据待合成视频的当前渲染帧的时间戳、所述第一起始时间戳和所述第一结束时间戳、以及所述第二起始时间戳和所述第 二结束时间戳,计算从视频素材中抽取图像帧的目标时间戳,包括:
    根据所述第二结束时间戳和所述第二起始时间戳,获得所述占位符所指示的时间长度;
    根据所述当前渲染帧的时间戳和所述第一起始时间戳的差值与所述第一结束时间戳和所述第一起始时间戳的差值的比值,与所述占位符所指示的时间长度的乘积,得到所述目标时间戳位于所述视频素材的比例时间长度;
    根据所述第二起始时间戳和所述目标时间戳占所述视频素材的比例时间长度,获得所述目标时间戳。
  6. 根据权利要求4或5所述的方法,其中,所述根据所述目标时间戳从所述视频素材中抽取图像帧,包括:
    若所述视频素材的时间长度小于所述视频模板中视频素材对应的占位符所指示的时间长度,则重新从所述视频素材的起点继续抽取图像帧。
  7. 根据权利要求2至6中任一项所述的方法,其中,所述对导入所述目标素材的视频模板的图像帧进行渲染,包括:
    识别所述视频模板中所述目标占位符对应的渲染器;
    根据所述渲染器的渲染特效对导入所述目标素材后的视频模板的图像帧进行渲染。
  8. 根据权利要求1至7中任一项所述的方法,其中,所述接收视频生成请求之前,还包括:
    获取视频模板制作素材,其中所述视频模板制作素材包括渲染素材和过场动画中的至少一种;
    预添加所述多个占位符;
    根据所述视频模板制作素材和预添加的所述多个占位符制作所述视频模板。
  9. 一种视频处理装置,所述装置包括:
    接收模块,用于接收视频生成请求;
    第一获取模块,用于根据所述视频生成请求获取视频模板,其中所述视频模板中包括多个占位符,其中,每个所述占位符用于指示文本、图片和视频中的至少一种;
    第二获取模块,用于根据所述视频生成请求获取多个素材,其中所述多个素材的类型包括文本、图片和视频中的至少一种;
    渲染模块,用于基于所述素材的类型,将所述多个素材分别导入所述视频模板中对应的占位符的位置并进行渲染,得到合成的视频。
  10. 一种电子设备,包括:处理器和存储器;
    所述存储器存储计算机执行指令;
    所述处理器执行所述存储器存储的计算机执行指令,使得所述电子设备执行如权利要求1至8中任一项所述的视频处理方法。
  11. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至8中任一项所述的视频处理方法。
  12. 一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如权利要求1至8中任一项所述的视频处理方法。
  13. 一种计算机程序,其特征在于,所述计算机程序被处理器执行时,实现如权利要求1至8中任一项所述的视频处理方法。
PCT/CN2022/082095 2021-04-09 2022-03-21 视频处理方法、装置及设备 WO2022213801A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/551,967 US20240177374A1 (en) 2021-04-09 2022-03-21 Video processing method, apparatus and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110385345.3 2021-04-09
CN202110385345.3A CN115209215B (zh) 2021-04-09 2021-04-09 视频处理方法、装置及设备

Publications (1)

Publication Number Publication Date
WO2022213801A1 true WO2022213801A1 (zh) 2022-10-13

Family

ID=83545140

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/082095 WO2022213801A1 (zh) 2021-04-09 2022-03-21 视频处理方法、装置及设备

Country Status (3)

Country Link
US (1) US20240177374A1 (zh)
CN (1) CN115209215B (zh)
WO (1) WO2022213801A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091738A (zh) * 2023-04-07 2023-05-09 湖南快乐阳光互动娱乐传媒有限公司 一种虚拟ar生成方法、系统、电子设备及存储介质
WO2024160128A1 (zh) * 2023-02-03 2024-08-08 北京字跳网络技术有限公司 用于生成视频模板的方法、装置和电子设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115988255A (zh) * 2022-12-23 2023-04-18 北京字跳网络技术有限公司 特效生成方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103928039A (zh) * 2014-04-15 2014-07-16 北京奇艺世纪科技有限公司 一种视频合成方法及装置
US20170238067A1 (en) * 2016-02-17 2017-08-17 Adobe Systems Incorporated Systems and methods for dynamic creative optimization for video advertisements
CN107770626A (zh) * 2017-11-06 2018-03-06 腾讯科技(深圳)有限公司 视频素材的处理方法、视频合成方法、装置及存储介质
CN109168028A (zh) * 2018-11-06 2019-01-08 北京达佳互联信息技术有限公司 视频生成方法、装置、服务器及存储介质
CN110072120A (zh) * 2019-04-23 2019-07-30 上海偶视信息科技有限公司 一种视频生成方法、装置、计算机设备和存储介质

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9576302B2 (en) * 2007-05-31 2017-02-21 Aditall Llc. System and method for dynamic generation of video content
CN101448089B (zh) * 2007-11-26 2013-03-06 新奥特(北京)视频技术有限公司 一种非线性编辑系统
EP2238743A4 (en) * 2007-12-17 2011-03-30 Stein Gausereide REAL-TIME VIDEO INCLUSION SYSTEM
EP2428957B1 (en) * 2010-09-10 2018-02-21 Nero Ag Time stamp creation and evaluation in media effect template
US9277198B2 (en) * 2012-01-31 2016-03-01 Newblue, Inc. Systems and methods for media personalization using templates
CN111131727A (zh) * 2018-10-31 2020-05-08 北京国双科技有限公司 视频数据处理方法和装置
CN109769141B (zh) * 2019-01-31 2020-07-14 北京字节跳动网络技术有限公司 一种视频生成方法、装置、电子设备及存储介质
CN110060317A (zh) * 2019-03-16 2019-07-26 平安城市建设科技(深圳)有限公司 海报自动配置方法、设备、存储介质及装置
CN110708596A (zh) * 2019-09-29 2020-01-17 北京达佳互联信息技术有限公司 生成视频的方法、装置、电子设备及可读存储介质
CN111222063A (zh) * 2019-11-26 2020-06-02 北京达佳互联信息技术有限公司 富文本渲染方法、装置、电子设备及存储介质
CN111669623B (zh) * 2020-06-28 2023-10-13 腾讯科技(深圳)有限公司 视频特效的处理方法、装置以及电子设备
CN111966931A (zh) * 2020-08-23 2020-11-20 云知声智能科技股份有限公司 控件的渲染方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103928039A (zh) * 2014-04-15 2014-07-16 北京奇艺世纪科技有限公司 一种视频合成方法及装置
US20170238067A1 (en) * 2016-02-17 2017-08-17 Adobe Systems Incorporated Systems and methods for dynamic creative optimization for video advertisements
CN107770626A (zh) * 2017-11-06 2018-03-06 腾讯科技(深圳)有限公司 视频素材的处理方法、视频合成方法、装置及存储介质
CN109168028A (zh) * 2018-11-06 2019-01-08 北京达佳互联信息技术有限公司 视频生成方法、装置、服务器及存储介质
CN110072120A (zh) * 2019-04-23 2019-07-30 上海偶视信息科技有限公司 一种视频生成方法、装置、计算机设备和存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024160128A1 (zh) * 2023-02-03 2024-08-08 北京字跳网络技术有限公司 用于生成视频模板的方法、装置和电子设备
CN116091738A (zh) * 2023-04-07 2023-05-09 湖南快乐阳光互动娱乐传媒有限公司 一种虚拟ar生成方法、系统、电子设备及存储介质
CN116091738B (zh) * 2023-04-07 2023-06-16 湖南快乐阳光互动娱乐传媒有限公司 一种虚拟ar生成方法、系统、电子设备及存储介质

Also Published As

Publication number Publication date
US20240177374A1 (en) 2024-05-30
CN115209215A (zh) 2022-10-18
CN115209215B (zh) 2024-07-12

Similar Documents

Publication Publication Date Title
WO2022213801A1 (zh) 视频处理方法、装置及设备
WO2020082870A1 (zh) 即时视频显示方法、装置、终端设备及存储介质
WO2020233142A1 (zh) 多媒体文件播放方法、装置、电子设备和存储介质
WO2021179882A1 (zh) 图像的绘制方法、装置、可读介质和电子设备
WO2020151599A1 (zh) 视频同步发布方法、装置、电子设备及可读存储介质
US11678024B2 (en) Subtitle information display method and apparatus, and electronic device, and computer readable medium
US11928152B2 (en) Search result display method, readable medium, and terminal device
US11785195B2 (en) Method and apparatus for processing three-dimensional video, readable storage medium and electronic device
WO2022057575A1 (zh) 一种多媒体数据的发布方法、装置、设备及介质
CN110321447A (zh) 重复图像的确定方法、装置、电子设备及存储介质
US11893770B2 (en) Method for converting a picture into a video, device, and storage medium
WO2023138441A1 (zh) 视频生成方法、装置、设备及存储介质
WO2024001545A1 (zh) 歌单展示信息生成方法、装置、电子设备及存储介质
WO2024193511A1 (zh) 互动方法、装置、电子设备、计算机可读介质
CN110287350A (zh) 图像检索方法、装置及电子设备
WO2023098576A1 (zh) 图像处理方法、装置、设备及介质
CN112492399B (zh) 信息显示方法、装置及电子设备
WO2023138468A1 (zh) 虚拟物体的生成方法、装置、设备及存储介质
WO2022042398A1 (zh) 用于确定对象添加方式的方法、装置、电子设备和介质
WO2021031909A1 (zh) 数据内容的输出方法、装置、电子设备及计算机可读介质
EP3229478B1 (en) Cloud streaming service system, image cloud streaming service method using application code, and device therefor
US12020347B2 (en) Method and apparatus for text effect processing
WO2021018176A1 (zh) 文字特效处理方法及装置
US12126876B2 (en) Theme video generation method and apparatus, electronic device, and readable storage medium
US20240276067A1 (en) Information processing method and apparatus, device, medium, and product

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22783868

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18551967

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22783868

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 22783868

Country of ref document: EP

Kind code of ref document: A1