CN113891113B - Video clip synthesis method and electronic equipment - Google Patents
Video clip synthesis method and electronic equipment Download PDFInfo
- Publication number
- CN113891113B CN113891113B CN202111152811.XA CN202111152811A CN113891113B CN 113891113 B CN113891113 B CN 113891113B CN 202111152811 A CN202111152811 A CN 202111152811A CN 113891113 B CN113891113 B CN 113891113B
- Authority
- CN
- China
- Prior art keywords
- video
- synthesis
- segment
- sdk
- browser
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001308 synthesis method Methods 0.000 title abstract description 5
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 161
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 161
- 239000000463 material Substances 0.000 claims abstract description 107
- 238000000034 method Methods 0.000 claims abstract description 81
- 230000008569 process Effects 0.000 claims abstract description 49
- 238000005516 engineering process Methods 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 33
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 26
- 230000011218 segmentation Effects 0.000 claims abstract description 20
- 238000009877 rendering Methods 0.000 claims abstract description 14
- 239000000203 mixture Substances 0.000 claims description 79
- 230000006870 function Effects 0.000 claims description 52
- 230000000007 visual effect Effects 0.000 claims description 19
- 230000018109 developmental process Effects 0.000 claims description 14
- 230000007246 mechanism Effects 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 238000005538 encapsulation Methods 0.000 claims description 2
- 238000012544 monitoring process Methods 0.000 claims 2
- 230000001419 dependent effect Effects 0.000 claims 1
- 239000000758 substrate Substances 0.000 claims 1
- 238000013461 design Methods 0.000 description 13
- 230000000694 effects Effects 0.000 description 12
- 238000011161 development Methods 0.000 description 8
- 239000002131 composite material Substances 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000007704 transition Effects 0.000 description 4
- 230000002730 additional effect Effects 0.000 description 3
- 238000011049 filling Methods 0.000 description 3
- 239000012141 concentrate Substances 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000012447 hatching Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23424—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44016—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Business, Economics & Management (AREA)
- Marketing (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
The embodiment of the application discloses a video clip synthesis method and electronic equipment, wherein the method comprises the following steps: receiving material adding and editing operation of a user through a video editing and synthesizing interface, and determining a video synthesizing scheme, wherein the video synthesizing scheme comprises contents to be synthesized of a plurality of image frames in video to be synthesized; determining the total duration of the video to be synthesized in the process of executing video synthesis according to the video synthesis request; determining a plurality of segment durations according to the total duration and the target segment number, and creating a plurality of segment synthesis tasks according to the plurality of segment durations; parallel processing is carried out on the plurality of segmentation synthesis tasks through a multithreading technology; and splicing and rendering the segment synthesis results respectively corresponding to the segment synthesis tasks, and outputting a video synthesis result. Through the embodiment of the application, the video synthesis processing efficiency can be improved.
Description
Technical Field
The present disclosure relates to the field of video synthesis technologies, and in particular, to a video clip synthesis method and an electronic device.
Background
The industry of network live broadcast, short video and the like is emerging, and the product propaganda way of a plurality of merchants or enterprises is enriched. For example, a video resource of several hours, which contains a lot of commodity explanation, factory introduction, etc., is broadcast. And live broadcasting has timeliness, so how to utilize the existing live broadcasting video resources after the live broadcasting is finished becomes a problem to be solved by a platform and a merchant.
In the prior art, some systems can identify a starting point and an ending point of a live video for explaining specific commodities, then can intercept video clips according to the time points to serve as explaining videos of corresponding commodity objects, and put the explaining videos into pages such as detail pages of the commodities for consumers to view at any time.
Although the live video content can be converted into the explanation video of the commodity in this way, the quality of the generated explanation video is uneven due to the fact that some relatively low-quality or invalid content may exist in the live video content. Therefore, some software developers provide a video editing and synthesizing tool for users, through which the users can clip the original video, remove the middle low-quality content, splice multiple sections of video and pictures together, add materials such as subtitles, stickers and the like, finally synthesize a section of video, and then the users can use the synthesized video for delivery.
The video clip composition described above can improve the quality of the produced video by the clip composition function, but since the composition of the picture contents needs to be performed sequentially by frames in the entire video composition process, this means that the composition time will be not less than the time length of the video actually produced, and the video composition time is long. For example, assuming that a video having a period of time of 100 seconds needs to be synthesized, after the user completes the preparation, editing, etc. of the material, the time for generating the synthesized video later will be greater than or equal to 100s, which causes a long waiting time for the user.
Therefore, how to improve the video synthesis processing efficiency becomes a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The application provides a video clip synthesis method and electronic equipment, which can improve video synthesis processing efficiency.
The application provides the following scheme:
a video clip composition method, comprising:
receiving material adding and editing operation of a user through a video editing and synthesizing interface, and determining a video synthesizing scheme, wherein the video synthesizing scheme comprises contents to be synthesized of a plurality of image frames in video to be synthesized;
determining the total duration of the video to be synthesized in the process of executing video synthesis according to the video synthesis request;
determining a plurality of segment durations according to the total duration and the target segment number, and creating a plurality of segment synthesis tasks according to the plurality of segment durations;
parallel processing is carried out on the plurality of segmentation synthesis tasks through a multithreading technology;
and splicing and rendering the segment synthesis results respectively corresponding to the segment synthesis tasks, and outputting a video synthesis result.
The video clip composing interface is generated and displayed based on browser technology, responds to the material addition and clipping operation of a user in a browser, and performs video composing in the browser.
The page code of the video clip synthesis page comprises an SDK, wherein the SDK is used for providing a video clip function and a video synthesis function for the video clip synthesis page; the SDK is common to a plurality of developers.
Wherein, still include:
after receiving the synthesizing request, creating an audio node based on the browser technology, wherein the audio node is used for periodically playing the target sound to serve as a refreshing mechanism depending on the video synthesizing process.
Wherein the determining a plurality of segment durations according to the total duration and the segment number includes:
if the total time length is not divided by the number of segments, processing the segment boundary in a way of rounding the segment time lengths so that the sum of the segment time lengths is equal to the total time length.
Wherein the clipping operation includes: and creating a plurality of material tracks, and editing the picture level, the starting time and the ending time of the added materials through the material tracks so as to perform superposition and/or splicing operation on the plurality of materials in time and/or space dimensions.
Wherein, still include:
in response to the user adding material and the editing operation, providing video picture preview content so as to carry out video picture based on the preview, and carrying out visual editing on the position of the material content in the picture.
A video clip composition method, comprising:
providing a Software Development Kit (SDK) based on browser technology and an Application Programming Interface (API) thereof and a structure description protocol for a plurality of developers, wherein the SDK comprises an SDK for providing a video clip function and an SDK for providing a video synthesis function, so that the developers develop a video clip synthesis page based on browser by using the API and the structure description protocol and write the SDK into page codes;
responding to the added materials and editing operation of the user in a browser through the SDK of the video editing function in the process of displaying the video editing synthetic page to the user;
after receiving the video composition request, performing video composition processing in a browser through the SDK of the video composition function.
Wherein the SDK further comprises an SDK for providing a preview function;
the method further comprises the steps of:
in response to the user's add material and clip operation, video preview content is provided via the SDK of the preview function for visual clipping based on the video.
The video synthesis processing in the browser is performed by the SDK with the video synthesis function, and the SDK comprises the following steps:
Determining a plurality of pieces of content to be synthesized corresponding to multi-frame images in the video to be synthesized respectively according to the added materials and the editing operation of the user so as to record the video to be synthesized frame by frame;
when the current image frame is recorded, the multiple to-be-synthesized contents corresponding to the current image frame are respectively converted into visual image streams, and the visual image streams are provided for a recorder unit to record the current image frame.
The video synthesis processing in the browser is performed by the SDK with the video synthesis function, and the SDK comprises the following steps:
determining the total duration of the video to be synthesized according to the added materials and the editing operation of the user;
determining a plurality of segment durations according to the total duration and the target segment number;
creating a plurality of segment synthesis tasks according to the segment durations;
parallel processing is carried out on the plurality of segmentation synthesis tasks through a browser multithreading technology;
and splicing and rendering the segment synthesis results respectively corresponding to the segment synthesis tasks, and outputting a video synthesis result.
Wherein, the API corresponding to the SDK of the video composition function is associated with a segmentation number parameter, so that the developer specifies the target segmentation number.
A video clip compositing device comprising:
a video composition scheme determining unit, configured to receive a material adding and editing operation of a user through a video editing and composing interface, and determine a video composition scheme, where the video composition scheme includes contents to be composed of a plurality of image frames in a video to be composed;
the total duration determining unit is used for determining the total duration of the video to be synthesized in the process of executing video synthesis according to the video synthesis request;
the segment synthesis task creation unit is used for determining a plurality of segment durations according to the total duration and the target segment number, and creating a plurality of segment synthesis tasks according to the plurality of segment durations;
the parallel processing unit is used for carrying out parallel processing on the plurality of segmented synthesis tasks through a multithreading technology;
and the splicing rendering unit is used for splicing and rendering the segmentation synthesis results respectively corresponding to the segmentation synthesis tasks and outputting a video synthesis result.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the preceding claims.
An electronic device, comprising:
one or more processors; and
A memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of the preceding claims.
According to a specific embodiment provided by the application, the application discloses the following technical effects:
according to the embodiment of the application, in the process of executing the video synthesis task driven by the video synthesis scheme (schema), the specific video synthesis task can be divided into a plurality of segment synthesis tasks according to the total duration of the video to be synthesized and the target segment number. In this way, the multiple segmentation synthesis tasks can be processed in parallel through a multithreading technology, and then the segmentation synthesis results corresponding to the multiple segmentation synthesis tasks are spliced and rendered to output a video synthesis result. In this way, since multi-line Cheng Fenduan parallel synthesis is possible, video synthesis efficiency can be improved and time required for video synthesis can be shortened.
The specific video composition scheme can be generated by a user after material addition and editing operation through a video editing composition interface. The specific video clip composition interface can be a Web page generated and displayed based on browser technology, and can respond to the material addition and the clipping operation of a user in a direct browser, and video composition is performed in the browser, so that the service cost of a developer is saved.
The embodiment of the application can also provide a clipping function SDK, a video synthesis function SDK and a structure description protocol based on browser technology for specific developers. In this way, when the developer specifically develops the video clip composition interface, the developer can be realized by the unified SDK due to the specific clipping function, the composition function and the like, so that the developer can concentrate on designing the style, the front-end rear-end link and the like of the specific video clip composition interface, hatching more product forms and jointly building the web video clip ecology.
Of course, not all of the above-described advantages need be achieved at the same time in practicing any one of the products of the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application;
FIG. 2 is a flow chart of a first method provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a video editing process provided by an embodiment of the present application;
FIG. 4 is a schematic diagram of a video composition process provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a parallel synthesis and a tiled rendering process for each segment provided in an embodiment of the present application;
FIG. 6 is a flow chart of a second method provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of an apparatus provided by an embodiment of the present application;
fig. 8 is a schematic diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of the protection of the present application.
In order to facilitate understanding of the technical solution provided in the embodiments of the present application, the following first describes a flow of video synthesis processing in a simple manner.
In the process of video synthesis processing, it is generally required that a user first inputs specific materials (including video, pictures, flowers, etc.), then in the video synthesis tool, through a series of actions such as dragging, visual editing can be performed, and a text can be added to a picture, or the position of the specific materials on a space, a time axis, etc. can be adjusted. The video composition tool then knows what the user needs to compose, specifically what the user needs to compose on each frame. Then, the user can click on operation options such as synthesis and the like to enter a specific synthesis flow. In the synthesis process, the recording of the picture content needs to be performed frame by frame. Specifically, in the process of recording frame by frame, for a current image frame, firstly, according to the material adding and editing results of a user, a plurality of to-be-synthesized contents (possibly including a certain frame picture in a certain original video, possibly including a picture, a text, a picture-in-picture image, etc. which are displayed by superposition on the picture) to be displayed in the image frame can be determined, then, the plurality of to-be-synthesized contents can be converted into a visual picture stream (for example, a Canvas stream), and then, the Canvas stream corresponding to each to-be-synthesized content is submitted to a recorder for recording, so that a frame of image in a real video can be generated.
Of course, in another way, the specific video synthesis tool may also provide some templates for the user, which may include some set special effects, backgrounds, etc., and the user may simply replace the content of the main video, the text, etc. in the templates, so as to complete editing the video to be synthesized, etc. Then, the method can trigger to enter a specific synthesis flow by clicking a synthesis option and the like, and the specific synthesis flow is basically the same as the previous process.
Because each frame of image needs to be recorded serially, the time spent in the specific video synthesizing process in the prior art is longer, at least not shorter than the total duration of the video to be synthesized. For example, if a video of 100 seconds needs to be synthesized, the time for specifically generating the synthesized video is not less than 100 seconds after the user completes the addition and editing of the material. That is, the user experience is that after completing the adding and editing operations of the material by a series of drag and the like, it takes a relatively long time to wait for the final composite video to be generated.
In view of the above, the embodiment of the present application provides a multi-line Cheng Fenduan parallel synthesis scheme, that is, after a user completes adding and editing specific materials, the total duration of a video to be synthesized may be determined, and then, a plurality of segment synthesis tasks may be generated. For example, assuming a total duration of 100 seconds, divided into four segments, each segment may perform a composite task in which 25 seconds, segment 1 corresponds to 1 st to 25 th seconds, segment 2 corresponds to 26 th to 50 th seconds, segment 3 corresponds to 51 th to 75 th seconds, segment 4 corresponds to 76 th to 100 th seconds, and so on. Each sectional synthesis task can be executed in parallel in a multithreading mode, and then the respective synthesis results are spliced and rendered to obtain a final synthesized video. In this way, the time required for the synthesis process is the longest time required for the segment synthesis task, plus the time required for the tile rendering, which together are less than the time required for the serial recording.
The above multi-line Cheng Fenduan parallel synthesis scheme can be used in existing video clip synthesis tools, or the present inventors have found in implementing the present application that: in the prior art, a specific video clip composing tool usually exists in the form of client software, and if a user needs to perform video composing, the client software needs to be downloaded locally. For relatively large merchandise information systems (e.g., e-commerce systems), the video clip composition tool described above may belong to a third party tool. Therefore, if the merchant user generates the requirement of synthesizing video in the process of using the commodity object information system, the merchant user can only download and install the third party tool, then use the tool to clip and synthesize the video, download the synthesized video to the local, upload the video to the background of a specific developer of a specific commodity information system for delivery, and the like. That is, the user needs to switch between the merchandise object information system and the third party video composition tool.
Therefore, if video composition related services can also be provided in the commodity object information system, convenience can be provided for merchants. In this case, the multi-line Cheng Fenduan parallel synthesis scheme can be applied to a video synthesis service provided in a specific merchandise object information system.
Here, since the commodity object information system itself is mainly a commodity object service, it includes distribution of commodity objects, transaction links, and the like. With respect to requirements specifically related to video composition, it is common for the requirements to be generated in the service links of one or more specific developers associated with the merchandise object information system. For example, inside a development and operation company of a commodity object information system, different product lines may be corresponding to provide services of different aspects and industries for users; each product line may also include a plurality of different functional modules, each of which may correspond to a respective developer, e.g., a developer providing a merchandise distribution service, a developer providing an information recommendation service, etc. Thus, in particular implementations, a particular developer may determine whether to provide video composition services to a user in its existing product link.
However, product links of different developers may be different, specific functions related to synthesizing videos may appear in different nodes of different links, and the developer may need to open or merge specific video synthesizing functions with respective specific service links, or perform some personalized settings on specific interfaces to conform to the overall product adjustability of the specific developer, or the like. Therefore, it is not feasible if the same video composition application or service is multiplexed between different developers.
In order to achieve the above object, one way is that each developer may develop a respective video composition service separately and provide it to a user for use. However, although the design schemes of the respective developers at the application layer (interface design, front-end link design, etc.) are different, after all, core content related to video composition is involved, and thus there is a case where redundancy development is performed between the different developers.
To reduce redundancy development, embodiments of the present application provide a set of generic video clip protocols, and highly customizable clips and composition kernels, which may include SDKs (Software Development Kit, software development kits) specifically for providing video clip functionality and SDKs for video composition functionality, and expose interfaces, i.e., APIs (Application Programming Interface, application programming interfaces), of the specific SDKs to developers. In this way, each developer can develop the video composition service based on the above protocol and the API corresponding to the SDK. That is, a specific developer may not pay attention to specific editing and synthesized implementation logic, but only needs to design page styles, front-end and back-end links, and the like, so that redundant development conditions among different developers are reduced, and the flow property and reusability of data are improved.
However, in practical applications, the following problems may also exist: as described above, the specific merchandise object information system is not necessarily dedicated to the video composition service, and is specific to the developer of the specific product line or function module within the system, and even if the video composition service is provided, the service may not be the subject service content of the developer. Thus, additional occupancy of service resources for a particular developer may be involved, increasing the developer's cost of service. Even for some smaller departments, the service resources available may be more limited, even without additional service resources for providing video composition services, and so on.
Therefore, aiming at the above situation, the SDK provided in the embodiment of the present application may directly run in the terminal device of the user, so that the user's own hardware device resource may be used to complete the specific clipping and video synthesis process. This may be achieved by means of client software or, alternatively, by browser technology. The latter is selected in the embodiments of the present application because the client software involves problems of download installation, i.e. the corresponding SDK and structure description protocol may be provided all based on browser technology. Therefore, each developer can access the set of SDK, and make encapsulation customization on the upper application layer, and develop specific video clip synthesized pages and related front-end and rear-end links. In this way, the specific clip composite page can exist in the form of a Web page, and a developer can put in page links in various ways, for example, the specific clip composite page can be put in a background with relatively large flow, so that users such as merchants can directly access the links by seeing the links, and the functions are used in the page. In addition, the developer can directly write the SDK into a specific page code, so that the video clip synthesis processing can be completed in the browser directly without occupying server resources when the video clip synthesis processing is performed on the basis of the page, and the occupation of the service cost of the developer is avoided. In other words, a developer may provide video clip composition services to its users in a "0 cost" manner.
Of course, the SDK and the structure description protocol provided by the browser technology are not limited to be provided to a developer in a commodity object information system, and may be opened to other developers.
In the above-described process of providing the developer with the browser-based SDK and the related structure description protocol, a specific SDK may include therein an SDK for providing a clip function, an SDK for providing a video composition function, and so on. In this regard, since the multithreading technology is also supported in the browser, regarding the SDK for providing the video composition function, the aforementioned multiline Cheng Fenduan type parallel composition scheme can be used to shorten the time required for the video composition process.
From the view of system architecture, referring to fig. 1, various SDKs, APIs, and structure description protocols may be provided in the embodiments of the present application, and may be provided to multiple developers, so that the developers may participate in the development process of a specific video clip development page, to implement personalized designs on page styles, front-end and back-end links, and so on. Regarding the implementation of specific editing functions and the implementation of video composition, the implementation can be realized through an SDK, and repeated design by specific developers is not needed. In addition, in the embodiment of the application, a specific SDK may be developed based on a browser technology, so that a developer may develop a video editing and synthesizing tool in a webpage form, and links of a specific webpage may be put in through various channels, so that a user may enter into a specific webpage to perform editing and synthesizing processing of the video by clicking the links. In the synthesis process, a multi-line Cheng Fenduan parallel synthesis scheme can be used to improve the specific video synthesis processing efficiency.
The following describes the specific technical scheme provided in the embodiment of the present application in detail.
Example 1
First, the embodiment provides a video clip composing method from the perspective of multi-line Cheng Fenduan parallel video composing, and specifically, the execution subject of the method may be an independent video clip composing tool or a development kit abstracted by SDK as described above. May be implemented by client technology or browser technology.
Specifically, referring to fig. 2, the method may include:
s201: receiving material adding and editing operation of a user through a video editing and synthesizing interface, and determining a video synthesizing scheme, wherein the video synthesizing scheme comprises contents to be synthesized of a plurality of image frames in video to be synthesized;
in a specific implementation, the video clip composition interface may specifically refer to an interface provided in a video composition tool that exists in a client form, or, as described above, the video clip composition interface may also be a Web page generated based on browser technology. In an alternative manner, the specific Web page may also be developed and implemented based on the SDK and the structure description protocol provided in the embodiments of the present application.
Among them, as for the clip function SDKs, there may be various types of SDKs specifically, for example, SDKs for adding materials or materials (video, pictures, text, etc.), SDKs for adding special effects of flowers, animations, etc., SDKs for realizing video immersive (filling mode of cutting off edge portions, etc.), SDKs for realizing video mute (e.g., removing video sound), SDKs for adding music, SDKs for performing video clips (e.g., requiring use of 3 rd to 5 th second content in a certain original video material, which can be realized by the capability provided by the SDKs), and the like.
Each SDK may provide a specific API to enable a developer to implement a call to the corresponding SDK through the specific API in a developed page to implement a corresponding function.
Of course, regarding the structure description protocol, the description protocol of the structure such as basic material (picture, document, music, video, flower, etc.), video special effects (special effects such as transition, filter, video filling, etc.), video clip, splicing, etc. may be specifically included. In addition, the concept of material tracks can be introduced simultaneously, the specific material tracks are used for receiving materials, different tracks are used for distinguishing picture levels, and the time sequence is embodied on the same track. Therefore, superposition and splicing rendering of various basic materials in time and space can be supported. In addition, complex transition, filters, filling modes and other additional effects can be supported, and functional video editing is realized. Specifically, an additional effect of (multiple tracks are multiple materials)/(an additional effect is realized, namely multiple tracks are supported, multiple materials of each track are edited, and effect linkage among single or multiple materials is supported.
When a specific developer develops the video clip composition page, the above-mentioned structure description protocol can be used to describe various structures, and a specific API can be used to call the SDK of the corresponding function. In addition, the specific clipping function SDK and the video composition function SDK may also be directly written in page code or the like to realize responding to the material addition and clipping operation of the user in the browser and performing video composition in the browser.
In the development completion of a specific video clip composition page, a user can access the specific page and perform a specific video clip composition operation. For example, as shown in fig. 3, when a user performs a editing operation, a specific picture, a document, a video, or other materials may be uploaded to a media resource library first, or a media resource library, a special effect library, or the like provided by the system may be used. Then, a specific material track can be created, a specific material is selected and dragged to the specific material track, and correspondingly, a control bar corresponding to the material can be created by the specific SDK, and the occurrence time point and the duration of the corresponding material on the corresponding picture level can be determined by dragging the position of the control bar on the material track, scaling the length of the control bar and the like. When the next material is added, if the next material is required to be displayed at the same picture level as the previous material, the next material can be directly dragged to the material track created before, and the position, the length and the like of the next material on the track can be adjusted. Alternatively, a new material track may be created (different material tracks may correspond to the same time axis) such that the next material is presented at another picture level, may overlap in time with the previous material, and so on. Thus, the design of multiple picture levels can be realized through the material track, and the appearance time of different materials can be designed in the same picture level. Therefore, the overlapping design of a plurality of different materials in time and space dimensions is realized, and the display effect of the video is improved.
In addition, the transition special effect design of a plurality of different video materials in the same picture level can be realized through the material track. For example, if two-end video needs to be continuously played in a certain picture level, there may be a partial overlap between the first video segment and the second video segment on the time axis, so that when the first video segment is about to end, the second video segment starts to be played, so as to achieve the effect of transition, and so on.
In addition, it should be noted that, in a specific implementation, the SDK for providing the preview function may be provided, so that, specifically, in a process that a user creates a material track and adds a specific material to the material track, a visual preview video may be provided (at this time, only specific content to be synthesized may be converted into a visual image stream, so that the visual image stream may be continuously played, but a real video is not yet generated), and effects presented by each material in a current design state are shown. In the preview process, play can be paused at any time, the content in the specific picture is dragged, the position of the content in the specific picture level is changed, and the like. That is, it is only possible to determine which contents are present in the same picture level, and the start time, end time, etc. of each occurrence, by the material track, but it is not possible to determine where each part of the contents in the same picture level is respectively shown in the picture. Therefore, the adjustment of the display position of the specific content in the picture can be realized by the mode of visualizing the preview picture.
In summary, by the foregoing manner, since a plurality of material tracks can be created, the picture level and the start and end times at which the added material is located can be edited by the material tracks, so that the plurality of materials are subjected to the stacking and/or splicing operation in the time and/or space dimensions. Thus, the specifically added material may have information on the screen level, start time, end time, position in the screen, and the like. If the specific material itself also has continuity information, for example, a video material, the time information may also have two information, where on one hand, the start time and the play time of the video on the time axis corresponding to the specific track are used to determine what time period the video is played in the final video to be synthesized; on the other hand, what time period of the specific video is played. For example, a certain video has 10 seconds of content, and the 3 rd to 5 th of the content needs to be played in the 10 th to 12 th of the video to be synthesized. The above-mentioned "play in the 10 th to 12 th second period in the video to be synthesized", and "take 3 rd to 5 th second content of the current video", etc. may be all shown in the description information of the video, etc. When a specific synthesis process is performed, the specific content to be synthesized in each frame can be determined according to the information.
The display style, the position, the style of the operation control and the like of the material track in the specific page can be customized and personalized by a specific developer according to respective requirements. The capabilities provided by the embodiments of the present application through the corresponding SDKs may be used only in specific response to user selection, dragging, clipping, etc.
S202: determining the total duration of the video to be synthesized in the process of executing video synthesis according to the video synthesis request;
after the addition of various materials and the design of the time and space where the materials are located, it is equivalent to the knowledge of the compositing tool how each frame of the video that the user wants to compose is presented, i.e. by which specific content each frame needs to compose together, etc. Then, the specific content to be synthesized can be converted into a visual picture stream frame by frame, and picture recording is carried out through a recorder.
Of course, in the embodiment of the present application, in order to improve the efficiency of the synthesis process, multi-line Cheng Fenduan parallel synthesis may be performed. For this purpose, in a specific implementation, after the user finishes adding and designing a specific material, the total duration of the video to be synthesized may also be determined first. In the specific implementation, in the process of the superposition design of the material tracks on the space and time dimensions, the specific tracks also correspond to the time axis, so that the total duration can be determined according to the time length of the materials added on each track. Specifically, since the starting points of the time axes corresponding to the plurality of tracks are the same, the time length corresponding to the track with the longest time can be determined as the total duration of the video to be synthesized.
S203: determining a plurality of segment durations according to the total duration and the target segment number, and creating a plurality of segment synthesis tasks according to the plurality of segment durations;
after the total duration of the video to be synthesized is determined, a plurality of segment durations can be determined according to the target segment number, and then a plurality of segment synthesis tasks can be created according to the segment durations. Specifically, the number of target segments may be fixed, or may be set by a developer according to actual situations. Of course, when setting is performed by the developer, the maximum number of segments supported may be set, and the developer may set the number of segments within an appropriate range. In a specific implementation, the number of segments may be set in a specific page code, the number of segments may be used as a parameter, call to a specific composite SDK, and so on.
In particular, when determining a plurality of segment durations according to the total duration and the target segment number, there may be a plurality of manners. For example, if the total length of time is divisible by the target number of segments, the divisor may be performed directly such that each segment length is equal. For example, if the total duration of a video to be synthesized is 100 seconds and the number of target segments is 4, the segment duration of the four segments is 25 seconds.
If the total duration is not divided by the number of target segments, a frame loss or the like may occur at the boundary between each segment. To avoid this, the segment boundaries may also be processed by rounding the plurality of segment durations such that the sum of the plurality of segment durations is equal to the total duration. For example, assuming that the total duration of a video to be synthesized is 97 seconds and the number of target segments is 3, dividing 97 by 3 directly gives the result of an infinite loop fraction: 32.333 … …. For this case, the three segment durations may be set to 32, 32, 33, respectively, so that the addition result of the three segment durations is exactly equal to the total duration of the video to be synthesized, and so on.
After determining the plurality of segment durations, a specific video composition task may be split into a plurality of segment composition tasks. For example, in the case where the aforementioned three segment durations are 32, 32, 33, respectively, three segment synthesis tasks may be generated, in which task 1 is used to synthesize segments from 1 st to 32 nd seconds, task 2 is used to synthesize segments from 33 th to 64 th seconds, task 3 is used to synthesize segments from 65 th to 97 th seconds, and so on.
S204: parallel processing is carried out on the plurality of segmentation synthesis tasks through a multithreading technology;
after determining the plurality of segment synthesis tasks, the plurality of segment synthesis tasks may be processed in parallel by a multi-threading technique. That is, multiple threads may be created, each for processing one of the segment composite tasks, and the multiple threads may be processed in parallel. For example, in a browser-technology-based implementation, multiple threads may be created for executing multiple segmentation composition tasks in parallel, in a web-worker-based fashion. The web-worker specifically opens a sub-thread based on the single-thread execution of the Javascript, so as to be used for processing the program without affecting the execution of the main thread, and returns to the main thread after the sub-thread is executed, so that the execution process of the main thread is not affected in the process.
When each thread processes a specific segmentation synthesis task, the content to be synthesized in each image frame can be determined respectively, then converted into a visual image stream, and then delivered to a recorder unit for recording, so that a specific video image frame is generated.
In a specific implementation, specific video synthesis task processing can be realized through the video synthesis function SDK provided by the embodiment of the present application. Specifically, in one implementation, the SDK may implement a set of schema (template, i.e., a video design scheme generated after a user adds material and performs a clipping operation) driven video player based on Canvas, etc., and redraw the Canvas at each moment. Since each material can have time attributes of starting time, ending time, starting playing and the like after the user finishes the material adding and editing operation, the player can determine whether the material is drawn or not through the time attributes. As shown in fig. 4, for the material hitting the playing range at each moment, the image resource of the material may be first acquired, then converted into a visual image stream, and drawn on a canvas. In the drawing process, the drawing order can be determined according to the track level relation in the schema. For example, a low orbit is drawn first, then a high orbit is drawn, and the high orbit material naturally covers the low orbit material on the canvas.
In a specific implementation manner, the drawing of the Canvas animation can be directly and dynamically output the image stream according to frames based on the MediaRecorder recording function, and the real video is recorded based on the image stream, so that the obtained (real video) is obtained (previewed screen). In addition, ffmpeg transcoding capabilities can be utilized to produce a standardized video (e.g., in mp4 format, etc.).
The MediaRecorder is a set of API for audio or video recording, each end has respective implementation mode at present, the MediaRecorder instance can be directly initialized in a modern browser, the core input of the MediaRecorder is Stream, and the Canvas animation can be effectively recorded into a section of real video by combining captureStream, which is a specific implementation mode for converting templates into real video in the browser.
FFmpeg is a set of open source multimedia video processing tools that can be used to record, convert digital audio, video, and convert it into streams. FFmpeg has very powerful functions including video capture functions, video format conversion, video capture, watermarking of video, and the like. In the embodiment of the application, the method is mainly used for audio and video decomposition, transcoding and video synthesis.
It should be noted here that in the process of video composition specifically, since it is a frame-by-frame composition, a refresh mechanism is also required to determine when to update a picture specifically. That is, the update of the next frame picture may be triggered by a periodically occurring event. That is, after converting the content to be synthesized of one image frame into a visual image stream and delivering the visual image stream to the recorder to generate a specific video frame, the generation of the next video frame is performed when a trigger event of the next period needs to be waited for, so that the video frame is updated.
In the manner of video synthesis based on browser technology, one implementation manner is that the method can directly rely on a refresh mechanism of a browser to update a picture. That is, the refresh event of the browser can be monitored, and the frame of the composite video can be updated when the refresh event of the browser in the next period is monitored. However, this approach may present the following problems: since the refresh mechanism of the browser is frozen after switching to other tab pages or minimizing the browser window, the refresh mechanism cannot be provided for the video composition process. This results in that, in the process of executing a specific video composition in the browser, the user is required to wait in the current page all the time, cannot switch to other tab pages, and cannot perform operations such as minimizing the browser.
To avoid this, embodiments of the present application also provide an improved way. Specifically, the embodiment of the application may not rely on a refresh mechanism of the browser itself, but may create an audio node based on a browser technology, and play a target sound periodically through the audio node, and in a preferred embodiment, the volume gain of the target sound may also be set to 0, so as to avoid interference to a user. In this way, the audio node can be used as a refresh mechanism that is relied upon in the video composition process. That is, the event of periodically playing the target sound may be monitored, and the screen refresh operation may be performed the next time the target sound is played, and so on. Therefore, the audio node can not be frozen due to page switching, browser minimization and the like, and the audio node can always periodically play target sound as long as the audio node is not actively ended in a program or a user does not execute operations such as shutdown and the like, so that the user can freely execute operations such as page switching, browser minimization and the like in the process of synthesizing the video, and the user experience is improved.
In the specific implementation, the web Audio Api is utilized to open an Audio node, and a node with the volume of 0 is manufactured to simulate a hardware timer, so that the period of all pixel operation is accurately controlled, and the influence of the inactive state of the browser is avoided. The web Audio Api has strong functions, can be generally combined with hardware such as a microphone, and can acquire real Audio in real time and process corresponding Audio nodes, including series of operations such as Audio special effects, clipping and the like. The main reason for selecting the API in the embodiments of the present application is that with this API, the time can be controlled very precisely with little delay, so that the developer can control the time accurately.
In addition, for the extreme case that the video is played for about 10 seconds in the inactive state and the picture is still (no hook can be transmitted to the developer), the video can be reloaded regularly and synchronized to the latest playing progress recorded before, and the recorder is paused until the video loading is successful in acquiring the first frame of the video.
It should be noted that, the above-mentioned multiple segment synthesis tasks executed in parallel are mainly synthesis of video picture contents, and in a specific implementation, since audio contents may also exist in the finally generated video, and the audio contents are whole segments, the whole segments can be operated, and frame-by-frame rendering synthesis is not needed. Therefore, in specific implementation, the audio part can be extracted from the video synthesis scheme designed by the user for separate recording. Specifically, whether audio is added to the video description structure or whether specific video materials have sound can be detected, if so, the audio can be separated, and superposition recording is performed according to the level, the sequence and the like of the specific separated audio, so that an audio recording result is obtained. Specifically, the recording process may also be performed in parallel with each segment synthesis task.
S205: and splicing and rendering the segment synthesis results respectively corresponding to the segment synthesis tasks, and outputting a video synthesis result.
After completing the multiple segment synthesis tasks, as shown in fig. 5, the segment synthesis results corresponding to the multiple segment synthesis tasks may be spliced and rendered, and if there is an audio recording result that is recorded separately, the audio recording result may be synthesized into the video synthesis result, so as to generate a final video synthesis result.
In summary, according to the embodiment of the present application, during the execution of a video synthesis task driven by a video synthesis scheme (schema), a specific video synthesis task may be divided into a plurality of segment synthesis tasks according to the total duration of a video to be synthesized and the target segment number. In this way, the multiple segmentation synthesis tasks can be processed in parallel through a multithreading technology, and then the segmentation synthesis results corresponding to the multiple segmentation synthesis tasks are spliced and rendered to output a video synthesis result. In this way, since multi-line Cheng Fenduan parallel synthesis is possible, video synthesis efficiency can be improved and time required for video synthesis can be shortened.
The specific video composition scheme can be generated by a user after material addition and editing operation through a video editing composition interface. The specific video clip composition interface can be a Web page generated and displayed based on browser technology, and can respond to the material addition and the clipping operation of a user in a direct browser, and video composition is performed in the browser, so that the service cost of a developer is saved.
The embodiment of the application can also provide a clipping function SDK, a video synthesis function SDK and a structure description protocol based on browser technology for specific developers. In this way, when the developer specifically develops the video clip composition interface, the developer can be realized by the unified SDK due to the specific clipping function, the composition function and the like, so that the developer can concentrate on designing the style, the front-end rear-end link and the like of the specific video clip composition interface, hatching more product forms and jointly building the web video clip ecology.
Example two
This second embodiment provides a video clip composition method, mainly from the perspective of the capability (i.e., specific SDK, structure description protocol, etc.) provider, see fig. 6, which may include:
S601: providing a Software Development Kit (SDK) based on browser technology and an Application Programming Interface (API) thereof and a structure description protocol for a plurality of developers, wherein the SDK comprises an SDK for providing a video clip function and an SDK for providing a video synthesis function, so that the developers develop a video clip synthesis page based on browser by using the API and the structure description protocol and write the SDK into page codes;
s602: responding to the added materials and editing operation of the user in a browser through the SDK of the video editing function in the process of displaying the video editing synthetic page to the user;
s603: after receiving the video composition request, performing video composition processing in a browser through the SDK of the video composition function.
Specifically, the SDK may further include an SDK for providing a preview function, and at this time, in response to the user's addition of material and the editing operation, the video preview content may be provided through the SDK of the preview function, so as to perform visual editing based on the video.
Specifically, when video synthesis processing is performed in the browser, multiple pieces of content to be synthesized corresponding to multiple frame images in the video to be synthesized respectively can be determined according to the added materials and clipping operation of the user so as to record the video to be synthesized frame by frame; when the current image frame is recorded, the multiple to-be-synthesized contents corresponding to the current image frame are respectively converted into visual image streams, and the visual image streams are provided for a recorder unit to record the current image frame.
In addition, in a specific implementation, when the specific SDK provides the video synthesis capability, the multi-line Cheng Fenduan parallel synthesis scheme in the embodiment one can also be adopted, so that the synthesis efficiency is improved. For example, specifically, first, the total duration of the video to be synthesized may be determined according to the added material and the editing operation of the user; then, determining a plurality of segment durations according to the total duration and the target segment number, and creating a plurality of segment synthesis tasks according to the plurality of segment durations; then, the multiple segmentation synthesis tasks can be processed in parallel through a browser multithreading technology; and finally, splicing and rendering the segmented synthesis results respectively corresponding to the segmented synthesis tasks, and outputting a video synthesis result.
Specifically, the API corresponding to the SDK of the video composition function may further be associated with a segmentation number parameter, so that the developer specifies the target segmentation number according to actual requirements.
For the undescribed portions of the second embodiment, reference may be made to the description of the first embodiment, and the description is omitted here.
It should be noted that, in the embodiments of the present application, the use of user data may be involved, and in practical applications, user specific personal data may be used in the schemes described herein within the scope allowed by applicable legal regulations in the country where the applicable legal regulations are met (for example, the user explicitly agrees to the user to actually notify the user, etc.).
Corresponding to the first embodiment, the embodiment of the present application further provides a video clip synthesizing device, referring to fig. 7, which specifically may include:
a video composition scheme determining unit 701, configured to receive a material adding and clipping operation of a user through a video clipping and composing interface, and determine a video composition scheme, where the video composition scheme includes contents to be composed of a plurality of image frames in a video to be composed;
a total duration determining unit 702, configured to determine a total duration of a video to be synthesized in a process of performing video synthesis according to a video synthesis request;
a segment synthesis task creation unit 703, configured to determine a plurality of segment durations according to the total duration and the target segment number, and create a plurality of segment synthesis tasks according to the plurality of segment durations;
a parallel processing unit 704, configured to perform parallel processing on the multiple segment synthesis tasks through a multithreading technology;
and the splicing rendering unit 705 is configured to output a video synthesis result by performing splicing rendering on the segment synthesis results respectively corresponding to the plurality of segment synthesis tasks.
The video clip composing interface is generated and displayed based on browser technology, responds to the material addition and clipping operation of a user in a browser, and performs video composing in the browser.
The page code of the video clip synthesis page comprises an SDK, wherein the SDK is used for providing a video clip function and a video synthesis function for the video clip synthesis page; the SDK is common to a plurality of developers.
Specifically, the device may further include:
and the audio node creation unit is used for creating an audio node based on a browser technology after receiving the synthesis request, wherein the audio node is used for periodically playing the target sound to serve as a refreshing mechanism depending on the video synthesis process.
The segment synthesis task creation unit may specifically be configured to:
if the total time length is not divided by the number of segments, processing the segment boundary in a way of rounding the segment time lengths so that the sum of the segment time lengths is equal to the total time length.
Specifically, the clipping operation includes: and creating a plurality of material tracks, and editing the picture level, the starting time and the ending time of the added materials through the material tracks so as to perform superposition and/or splicing operation on the plurality of materials in time and/or space dimensions.
In addition, the apparatus may further include:
and the preview unit is used for providing video picture preview content in the process of responding to the adding material and the clipping operation of the user so as to carry out video picture based on the preview and carry out visual clipping on the position of the material content in the picture.
In addition, the embodiment of the application further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method of any one of the foregoing method embodiments.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of the preceding method embodiments.
Fig. 8 illustrates an architecture of an electronic device, which may include, inter alia, a processor 810, a video display adapter 811, a disk drive 812, an input/output interface 813, a network interface 814, and a memory 820. The processor 810, video display adapter 811, disk drive 812, input/output interface 813, network interface 814, and memory 820 may be communicatively coupled via a communication bus 830.
The processor 810 may be implemented by a general-purpose CPU (Central Processing Unit, processor), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc., for executing relevant programs to implement the technical solutions provided herein.
The Memory 820 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. The memory 820 may store an operating system 821 for controlling the operation of the electronic device 800, and a Basic Input Output System (BIOS) for controlling low-level operation of the electronic device 800. In addition, a web browser 823, a data storage management system 824, a video clip composition system 825, and the like may also be stored. The video clip composition system 825 may be an application program embodying the operations of the foregoing steps in the embodiments of the present application. In general, when implemented in software or firmware, the relevant program code is stored in memory 820 and executed by processor 810.
The input/output interface 813 is used to connect with an input/output module to realize information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Network interface 814 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 830 includes a path for transferring information between components of the device (e.g., processor 810, video display adapter 811, disk drive 812, input/output interface 813, network interface 814, and memory 820).
It is noted that although the above-described devices illustrate only the processor 810, video display adapter 811, disk drive 812, input/output interface 813, network interface 814, memory 820, bus 830, etc., the device may include other components necessary to achieve proper operation in an implementation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the present application, and not all the components shown in the drawings.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the embodiments or some parts of the embodiments of the present application.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The above describes the video clip synthesizing method and the electronic device provided by the present application in detail, and specific examples are applied to the description of the principles and the implementation modes of the present application, and the description of the above examples is only used for helping to understand the method and the core idea of the present application; also, as will occur to those of ordinary skill in the art, many modifications are possible in view of the teachings of the present application, both in the detailed description and the scope of its applications. In view of the foregoing, this description should not be construed as limiting the application.
Claims (7)
1. A method of video clip composition, comprising:
receiving material adding and editing operation of a user through a video editing and synthesizing interface provided by a functional module in a commodity object information system, and determining a video synthesizing scheme, wherein the video synthesizing scheme comprises contents to be synthesized of a plurality of image frames in video to be synthesized; the video clip synthesizing interface is generated by introducing a software development kit SDK and a structural body description protocol provided based on browser technology and performing customized encapsulation with an application layer of the functional module so as to respond to the material addition and clipping operation of a user in a browser and execute video synthesizing processing;
determining the total duration of the video to be synthesized in the process of executing video synthesis according to the video synthesis request;
determining a plurality of segment durations according to the total duration and the target segment number, and creating a plurality of segment synthesis tasks according to the plurality of segment durations;
parallel processing is carried out on the plurality of segmentation synthesis tasks through a multithreading technology supported by a browser, wherein in the video synthesis process, an audio node is also created based on the browser technology, and the audio node is used for periodically playing target sound to be used as a refreshing mechanism relied on in the video synthesis process; the refresh mechanism is used for: executing a picture refreshing operation in a mode of monitoring the event of periodically playing the target sound;
And splicing and rendering the segment synthesis results respectively corresponding to the segment synthesis tasks, and outputting a video synthesis result.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the determining a plurality of segment durations according to the total duration and the segment number includes:
if the total time length is not divided by the number of segments, processing the segment boundary in a way of rounding the segment time lengths so that the sum of the segment time lengths is equal to the total time length.
3. A method according to claim 1 or 2, characterized in that,
the clipping operation includes: and creating a plurality of material tracks, and editing the picture level, the starting time and the ending time of the added materials through the material tracks so as to perform superposition and/or splicing operation on the plurality of materials in time and/or space dimensions.
4. The method according to claim 1 or 2, further comprising:
in response to the user adding material and the editing operation, providing video picture preview content so as to carry out video picture based on the preview, and carrying out visual editing on the position of the material content in the picture.
5. A method of video clip composition, comprising:
providing a Software Development Kit (SDK) based on browser technology and an Application Programming Interface (API) thereof and a structure description protocol for a plurality of developers, wherein the SDK comprises an SDK for providing a video clip function and an SDK for providing a video synthesis function, so that the developers develop a video clip synthesis page based on browser by using the API and the structure description protocol and write the SDK into page codes;
responding to the added materials and editing operation of the user in a browser through the SDK of the video editing function in the process of displaying the video editing synthetic page to the user;
after receiving the video synthesis request, performing video synthesis processing in a browser through the SDK of the video synthesis function; in the video synthesis process, an audio node is also created based on a browser technology, and the audio node is used for periodically playing target sound to serve as a refreshing mechanism which is dependent in the video synthesis process; the refresh mechanism is used for: and executing the picture refreshing operation in a mode of monitoring the event of periodically playing the target sound.
6. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 5.
7. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111152811.XA CN113891113B (en) | 2021-09-29 | 2021-09-29 | Video clip synthesis method and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111152811.XA CN113891113B (en) | 2021-09-29 | 2021-09-29 | Video clip synthesis method and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113891113A CN113891113A (en) | 2022-01-04 |
CN113891113B true CN113891113B (en) | 2024-03-12 |
Family
ID=79008173
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111152811.XA Active CN113891113B (en) | 2021-09-29 | 2021-09-29 | Video clip synthesis method and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113891113B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114501079B (en) * | 2022-01-29 | 2024-10-01 | 京东方科技集团股份有限公司 | Method for processing multimedia data and related equipment |
CN114666514B (en) * | 2022-03-18 | 2024-02-02 | 稿定(厦门)科技有限公司 | Data processing method, device, electronic equipment and storage medium |
CN114615548B (en) * | 2022-03-29 | 2023-12-26 | 湖南国科微电子股份有限公司 | Video data processing method and device and computer equipment |
CN114827722A (en) * | 2022-04-12 | 2022-07-29 | 咪咕文化科技有限公司 | Video preview method, device, equipment and storage medium |
CN114979766B (en) * | 2022-05-11 | 2023-11-21 | 深圳市闪剪智能科技有限公司 | Audio and video synthesis method, device, equipment and storage medium |
CN115052201A (en) * | 2022-05-17 | 2022-09-13 | 阿里巴巴(中国)有限公司 | Video editing method and electronic equipment |
CN115086717A (en) * | 2022-06-01 | 2022-09-20 | 北京元意科技有限公司 | Method and system for real-time editing, rendering and synthesizing of audio and video works |
CN115278306B (en) * | 2022-06-20 | 2024-05-31 | 阿里巴巴(中国)有限公司 | Video editing method and device |
CN117671100A (en) * | 2022-08-31 | 2024-03-08 | 北京字跳网络技术有限公司 | Rendering level sequence adjustment method and device |
CN115499684A (en) * | 2022-09-14 | 2022-12-20 | 广州方硅信息技术有限公司 | Video resource exporting method and device and live network broadcasting system |
CN117998163A (en) * | 2022-11-07 | 2024-05-07 | 北京字跳网络技术有限公司 | Video editing method, device, electronic equipment and storage medium |
CN115914504A (en) * | 2022-11-25 | 2023-04-04 | 杭州当虹科技股份有限公司 | Method for realizing clipping and splitting of video file on HTML5 page |
CN115955583A (en) * | 2022-12-19 | 2023-04-11 | 北京沃东天骏信息技术有限公司 | Video synthesis method and device |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2281270A1 (en) * | 1999-09-01 | 2001-03-01 | Blais, Stephane R. | Interactive audio internet system |
EP1143353A2 (en) * | 2000-03-09 | 2001-10-10 | Ateon Networks, Inc. | Adaptive media streaming server for playing live and streaming media content on demand through web client's browser with no additional software or plug-ins |
AU6880901A (en) * | 1997-01-29 | 2001-11-08 | Tangozebra Limited | Method of transferring media files over a communications network |
WO2007070846A2 (en) * | 2005-12-15 | 2007-06-21 | Mediaguide, Inc. | Method and apparatus for automatic detection and identification of broadcast audio or video signals |
US7240006B1 (en) * | 2000-09-27 | 2007-07-03 | International Business Machines Corporation | Explicitly registering markup based on verbal commands and exploiting audio context |
CN101098483A (en) * | 2007-07-19 | 2008-01-02 | 上海交通大学 | Video cluster transcoding system using image group structure as parallel processing element |
CN101478669A (en) * | 2008-08-29 | 2009-07-08 | 百视通网络电视技术发展有限责任公司 | Media playing control method based on browser on IPTV system |
CN104866512A (en) * | 2014-02-26 | 2015-08-26 | 腾讯科技(深圳)有限公司 | Method, device and system for extracting webpage content |
KR20160072510A (en) * | 2014-12-15 | 2016-06-23 | 조은형 | Method for reproduing contents and electronic device performing the same |
CN109040779A (en) * | 2018-07-16 | 2018-12-18 | 腾讯科技(深圳)有限公司 | Caption content generation method, device, computer equipment and storage medium |
WO2019024919A1 (en) * | 2017-08-03 | 2019-02-07 | 腾讯科技(深圳)有限公司 | Video transcoding method and apparatus, server, and readable storage medium |
CN109640168A (en) * | 2018-11-27 | 2019-04-16 | Oppo广东移动通信有限公司 | Method for processing video frequency, device, electronic equipment and computer-readable medium |
CN110737532A (en) * | 2019-10-15 | 2020-01-31 | 四川长虹电器股份有限公司 | Android television browser memory optimization method |
CN111899322A (en) * | 2020-06-29 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Video processing method, animation rendering SDK, device and computer storage medium |
WO2021073315A1 (en) * | 2019-10-14 | 2021-04-22 | 北京字节跳动网络技术有限公司 | Video file generation method and device, terminal and storage medium |
WO2021098670A1 (en) * | 2019-11-18 | 2021-05-27 | 北京字节跳动网络技术有限公司 | Video generation method and apparatus, electronic device, and computer-readable medium |
CN113015005A (en) * | 2021-05-25 | 2021-06-22 | 腾讯科技(深圳)有限公司 | Video clipping method, device and equipment and computer readable storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120323897A1 (en) * | 2011-06-14 | 2012-12-20 | Microsoft Corporation | Query-dependent audio/video clip search result previews |
US20130067314A1 (en) * | 2011-09-10 | 2013-03-14 | Microsoft Corporation | Batch Document Formatting and Layout on Display Refresh |
US11665379B2 (en) * | 2019-11-26 | 2023-05-30 | Photo Sensitive Cinema (PSC) | Rendering image content as time-spaced frames |
-
2021
- 2021-09-29 CN CN202111152811.XA patent/CN113891113B/en active Active
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU6880901A (en) * | 1997-01-29 | 2001-11-08 | Tangozebra Limited | Method of transferring media files over a communications network |
CA2281270A1 (en) * | 1999-09-01 | 2001-03-01 | Blais, Stephane R. | Interactive audio internet system |
EP1143353A2 (en) * | 2000-03-09 | 2001-10-10 | Ateon Networks, Inc. | Adaptive media streaming server for playing live and streaming media content on demand through web client's browser with no additional software or plug-ins |
US7240006B1 (en) * | 2000-09-27 | 2007-07-03 | International Business Machines Corporation | Explicitly registering markup based on verbal commands and exploiting audio context |
WO2007070846A2 (en) * | 2005-12-15 | 2007-06-21 | Mediaguide, Inc. | Method and apparatus for automatic detection and identification of broadcast audio or video signals |
CN101098483A (en) * | 2007-07-19 | 2008-01-02 | 上海交通大学 | Video cluster transcoding system using image group structure as parallel processing element |
CN101478669A (en) * | 2008-08-29 | 2009-07-08 | 百视通网络电视技术发展有限责任公司 | Media playing control method based on browser on IPTV system |
CN104866512A (en) * | 2014-02-26 | 2015-08-26 | 腾讯科技(深圳)有限公司 | Method, device and system for extracting webpage content |
KR20160072510A (en) * | 2014-12-15 | 2016-06-23 | 조은형 | Method for reproduing contents and electronic device performing the same |
WO2019024919A1 (en) * | 2017-08-03 | 2019-02-07 | 腾讯科技(深圳)有限公司 | Video transcoding method and apparatus, server, and readable storage medium |
CN109040779A (en) * | 2018-07-16 | 2018-12-18 | 腾讯科技(深圳)有限公司 | Caption content generation method, device, computer equipment and storage medium |
CN109640168A (en) * | 2018-11-27 | 2019-04-16 | Oppo广东移动通信有限公司 | Method for processing video frequency, device, electronic equipment and computer-readable medium |
WO2021073315A1 (en) * | 2019-10-14 | 2021-04-22 | 北京字节跳动网络技术有限公司 | Video file generation method and device, terminal and storage medium |
CN110737532A (en) * | 2019-10-15 | 2020-01-31 | 四川长虹电器股份有限公司 | Android television browser memory optimization method |
WO2021098670A1 (en) * | 2019-11-18 | 2021-05-27 | 北京字节跳动网络技术有限公司 | Video generation method and apparatus, electronic device, and computer-readable medium |
CN111899322A (en) * | 2020-06-29 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Video processing method, animation rendering SDK, device and computer storage medium |
CN113015005A (en) * | 2021-05-25 | 2021-06-22 | 腾讯科技(深圳)有限公司 | Video clipping method, device and equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113891113A (en) | 2022-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113891113B (en) | Video clip synthesis method and electronic equipment | |
US11887630B2 (en) | Multimedia data processing method, multimedia data generation method, and related device | |
US20210358524A1 (en) | Method and device of editing a video | |
US11856271B2 (en) | Symbiotic interactive video | |
CN107770626A (en) | Processing method, image synthesizing method, device and the storage medium of video material | |
JP7446468B2 (en) | Video special effects processing methods, devices, electronic equipment and computer programs | |
US9620173B1 (en) | Automated intelligent visualization of data through text and graphics | |
CN108965397A (en) | Cloud video editing method and device, editing equipment and storage medium | |
TWI479332B (en) | Selective hardware acceleration in video playback systems | |
US20060204214A1 (en) | Picture line audio augmentation | |
SG173703A1 (en) | Method for generating gif, and system and media player thereof | |
CN113190314A (en) | Interactive content generation method and device, storage medium and electronic equipment | |
JP2005051703A (en) | Live streaming broadcasting method, live streaming broadcasting apparatus, live streaming broadcasting system, program, recording medium, broadcasting method, and broadcasting apparatus | |
KR20210083690A (en) | Animation Content Production System, Method and Computer program | |
JP2001024610A (en) | Automatic program producing device and recording medium with programs recorded therein | |
US10269388B2 (en) | Clip-specific asset configuration | |
CN112017261B (en) | Label paper generation method, apparatus, electronic device and computer readable storage medium | |
CN113301389B (en) | Comment processing method and device for generating video | |
CN117998163A (en) | Video editing method, device, electronic equipment and storage medium | |
CN116962807A (en) | Video rendering method, device, equipment and storage medium | |
JP4681685B1 (en) | Video editing system and video editing method | |
CN101882451B (en) | Device and method for generating DVD dynamic menu | |
CN115134659B (en) | Video editing and configuring method, device, browser, electronic equipment and storage medium | |
KR101263179B1 (en) | Method of setting up background image of mobile terminal using moving picture, the mobile with the Apparatus setting up background image of mobile terminal using moving picture, the system thereof and recording medium thereof | |
CN113556576B (en) | Video generation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |