CN114845161A - Video generation method, device, equipment and medium based on video structure information - Google Patents

Video generation method, device, equipment and medium based on video structure information Download PDF

Info

Publication number
CN114845161A
CN114845161A CN202210503832.XA CN202210503832A CN114845161A CN 114845161 A CN114845161 A CN 114845161A CN 202210503832 A CN202210503832 A CN 202210503832A CN 114845161 A CN114845161 A CN 114845161A
Authority
CN
China
Prior art keywords
video
content
structure information
reference video
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210503832.XA
Other languages
Chinese (zh)
Inventor
李令斌
张悦
陈新利
魏北冬
司奇刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202210503832.XA priority Critical patent/CN114845161A/en
Publication of CN114845161A publication Critical patent/CN114845161A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4722End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The application provides a video generation method, a video generation device, video generation equipment and a video generation medium based on video structure information, relates to the technical field of video processing, and aims to generate a video with the same video structure as a reference video. The method comprises the following steps: performing structural analysis on a reference video to obtain reference video structural information, wherein the reference video structural information is as follows: the reference video comprises a content label of each video clip and a time period occupied by each video clip in the reference video; acquiring a video generation material marked with any one content label in the reference video structure information; and generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.

Description

Video generation method, device, equipment and medium based on video structure information
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a video generation method, apparatus, device, and medium based on video structure information.
Background
In the short video era, videos bring huge changes to people's lives, for example, users can utilize video advertisements for promotion and the like. However, most users do not produce videos, and thus, a video generation template is provided to users in the related art to help users produce videos.
In the related art, video generation templates provided for users are usually manually input, the number of the manually input video generation templates is limited, and a large amount of human resources are consumed. Therefore, how to generate a video quickly and automatically is a technical problem which needs to be solved urgently.
Disclosure of Invention
In view of the foregoing, embodiments of the present application provide a method, an apparatus, a device, and a medium for video generation based on video structure information, so as to overcome the foregoing problems or at least partially solve the foregoing problems.
In a first aspect of the embodiments of the present application, a video generation method based on video structure information is provided, where the method includes:
performing structural analysis on a reference video to obtain reference video structural information, wherein the reference video structural information is as follows: the reference video comprises a content label of each video clip and a time period occupied by each video clip in the reference video;
acquiring a video generation material marked with any one content label in the reference video structure information;
and generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.
Optionally, performing structure analysis on the reference video to obtain reference video structure information, including:
performing time sequence identification on the reference video to obtain a content tag of each video frame in the reference video;
segmenting the reference video into a plurality of video segments according to the content labels of all video frames in the reference video, wherein each video segment comprises a plurality of continuous video frames with the same content label;
and arranging the respective content labels of the plurality of video clips and the time periods occupied by the respective content labels in the reference video according to time sequence to obtain the structural information of the reference video.
Optionally, generating a video to be delivered with the reference video structure information by using the obtained video generation material marked with any one of the content tags, including:
acquiring a video generation material marked with any one content label in the reference video structure information from a material library;
processing the video generation material marked with the content label according to the corresponding time period of each content label in the reference video structure information to obtain a video segment to be spliced of the content label;
and splicing the video clips to be spliced of each content label in the reference video structure information according to the sequence of the time periods corresponding to the video clips to be spliced in the reference video structure information to obtain the video to be launched with the reference video structure information.
Optionally, obtaining the video generation material marked with any one of the content tags in the reference video structure information from a material library, including:
acquiring a text material and a video clip material marked with any content label in the reference video structure information from the material library;
processing the video generation material marked with the content label according to the corresponding time period of each content label in the reference video structure information to obtain the video clip to be spliced of the content label, wherein the processing comprises the following steps:
taking the acquired text material corresponding to each content label as a subtitle of the video segment material corresponding to the content label, and adding the subtitle to the video segment material corresponding to the content label;
and intercepting video clip materials corresponding to the content labels according to the time period corresponding to each content label in the reference video structure information, or splicing a plurality of video clip materials corresponding to the content labels to obtain the video clips to be spliced of the content labels.
Optionally, the method further comprises:
acquiring an original video uploaded by a user;
carrying out time sequence identification on the original video to obtain a content label of each video frame in the original video;
according to the content labels of all video frames in the original video, segmenting the original video into a plurality of video segments with content labels, wherein each video segment with the content labels comprises a plurality of continuous video frames with the same content label;
and storing the video clips with the content tags as video clip materials to the material library.
Optionally, the method further comprises:
acquiring user behavior data generated after the video to be launched with the reference video structure information is launched;
and under the condition that the user behavior data of the video to be launched with the reference video structure information exceeds a high-quality video threshold value, taking a plurality of video clips to be spliced of the video to be launched obtained by splicing as video clip materials, and storing the video clip materials into a material library.
Optionally, the method further comprises:
acquiring user behavior data generated after each video from a third-party video content server is launched;
determining the video with the user behavior data exceeding a high-quality video threshold as the reference video;
generating a video to be launched with the reference video structure information by using the obtained video generation material marked with any one of the content tags, wherein the method comprises the following steps:
and when a video generation request triggered by a client of the current video content server is detected, generating a video to be launched with the reference video structure information by using the obtained video generation material marked with any one content label, wherein the current video content server is different from the third-party video content server.
In a second aspect of the embodiments of the present application, there is provided a video generating apparatus based on video structure information, the apparatus including:
the structure analysis module is used for carrying out structure analysis on the reference video to obtain reference video structure information, wherein the reference video structure information is as follows: the reference video comprises content labels of all video clips and time periods occupied by all the video clips in the reference video;
the material acquisition module is used for acquiring a video generation material marked with any one content label in the reference video structure information;
and the video generation module is used for generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.
In a third aspect of the embodiments of the present application, an electronic device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the video generation method based on video structure information as disclosed in the embodiments of the present application is implemented.
In a fourth aspect of the embodiments of the present application, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the video generation method based on video structure information as disclosed in the embodiments of the present application.
The embodiment of the application has the following advantages:
in the embodiment of the present application, structural analysis is performed on a reference video to obtain reference video structural information, where the reference video structural information is: the reference video comprises a content label of each video clip and a time period occupied by each video clip in the reference video; acquiring a video generation material marked with any one content label in the reference video structure information; and generating a material by using the obtained video marked with any one content label, and generating the video to be launched with the reference video structure information. Therefore, the reference video structure information can be obtained according to the reference video, so that the video generation material with the same content label as the video clip included in the reference video is obtained according to the reference video structure information, and the video to be launched with the reference video structure information is quickly generated by using the obtained video generation material.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a flowchart illustrating steps of a video generation method based on video structure information according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of creating a video generation template and generating a video in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a video generation apparatus based on video structure information in an embodiment of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
Referring to fig. 1, a flowchart illustrating steps of a video generation method based on video structure information in an embodiment of the present application is shown, and as shown in fig. 1, the video generation method based on video structure information may specifically include the following steps:
step S11: performing structural analysis on a reference video to obtain reference video structural information, wherein the reference video structural information is as follows: the reference video comprises content labels of the video clips and time periods occupied by the video clips in the reference video.
The reference video may be any video or premium video; whether a video is a reference video can be determined through user behavior data such as the number of times the video is forwarded and the number of times the video is praised, or through manual screening.
By performing structural analysis on the reference video, a plurality of video clips contained in the reference video, a content tag of each video clip, and a time period occupied by each video clip in the reference video can be determined. The time period occupied by each video clip in the reference video comprises the occupied time length and the occupied sequence. The content tag of each video clip is determined by the content tags of a plurality of video frames constituting the video clip, and a video clip is composed of a plurality of video frames having the same content tag, so that the content tag of a video clip is identical to the content tag of any one of the video frames constituting the video clip.
For example, if a reference video with a duration of 10 seconds, where the video contents of 1 st and 2 nd seconds are cate, the video contents of 3 rd and 4 th seconds are internal environments, the video contents of 5 th, 6 th and 7 th seconds are also cate, and the video contents of 8 th, 9 th and 10 th seconds are other, the obtained reference video structure information may be: (1,2) -cate; (3,4) -internal environment; (5,7) -cate; (8,10) to others.
It can be understood that the reference video structure information can be conveniently described in various forms such as a fixed data format, a table and the like, for example, the reference video structure information can also be described in the last example by [2, cate ] [2, internal environment ] [3, cate ] [3, and others ], wherein the content in each bracket sequentially describes the duration and content tag of the video clip, and the sequence between the brackets describes the sequence between the video clips.
Step S12: and acquiring the video generation material marked with any content label in the reference video structure information.
After the reference video structure information is obtained, the video generation material marked with the content label is obtained for the content label of each video clip included in the reference video structure information. Alternatively, the video generation material marked with the content tag may be acquired from a material library.
Step S13: and generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.
And generating a material by utilizing the video corresponding to the content label of each video clip, and generating the video to be launched. And the video structure information of the video to be launched is consistent with the reference video structure information. The produced video to be released can be used for recording life, propaganda of commodities or shops, making friends by using the video and the like.
According to the previous example, video generation materials with a gourmet label, an internal environment label and other labels can be obtained, the video material with the gourmet label and the duration of two seconds is placed at the forefront, the video material with the internal environment label and the duration of two seconds is placed at the second position, the video material with the gourmet label and the duration of three seconds is placed at the third position, the video material with the other labels and the duration of three seconds is placed at the last position, and the four video materials are spliced in sequence to obtain a video to be launched.
By adopting the technical scheme of the embodiment of the application, the reference video structure information can be obtained according to the reference video, so that the video generation material with the same content label as the video clip included in the reference video is obtained according to the reference video structure information, and the video to be launched with the reference video structure information is quickly generated by using the obtained video generation material.
Optionally, on the basis of the above technical solution, the reference video may be a video originated from a third-party video content server.
In order to determine a reference video from a plurality of videos in the third-party video content server, user behavior data generated after each video is released may be obtained, and a video with the user behavior data exceeding a high-quality video threshold value is determined as the reference video. The user behavior data comprises data generated by behaviors of forwarding, praise, commenting and the like of the video by the user, and each user behavior can have different weights.
The user behavior data reflects the preference degree of the user for the video, so that the reference video with the user behavior data exceeding the high-quality video threshold value is the video which is relatively in line with the preference of the user. The video to be delivered generated according to the reference video has the same reference video structure information as the reference video, and the video to be delivered can also be relatively in line with the preference of the user.
Optionally, the reference video structure information may be stored, and when a video needs to be generated subsequently, the video is generated directly and quickly according to the stored reference video structure information. And a video generation template can be created according to the reference video structure information, the video generation template records the content labels of all the video segments and the time periods occupied by all the video segments, and then the video can be directly generated according to the video generation module.
When a video generation request triggered by a client of a current video content server is detected, reference video structure information can be obtained, a video generation material marked with a content label in the reference video structure information is obtained, and a video to be launched with the reference video structure information is further generated. The current video content server is different from the third party video content server.
Therefore, the current video content server can learn the high-quality video in the third-party video content server, and generate the video to be released for releasing in the current video content server, so that the number of the video in the current video content server is expanded.
Optionally, on the basis of the above technical solution, structural analysis is performed on the reference video to obtain reference video structural information, and a content tag of each video frame in the reference video may be obtained by performing time sequence identification on the reference video; according to the content label of each video frame, the reference video is segmented into a plurality of video segments, and each video segment comprises a plurality of continuous video frames with the same content label; and arranging the respective content labels of the plurality of video clips and the time periods occupied by the respective content labels in the reference video according to time sequence to obtain the structure information of the reference video.
Firstly, video frame extraction is carried out on a reference video to obtain a plurality of video frames of the reference video. And then, carrying out image identification or image classification on each video frame of the reference video to obtain a content label of each video frame, wherein the content label of the video frame reflects the content described by the video frame.
After the content tag of each video frame of the reference video is obtained, a plurality of adjacent video frames with the same content tag are aggregated, so that a plurality of video clips can be obtained. Taking the content label of each video frame in the video clip as the content label of the video clip; according to the corresponding time of the starting video frame and the ending video frame in the video clip, the duration in the video clip can be obtained; according to the positions of the video clips in the reference video, the sequence of the video clips can be obtained.
Optionally, the content tag of each video segment of the reference video may be identified, or a plurality of video segments may be identified based on the video picture and the video voice, and the content tag, the occupied time period, and the sequence between the plurality of video segments of each video segment may be obtained.
Optionally, on the basis of the above technical solution, the video generation material may be obtained from a material library. The video generation materials in the material library can be pictures and videos uploaded by the user terminal, comment information of other users and the like. For example, a picture of a shop, a picture of a dish, a video of the shop uploaded by the merchant, and comment information of the customer to the merchant.
The content tags of the pictures may be obtained using an image processing model, which may be an image classification model, an image recognition model, or the like. The content tags of the comment information can be acquired by using a language processing model, a language representation model and the like. The method for obtaining the image and the label of the comment information is not limited in the application. A plurality of video frames of the video can be identified, and a content label of the video is obtained. Under the condition that the content labels of the video frames are different, the video can be segmented into a plurality of video segments, the content labels of the video frames in each video segment are the same, and each video segment is used as a video generation material.
Optionally, other pictures, copy cases and videos which do not infringe the copyright of other people and carry respective tags can be used as the video generation material. For example, world famous paintings, ancient poems, and the like may be used as the video generation material.
The video generation material with the content tag acquired from the material library may be only pictures or characters, or even a video clip material, the duration of the video clip with the content tag in the reference video structure information is inconsistent with the duration of the video clip with the content tag in the reference video structure information, so that the acquired video generation material with the content tag needs to be processed according to the corresponding time period of the content tag in the reference video structure information to obtain the video clip to be spliced of the content tag. The duration of the video segment to be spliced of one content label is the same as the duration of the video segment of the content label in the reference video structure information. The method for generating the video segments to be spliced according to the video generation material is not limited.
If the obtained video generation material with the content tag is a picture, a video clip to be spliced can be generated according to the picture. For example, three gourmet pictures in the video generation material may be synthesized into a video segment to be stitched with a label of gourmet, where if the time duration corresponding to the gourmet label in the reference video structure information is 3 seconds, the generated video segment to be stitched is also 3 seconds. The method for generating video clips by using pictures can refer to the related art, and the application is not limited thereto.
If the obtained video generation material with the content tag is a picture and comment information carrying the same tag, a video clip to be spliced can be generated according to the picture and the comment information. For example, if a tag of one picture is a food and a tag of one piece of comment information is also a food, a video clip to be spliced with the tag of the food can be generated by using the two tags (for example, the comment information is added to the video clip generated from the picture as a subtitle). The method for generating the video clip by using the picture and the comment information may refer to the related art, and the application is not limited thereto.
If the obtained video generation material with the content tag is a video, a video clip to be spliced can be generated according to the video. If the duration of the video is longer than the duration of the video segment corresponding to the content tag in the reference video structure information, intercepting the video; if the duration of the video is less than the duration of the video segment corresponding to the content tag in the reference video structure information, the video can be temporally stretched, or video generation materials with the same content tag are obtained and spliced to generate a longer video, and then the longer video is intercepted, and the like.
Optionally, if the obtained video generation material with the content tag is a video clip material and a text material, the text material with the content tag can be used as a subtitle of the video clip material with the same content tag, added to the video clip material corresponding to the content tag, and intercepted according to a time period corresponding to the content tag in the reference video structure information under the condition that the time length of the video clip material frequency is greater than the time length of the video clip corresponding to the content tag in the reference video structure information; and under the condition that the time length of the video clip material frequency is less than that of the video clip corresponding to the content label in the reference video structure information, splicing the video clip materials of the content labels to obtain the video clip to be spliced of the content label.
Optionally, other processing may be performed on the pictures, the comment information, and the video segments in the video generation material to generate the video segments to be spliced. For example, the picture, the comment information and the video clip can be used simultaneously to generate the video clip to be spliced.
After the video clips to be spliced of the content labels in the reference video structure information are obtained, the video clips to be spliced can be spliced according to the sequence of the time periods corresponding to the content labels in the reference video structure information, and the video to be delivered with the reference video structure information is obtained.
By adopting the technical scheme of the embodiment of the application, each content label in the video structure information can be referred to, the material is generated by utilizing the video which is the same as each content label, the video segment to be spliced and positioned in the same time period as the content label is generated, and then the video to be delivered is spliced, so that the video with the same structure information can be quickly generated according to the reference video structure information.
Optionally, on the basis of the above technical solution, the video clip material in the material library may be uploaded by the user. The video uploaded by the user may be an original video with a plurality of content tags, and the time sequence of the original video can be identified to obtain the content tags of each video frame in the original video. And segmenting the original video into a plurality of video segments with content labels according to the content labels of all the video frames in the original video, wherein each video segment comprises a plurality of continuous video frames with the same content label.
Firstly, video frame extraction is carried out on an original video to obtain a plurality of video frames of the original video. And then, carrying out image identification or image classification on each video frame of the original video to obtain a content label of each video frame. After the content tag of each video frame of the original video is obtained, a plurality of adjacent video frames with the same content tag are aggregated, so that a plurality of video clips can be obtained. And taking the content label of each video frame in the video clip as the content label of the video clip.
And after obtaining a plurality of video clips with content labels, storing the video clips with the content labels as video clip materials in a material library.
Therefore, compared with the prior art that the video containing a plurality of video segments is directly used as the video material, the video is divided into the plurality of video segments, the number of the video material is increased, and the subsequent utilization of the video segment material is facilitated.
Optionally, on the basis of the technical scheme, after the video to be launched is generated, the video to be launched may be launched, and user behavior data generated after the video to be launched is obtained. And under the condition that the user behavior data exceeds the high-quality video threshold, a plurality of video segments to be spliced of the video to be launched obtained by splicing can be used as video segment materials and stored in a material library.
The video clip to be spliced is obtained by processing the original video generation material, and the video clip to be spliced is stored in a material library, so that the quantity of the video material of the generated video can be expanded.
Fig. 2 shows a flow diagram for creating a video generation template and generating a video. The structure analysis can be carried out on the reference video to obtain the structure information of the reference video; creating a video generation template according to the reference video structure information; meanwhile, a video generation template which is manually input can be received. Utilizing a merchant photo album, a merchant video and net friend comments uploaded by a merchant to generate a video generation material with a content tag; the method comprises the steps of processing a merchant photo album by using an image understanding model to obtain a picture with a content tag, processing a friend comment by using a language processing model to obtain a file with a content tag, and performing time sequence identification and video segmentation on a merchant video to obtain a video clip with a tag. And filling the video generation material with the label into a video generation template according to the content label, and splicing to obtain the video to be delivered. And (4) releasing the video to be released, segmenting the high-quality video to be released according to the user behavior data of the video to be released to obtain other video segments carrying the tags.
Therefore, on one hand, the video generation template can be quickly and automatically created, and on the other hand, the manually input video generation template can be received; the created video generation template can be used for quickly making a video to be released; the merchant video is divided into a plurality of video segments, so that the number of video materials can be expanded; the high-quality target video is divided into a plurality of video segments, the posterior check is carried out on the video to be launched, and the quantity of video materials can be expanded.
Optionally, as an embodiment, the video to be released is a video used by a merchant to promote a store. And pre-selecting a reference video which is also used for advertising shops in a third-party video content server, acquiring reference video structure information, and creating a target video generation template used for advertising shops according to the reference video structure information.
And in response to a video generation request triggered by a client of the merchant, displaying the target video generation template and other video generation templates. And under the condition that the merchant selects the target video generation template, acquiring the video generation material of the content label contained in the target video generation template. The video generation material can be obtained from a current platform to which a merchant store belongs, for example, a picture of a commodity of the merchant in the current platform, a comment written by a net friend for the merchant in the current platform, a video generation material uploaded by the merchant in advance or at present, and the like. Video production material may also be obtained from a material library. The plurality of video generation materials may be presented to a merchant for selection by the merchant. And generating a video to be released for advertising the shop according to the selected video generation material and the target video generation template.
Optionally, in the process of generating the video to be launched, multiple choices can be provided for the user, so that the user can generate a satisfactory video in a self-defined manner, and the video can also be generated automatically by one key.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
Fig. 3 is a schematic structural diagram of a video generation apparatus based on video structure information according to an embodiment of the present application, and as shown in fig. 3, the video generation apparatus based on video structure information includes: structure analysis module, material acquisition module and video generation module, wherein:
the structure analysis module is used for carrying out structure analysis on the reference video to obtain reference video structure information, wherein the reference video structure information is as follows: the reference video comprises content labels of all video clips and time periods occupied by all the video clips in the reference video;
the material acquisition module is used for acquiring a video generation material marked with any one content label in the reference video structure information;
and the video generation module is used for generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.
Optionally, as an embodiment, the structural analysis module includes:
the time sequence identification unit is used for carrying out time sequence identification on the reference video to obtain a content label of each video frame in the reference video;
the video segmentation unit is used for segmenting the reference video into a plurality of video segments according to the content labels of all the video frames in the reference video, and each video segment comprises a plurality of continuous video frames with the same content label;
and the fragment arrangement unit is used for sequentially arranging the content labels of the video fragments and the time periods occupied by the video fragments in the reference video according to time sequence to obtain the structural information of the reference video.
Optionally, as an embodiment, the video generating module includes:
a material acquisition unit, configured to acquire a video generation material marked with any one of the content tags in the reference video structure information from a material library;
the material processing unit is used for processing the obtained video generation material marked with the content label according to the time period corresponding to each content label in the reference video structure information to obtain a video segment to be spliced of the content label;
and the segment splicing unit is used for splicing the video segments to be spliced of each content label in the reference video structure information according to the sequence of the time periods corresponding to the video segments in the reference video structure information to obtain the video to be launched with the reference video structure information.
Optionally, as an embodiment, the material obtaining unit includes:
the material subunit is used for acquiring the text material and the video segment material marked with any content label in the reference video structure information from the material library;
the material processing unit includes:
the caption adding subunit is used for taking the acquired text material corresponding to each content label as the caption of the video segment material corresponding to the content label and adding the caption into the video segment material corresponding to the content label;
and the material processing subunit is configured to intercept, according to a time period corresponding to each content tag in the reference video structure information, a video segment material corresponding to the content tag, or splice a plurality of video segment materials corresponding to the content tag to obtain a to-be-spliced video segment of the content tag.
Optionally, as an embodiment, the method further includes:
the original video acquisition module is used for acquiring an original video uploaded by a user;
the original video identification module is used for carrying out time sequence identification on the original video to obtain content labels of all video frames in the original video;
the original video segmentation module is used for segmenting the original video into a plurality of video segments with content labels according to the content labels of all the video frames in the original video, wherein each video segment with the content labels comprises a plurality of continuous video frames with the same content label;
and the original video adding module is used for storing the video clips with the content labels as the video clip materials to the material library.
Optionally, as an embodiment, the method further includes:
the data acquisition module is used for acquiring user behavior data generated after the video to be launched with the reference video structure information is launched;
and the material adding module is used for taking a plurality of video clips to be spliced of the video to be launched obtained by splicing as the video clip materials and storing the video clip materials into the material library under the condition that the user behavior data of the video to be launched with the reference video structure information exceeds a high-quality video threshold value.
Optionally, as an embodiment, the method further includes:
the third-party acquisition module is used for acquiring user behavior data generated after each video from the third-party video content server is launched;
the reference video determining module is used for determining a video with user behavior data exceeding a high-quality video threshold as the reference video;
the video generation module includes:
and the current video generation unit is used for generating a video to be launched with the reference video structure information by using the acquired video generation material marked with any one content label when a video generation request triggered by a client of a current video content server is detected, wherein the current video content server is different from the third-party video content server.
It should be noted that the device embodiments are similar to the method embodiments, so that the description is simple, and reference may be made to the method embodiments for relevant points.
The embodiment of the present application further provides an electronic device, which includes a processor, a memory, and a computer program that is stored on the memory and can be run on the processor, and when the processor executes the program, the video generation method based on the video structure information disclosed in the embodiment of the present application is implemented.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed, the video generation method based on video structure information disclosed in the embodiment of the present application is implemented.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, electronic devices and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The video generation method, device, equipment and medium based on video structure information provided by the application are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the above embodiment is only used to help understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A video generation method based on video structure information, the method comprising:
performing structural analysis on a reference video to obtain reference video structural information, wherein the reference video structural information is as follows: the reference video comprises a content label of each video clip and a time period occupied by each video clip in the reference video;
acquiring a video generation material marked with any one content label in the reference video structure information;
and generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.
2. The method of claim 1, wherein performing a structure analysis on the reference video to obtain reference video structure information comprises:
performing time sequence identification on the reference video to obtain a content tag of each video frame in the reference video;
segmenting the reference video into a plurality of video segments according to the content labels of all video frames in the reference video, wherein each video segment comprises a plurality of continuous video frames with the same content label;
and arranging the respective content labels of the plurality of video clips and the time periods occupied by the respective content labels in the reference video according to time sequence to obtain the structural information of the reference video.
3. The method of claim 1, wherein generating the video to be delivered with the reference video structure information using the obtained video generation material marked with any one of the content tags comprises:
acquiring a video generation material marked with any one content label in the reference video structure information from a material library;
processing the video generation material marked with the content label according to the corresponding time period of each content label in the reference video structure information to obtain a video segment to be spliced of the content label;
and splicing the video clips to be spliced of each content label in the reference video structure information according to the sequence of the time periods corresponding to the content labels in the reference video structure information to obtain the video to be launched with the reference video structure information.
4. The method of claim 3, wherein obtaining video production material tagged with any of the content tags in the reference video structure information from a material library comprises:
acquiring a text material and a video clip material marked with any content label in the reference video structure information from the material library;
processing the video generation material marked with the content label according to the corresponding time period of each content label in the reference video structure information to obtain the video clip to be spliced of the content label, wherein the processing comprises the following steps:
taking the acquired text material corresponding to each content label as a subtitle of the video segment material corresponding to the content label, and adding the subtitle to the video segment material corresponding to the content label;
and intercepting video clip materials corresponding to the content labels according to the time period corresponding to each content label in the reference video structure information, or splicing a plurality of video clip materials corresponding to the content labels to obtain the video clips to be spliced of the content labels.
5. The method of claim 4, further comprising:
acquiring an original video uploaded by a user;
carrying out time sequence identification on the original video to obtain a content label of each video frame in the original video;
according to the content labels of all video frames in the original video, segmenting the original video into a plurality of video segments with content labels, wherein each video segment with the content labels comprises a plurality of continuous video frames with the same content label;
and storing the plurality of video clips with the content tags as the video clip materials to the material library.
6. The method of claim 4, further comprising:
acquiring user behavior data generated after the video to be launched with the reference video structure information is launched;
and under the condition that the user behavior data of the video to be launched with the reference video structure information exceeds a high-quality video threshold value, taking a plurality of video segments to be spliced of the video to be launched obtained by splicing as the video segment materials, and storing the video segment materials into the material library.
7. The method of any of claims 1-6, further comprising:
acquiring user behavior data generated after each video from a third-party video content server is launched;
determining the video with the user behavior data exceeding a high-quality video threshold as the reference video;
generating a video to be launched with the reference video structure information by using the obtained video generation material marked with any one of the content tags, wherein the method comprises the following steps:
and when a video generation request triggered by a client of the current video content server is detected, generating a video to be launched with the reference video structure information by using the obtained video generation material marked with any one content label, wherein the current video content server is different from the third-party video content server.
8. An apparatus for generating a video based on video structure information, the apparatus comprising:
the structure analysis module is used for carrying out structure analysis on the reference video to obtain reference video structure information, wherein the reference video structure information is as follows: the reference video comprises a content label of each video clip and a time period occupied by each video clip in the reference video;
the material acquisition module is used for acquiring a video generation material marked with any one content label in the reference video structure information;
and the video generation module is used for generating a video to be launched with the reference video structure information by utilizing the obtained video generation material marked with any one content label.
9. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the video structure information based video generation method according to any one of claims 1 to 7.
10. A computer-readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the video structure information-based video generation method of any one of claims 1 to 7.
CN202210503832.XA 2022-05-10 2022-05-10 Video generation method, device, equipment and medium based on video structure information Pending CN114845161A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210503832.XA CN114845161A (en) 2022-05-10 2022-05-10 Video generation method, device, equipment and medium based on video structure information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210503832.XA CN114845161A (en) 2022-05-10 2022-05-10 Video generation method, device, equipment and medium based on video structure information

Publications (1)

Publication Number Publication Date
CN114845161A true CN114845161A (en) 2022-08-02

Family

ID=82568831

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210503832.XA Pending CN114845161A (en) 2022-05-10 2022-05-10 Video generation method, device, equipment and medium based on video structure information

Country Status (1)

Country Link
CN (1) CN114845161A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110996121A (en) * 2019-12-11 2020-04-10 北京市商汤科技开发有限公司 Information processing method and device, electronic equipment and storage medium
CN113473182A (en) * 2021-09-06 2021-10-01 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium
CN113992942A (en) * 2019-12-05 2022-01-28 腾讯科技(深圳)有限公司 Video splicing method and device and computer storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113992942A (en) * 2019-12-05 2022-01-28 腾讯科技(深圳)有限公司 Video splicing method and device and computer storage medium
CN110996121A (en) * 2019-12-11 2020-04-10 北京市商汤科技开发有限公司 Information processing method and device, electronic equipment and storage medium
WO2021114552A1 (en) * 2019-12-11 2021-06-17 北京市商汤科技开发有限公司 Information processing method and apparatus, electronic device and storage medium
CN113473182A (en) * 2021-09-06 2021-10-01 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108377418B (en) Video annotation processing method and device
CN103988202A (en) Image attractiveness based indexing and searching
JP2010529566A5 (en)
US10339587B2 (en) Method, medium, and system for creating a product by applying images to materials
CN110221747B (en) Presentation method of e-book reading page, computing device and computer storage medium
CN109598171B (en) Data processing method, device and system based on two-dimensional code
CN110677735A (en) Video positioning method and device
CN113852832B (en) Video processing method, device, equipment and storage medium
US20180075066A1 (en) Method and apparatus for displaying electronic photo, and mobile device
CN107239503B (en) Video display method and device
US10163013B2 (en) Photographic image extraction apparatus, photographic image extraction method, and program
CN111683267A (en) Method, system, device and storage medium for processing media information
CN112052038B (en) Method and device for generating front-end interface
CN112287168A (en) Method and apparatus for generating video
CN107315828A (en) Data processing method, device and terminal device
CN110827058A (en) Multimedia promotion resource insertion method, equipment and computer readable medium
CN111078915B (en) Click-to-read content acquisition method in click-to-read mode and electronic equipment
CN107403460B (en) Animation generation method and device
CN109120994A (en) A kind of automatic editing method, apparatus of video file and computer-readable medium
CN113518187A (en) Video editing method and device
CN114845161A (en) Video generation method, device, equipment and medium based on video structure information
CN113763009A (en) Picture processing method, picture skipping method, device, equipment and medium
JPWO2015140922A1 (en) Information processing system, information processing method, and information processing program
CN116527994A (en) Video generation method and device and electronic equipment
WO2022089427A1 (en) Video generation method and apparatus, and electronic device and computer-readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination