CN113473225A

CN113473225A - Video generation method and device, electronic equipment and storage medium

Info

Publication number: CN113473225A
Application number: CN202110762497.0A
Authority: CN
Inventors: 谢陶欣; 李治中; 罗洪运
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2021-10-01
Also published as: WO2023279726A1

Abstract

The disclosure relates to a video generation method and apparatus, an electronic device, and a storage medium. The method comprises the following steps: in response to a behavior search instruction, displaying a candidate video segment matched with the behavior search instruction in a search result page, wherein the behavior search instruction comprises a first search word for representing a behavior, and the candidate video segment comprises a behavior tag matched with the first search word; responding to a selection instruction aiming at least one candidate video clip in the search result page, and respectively determining at least one candidate video clip as a target video clip; and generating a target video according to at least one target video segment.

Description

Video generation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of video technologies, and in particular, to a video generation method and apparatus, an electronic device, and a storage medium.

Background

In a conventional video editing mode, if a user wants to find a video segment with a certain behavior, the user needs to browse each video segment to find the desired content, and then the user creates a video by manual cutting and splicing. This approach is time and labor intensive.

Disclosure of Invention

The present disclosure provides a video generation technical solution.

According to an aspect of the present disclosure, there is provided a video generation method including:

in response to a behavior search instruction, displaying a candidate video segment matched with the behavior search instruction in a search result page, wherein the behavior search instruction comprises a first search word for representing a behavior, and the candidate video segment comprises a behavior tag matched with the first search word;

responding to a selection instruction aiming at least one candidate video clip in the search result page, and respectively determining at least one candidate video clip as a target video clip;

and generating a target video according to at least one target video segment.

The candidate video segments matched with the behavior search instruction are displayed in a search result page in response to the behavior search instruction, wherein the behavior search instruction comprises a first search word used for representing a behavior, the candidate video segments comprise a behavior tag matched with the first search word, at least one candidate video segment is respectively determined as a target video segment in response to a selection instruction for at least one candidate video segment in the search result page, a target video is generated according to at least one target video segment, therefore, the candidate video segments can be quickly searched and obtained on the basis of the behavior information, the target video is generated according to the candidate video segments, the time for selecting the video segments for synthesizing the target video by a user can be saved, and the efficiency of video clipping by the user is improved.

In one possible implementation, the behavior search instruction further includes: the second search word is used for representing the behavior execution main body, and/or the third search word is used for representing the scene corresponding to the behavior execution main body.

In this implementation, the behavior search instruction includes a second search term used for representing a behavior execution subject, so that a candidate video segment of a certain execution subject (for example, a certain person, a certain animal, or a certain animal) executing a certain behavior can be quickly searched by combining behavior information and information of the behavior execution subject, and a target video is generated according to the candidate video segment, thereby improving the efficiency of video clipping by a user; the behavior search instruction comprises a third search word used for representing the scene corresponding to the behavior execution main body, so that candidate video segments of the execution main body under a certain scene for executing a certain behavior can be quickly searched by combining the behavior information and the scene information corresponding to the behavior execution main body, a target video is generated according to the candidate video segments, and the video editing efficiency of a user is improved; by means of the behavior search instruction including the second search word for representing the behavior execution subject and the third search word for representing the scene corresponding to the behavior execution subject, the candidate video segment of a certain execution subject (such as a certain person, a certain animal or a certain animal) under a certain scene for executing a certain behavior can be quickly searched and obtained by combining the behavior information, the information of the behavior execution subject and the information of the scene corresponding to the behavior execution subject, the target video is generated according to the candidate video segment, and the efficiency of video clipping by the user is improved.

In one possible implementation, the behavior includes at least one of an action, an expression, and a sound.

By adopting the implementation mode, the candidate video segments can be quickly searched and obtained based on at least one of the actions, the expressions and the sounds, and the target video is generated according to the candidate video segments, so that the time for a user to select the video segments for synthesizing the target video can be saved, and the video editing efficiency of the user is improved.

In one possible implementation, the selection instruction includes a trigger instruction for a first selection control;

the determining, in response to a selection instruction for at least one of the candidate video segments in the search result page, at least one of the candidate video segments as a target video segment respectively includes:

responding to a preview instruction for any candidate video clip in the search result page, displaying a first video preview window, playing the candidate video clip through the first video preview window, and displaying the first selection control in the first video preview window;

and responding to a triggering instruction aiming at the first selection control, and determining the candidate video clip as a target video clip.

In the implementation manner, a first video preview window is displayed in response to a preview instruction for any one of the candidate video segments in the search result page, the candidate video segment is played through the first video preview window, the first selection control is displayed in the first video preview window, and the candidate video segment is determined as a target video segment in response to a trigger instruction for the first selection control, so that a user can conveniently and quickly select the candidate video segment by previewing the candidate video segment through a larger window.

In one possible implementation, the selection instruction includes a trigger instruction for a second selection control;

in response to detecting that the user pays attention to any candidate video clip in the search result page, displaying the second selection control corresponding to the candidate video clip in the search result page;

and responding to a triggering instruction aiming at the second selection control, and determining the candidate video clip as a target video clip.

In the implementation manner, for any candidate video segment in the search result page, if it is not detected that the user pays attention to the candidate video segment, the second selection control corresponding to the candidate video segment may not be displayed, and if it is detected that the user pays attention to the candidate video segment, the second selection control corresponding to the candidate video segment may be displayed, so that unnecessary information display can be reduced, and the experience of the user in selecting the video segment can be improved.

In one possible implementation, the method further includes:

in response to detecting that a cursor remains on any of the candidate video segments in the search results page, determining that a user is interested in the candidate video segment;

and/or the presence of a gas in the gas,

determining that the user focuses on any of the candidate video segments in the search results page in response to detecting that the user gaze is stopped on the candidate video segment.

According to the implementation mode, whether the user pays attention to the candidate video clip in the search result page can be accurately detected.

In one possible implementation, the method further includes:

displaying a video clip control in the search results page according to the number of the target video segments, wherein icon content of the video clip control comprises the number of the target video segments.

In this implementation manner, a video clip control is displayed in the search result page according to the number of the target video segments, where the icon content of the video clip control includes the number of the target video segments, so that the user can see the number of the selected target video segments in the search result page without clicking on a video clip interface or other interfaces to confirm the number of the selected target video segments, thereby further improving the convenience of video clipping for the user.

In one possible implementation, the generating a target video according to at least one target video segment includes:

in response to a video clipping instruction, displaying a video clipping interface, wherein the video clipping interface comprises at least one second video preview window in one-to-one correspondence with at least one target video segment, and the second video preview window is used for displaying the target video segment;

and responding to a video composition instruction, and compositing at least one target video segment according to the sequence of at least one second video preview window in the video clip interface to obtain a target video.

According to the implementation mode, each target video segment can be displayed through the video editing interface, so that a user can conveniently and intuitively know the information of each target video segment of the target video to be synthesized.

In one possible implementation, after the displaying the video clip interface, the method further comprises:

in response to detecting that the user is interested in any of the second video preview windows in the video clip interface, playing a target video segment in the second video preview window.

According to the implementation manner, for any target video segment in the video clip interface, if it is not detected that the user pays attention to the target video segment (or it is not detected that the user pays attention to the second video preview window corresponding to the target video segment), the target video segment may not be played, and if it is detected that the user pays attention to the target video segment (or it is detected that the user pays attention to the second video preview window corresponding to the target video segment), the target video segment may be played, so that the user can conveniently watch the content of the interested target video segment, unnecessary playing can be reduced, and the experience of the user in video clip can be improved.

In one possible implementation manner, the second video preview window includes a clipping control, and the clipping control includes a clipping start point sub-control and a clipping end point sub-control;

after the displaying the video clip interface, the method further comprises:

and responding to a dragging instruction aiming at the cutting starting point sub-control and/or the cutting end point sub-control, and cutting the target video segment in the second video preview window to obtain the cut target video segment.

According to the implementation mode, a user can conveniently cut each target video segment in the video clipping interface, so that the convenience of video clipping can be further improved.

adjusting an order of the second video preview windows in the video clip interface in response to a move operation for any of the second video preview windows.

According to the implementation mode, the user can conveniently adjust the sequence of the target video clips in the video clip interface, so that the convenience of video clip can be further improved.

and displaying a deletion control corresponding to any one of the second video preview windows in response to the detection that the user focuses on the second video preview window.

In the implementation manner, for any second video preview window in the video clip interface, if it is not detected that the user pays attention to the second video preview window, the deletion control corresponding to the second video preview window may not be displayed, and if it is detected that the user pays attention to the second video preview window, the deletion control corresponding to the second video preview window may be displayed, so that unnecessary information display can be reduced, and the experience of the user in video clip is improved.

In one possible implementation manner, after the displaying of the deletion control corresponding to the second video preview window, the method further includes:

and in response to the deletion control being triggered, deleting the second video preview window from the video clip interface, and deleting a target video segment corresponding to the second video preview window.

According to the implementation mode, the user can delete the target video segment in the video editing interface conveniently, so that the convenience of the user in video editing can be further improved.

In one possible implementation, the video clip interface further includes a third video preview window, and the third video preview window is used for previewing the target video.

According to the implementation mode, the user can conveniently preview the target video in the video editing interface, so that the convenience of the user in video editing can be further improved.

determining a target resolution selected by a user;

and generating a target video according to the target resolution and at least one target video fragment.

According to the implementation mode, the requirements of users on different resolutions can be met.

In one possible implementation, after the generating the target video according to at least one of the target video segments, the method further includes:

responding to a release instruction corresponding to the target video, and determining a target release platform and target release time selected by a user;

and sending a publishing request corresponding to the target video to a server corresponding to the target publishing platform, wherein the publishing request comprises the target publishing time.

According to this implementation, the target video can be automatically distributed on time at the distribution time desired by the user.

According to an aspect of the present disclosure, there is provided a video generating apparatus including:

the first display module is used for responding to a behavior search instruction, and displaying a candidate video segment matched with the behavior search instruction in a search result page, wherein the behavior search instruction comprises a first search word used for representing a behavior, and the candidate video segment comprises a behavior tag matched with the first search word;

the first determining module is used for responding to a selection instruction aiming at least one candidate video clip in the search result page and respectively determining the at least one candidate video clip as a target video clip;

and the generating module is used for generating a target video according to at least one target video segment.

the first determination module is to:

In one possible implementation, the apparatus further includes:

a second determination module, configured to determine that the user focuses on any of the candidate video segments in the search result page in response to detecting that a cursor stays on the candidate video segment;

and/or the presence of a gas in the gas,

and the third determination module is used for determining that the user focuses on the candidate video clips in response to the fact that the line of sight of the user stays on any one of the candidate video clips in the search result page.

In one possible implementation, the apparatus further includes:

and the display module is used for displaying a video clip control in the search result page according to the number of the target video segments, wherein the icon content of the video clip control comprises the number of the target video segments.

In one possible implementation, the generating module is configured to:

In one possible implementation, the apparatus further includes:

and the playing module is used for responding to the detection that the user pays attention to any one second video preview window in the video clip interface, and playing the target video segment in the second video preview window.

the device further comprises:

and the cutting module is used for responding to a dragging instruction aiming at the cutting starting point sub-control and/or the cutting end point sub-control, and cutting the target video segment in the second video preview window to obtain the cut target video segment.

In one possible implementation, the apparatus further includes:

and the adjusting module is used for responding to the moving operation aiming at any one second video preview window and adjusting the sequence of the second video preview window in the video clip interface.

In one possible implementation, the apparatus further includes:

and the second display module is used for responding to the detection that the user pays attention to any one of the second video preview windows and displaying the deletion control corresponding to the second video preview window.

In one possible implementation, the apparatus further includes:

and the deleting module is used for deleting the second video preview window from the video clip interface and deleting the target video segment corresponding to the second video preview window in response to the deletion control being triggered.

In one possible implementation, the generating module is configured to:

determining a target resolution selected by a user;

In one possible implementation, the apparatus further includes:

the fourth determining module is used for responding to the issuing instruction corresponding to the target video and determining the target issuing platform and the target issuing time selected by the user;

and the sending module is used for sending a publishing request corresponding to the target video to a server corresponding to the target publishing platform, wherein the publishing request comprises the target publishing time.

According to an aspect of the present disclosure, there is provided an electronic device including: one or more processors; a memory for storing executable instructions; wherein the one or more processors are configured to invoke the memory-stored executable instructions to perform the above-described method.

According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.

In the embodiment of the disclosure, by responding to the behavior search instruction and showing the candidate video segment matched with the behavior search instruction in the search result page, wherein the behavior search instruction comprises a first search word for representing a behavior, the candidate video segments comprise behavior tags matched with the first search word, at least one candidate video segment is respectively determined as a target video segment in response to a selection instruction for at least one candidate video segment in the search result page, and a target video is generated according to at least one target video segment, therefore, candidate video clips can be quickly searched and obtained based on the behavior information, the target video is generated according to the candidate video clips, therefore, the time for selecting the video segments for synthesizing the target video by the user can be saved, and the video editing efficiency of the user is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 shows a flowchart of a video generation method provided by an embodiment of the present disclosure.

Fig. 2 is a schematic diagram illustrating that N candidate video segments are displayed through N fourth video preview windows in the video generation method provided by the embodiment of the present disclosure.

Fig. 3 shows a schematic diagram that a candidate video segment is played through a first video preview window in the video generation method provided by the embodiment of the present disclosure, and a first selection control is displayed in the first video preview window.

Fig. 4 is a schematic diagram illustrating a second selection control corresponding to a candidate video segment displayed in a search result page in the video generation method provided by the embodiment of the disclosure.

Fig. 5 is a schematic diagram illustrating that a video clip control is displayed in a search result page according to the number of target video segments in the video generation method provided by the embodiment of the disclosure.

Fig. 6 shows another schematic diagram illustrating that a video clip control is displayed in a search result page according to the number of target video segments in the video generation method provided by the embodiment of the disclosure.

Fig. 7 shows a schematic diagram of a video clip interface in a video generation method provided by an embodiment of the present disclosure.

Fig. 8 illustrates a schematic diagram of a cropping control in a video generation method provided by an embodiment of the present disclosure.

Fig. 9 shows another schematic diagram of a video clip interface in the video generation method provided by the embodiment of the present disclosure.

Fig. 10 shows another schematic diagram of a video clip interface in the video generation method provided by the embodiment of the present disclosure.

Fig. 11 shows a block diagram of a video generation apparatus provided by an embodiment of the present disclosure.

Fig. 12 shows a block diagram of an electronic device 800 provided by an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

The disclosed embodiments provide a video generation method and apparatus, an electronic device, and a storage medium, wherein a behavior search instruction is responded, candidate video segments matched with the behavior search instruction are displayed in a search result page, wherein the behavior search instruction comprises a first search word for representing a behavior, the candidate video segments comprise a behavior tag matched with the first search word, at least one candidate video segment is respectively determined as a target video segment in response to a selection instruction for at least one candidate video segment in the search result page, and a target video is generated according to at least one target video segment, so that the candidate video segments can be quickly searched based on behavior information, the target video is generated according to the candidate video segments, and thus the time for a user to select the video segments for synthesizing the target video can be saved, the efficiency of the user for video editing is improved.

The following describes a video generation method provided by the embodiments of the present disclosure in detail with reference to the drawings.

Fig. 1 shows a flowchart of a video generation method provided by an embodiment of the present disclosure. In one possible implementation, the video generation method may be performed by a terminal device or other processing device. The terminal device may be a User Equipment (UE), a mobile device, a User terminal, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, or a wearable device. In some possible implementations, the video generation method may be implemented by a processor calling computer readable instructions stored in a memory. As shown in fig. 1, the video generating method includes steps S11 through S13.

In step S11, in response to a behavior search instruction, presenting a candidate video segment matching the behavior search instruction in a search result page, wherein the behavior search instruction includes a first search word for representing a behavior, and the candidate video segment includes a behavior tag matching the first search word.

In step S12, in response to a selection instruction for at least one of the candidate video segments in the search result page, at least one of the candidate video segments is respectively determined as a target video segment.

In step S13, a target video is generated from at least one of the target video segments.

In the disclosed embodiment, the behavior search instruction may represent an instruction to search for a video clip based on the behavior information. The behavior search instruction may include one or more search terms, and the behavior search instruction includes at least a first search term for representing a behavior. Wherein the first search word may represent a search word used to represent a behavior in the behavior search instruction. The search term may represent a word or phrase used for a search. Any search term may include at least one character. The embodiment of the disclosure can support a user to search a video clip in which a certain behavior occurs in a text search mode.

In one possible implementation, the behavior includes at least one of an action, an expression, and a sound. In this implementation, the action may be any action of hugging, clapping, holding, standing up, waving, etc. For example, the first search term used to represent an action may be "hug" or "applause" or the like. In this implementation, the expression may be any expression that laughs, smiles, cries, is hard, dull, surprised, and so on. For example, the first search word for representing an expression may be "laugh" or "stubborn" or the like. In this implementation, the first search term used to represent sound may be a search term used to represent sound content in a video clip and/or content related to sound content in a video clip. For example, the first search term used to represent the voice may be the content of the speech of the person in the video segment, such as "i feel unable", the song name or lyrics of the song sung by the person in the video segment, or the poem name of the poem commented by the person in the video segment, such as "will go into wine", and the like, without limitation. By adopting the implementation mode, the candidate video segments can be quickly searched and obtained based on at least one of the actions, the expressions and the sounds, and the target video is generated according to the candidate video segments, so that the time for a user to select the video segments for synthesizing the target video can be saved, and the video editing efficiency of the user is improved.

In one possible implementation, the behavior search instruction further includes a second search term representing a behavior execution subject. In this implementation, the second search term may refer to a search term used to represent the behavior execution subject. In the case where the action execution subject is a person, the second search word may be a name of the person, such as a real name, a nickname, a character name, or the like. In the case where the behavior execution subject is an animal, the second search word may be a name or a category name of the animal. For example, the animal may be named "panda", the animal may be named "rolling" or "trypan", and the like. According to the implementation mode, candidate video clips of a certain execution subject (such as a certain person or a certain animal) for executing a certain behavior can be quickly searched by combining behavior information and information of the behavior execution subject, and the target video is generated according to the candidate video clips, so that the efficiency of video clipping by a user is improved.

In another possible implementation manner, the behavior search instruction further includes a third search word for indicating a scene corresponding to the behavior execution subject. In this implementation, the third search term may refer to a search term representing a scene corresponding to the behavior execution subject. For example, the scene may be close-up, medium, panoramic, distant, etc., and is not limited herein. According to the implementation mode, the behavior information and the scene information corresponding to the behavior execution main body can be combined to quickly search to obtain a candidate video segment of the execution main body executing a certain behavior under a certain scene, and the target video is generated according to the candidate video segment, so that the video clipping efficiency of the user is improved.

In another possible implementation manner, the behavior search instruction further includes a second search word for representing a behavior execution subject and a third search word for representing a scene corresponding to the behavior execution subject. According to the implementation mode, the behavior information, the information of the behavior execution subject and the information of the scene corresponding to the behavior execution subject can be combined, the candidate video segment of a certain execution subject (such as a certain person, a certain animal or a certain animal) under a certain scene for executing a certain behavior can be quickly searched and obtained, the target video is generated according to the candidate video segment, and the efficiency of video clipping by the user is improved.

In the embodiment of the disclosure, the terminal device may generate a behavior search request in response to the behavior search instruction, and send the behavior search request to the server, where the behavior search request may include a search word in the behavior search instruction, and the behavior search request includes at least a first search word. The server side can respond to the behavior search request, search candidate video clips matched with the behavior search request from the video database, and return information of the candidate video clips to the terminal equipment. And the candidate video clips matched with the behavior search request are the candidate video clips matched with the behavior search instruction. The candidate video segments may represent video segments that match the behavioral search instruction.

The server side can respond to the behavior search request, determine a target label matched with a search word in the behavior search request, and can search a video segment comprising the target label from a video database to obtain a candidate video segment. The target tag may represent a tag of a video segment matching a search term in the behavioral search request, that is, the target tag may represent a tag of a video segment matching a search term in the behavioral search instruction. A search term may be matched to one or more tags. The number of target tags may be one or more than two. The candidate video segments may include one or more tags. The candidate video segment includes at least a behavior tag matching the first search term, and the candidate video segment may further include other target tags. In one example, the candidate video clip may further include a behavior execution subject tag, such as a person name tag, an animal category name tag, or the like, that matches a second search term in the behavior search instruction that represents the behavior execution subject. In another example, the candidate video segment may further include a scene tag matching a third search term in the behavior search instruction for representing a scene corresponding to the behavior execution subject.

In one possible implementation, the search term entered by the user may not be completely consistent with the target tag, e.g., the user wishes to search for a video clip of the "up" action, while the search term entered by the user is "up"; in contrast, in the process of determining the target tag matched with the search word, the tag having the most similar semantic meaning to the search word can be obtained as the matched target tag in a semantic analysis (or natural language model) manner. For example, a search word input by a user may be converted into a corresponding search vector in a certain manner, and then a tag vector matched with the search vector is determined in a vector similarity calculation manner such as cosine similarity and euclidean distance, where the tag corresponding to the tag vector is a target tag matched with the search word.

The video database may include video clips uploaded by users who issue behavior search instructions, and may also include video clips uploaded by other users, which is not limited herein. The server side can respond to the fact that any user uploads a new video clip, conduct behavior detection on the video clip, and obtain a behavior tag of the video clip. The server side can also perform behavior execution subject detection on the video clip to obtain a behavior execution subject label of the video clip. For example, the server may perform people detection on the video segment to obtain a people tag of the video segment. For another example, the server may perform animal detection on the video segment to obtain an animal tag of the video segment. The server can also detect the scene corresponding to the behavior execution main body of the video clip to obtain the scene label corresponding to the behavior execution main body of the video clip.

The terminal device may present the candidate video segment in a search results page in response to receiving the information of the candidate video segment. The terminal device can display N candidate video clips in the search result page, wherein N is an integer greater than or equal to 1. For example, in a case that the number of candidate video clips returned by the server is less than or equal to M, the terminal device may show all candidate video clips returned by the server in the search result page, in this case, N may be the number of candidate video clips returned by the server, where M may represent the maximum number of candidate video clips that can be simultaneously shown by the search interface page, and M is greater than 1; in a case that the number of the candidate video clips returned by the server is greater than M, the terminal device may display the M candidate video clips returned by the server in the search interface page, in which case N may be equal to M.

In a possible implementation manner, the terminal device may display N candidate video segments through N fourth video preview windows in the search result page, where the fourth video preview windows correspond to the candidate video segments one to one, that is, each fourth video preview window is used to display one candidate video segment respectively. Fig. 2 is a schematic diagram illustrating that N candidate video segments are displayed through N fourth video preview windows in the video generation method provided by the embodiment of the present disclosure. In the example shown in fig. 2, the first search term is "clap", and the number of the fourth video preview windows 201 in the search result page 200 is 6, which are respectively used for showing the candidate video segment 1 to the candidate video segment 6. In one example, the terminal device may present the cover page of the N candidate video clips through the N fourth video preview windows in the search result page. Wherein, the cover of any candidate video clip can be any video frame of the candidate video clip. In another example, the terminal device may play the N candidate video clips silently through the N fourth video preview windows in the search results page. In one example, the sound of the candidate video segments in the fourth video preview window may be played in response to detecting that the user is interested in any fourth video preview window in the search results page.

In the embodiment of the disclosure, the user may select one or more candidate video segments in the search result page as the target video segment. The target video segment may represent a video segment selected by a user from the candidate video segments. The user can select one candidate video segment in the search result page as the target video segment each time, and also can select a plurality of candidate video segments in the search result page as the target video segments each time. The number of target video segments may be one or more than two.

In one possible implementation, the selection instruction includes a trigger instruction for a first selection control; the determining, in response to a selection instruction for at least one of the candidate video segments in the search result page, at least one of the candidate video segments as a target video segment respectively includes: responding to a preview instruction for any candidate video clip in the search result page, displaying a first video preview window, playing the candidate video clip through the first video preview window, and displaying the first selection control in the first video preview window; and responding to a triggering instruction aiming at the first selection control, and determining the candidate video clip as a target video clip. The display mode may be pop-up, screen switching, etc., and the pop-up mode is described as an example below.

In this implementation, the user may issue a preview instruction for any candidate video segment in the search result page by clicking, touching, gazing, or the like. For example, in response to detecting a click operation for any candidate video segment in the search results page, it may be determined that a preview instruction for the candidate video segment is detected. The first video preview window may be a floating window and the first video preview window may float above the search results page. The size of the first video preview window may be larger than the size of the fourth video preview window in the search results interface, thereby facilitating the user to better view the content of the candidate video segment of interest. The first selection control is a control used for selecting the candidate video clip in the first video preview window. The user can send out a trigger instruction aiming at the first selection control in a clicking, touching, watching and other modes. For example, in response to detecting a click operation on the first selection control, it may be determined that a trigger instruction for the first selection control is detected. Fig. 3 shows a schematic diagram that a candidate video segment is played through a first video preview window in the video generation method provided by the embodiment of the present disclosure, and a first selection control is displayed in the first video preview window. In the example shown in fig. 3, a first video preview window 202 may pop up, play the candidate video clip 3 through the first video preview window 202, and expose a first control 203 in the first video preview window 202 in response to a preview instruction for the candidate video clip 3 in the search results interface. The candidate video segment 3 may be determined to be the target video segment in response to a triggering instruction for the first control 203.

In another possible implementation manner, the selection instruction includes a trigger instruction for a second selection control; the determining, in response to a selection instruction for at least one of the candidate video segments in the search result page, at least one of the candidate video segments as a target video segment respectively includes: in response to detecting that the user pays attention to any candidate video clip in the search result page, displaying the second selection control corresponding to the candidate video clip in the search result page; and responding to a triggering instruction aiming at the second selection control, and determining the candidate video clip as a target video clip.

In this implementation, in response to detecting that the user focuses on any candidate video segment in the search result page, the second selection control corresponding to the candidate video segment may be displayed in the fourth video preview window corresponding to the candidate video segment in the search result page. The user can send out a trigger instruction aiming at the second selection control in the modes of clicking, touching, watching and the like. For example, in response to detecting a click operation on the second selection control, it may be determined that a trigger instruction for the second selection control is detected. Fig. 4 is a schematic diagram illustrating a second selection control corresponding to a candidate video segment displayed in a search result page in the video generation method provided by the embodiment of the disclosure. In the example shown in fig. 4, in response to detecting that the user focuses on the candidate video segment 3 in the search results page, the second selection control 204 corresponding to the candidate video segment 3 may be presented in the search results page. In the implementation manner, for any candidate video segment in the search result page, if it is not detected that the user pays attention to the candidate video segment, the second selection control corresponding to the candidate video segment may not be displayed, and if it is detected that the user pays attention to the candidate video segment, the second selection control corresponding to the candidate video segment may be displayed, so that unnecessary information display can be reduced, and the experience of the user in selecting the video segment can be improved.

As an example of this implementation, the method further comprises: in response to detecting that a cursor remains on any of the candidate video segments in the search results page, determining that a user is interested in the candidate video segment; and/or determining that the user focuses on any candidate video segment in the search result page in response to detecting that the user stays on the candidate video segment. In this example, it may be determined that the cursor is detected to stay on any candidate video segment in the search result page in response to detecting that the duration of the cursor staying on the candidate video segment reaches a first preset duration; the method may further include determining that the user gaze is detected to stay on any candidate video segment in the search results page in response to detecting that the user gaze stays on the candidate video segment for a duration of time that reaches a second preset duration. The first preset time duration and the second preset time duration may be equal to or unequal to each other, and are not limited herein. According to this example, it can be accurately detected whether the user focuses on a candidate video clip in the search result page.

In another possible implementation manner, the selection instruction includes a trigger instruction for a third selection control; the responding to the behavior search instruction, and displaying the candidate video clips matched with the behavior search instruction in a search result page, wherein the displaying comprises: responding to a behavior search instruction, displaying candidate video segments matched with the behavior search instruction in a search result page, and displaying third selection controls corresponding to the candidate video segments one by one; the determining, in response to a selection instruction for at least one of the candidate video segments in the search result page, at least one of the candidate video segments as a target video segment respectively includes: and responding to a triggering instruction aiming at any third selection control in the search result page, and determining the candidate video clip corresponding to the third selection control as a target video clip.

In one possible implementation, the method further includes: displaying a video clip control in the search results page according to the number of the target video segments, wherein icon content of the video clip control comprises the number of the target video segments. In this implementation, the video clip control may represent a control for entering a video clip interface. Fig. 5 is a schematic diagram illustrating that a video clip control is displayed in a search result page according to the number of target video segments in the video generation method provided by the embodiment of the disclosure. In the example shown in FIG. 5, the number of target video segments is 1, and the iconic content of the video clip control 205 also includes the text of "composite video". Fig. 6 shows another schematic diagram illustrating that a video clip control is displayed in a search result page according to the number of target video segments in the video generation method provided by the embodiment of the disclosure. In the example shown in FIG. 6, the number of target video segments is 2, and the iconic content of the video clip control 206 also includes the text of "composite video". In one example, a video clip control can be displayed in the search results page according to the number of target video segments in response to the number of target video segments being greater than or equal to 1. In this example, where the number of target video segments is 0, the video clip control may not be displayed according to the number of target video segments, e.g., only the text of "composite video" may be displayed, and the number of target video segments "0" may not be displayed. In this implementation manner, a video clip control is displayed in the search result page according to the number of the target video segments, where the icon content of the video clip control includes the number of the target video segments, so that the user can see the number of the selected target video segments in the search result page without clicking on a video clip interface or other interfaces to confirm the number of the selected target video segments, thereby further improving the convenience of video clipping for the user.

In the disclosed embodiments, the target video may be automatically generated from the at least one target video segment, or the target video may be generated from the at least one target video segment in response to a user trigger.

In one possible implementation, the generating a target video according to at least one target video segment includes: in response to a video clipping instruction, displaying a video clipping interface, wherein the video clipping interface comprises at least one second video preview window in one-to-one correspondence with at least one target video segment, and the second video preview window is used for displaying the target video segment; and responding to a video composition instruction, and compositing at least one target video segment according to the sequence of at least one second video preview window in the video clip interface to obtain a target video.

In this implementation, the detection of the video clip instruction may be determined in response to detecting a triggering instruction for a video clip control in the search results page. Of course, the user may issue the video clip command by other means (for example, the first preset shortcut key), which is not limited herein. The video compositing instruction may be determined to be detected in response to a triggering instruction for a composite video control in the video clip interface. Of course, the user may also issue the composition command by other means (for example, a second preset shortcut key), which is not limited herein. Fig. 7 shows a schematic diagram of a video clip interface in a video generation method provided by an embodiment of the present disclosure. In the example shown in fig. 7, the video clip interface 207 includes 3 second video preview windows 208 for respectively presenting target video segment 1 through target video segment 3. In the example shown in FIG. 7, the video clip interface also includes a composite video control 209. In response to a trigger instruction for the composite video control 209, the 3 target video segments may be synthesized according to the sequence of the 3 second video preview windows in the video clip interface to obtain the target video, that is, the 3 target video segments may be synthesized according to the sequence of the target video segment 1-the target video segment 2-the target video segment 3 to obtain the target video. According to the implementation mode, each target video segment can be displayed through the video editing interface, so that a user can conveniently and intuitively know the information of each target video segment of the target video to be synthesized.

As one example of this implementation, after the displaying the video clip interface, the method further comprises: in response to detecting that the user is interested in any of the second video preview windows in the video clip interface, playing a target video segment in the second video preview window. In this example, for any target video segment in the video clip interface, if it is not detected that the user focuses on the target video segment (or it is not detected that the user focuses on the second video preview window corresponding to the target video segment), the target video segment may not be played, and if it is detected that the user focuses on the target video segment (or it is detected that the user focuses on the second video preview window corresponding to the target video segment), the target video segment may be played, so that the user can conveniently view the content of the target video segment of interest, unnecessary playing can be reduced, and the experience of the user in video clip can be improved.

In one example, the method further comprises: in response to detecting that a cursor remains in any of the second video preview windows in the video clip interface, determining that a user is focused on a target video segment in the second video preview window; and/or, in response to detecting that the user is gazing into any of the second video preview windows in the video clip interface, determining that the user is focusing on a target video segment in the second video preview window. In this example, it may be determined that the cursor is detected to be hovering in any second video preview window in the video clip interface in response to detecting that the cursor is hovering in the second video preview window for a duration of a third preset duration; the method may further include determining that the user gaze is detected to be hovering in any of the second video preview windows in the video clip interface in response to detecting that the user gaze is hovering in the second video preview window for a duration of a fourth preset duration. The third preset time period and the fourth preset time period may be equal to or unequal to each other, and are not limited herein. According to this example, it can be accurately detected whether the user is focusing on any of the second video preview windows in the video clip interface.

As an example of this implementation, the second video preview window includes a cropping control, and the cropping control includes a cropping start sub-control and a cropping end sub-control; after the displaying the video clip interface, the method further comprises: and responding to a dragging instruction aiming at the cutting starting point sub-control and/or the cutting end point sub-control, and cutting the target video segment in the second video preview window to obtain the cut target video segment. In this example, the cropping control in the second video preview window may be a bar control, the length of the cropping control may be the same as the length of the progress bar of the target video clip in the second video preview window, the starting point of the cropping control may include a cropping start sub-control, and the end point of the cropping control may include a cropping end sub-control. The cutting starting point sub-control can be used for adjusting the starting point of the target video segment, and the cutting end point sub-control can be used for adjusting the end point of the target video segment. Fig. 8 illustrates a schematic diagram of a cropping control in a video generation method provided by an embodiment of the present disclosure. As shown in fig. 8, the cut control may include a cut start sub-control 210 and a cut end sub-control 211. The target video segment in the second video preview window may be clipped in response to a dragging instruction for the clipping start sub-control 210 and/or the clipping end sub-control 211, so as to obtain a clipped target video segment. According to the example, the user can conveniently cut each target video segment in the video clip interface, so that the convenience of video clip can be further improved. Further, in order to enable the user to know the duration of the composite video, the estimated total duration of the composite video can be displayed in the second video preview window, and the estimated total duration can be updated in real time based on the cutting operation of the user on the target video segment; still further, a time length range can be preset, and when the estimated total time length of the synthesized video is not within the time length range, corresponding prompt information is output or the cutting operation of the user is limited, so that the user can make a video meeting the time length requirement.

As one example of this implementation, after the displaying the video clip interface, the method further comprises: adjusting an order of the second video preview windows in the video clip interface in response to a move operation for any of the second video preview windows. For example, the user may drag the second video preview window with a mouse to change the order of the second video preview window in the video clip interface. According to the example, the user can conveniently adjust the sequence of the target video segments in the video clip interface, so that the convenience of video clip can be further improved.

As one example of this implementation, after the displaying the video clip interface, the method further comprises: and displaying a deletion control corresponding to any one of the second video preview windows in response to the detection that the user focuses on the second video preview window. For example, in response to detecting that the user focuses on any second video preview window in the video clip interface, a deletion control can be presented in the second video preview window, wherein the deletion control is used for deleting the second video preview window and the target video segment corresponding to the second video preview window. In this example, for any second video preview window in the video clip interface, if it is not detected that the user focuses on the second video preview window, the deletion control corresponding to the second video preview window may not be displayed, and if it is detected that the user focuses on the second video preview window, the deletion control corresponding to the second video preview window may be displayed, so that unnecessary information display can be reduced, and the experience of the user in performing video clips is improved.

In one example, after the displaying of the deletion control corresponding to the second video preview window, the method further includes: and canceling the display of the deletion control corresponding to the second video preview window in response to the fact that the user does not pay attention to the second video preview window any more.

In one example, after the displaying of the deletion control corresponding to the second video preview window, the method further includes: and in response to the deletion control being triggered, deleting the second video preview window from the video clip interface, and deleting a target video segment corresponding to the second video preview window. According to the example, the user can delete the target video segment in the video clip interface conveniently, so that the convenience of the user in video clip can be further improved.

As an example of this implementation, the video clip interface further includes a third video preview window for previewing the target video. Fig. 9 shows another schematic diagram of a video clip interface in the video generation method provided by the embodiment of the present disclosure. In fig. 9, the video clip interface also includes a third video preview window 212. According to the example, the user can conveniently preview the target video in the video clip interface, so that the convenience of the user in video clip can be further improved.

As one example of this implementation, the video clip interface further includes a local upload control; the method further comprises the following steps: and responding to a triggering instruction aiming at the local uploading control, and determining the local video file selected by the user as a target video fragment.

In one possible implementation, the generating a target video according to at least one target video segment includes: determining a target background music file selected by a user; and generating a target video according to the target background music file and at least one target video fragment. In this implementation, the target background music file may be an audio file imported by the user, or an audio file selected online, which is not limited herein. In the process of generating the target video according to the target background music file, the background music of the original target video segment is eliminated and replaced by the music corresponding to the background music file selected by the user; of course, the user may also not select the target background music file, that is, the background music of the original target video segment is retained. Further, when the user selects the target background music file, the background music file may be selected for the entire target video, or the target background music file may be selected for one or more target video segments, and the server performs background music processing on the corresponding video according to the selection operation of the user.

In one possible implementation, the generating a target video according to at least one target video segment includes: determining a video name set by a user; and generating a target video according to the video name and at least one target video fragment. In this implementation, the user can customize the video name of the target video.

In one possible implementation, the generating a target video according to at least one target video segment includes: determining a target resolution selected by a user; and generating a target video according to the target resolution and at least one target video fragment. In this implementation, the target resolution may represent a user-selected resolution. Fig. 10 shows another schematic diagram of a video clip interface in the video generation method provided by the embodiment of the present disclosure. In one example, a user may select a target resolution in a video clip interface, for example, the target resolution may be 480P, 720P, 1080P, and so on. The target video may be generated according to the target resolution and the target video segments 1-3 in response to a triggering instruction for a composite video control in the video clip interface. According to the implementation mode, the requirements of users on different resolutions can be met. Further, in order to let the user know the file size (or file volume) of the composite video, the predicted file size (e.g., 300MB) of the composite video may also be displayed in the video clip interface, and may be calculated and updated in real time based on the resolution selected by the user; still further, a file size range (or a file volume range) may be preset, and when the file size of the synthesized video is not within the file size range, corresponding prompt information may be output or a user may be prompted to select another resolution, so that the user may make a video meeting the duration requirement.

In another possible implementation manner, the generating a target video according to at least one target video segment includes: and generating a target video according to a preset resolution and at least one target video fragment. In this implementation, the user may not be required to select the target resolution.

In one possible implementation, after the generating the target video according to at least one of the target video segments, the method further includes: responding to a release instruction corresponding to the target video, and determining a target release platform and target release time selected by a user; and sending a publishing request corresponding to the target video to a server corresponding to the target publishing platform, wherein the publishing request comprises the target publishing time. In this implementation, the target publishing platform may represent a publishing platform selected by a user to publish the target video, and the target publishing time may represent a time selected by the user to publish the target video. According to this implementation, the target video can be automatically distributed on time at the distribution time desired by the user.

The video generation method provided by the embodiment of the present disclosure is described below by a specific application scenario. In this application scenario, the video generation method may be performed by a terminal device. After the user enters the address of the video clip website in the browser of the terminal device, the browser may enter the login page of the video clip website. After the user enters the username, password, and authentication code in the login page, the browser may enter the home page of the video clip website. The user may enter a person name and action in a search area of the video clip website. The server may retrieve candidate video clips from the video database that match the person name and the action. The browser may present 6 candidate video segments, candidate video segments 1-6, respectively, in the search results page. The terminal device can respond to a selection instruction aiming at the candidate video clip 3 in the search result page, and take the candidate video clip 3 as the target video clip 1; the terminal device can respond to a selection instruction of the candidate video clip 2 in the search result page, and take the candidate video clip 2 as the target video clip 2; the terminal device may take the candidate video segment 5 as the target video segment 3 in response to a selection instruction for the candidate video segment 5 in the search result page. The terminal device may pop up a video clip interface in response to a triggering instruction for a video clip control in the search results page. The video clip interface can preview the target video through the third video preview window and can respectively show 3 target video segments through the 3 second video preview windows. The user may adjust the length, order, etc. of the target video segments in the video clip interface and may delete one or more of the target video segments. The video clip interface can also include a local upload control that can be used to upload a local video file. The terminal device may use the target background music file selected by the user as the background music file of the target video. The terminal device may take the resolution selected by the user as the resolution of the target video. The terminal device can respond to the triggering instruction aiming at the synthesized video control to generate the target video. The user may preview the target video in "my work". The terminal device can respond to the issuing instruction corresponding to the target video, select the issuing platform and the issuing time of the target video, and can fill in the video name and the video introduction of the target video. The terminal device can respond to the trigger instruction for confirming the publishing control and send a publishing request to a server corresponding to the target publishing platform. The user can view information such as the publishing state and the publishing platform of the target video in the publishing record.

It should be noted that, in the above embodiments of the method, the terminal device and the server are introduced as two relatively independent bodies; in some embodiments, the functions of the terminal device and the server may be implemented by the same hardware device (or a system), and the terminal device and the server are different functional modules respectively corresponding to the hardware device (or the system), that is, the functions of the terminal device and the server are implemented by different functional modules.

It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted. Those skilled in the art will appreciate that in the above methods of the specific embodiments, the specific order of execution of the steps should be determined by their function and possibly their inherent logic.

In addition, the present disclosure also provides a video generation apparatus, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any video generation method provided by the present disclosure, and corresponding technical solutions and technical effects can be referred to in corresponding descriptions of the method section and are not described again.

Fig. 11 shows a block diagram of a video generation apparatus provided by an embodiment of the present disclosure. As shown in fig. 11, the video generation apparatus includes:

a first presentation module 31, configured to, in response to a behavior search instruction, present, in a search result page, a candidate video segment that matches the behavior search instruction, where the behavior search instruction includes a first search word used for representing a behavior, and the candidate video segment includes a behavior tag that matches the first search word;

a first determining module 32, configured to respond to a selection instruction for at least one candidate video segment in the search result page, and determine at least one candidate video segment as a target video segment respectively;

a generating module 33, configured to generate a target video according to at least one target video segment.

the first determining module 32 is configured to:

In one possible implementation, the apparatus further includes:

and/or the presence of a gas in the gas,

In one possible implementation, the apparatus further includes:

In a possible implementation manner, the generating module 33 is configured to:

In one possible implementation, the apparatus further includes:

the device further comprises:

In one possible implementation, the apparatus further includes:

In a possible implementation manner, the generating module 33 is configured to:

determining a target resolution selected by a user;

In one possible implementation, the apparatus further includes:

In some embodiments, functions or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementations and technical effects thereof may refer to the description of the above method embodiments, which are not described herein again for brevity.

Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-described method. The computer-readable storage medium may be a non-volatile computer-readable storage medium, or may be a volatile computer-readable storage medium.

Embodiments of the present disclosure also provide a computer program, which includes computer readable code, and when the computer readable code runs in an electronic device, a processor in the electronic device executes the above method.

The disclosed embodiments also provide a computer program product comprising computer readable code or a non-volatile computer readable storage medium carrying computer readable code, which when run in an electronic device, a processor in the electronic device performs the above method.

An embodiment of the present disclosure further provides an electronic device, including: one or more processors; a memory for storing executable instructions; wherein the one or more processors are configured to invoke the memory-stored executable instructions to perform the above-described method.

The electronic device may be provided as a terminal or other modality of device.

Fig. 12 shows a block diagram of an electronic device 800 provided by an embodiment of the disclosure. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like terminal.

Referring to fig. 12, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as a wireless network (Wi-Fi), a second generation mobile communication technology (2G), a third generation mobile communication technology (3G), a fourth generation mobile communication technology (4G)/long term evolution of universal mobile communication technology (LTE), a fifth generation mobile communication technology (5G), or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A method of video generation, comprising:

and generating a target video according to at least one target video segment.

2. The method of claim 1, wherein the behavior search instruction further comprises: the second search word is used for representing the behavior execution main body, and/or the third search word is used for representing the scene corresponding to the behavior execution main body.

3. The method of claim 1 or 2, wherein the behavior comprises at least one of an action, an expression, and a sound.

4. The method according to any one of claims 1 to 3, wherein the selection instruction comprises a trigger instruction for a first selection control;

5. The method according to any one of claims 1 to 4, wherein the selection instruction comprises a trigger instruction for a second selection control;

6. The method of claim 5, further comprising:

and/or the presence of a gas in the gas,

7. The method according to any one of claims 1 to 6, further comprising:

8. The method according to any one of claims 1 to 7, wherein the generating a target video from at least one of the target video segments comprises:

9. The method of claim 8, wherein after the displaying a video clip interface, the method further comprises:

10. The method of claim 8 or 9, wherein the second video preview window comprises a cropping control, the cropping control comprising a cropping start sub-control and a cropping end sub-control;

after the displaying the video clip interface, the method further comprises:

11. The method of any of claims 8-10, wherein after the displaying a video clip interface, the method further comprises:

12. The method of any of claims 8-11, wherein after the displaying a video clip interface, the method further comprises:

13. The method of claim 12, wherein after said presenting the deletion control corresponding to the second video preview window, the method further comprises:

14. The method of any of claims 8 to 13, wherein the video clip interface further comprises a third video preview window for previewing the target video.

15. The method according to any one of claims 1 to 14, wherein generating a target video from at least one of the target video segments comprises:

determining a target resolution selected by a user;

16. The method according to any one of claims 1 to 15, wherein after generating a target video from at least one of the target video segments, the method further comprises:

17. A video generation apparatus, comprising:

18. An electronic device, comprising:

one or more processors;

a memory for storing executable instructions;

wherein the one or more processors are configured to invoke the memory-stored executable instructions to perform the method of any one of claims 1 to 16.

19. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1 to 16.