CN116980718A - Scenario recomposition method and device for video, electronic equipment and storage medium - Google Patents

Scenario recomposition method and device for video, electronic equipment and storage medium Download PDF

Info

Publication number
CN116980718A
CN116980718A CN202310275923.7A CN202310275923A CN116980718A CN 116980718 A CN116980718 A CN 116980718A CN 202310275923 A CN202310275923 A CN 202310275923A CN 116980718 A CN116980718 A CN 116980718A
Authority
CN
China
Prior art keywords
scenario
video
original
information
adaptation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310275923.7A
Other languages
Chinese (zh)
Inventor
孙雨婷
莫琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310275923.7A priority Critical patent/CN116980718A/en
Publication of CN116980718A publication Critical patent/CN116980718A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8541Content authoring involving branching, e.g. to different story endings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally

Abstract

The application relates to the technical field of Internet, and provides a scenario recomposition method and device of video, electronic equipment and a storage medium, which are used for improving the scenario recomposition efficiency of the video. The method comprises the following steps: responsive to an adaptation operation triggered for an original video to be adapted, presenting a selection interface containing respective video generation areas associated with the original video; each video generation region is associated with one video clip set in the original video; responding to a selection operation triggered for each video generation area, and presenting an editing interface containing original scenario information of video clip sets respectively associated with the selected target video generation areas; responding to the scenario recomposition operation triggered in the editing interface, acquiring corresponding scenario recomposition information, and presenting a target video generated based on the acquired at least one original scenario information and scenario recomposition information; the target video is automatically adapted based on the video clip sets respectively associated with the target video generation areas, so that the adaptation efficiency can be effectively improved.

Description

Scenario recomposition method and device for video, electronic equipment and storage medium
Technical Field
The application relates to the technical field of Internet, and provides a scenario adaptation method and device for video, electronic equipment and a storage medium.
Background
The current scenes and plot shooting contents of film and television episodes (such as films, television dramas, various long and short videos and the like) are formed by shooting and making the content of the pre-defined episodes. When viewing the scenario of the related movie content, the viewing object often encounters the situation that the original scenario trend and the drama content are not in line with expectations. However, since there is no good solution on the platform side at present, even if the viewing object has a view different from the current scenario, the movie still plays according to the original scenario.
In this case, the viewing object with a certain drama editing capability can self-grasp the complete extra-script, but the method is time-consuming and labor-consuming; viewing objects with certain editing capability can splice new extra content by combing the original episode content segments, but the achievement obtained by the method is rough and easy to play, and is time-consuming and labor-consuming as well.
Therefore, how to relate the scenario development to the viewing requirement of the viewing object, automatically generate the scenario meeting the viewing requirement of the viewing object, and improve the scenario adaptation efficiency of the video becomes a problem to be solved.
Disclosure of Invention
The embodiment of the application provides a scenario adaptation method, a scenario adaptation device, electronic equipment and a storage medium for video, which are used for improving the scenario adaptation efficiency of the video.
The scenario adaptation method of the video in the embodiment of the application comprises the following steps:
in response to an adaptation operation triggered for an original video to be adapted, presenting a selection interface comprising: each video generation area associated with the original video; wherein each video generation region is associated with a set of video segments in the original video;
responsive to a selection operation triggered for the respective video generation region, presenting an editing interface comprising: original scenario information of video clip sets respectively associated with at least one selected target video generation region;
responding to the scenario recomposition operation triggered in the editing interface, acquiring corresponding scenario recomposition information, and presenting a target video generated based on the acquired at least one original scenario information and the scenario recomposition information; the target video is adapted based on the video clip sets respectively associated with the at least one target video generation region.
The scenario adaptation method of the video provided by the embodiment of the application comprises the following steps:
Analyzing script text and picture data of an original video to be adapted to obtain original scenario analysis data corresponding to the original video;
acquiring scenario adaptation information of at least one target video generation area in each video generation area associated with the original video, and updating the original scenario analysis data based on the scenario adaptation information to generate corresponding new scenario analysis data; the scenario adaptation information is adaptation information of original scenario information of video clips respectively associated with the at least one target video generation region;
generating a new script text and each scene fragment according with the scenario recomposition information based on the new scenario analytic data, and synthesizing each scene fragment into a target video based on the new script text; the target video is adapted based on the respective associated video segment sets of the at least one target video generation region.
The scenario adaptation device of the video in the embodiment of the application comprises:
a first response unit, configured to respond to an adaptation operation triggered for an original video to be adapted, and present a selection interface, where the selection interface includes: each video generation area associated with the original video; wherein each video generation region is associated with a set of video segments in the original video;
The second response unit is used for responding to the selection operation triggered by each video generation area and presenting an editing interface, and the editing interface comprises: original scenario information of video clip sets respectively associated with at least one selected target video generation region;
the third response unit is used for responding to the scenario recomposition operation triggered in the editing interface, acquiring corresponding scenario recomposition information and presenting a target video generated based on the acquired at least one original scenario information and the scenario recomposition information; the target video is adapted based on the video clip sets respectively associated with the at least one target video generation region.
Optionally, the first response unit is specifically configured to:
presenting a video playing interface for playing the original video; the video playing interface comprises an adaptation entrance;
responding to an adaptation operation triggered by an adaptation port in the video playing interface, and presenting the selection interface; one video generation area in the selection interface is marked with the current playing progress of the original video.
Optionally, the editing interface further includes scenario development options associated with each of the at least one target video generation region;
The third response unit is specifically configured to:
responding to a selection operation triggered for each scenario development option, and taking at least one selected target scenario development option as scenario adaptation information related to scenario development;
and presenting the target video obtained by adapting the plot development trend of the video clip set respectively associated with the at least one target video generation area based on the obtained at least one original plot information and the plot adaptation information.
Optionally, the editing interface further includes a first custom area;
the third response unit is specifically configured to:
responding to the input operation aiming at the first custom region, acquiring the input scenario custom content, and taking the scenario custom content as the scenario recomposition information;
and presenting the target video obtained by adapting the plot development trend of the video clip set respectively associated with the at least one target video generation area based on the obtained at least one original plot information and the plot adaptation information.
Optionally, the third response unit is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page includes: the scenario style information of the video clip set respectively associated with the at least one target video generation region and preset scenario style options;
And responding to a selection operation triggered for each scenario style option, and taking the selected at least one target scenario style option as scenario adaptation information related to the scenario style.
Optionally, the third response unit is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page includes: the original scenario roles of the video clip sets respectively associated with the at least one target video generation region and preset role options;
responding to character modification operation triggered based on the character options, and obtaining scenario adaptation information related to scenario characters; wherein the character modification operation includes at least one of: character replacement operation between original scenario characters, character replacement operation between candidate scenario characters and original scenario characters, and character addition operation based on candidate scenario characters; the candidate scenario roles include at least one of: and the original video contains other scenario roles except the original scenario roles, and scenario roles in other video materials.
Optionally, the third response unit is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page includes: the original scenario scenes of the video clip sets respectively associated with the at least one target video generation area and preset scene options;
acquiring scenario adaptation information related to the scenario scene in response to a scene modification operation triggered based on the scene option; wherein the scene modifying operation includes at least one of: scene replacement operation between original scenario scenes, scene replacement operation between candidate scenario scenes and original scenario scenes, and scene addition operation based on other scenario scenes; the candidate scenario includes at least one of: and the original video comprises other scenario scenes except the original scenario scene, and scenario scenes in other video materials.
Optionally, the third response unit is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page includes: setting a control for the duration;
And acquiring scenario adaptation information related to the scenario duration in response to a duration setting operation triggered by the duration setting control.
Optionally, the third response unit is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page includes: a second custom region;
and responding to the input operation aiming at the second custom region, acquiring other input custom content, and taking the other custom content as the scenario adaptation information.
Optionally, the third response unit is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, presenting a content generation interface comprising: generating a progress bar for the target video;
the third response unit is specifically configured to:
and determining that the generation of the target video is completed, and playing the target video in the content generation interface.
Optionally, the third response unit is further configured to:
After presenting a content generation interface, before playing the target video in the content generation interface, responding to a minimizing operation for the content generation interface, jumping to a video playing interface, and adapting the relevant position of an entry in the video playing interface, and presenting a generation progress bar for the target video; the adaptation entry is for triggering an adaptation operation for the original video.
Optionally, the apparatus further includes:
a fourth response unit, configured to respond to an editing operation for the target video, and present at least one type of editing control;
and responding to the selection operation triggered by the at least one type of editing control, editing the picture content of the target video based on the selected target editing control, and presenting a corresponding editing effect.
The scenario adaptation device of a video provided by the embodiment of the application comprises:
the analysis unit is used for analyzing the script text and the picture data of the original video to be adapted to obtain original scenario analysis data corresponding to the original video;
the updating unit is used for acquiring scenario recomposition information of at least one target video generation area in each video generation area associated with the original video, updating the original scenario analysis data based on the scenario recomposition information and generating corresponding new scenario analysis data; the scenario adaptation information is adaptation information of original scenario information of video clips respectively associated with the at least one target video generation region;
The adaptation unit is used for generating a new script text and each scene segment which accord with the scenario adaptation information based on the new scenario analysis data, and synthesizing each scene segment into a target video based on the new script text; the target video is adapted based on the respective associated video segment sets of the at least one target video generation region.
Optionally, the analysis unit is specifically configured to:
analyzing the script text of the original video to obtain an original character behavior logic sequence containing scenes;
analyzing the picture of the original video, and supplementing the logic sequence of the behavior of the original role by combining the picture analysis result to obtain scenario analysis data corresponding to the original video.
Optionally, the analysis unit is specifically configured to:
inputting script text of the original video into a trained natural language analysis model to obtain the original character behavior logic sequence output by the natural language analysis model;
the natural language analysis model is obtained by training a sample script text serving as an input characteristic and a corresponding sample character behavior logic sequence serving as an output characteristic; the sample character behavior logic sequence is constructed based on the relation among characters obtained through text analysis of the sample script and the data logic description of the scenario content structure obtained through corresponding scene analysis.
Optionally, the analysis unit is specifically configured to:
inputting the original character behavior logic sequence and the original video into a trained picture analysis model to obtain scattered scene fragments output by the picture analysis model; each character behavior logic sequence comprises at least one character behavior logic sequence unit, and each character behavior logic sequence unit corresponds to one scene segment;
analyzing pictures in each obtained scene segment to obtain key information, and marking the original role behavior logic sequence based on the obtained key information to obtain the original scenario analysis data;
the picture analysis model is obtained by training a sample video and a sample character behavior logic sequence serving as input features and corresponding sample scene fragments serving as output features; the sample scene segment is obtained by splitting the sample video based on the sample character behavior logic sequence.
Optionally, the updating unit is specifically configured to:
inputting the scenario adaptation information and the original scenario analysis data into a trained natural language generation model to obtain a new character behavior logic sequence output by the natural language generation model, and taking the new character behavior logic sequence as the new scenario analysis data;
The natural language generation model is obtained by training an original sample character behavior logic sequence and sample scenario adaptation information serving as input features and a corresponding new sample character behavior logic sequence serving as output features; the new sample character behavior logic sequence is obtained by modifying the original sample character behavior logic sequence for a plurality of times based on sample scenario adaptation information; the original sample character behavior logic sequence is constructed based on the relation between characters obtained by analyzing the sample script text and the data logic description of the scenario content structure obtained by analyzing the corresponding scene.
Optionally, the original sample character behavior logic sequence further includes an end identifier, where the end identifier is used to represent a scenario end type of the corresponding sample video.
Optionally, the adapting unit is specifically configured to:
inputting the new character behavior logic sequence into a trained picture generation model to obtain a scene segment output by the picture generation model;
the picture generation model is obtained by training a sample character behavior logic sequence and sample key information serving as input features and corresponding sample scene fragments serving as output features; the sample key information is obtained by analyzing pictures in each sample scene segment.
Optionally, the picture generation model generates an impedance network; the adaptation unit is specifically configured to:
mapping key information corresponding to the new character behavior logic sequence into a plurality of feature vectors; each key information corresponds to a feature vector;
inputting the plurality of feature vectors into a generator in the picture generation model, generating corresponding images, and obtaining scene fragments composed of the images.
Optionally, the updating unit is further configured to:
after scenario adaptation information of at least one target video generation area in each video generation area associated with the original video is acquired, semantic logic analysis is carried out on scenario self-defined content if the scenario adaptation information comprises the scenario self-defined content before updating the original scenario analysis data based on the scenario adaptation information;
and if the custom content is determined not to accord with the semantic logic, sending prompt information to the terminal equipment so as to prompt that the scenario custom content needs to be edited again.
An electronic device in an embodiment of the present application includes a processor and a memory, where the memory stores a computer program that, when executed by the processor, causes the processor to execute the steps of the scenario adaptation method of any one of the videos described above.
An embodiment of the present application provides a computer-readable storage medium including a computer program for causing an electronic device to execute the steps of the scenario adaptation method of any one of the videos described above, when the computer program is run on the electronic device.
Embodiments of the present application provide a computer program product comprising a computer program stored in a computer readable storage medium; when the processor of the electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, so that the electronic device performs the steps of the scenario adaptation method of any one of the videos described above.
The application has the following beneficial effects:
the embodiment of the application provides a scenario adaptation method and device of video, electronic equipment and a storage medium. According to the method, the object can trigger the adapting operation directly aiming at the original video to be adapted, further, a selection interface for the object to select the scenario of the original video to be adapted is presented, specifically, the selection interface comprises each video generation area related to the original video, each video generation area is related to one video segment set in the original video, based on the selection interface, the object can select at least one target video generation area to be adapted and trigger the scenario adapting operation according to the corresponding editing interface, corresponding scenario adapting information is set according to the self requirement, namely, a matched target video can be automatically generated based on the video segment set related to the target video generation area and the scenario adapting information set by the object according to the self requirement. Based on the method, the system and the device, the scenario meeting the requirement of the object to be watched can be automatically generated by combining the requirement of the object to be watched, and the flexibility and the recompilation efficiency of video recompilation are improved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is an alternative schematic diagram of an application scenario in an embodiment of the present application;
FIG. 2 is a flow chart of a scenario adaptation method of a video according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a video playing interface according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a video generation region in an embodiment of the present application;
FIG. 5 is a schematic diagram of a target video generation region according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a scenario development trend-related editing interface in an embodiment of the present application;
FIG. 7A is a schematic diagram of a scenario style-related editing interface in an embodiment of the present application;
FIG. 7B is a schematic diagram of a scenario style-related editing interface according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a scenario character-related editing interface in an embodiment of the present application;
FIG. 9 is a schematic diagram of a scenario-related editing interface in an embodiment of the present application;
FIG. 10A is a schematic diagram of a scenario duration-related editing interface in an embodiment of the present application;
FIG. 10B is a schematic diagram of a scenario duration-related editing interface according to an embodiment of the present application;
FIG. 11 is a schematic diagram of a scenario ending type-related editing interface in an embodiment of the present application;
FIG. 12 is a schematic diagram of an alternative custom content-related editing interface in accordance with an embodiment of the present application;
FIG. 13 is a schematic diagram of a content creation interface in an embodiment of the application;
FIG. 14 is a schematic diagram of a video playing interface according to another embodiment of the present application;
FIG. 15 is a diagram of a target video frame according to an embodiment of the present application;
FIG. 16 is a schematic diagram of a secondary editing process in an embodiment of the application;
FIG. 17 is a flowchart illustrating a scenario adaptation method of still another video according to an embodiment of the present application;
FIG. 18 is a flowchart illustrating another exemplary scenario adaptation method for video according to an embodiment of the present application;
Fig. 19 is a schematic structural diagram of a scenario adaptation apparatus for video according to an embodiment of the present application;
fig. 20 is a schematic structural diagram of a scenario adaptation apparatus of still another video in an embodiment of the present application;
fig. 21 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 22 is a schematic structural diagram of still another electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the technical solutions of the present application, but not all embodiments. All other embodiments, based on the embodiments described in the present document, which can be obtained by a person skilled in the art without any creative effort, are within the scope of protection of the technical solutions of the present application.
Some of the concepts involved in the embodiments of the present application are described below.
Adapting: the method is that based on the original work, the original work is created by changing the expression form or the application of the work. The scenario adaptation method of the application supports the adaptation of the scenario of the original video to generate the target video, including but not limited to the adaptation of the scenario development trend, scenario style, scenario role, scenario duration, scenario ending type and the like of the original video.
And (2) the following steps: some branched stories outside the main story line, and characters in the stories are processed to open up a new small story or a story similar to the main story line, but are told or played by other people. There is also a part mentioned in the story of the main story but not specifically mentioned, which is fully developed here.
Forward: is a work of a previous timeline of works. Typically, the story that occurs before the work occurs is described, typically by laying down the plot of the work.
The following steps: is a work that follows a timeline of works. Typically, the story that recurs after the occurrence of the work is described, typically in addition to the plot of the work.
Video generation region: is a data form for characterizing which segments in an original video are adapted, and an original video may be associated with at least one video generation area, and each video generation area may be associated with a set of video segments in the original video, where the association relationship may be specifically expressed as: one video generation area corresponds to one or more video clips contained in the original video; it can also be expressed as: a video generation region corresponds to one or more video segments that the original video does not include, and may specifically refer to branching, extending, or padding one or more video segments that the original video includes.
Selecting an interface: a user-oriented interactive interface for an object to determine which segments of an original video need to be adapted. In an embodiment of the present application, the selection interface may include respective video generation regions associated with the original video, each video generation region being associated with a video clip set in the original video, each video clip set including at least one video clip, and each video generation region being operable to: and carrying out scenario adaptation at least based on the video clip set associated with the video generation area, and generating corresponding change video.
Scenario adaptation information: i.e. information describing which episodes in the original video were adapted. For example, the method can be specifically an adaptation of the plot development trend, plot style, plot role, plot duration, plot ending type and the like of part of the plot in the original video.
Editing interface: an interactive interface for setting scenario adaptation information for an object is user-oriented. The editing interface may include one or more pages, for example, when a page is included, the object may set various scenario adaptation information based on the page, such as the scenario development trend, scenario style, scenario role, scenario duration, scenario ending type, and the like listed above; in another example, when multiple pages are included, each page may correspond to one (or more) scenario adaptation information.
Script text: the text format file refers to a base book or a base book of a script according to which drama, film shooting and the like are performed. The script can be said to be an outline of the development of the story to determine the direction of the development of the story.
Character behavior logic sequence: the method refers to a sequence obtained by analyzing and dividing script texts according to scenes or roles, one script text can be divided into one or more role behavior logic sequences, for example, when the script texts are divided according to scenes, one role behavior logic sequence can be descriptive information for describing each fragment in one scene; when a character is divided, a character behavior logic sequence may be description information describing each sub-period of a character under a certain period of time. Each character behavior logic sequence comprises at least one character behavior logic sequence unit, and each character behavior logic sequence unit corresponds to one scene segment. In the embodiment of the present application, unlike the data structure of the script file, the character behavior logic sequence is one data structure for the machine, and the script file is text information for the person (director, actor, etc.).
Scenario analysis data: the character behavior logic sequence is at least contained, and the character behavior logic sequence can be obtained by supplementing the character behavior logic sequence based on the picture analysis result. Therefore, the scenario analysis data is also different from the data structure of the script file, and the scenario analysis data is also one data structure for the machine, and the script file is text information for people (director, actor, etc.).
Embodiments of the present application relate to artificial intelligence (Artificial Intelligence, AI) and Machine Learning (ML) techniques, designed based on computer vision techniques and Machine Learning in artificial intelligence.
The artificial intelligence technology mainly comprises a computer vision technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions. With research and progress of artificial intelligence technology, artificial intelligence is developed in various fields such as common smart home, intelligent customer service, virtual assistant, smart speaker, smart marketing, unmanned, automatic driving, robot, smart medical, etc., and it is believed that with the development of technology, artificial intelligence will be applied in more fields and become more and more important value. The scenario adaptation method of the video in the embodiment of the application can combine the artificial intelligence with the video adaptation task in the fields, and can improve the video adaptation efficiency.
Deep learning is a machine learning method in the field of artificial intelligence, and is based on an artificial neural network model. Deep learning processes data hierarchically, extracts complex representations from simple features, and trains with large amounts of data to achieve efficient learning and prediction. Deep learning has achieved significant results in many areas, such as computer vision, natural language processing, speech recognition, and the like.
An AI model is a system (e.g., a software system) that learns and predicts new data from data using artificial intelligence techniques. These models are typically based on mathematical methods such as linear regression, decision trees, support vector machines, neural networks, etc. AI models adapt to data features through training (also known as learning) and predict the output of unknown data through reasoning. Based on the deep learning method, the AI model can be trained to obtain a trained AI model that can inferentially predict the target data. AI models may be used for a variety of applications such as image classification, speech recognition, natural language processing, predictive analysis, and the like. In the embodiment of the present application, the present application mainly relates to an AI natural language analysis model (corresponding to AI natural language analysis technology), an AI natural language generation model (corresponding to AI natural language generation technology), an AI picture analysis model (corresponding to AI picture analysis technology), an AI picture generation model (corresponding to AI picture generation technology), and the like.
Specifically, the scenario adaptation method provided by the application relates to a natural language analysis technology and a natural language generation technology in a natural language processing technology.
Among them, AI natural language analysis technology is technology (which may be in the form of software) that analyzes grammar, semantics and emotion of human language through artificial intelligence algorithm. They help classify text, audio and video, extract information and generate summaries. Common uses include text classification, emotion analysis, speech recognition, machine translation, and the like.
AI natural language analysis techniques are typically implemented using the following techniques: lexical analysis: the text is segmented into words, punctuation marks, etc. Syntax analysis: the grammatical relations of the words in the text are determined. Syntax analysis: the structure of sentences in the text is determined. Semantic analysis: semantic meaning of words and sentences in the text is determined. Emotion analysis: emotion tendencies in the text are identified. Machine translation: the translation model is used to translate text from one language to another. Conversational robot: a dialog system model is used to answer the query of the object.
According to the embodiment of the application, a trained AI natural language analysis model can be obtained through learning of an AI natural language analysis technology so as to analyze script text of a video based on the model.
AI natural language generation techniques are techniques (which may be in software form) that use artificial intelligence algorithms to generate human language from data and models. They can generate text, audio or images, and have the characteristics of correct grammar, reasonable semantics and coherent style. Common uses include news digest generation, dialogue generation, question and answer generation, report generation, and the like.
The implementation of AI natural language generation techniques generally includes the steps of: inputting a model: the content to be generated is input into the model. Generating a prediction: text is generated using a deep learning algorithm (e.g., RNN, LSTM, transformer, etc.). Training a language model: the model is trained to ensure that the generated text complies with language rules. Sampling: the most likely results are sampled from the generated text. And (3) adjusting and generating a result: the generated results are adjusted by adjusting the input, the generated model, and the language model.
According to the embodiment of the application, a trained AI natural language generation model can be obtained through learning of an AI natural language generation technology so as to generate a new character behavior logic sequence based on the model.
In addition, the scenario adaptation method provided by the application also relates to a picture analysis technology and a picture generation technology.
Among them, the AI picture analysis technique refers to a technique of analyzing and recognizing information of objects, scenes, expressions, etc. in an image, such as image classification, object detection, semantic segmentation, etc., using artificial intelligence technique.
AI picture analysis techniques are typically implemented using the following techniques: and (3) image identification: objects in the image are identified using a deep learning algorithm. Video analysis: dynamic events in the video are identified. And (3) target detection: objects in the image or video, such as people, vehicles, etc., are identified. Visual language: the image is converted to a text description using a deep learning algorithm. Computer vision: the image is identified using image processing techniques and a deep learning algorithm.
As in the embodiments of the present application, a trained AI picture analysis model may be obtained through learning of AI picture analysis techniques to split video into discrete scene segments based on the model.
AI picture generation technology refers to technology that generates images using artificial intelligence technology, and typically generates images with specific characteristics, such as faces, animals, etc., using a generation countermeasure network (Generative Adversarial Network, GAN).
AI picture generation techniques are typically implemented using the following techniques: generating a network: the image and sequence data are generated using a generation network. Training data: the model is trained using a large amount of image data to ensure that the generated image is authentic. And (3) generating a result adjustment: the generated results are adjusted by adjusting the model parameters and training data.
As in the embodiment of the application, a trained AI picture generation model can be obtained through learning of AI picture generation technology to generate new scene segments based on the model.
It should be noted that the AI picture generation technology further includes AI expression migration and AI face migration.
The AI expression migration is a technology for converting the expression of a person in an input image into another expression through an artificial intelligence algorithm. This technique utilizes a deep learning technique and learns patterns of changes in the expression of a person from a large amount of image data. Expression migration can be used in the application fields of animation production, virtual performance and the like.
AI face migration is a technique that migrates features (e.g., expressions, poses, etc.) of a person in one input image to another image by means of an artificial intelligence algorithm. This technique utilizes a deep learning technique and learns the changing pattern of the character feature from a large amount of image data. The face migration can be used in the application fields of animation production, virtual performance, face recognition and the like.
In addition, the scenario adaptation method provided by the application also relates to a video encoding and decoding technology.
Video codec refers to a process of encoding an original video signal into a digital signal and decoding the digital signal into the original video signal during transmission or storage. This is the core of video compression, with the aim of reducing file size and optimizing transmission speed. Common video coding and decoding standards are h.264, h.265, VP9, etc.
In the embodiment of the application, each newly generated scene segment can be synthesized into a complete target video based on the video coding and decoding technology.
The following briefly describes the design concept of the embodiment of the present application:
with the development of video technology and the popularization of application software, the frequency of viewing content by viewing objects using video applications is increasing. When a viewing object uses a video application to view a movie, the viewing object may have a view different from the current scenario in scenario development, however, in this manner, if the viewing object does not have drama capability or editing capability, the viewing object can only tolerate unpleasant and self-fantasy; if the self-creation process has a certain drama capability or editing capability, the self-creation process is time-consuming and labor-consuming.
In order to meet the requirement of viewing objects on different scenario trends, part of video platforms adopt an alternative scheme of presetting a plurality of scenario trends, and shooting of each scene is carried out according to each change of each scenario trend. The corresponding video clips are interactively played through manual selection of the viewing object, so that imagination of the viewing object on different drama situations is met.
However, the episode content obtained in the above manner requires that movie episodes or platforms complete making of episode texts of each episode trend and shooting of each trend content, which also consumes time and labor cost, increasing workload of movie episode making companies; moreover, because of the requirement of the interaction layer, the watched objects are required to be continuously returned and replayed in the watching process so as to see each interaction trend, the operation cost is high, and many watched objects are not carelessly played once and once according to the process, so that the corresponding shot content is wasted.
In summary, how to relate the scenario development to the viewing requirement of the viewing object, so that the viewing object can see the scenario meeting the viewing preference of the viewing object becomes a problem to be solved in the current video recording and playing process.
In view of this, the embodiment of the application provides a scenario adaptation method, a scenario adaptation device, an electronic device and a storage medium of video. According to the method, the object can trigger the adapting operation directly aiming at the original video to be adapted, further, a selection interface for the object to select the scenario of the original video to be adapted is presented, specifically, the selection interface comprises each video generation area related to the original video, each video generation area is related to one video segment set in the original video, based on the selection interface, the object can select at least one target video generation area to be adapted and trigger the scenario adapting operation according to the corresponding editing interface, corresponding scenario adapting information is set according to the self requirement, namely, a matched target video can be automatically generated based on the video segment set related to the target video generation area and the scenario adapting information set by the object according to the self requirement. Based on the method, the system and the device, the scenario meeting the requirement of the object to be watched can be automatically generated by combining the requirement of the object to be watched, and the flexibility and the recompilation efficiency of video recompilation are improved.
The preferred embodiments of the present application will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are for illustration and explanation only, and not for limitation of the present application, and embodiments of the present application and features of the embodiments may be combined with each other without conflict.
Fig. 1 is a schematic diagram of an application scenario according to an embodiment of the present application. The application scenario diagram includes two terminal devices 110 and a server 120.
In the embodiment of the present application, the terminal device 110 includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a personal computer (Personal Computer, PC), an intelligent bluetooth device, a Virtual Reality (VR), an augmented Reality (Augmented Reality, AR), an intelligent voice interaction device, an intelligent home appliance, a vehicle-mounted terminal, and the like; the terminal device may be provided with a video-related client, where the client may be software (such as a browser, video software, short video software, etc.), or may be a web page, an applet, etc., and the server 120 may be a background server corresponding to the software or the web page, the applet, etc., or a server specially used for carrying out video scenario adaptation, and the application is not limited in detail. The server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content delivery network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligence platform.
It should be noted that, the scenario adaptation method of the video in the embodiments of the present application may be performed by an electronic device, which may be the terminal device 110 or the server 120, that is, the method may be performed by the terminal device 110 or the server 120 alone, or may be performed by both the terminal device 110 and the server 120 together. For example, when the terminal device 110 and the server 120 execute together, the terminal device 110 is configured to present an original video to be adapted to an object, after the object triggers an adaptation operation for the original video, an adaptation request may be sent to the server 120, and the server 120 may analyze script text and picture data of the original video to be adapted to obtain original scenario analysis data corresponding to the original video; in addition, in response to the adapting operation, the terminal device 110 may also present a selection interface including respective video generation areas associated with the original video, each video generation area being associated with a set of video segments in the original video. The object can trigger a selection operation triggered by each video generation region based on the object, for example, the object selects at least one target video generation region, and then an editing interface containing original scenario information of video fragment sets associated with the target video generation regions can be presented; the object can set video adaptation in the editing interface, the terminal equipment 110 responds to the scenario adaptation operation triggered in the editing interface to acquire corresponding scenario adaptation information and inform the server 120, and after the server 120 acquires the scenario adaptation information, the original scenario analysis data is updated based on the scenario adaptation information to generate corresponding new scenario analysis data; further, the server 120 generates a new script text and each scene segment conforming to the scenario adaptation information based on the new scenario analysis data, and synthesizes each scene segment into a target video based on the new script text; the server 120 then feeds the synthesized target video back to the terminal device 110 for presentation to the subject by the terminal device 110.
In an alternative embodiment, the terminal device 110 and the server 120 may communicate via a communication network.
In an alternative embodiment, the communication network is a wired network or a wireless network.
It should be noted that, the number of terminal devices and servers shown in fig. 1 is merely illustrative, and the number of terminal devices and servers is not limited in practice, and is not particularly limited in the embodiment of the present application.
In the embodiment of the application, when the number of the servers is multiple, the multiple servers can be formed into a blockchain, and the servers are nodes on the blockchain; according to the scenario adaptation method of the video disclosed by the embodiment of the application, related information of the original video and the target video can be stored in a blockchain, such as an original script text, a new script text, original scenario analysis data, new scenario analysis data, scenario adaptation information and the like.
In addition, the embodiment of the application can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and other scenes.
The scenario adaptation method of video provided by the exemplary embodiment of the present application will be described below with reference to the accompanying drawings in conjunction with the above-described application scenario, and it should be noted that the above-described application scenario is only shown for the convenience of understanding the spirit and principle of the present application, and the embodiment of the present application is not limited in any way in this respect.
Referring to fig. 2, a flowchart of an implementation of a scenario adaptation method of a video in an embodiment of the present application is shown, taking a client installed in a terminal device as an execution body as an example, and the implementation flow of the method is as follows:
s21: the client responds to the adapting operation triggered by the original video to be adapted, and presents a selection interface comprising: each video generation area associated with the original video; wherein each video generation region is associated with a set of video segments in the original video.
In an embodiment of the present application, each video clip set contains at least one video clip.
It should be noted that, the video types in the embodiments of the present application include, but are not limited to, various long videos, short videos, movie episodes (such as tv dramas, movies, cartoons), variety, and the like.
Taking a television play as an example, each video content set can be used as a video clip; alternatively still, portions of a collection of video content may be provided as a video clip; still alternatively, some or all of the multiple sets of video content may be combined as one video clip, etc., and so forth, without limitation. Correspondingly, in the embodiment of the application, the video generation area corresponding to the television drama can be divided according to the number of episodes of the television drama or the relevance of video contents, for example, each video content as a video clip corresponds to one video generation area; or, one or more video clips contained in each set of video content correspond to one video generation area; still alternatively, one or more video clips, such as a partial or complete combination of multiple sets of video content, may correspond to a video generation area, etc., without specific limitation herein.
Taking a movie as an example, the movie may be divided according to the storyline of the movie, etc., e.g., the movie is divided into several parts according to the storyline, each part being a video clip. Alternatively, the division may be performed according to scenario characters, movie scenes, and the like, which are not particularly limited herein.
The foregoing is merely illustrative, and any video clip dividing method is applicable to the embodiments of the present application, which is not specifically limited herein.
In the embodiment of the application, the object can trigger the recomposition operation in the process of watching the original video; the method can also directly trigger the recompilation operation and the like for the original video after searching the original video, and is not particularly limited herein.
An alternative implementation manner is to implement S21 according to the following substeps, and the specific implementation procedures are as follows S211-S212:
s211: the client presents a video playing interface for playing the original video; the video playback interface includes an adaptation entry.
Wherein the adaptation entrance may be in the form of a button, a bubble, etc., without specific limitation herein.
S212: the client responds to the reprogramming operation triggered by the reprogramming inlet in the video playing interface, and presents a selection interface; one video generation area in the selection interface is marked with the current playing progress of the original video.
Specifically, the object may trigger the adapting operation by clicking, long pressing the "adapting entry" and the like, which are not limited herein.
Fig. 3 is a schematic diagram of a video playing interface according to an embodiment of the application. S300 is an example of an adaptation entry in the embodiment of the present application. The object may click S300, jump to the selection interface listed in fig. 4, and display at least one video generation area.
Referring to fig. 4, a schematic diagram of a video generation area according to an embodiment of the application is shown. In fig. 4, a set of drama is taken as an example for each video generation area, and at least 1-8 sets of video generation areas are shown in fig. 4.
In addition, a video generation area corresponding to each of the forward, the follow-up, the additional and the like may be provided.
In the above embodiment, the video clip set associated with the video generation area denoted by 1, that is, the first set of videos (the whole set may be regarded as one video clip or may be divided into a plurality of video clips); the video segment set associated with the video generation region numbered 2, i.e., the second set of videos, …, and so on.
The "current user is in the current set number and progress" section shown in fig. 4 indicates that the object is currently watched as set 3, and the progress of the watching is shown in fig. 4, which is a representation form of the current playing progress of the original video in the embodiment of the present application, which is not specifically limited herein.
In the above embodiment, the object may quickly select the interface through the adaptation entry, check the current playing progress of the original video, and quickly locate the relevant video generation area based on the current playing progress, so as to quickly determine the target video generation area to be adapted, thereby improving efficiency.
S22: the client responds to the selection operation triggered by each video generation area, and presents an editing interface which comprises: the original scenario information of the video clip sets respectively associated with the selected at least one target video generation region.
In the embodiment of the application, the object can select all video generation areas which want to generate the adaptation through modes of gesture clicking, sliding passing and the like, and the video generation areas are used as target video generation areas.
Still taking the example shown in fig. 4, the current episode includes all the original video episodes of 1-8 episodes, each episode corresponds to a video generating area, and further provided with: forward, follow-up, and beyond, the 3 video generation areas, 11 total video generation areas.
In the embodiment of the application, one or more objects can be selected from various video generation areas presented by the selection interface as target video generation areas.
In the case of selecting a plurality of target video generation regions, the object may select several video generation regions that are continuous or several video generation regions that are discontinuous, which is not particularly limited herein.
After the object selects the target video generation area to be adapted, it may be displayed in the editing interface: the original scenario information of the video clips respectively associated with the currently selected target video generation areas, namely the trend of the original video scenario is approximate, so that the object can conveniently know the current scenario information.
In the embodiment of the present application, the display style of the original scenario information includes, but is not limited to: text, picture. For example, the text forms such as keywords (compound/handedness/success/failure …), sentences and the like can be adopted, and the text forms can also be adopted as film and television pictures (such as scenes with names, hotness or pictures with higher discussion) corresponding to the video clips.
When the original scenario information is displayed in a sentence form, keywords in the sentence can be further highlighted, and particularly, the keywords can be thickened, amplified, highlighted and the like, which is not particularly limited herein.
Taking the original scenario information as an example of a sentence, the sentence has a keyword that is highlighted.
Fig. 5 is a schematic diagram of a target video generation area according to an embodiment of the application.
The object selects video generating areas with the reference numbers of 3, 4 and 5 as target video generating areas, and video fragment sets respectively associated with the three target video generating areas are set 3, set 4 and set 5.
On the basis, the user can jump to the editing interface, and the setting of scenario adaptation information is carried out by the object.
Fig. 6 is a schematic diagram of a scenario development trend-related editing interface according to an embodiment of the present application. As shown in fig. 6, in the "current scenario" section, original scenario information of each of the 3 rd set, the 4 th set and the 5 th set is displayed, for example: the 3 rd set is: a and B are aware of the collusion of C and implement the back-pursuit plan; the 4 th set is: c is arrested into prison, A realizes that B acts in the process of pursuing are somewhat abnormal; the 5 th set is: b starts to show his tricks gradually, a probes B at the dangerous edge.
Wherein, some keywords in the original scenario information are highlighted by thickening. Such as "collusion", "anti-pursuit", "prison", "alien", "tricks", etc.
It should be noted that, the adaptations involved in the embodiments of the present application may be not only: the modification of the original scenario contained in the original video may also be: on the basis of the original video, the supplemental content such as the forward, the follow-up or the extra is generated, which is not contained in the original video.
For the second case, for generating the forward, subsequent or additional content, the corresponding original scenario information may be the current scenario information of all the content of the original video, or may be "blank".
Still referring to fig. 4, for example, when the object selects "supplement extra" or "forward" or "follow-up", the video clip set associated with the target video generation area includes all video clips corresponding to 1-8 sets, so that the current scenario information of 1-8 sets can be used as the original scenario information corresponding to the target video generation area.
For another example, when the object selects "forward" or "follow-up", the portion of the original scenario information may also be displayed as "empty".
It should be noted that the foregoing is merely illustrative, and any display manner of the original scenario information is applicable to the embodiment of the present application, which is not limited herein.
In the embodiment of the application, the object can know the general scenario trend of the video clip set associated with the selected target video generation area through the listed original scenario information so as to adapt the scenario according to the own requirement on the basis.
For example, when scenario information is described by keywords, the object may set scenario adaptation information by deleting original keywords and adding click or description keywords, and the detailed description will be described below.
S23: the client responds to the scenario recomposition operation triggered in the editing interface, acquires corresponding scenario recomposition information, and presents a target video generated based on the acquired at least one original scenario information and the scenario recomposition information; the target video is adapted based on the video clip sets respectively associated with the at least one target video generation region.
In the embodiment of the application, the target video is at least adapted based on the video clip sets respectively associated with the target video generation areas. In addition, the video clip album can be adapted in combination with other video contents having a certain association relationship with the scenario of the video clip album associated with the target video generation area. If the segments describing the same story line are split into different video segments or video segment sets, then a certain association exists between the video content.
Still referring to the example listed in fig. 4, the selection of the target video generation area and the generation of the target video include, but are not limited to, the following:
case one: the object clicks on a partial episode, such as 3-5 episodes in clicks 1-8. The generated content contains 3-5 alternative content, and in the course of analyzing the scenario, the AI scenario adaptation system (comprising a client and a server, which can be called as a system for short later) in the application can analyze the content of the front 1-2 sets and the rear 6-8 sets and the scenario trend of the original 3-5 sets, so that the target video which ensures continuity of the scenario before and after the scenario is generated.
In this case, regardless of the object's click on some consecutive sets, all relevant contexts or full text contexts are fully considered for overall connectivity.
If the object clicks the number of sets separated from the object, such as 1, 3, 5 sets, other interpenetration contents (2, 4) are also adjusted or changed in the generation process of the target video. While all relevant contexts or contexts are fully considered for connectivity.
That is, when the object selects some target video generation areas, the target video can be generated comprehensively based on not only the video segment sets related to the target video generation areas, but also the video segments related to the video segment sets or the video segments inserted in the middle.
And a second case: and if the object selects to generate the forward content, the original 1-8 sets of content are not changed, and in the process of analyzing the scenario, the system analyzes the 1-8 sets of content to generate the target video which accords with the forward background of the scenario time axis and has consistent scenario.
And a third case: and if the object selects to generate the subsequent content, the original 1-8-set content is not changed, and in the process of analyzing the scenario, the system analyzes the 1-8-set content to generate the target video which accords with the subsequent trend of the time axis and is consistent with the scenario.
Case four: when the object selects to generate the supplementary, the original 1-8 sets of contents are not changed, the contents outside the scenario time line are generated, in the process of analyzing the scenario, the system analyzes the 1-8 sets of contents to generate target videos which accord with the follow-up trend of the time axis and the continuity of the scenario, and the setting of the contents can provide more information when the object defines the keywords.
It should be noted that the above-mentioned several cases are only examples, and any case of selecting a target video generation area and generating a target video is applicable to the embodiments of the present application, and is not limited herein.
The following describes the setting procedure of scenario adaptation information in the embodiment of the present application in detail:
an alternative embodiment is: and further displaying scenario development options respectively associated with at least one target video generation area on the editing interface.
Similar to the above-listed original scenario information, scenario development options in the embodiment of the present application may also be text forms such as keywords, sentences, and the like.
In this embodiment, one specific embodiment of step S23 is as follows S231A to S232A:
S231A: the client responds to the selection operation triggered by each scenario development option, and takes at least one selected target scenario development option as scenario adaptation information related to scenario development;
S232A: the client presents the target video obtained by adapting the plot development trend of the video clip set respectively associated with the at least one target video generation area based on the obtained at least one original plot information and plot adaptation information.
That is, the object may adapt the original video by selecting a preset scenario development option.
Still referring to fig. 6, an example is shown in which a scenario trend keyword portion displays a plurality of scenario development options, that is, scenario trend keywords, which are generated based on AI (different from, such as completely opposite to, or extended from, the scenario trend of the original video), and the object may be selected by clicking or the like. And, the object can be selected singly or in a plurality of modes, or the object can be not selected, but the scenario adaptation information is set in other modes.
The scenario trend keywords as listed in fig. 6 are: no C-aware collusion between a and B, no C-aware intentional leakage from horse feet, no B-aware collusion by a, no B-aware behavioral anomaly by a, B-threatened by C, C-escape pursuit, C-convincing a-joining, a-use of the beauty meter, a is just a backlog, B has perceived suspicion of a.
In the case that there are a plurality of target video generating areas, the scenario trend keywords listed above may also correspond to the target video generating areas, that is, at least one scenario trend keyword is set for each target video generating area.
As listed above, the scenario trend keywords corresponding to the 3 rd set are: and C is not perceived by A and B, C deliberately leaks out of the horse feet, and A perceives B, wherein the selected target scenario development option is 'A perceives B' collusion.
The scenario trend keywords corresponding to the 4 th set are: a does not perceive the behavioral abnormality of B, B is threatened by C, C escapes and pursues, C persuades A to join, wherein, the selected target scenario development option is "C persuades A to join".
The scenario trend keywords corresponding to the 5 th set are: a uses a cosmetician, a is a backdrop, B has perceived suspicion of a, where there is no selected target scenario development option.
In the above embodiment, the object may set scenario adaptation information by clicking on the scenario development options, so as to adapt at least the scenario development trend of the video clips related to each of the target video generation areas, and may also synchronously adapt other video contents (such as the several cases listed above, which are not repeated here) related to the video clips to generate the target video with consecutive scenarios.
Another alternative embodiment is: and further displaying the first custom region on the editing interface.
In this embodiment, one specific embodiment of step S23 is as follows S231B to S232B:
S231B: the client responds to the input operation aiming at the first custom region, acquires the input scenario custom content, and takes the scenario custom content as scenario adaptation information;
S232B: the client presents the target video obtained by adapting the plot development trend of the video clip set respectively associated with the at least one target video generation area based on the obtained at least one original plot information and plot adaptation information.
That is, the object can also input custom content related to some scenario by itself to adapt the original video.
In addition to setting the scenario development direction according to the scenario trend keyword, the object may also be customized, for example, the "custom trend" portion in fig. 6 is an example of a first custom region listed in the embodiment of the present application, where the object may input scenario custom content by means of voice, text, etc., for example, "a is injured during the capturing process and is saved by E" as listed in fig. 6.
In the above embodiment, the object may set scenario adaptation information in a custom manner to adapt at least the scenario development trend of the video clips related to each of the target video generation areas, and may also synchronously adapt other video contents (such as several cases listed above and not repeated here) related to the video clips to generate target videos with consistent scenarios.
In summary, after the object selects the corresponding target video generation area to be adapted, the main original video scenario trend outline of the currently selected content is displayed in the system, so that the object can conveniently know the current scenario information. On the basis, the object can set scenario adaptation information related to the scenario development trend by deleting the original keywords, adding modes such as clicking or describing the keywords, and the like, so as to adapt the scenario development trend of the original video. The adaptation of the plot development trend of the original video at least comprises the adaptation of the plot development trend of the video clip corresponding to the target video generation area, and the adaptation of the plot development trend of other related content in the original video can be further included so as to ensure plot continuity.
Further, the object may also set content such as a scenario style, scenario character, scenario scene, duration, etc. The following specific modes are provided, wherein the setting sequence and interaction form of the scenario adaptation information can be various, and the setting sequence and interaction form are only simple examples, and the comparison is not particularly limited.
In an alternative embodiment, the object may trigger setting of the scenario style through the next page operation, which specifically includes the following steps:
Before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the object clicks a ">" icon, and the client presents a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: the system comprises at least one target video generation area, scenario style information of each associated video clip and preset scenario style options.
Then, the object can select from the preset scenario style options singly or multiply so as to realize the adaptation of the scenario style. And the client responds to the selection operation triggered for each scenario style option, and takes at least one selected target scenario style option as scenario recomposition information related to the scenario style so as to recomposite at least the scenario style of the video clip corresponding to the target video generation area in the original video based on the scenario recomposition information.
Based on the above manner, the subject can select the drama style, such as the drama that is originally pessimistic, the subject can change to happy, fun, delight, loving, office, garden, etc.
Fig. 7A is a schematic diagram of a scenario style-related editing interface according to an embodiment of the present application. The current scenario style part displays scenario style information of video clips respectively associated with the target video generation area; the scenario style trend keyword is an example of scenario style options in the embodiment of the application.
For example, the 3 rd to 5 th episodes listed above are the following: uterine cavity, ancient dress, action.
The scenario style trend keyword part is provided with: uterine funnels, ancient clothes, actions, love, garden, fun, etc.
The subject can select the key words through the plot style trend, and add a new plot style, for example, the subject can further select 'fun' on the basis of the current plot style.
Fig. 7B is a schematic diagram of still another scenario style-related editing interface according to an embodiment of the present application. For deleting the original scenario style, the object can perform some specified operations on the current scenario style part, as shown in fig. 7B, the object presses a certain current scenario style "womb" for a long time, namely, a deletion control "x" can be presented at the upper right corner of the object, and the object clicks the upper right corner "x" of the object, namely, the scenario style can be deleted.
Or, the keyword corresponding to the current scenario style is defaulted to be the selected state in the scenario style trend keyword part, and the object can also realize the effect of deleting the scenario style by canceling the selected state of the keyword, which is not particularly limited herein.
It should be noted that, the setting of the scenario style trend related scenario recomposition information is only a simple example, and any setting mode of the scenario style trend is applicable to the embodiment of the present application and will not be described in detail herein.
In an alternative embodiment, the object may trigger setting the scenario role through the next page operation, which specifically includes the following steps:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the object clicks a ">" icon, and the client presents a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: the method comprises the steps of generating original scenario roles of video clip sets respectively associated with at least one target video generation area and preset role options.
The object may then be triaged from these preset character options to enable adaptation of the scenario characters. And the client responds to the character modification operation triggered based on the character options, and obtains scenario adaptation information related to the scenario characters so as to adapt at least the related characters of the video clip set corresponding to the target video generation area in the original video based on the scenario adaptation information.
Fig. 8 is a schematic diagram of a scenario character-related editing interface according to an embodiment of the present application. The part of the current scenario role is displayed to be the role currently contained in the video clip sets respectively associated with the target video generation areas. The "role replacement" part is some role option examples in the embodiment of the application.
It should be noted that, in the embodiment of the present application, the role modification not only can realize the replacement of the original roles, but also can combine the roles in other video materials (such as other film and television resources) to perform the role replacement or new addition, so as to realize the linkage between different videos.
Optionally, the role modification operation includes at least one of:
and performing role replacement operation between the original scenario roles, performing role replacement operation between the candidate scenario roles and the original scenario roles, and performing role addition operation based on the candidate scenario roles.
Wherein the candidate scenario characters include at least one of: and the scenario roles in the original video except for the original scenario roles are the scenario roles in other video materials.
As shown in fig. 8, for the original scenario character a, it may be exchanged with other original scenario characters B, C or D to implement character replacement between the original scenario characters, such as replacing a with B; in addition, "…" can be clicked to replace more characters of the play or other plays, so as to realize the character replacement between the candidate play characters and the original play characters, such as replacing A with character E in the play, or replacing A with character F in other plays, etc.
Similarly, for the original scenario character B, it can be exchanged with other original scenario characters A, C or D, and "…" can be clicked to replace more of the scenario or other scenario characters; for the original scenario character C, the original scenario character C can be exchanged with other original scenario characters A, B or D, and the original scenario character C can be clicked with … to replace more of the scenario or other scenario characters; for the original scenario character D, it may be exchanged with other original scenario characters A, B or C, and click "…" instead of more of the play or other play characters.
Or, the "+" can be clicked, so that more characters which are not in the current scenario can be added, and the character adding operation based on the candidate scenario characters can be realized, such as adding the character G of the present scenario, the characters H of other scenarios, and the like.
Based on the above embodiments, the object may select a character that corresponds to a desired appearance or a character that is desired to be avoided. The selection of the percentage can be supplemented for the selected characters, and the characters can be added from other film and television episodes through system interaction, so that the related scenario adaptation of the scenario characters can be realized.
Similar to the above-mentioned scenario character adaptation process, in an alternative embodiment, the object may trigger setting a scenario scene through a next page operation, which is specifically as follows:
Before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the object clicks a ">" icon, and the client presents a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: at least one target video generation area respectively associated with the original scenario scenes of the video clip sets and preset scene options.
Then, the object can select from the preset scene options to realize the adaptation of the scenario scene. And the client responds to scene modifying operation triggered based on the scene options, obtains scenario recomposition information related to the scenario scenes, and recomposites at least related scenes of the video clip set corresponding to the target video generating area in the original video based on the scenario recomposition information.
Fig. 9 is a schematic diagram of a scenario scene related editing interface according to an embodiment of the application. The current scenario scene part is displayed as the scene currently contained in the video clip sets respectively associated with the target video generation areas. The "scene replacement" part is some scene option examples in the embodiment of the application, and the object clicks or adds the scene.
Wherein the scene modifying operation includes at least one of: scene replacement operation between original scenario scenes, scene replacement operation between candidate scenario scenes and original scenario scenes, and scene addition operation based on other scenario scenes; the candidate scenario includes at least one of: and the original video contains other scenario scenes except the original scenario scene, and scenario scenes in other video materials.
If the object can select a certain original scenario scene, then selecting a replaced scenario scene from a scene replacement part; or, the added scenario scene is selected directly from the "scene replacement" section, etc.
Based on the embodiment, the application not only can realize the replacement of original scenes, but also can combine scenes in other video materials (such as other film and television resources) to replace or add scenes, thereby realizing the linkage among different videos.
In an alternative embodiment, the object may trigger setting of the scenario duration through a next page operation, which specifically includes the following steps:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the object clicks a ">" icon, and the client presents a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: and setting a control for the duration.
Then, the object can set the scenario duration on the page, and the client responds to the duration setting operation triggered by the duration setting control to acquire scenario recomposition information related to the scenario duration so as to recompose the duration of the video clip corresponding to the target video generation area in the original video based on the scenario recomposition information.
Fig. 10A is a schematic diagram of a scenario duration-related editing interface according to an embodiment of the present application. The object can set the duration on the page, and select the duration of the album segment which is specifically needed to be generated.
If the scenario duration is set to 00:50:00, the object represents a collection of 50 minutes.
In the above-mentioned scenario of drama, considering that the duration of each episode of drama is generally substantially the same, only the example illustrated in fig. 10A is presented, that is, only one duration setting needs to be performed.
In addition, the application also supports setting for each target video generation area (such as each set) respectively during duration selection, and referring to fig. 10B, which is a schematic diagram of still another scenario duration related editing interface in the embodiment of the application. As in fig. 10B, one duration setting control may be configured for each target video generation region, and further, the object may perform duration configuration for each target video generation region separately, for example, set 3 to 00:46:00, set 4 to 00:45:00, set 5 to 00:44:00, and so on.
It should be noted that the above-listed time duration setting manners are only illustrative, and any time duration setting manner is applicable to the embodiments of the present application, and is not limited herein.
Based on the embodiment, the object can set the scenario duration by itself, and further perfect scenario recomposition information, so as to automatically generate the target video more meeting the own requirements based on the scenario recomposition information.
In an alternative embodiment, the object may trigger the adaptation flow of the scenario ending type through the next page operation, which specifically includes the following steps:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the object clicks a ">" icon, and the client presents a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: the type of the end of scenario of the original video and preset end of scenario options.
The object can select from the preset scenario ending options to realize the adaptation of the scenario ending type. The client responds to the selection operation triggered for each scenario ending option, and takes the selected target scenario ending option as scenario recomposition information related to the scenario ending type so as to recomposite the ending type of the original video based on the scenario recomposition information.
Fig. 11 is a schematic diagram of an end-of-scenario type-related editing interface according to an embodiment of the present application. Fig. 11 shows that the current scenario end type of the original video is be, which the object replaces for he.
Wherein: be is bad, he is happy, he is open, ne is noarml, he is normal.
It should be noted that the above-listed end types are only simple examples, and are not limited thereto, and are not described in detail herein.
In an optional implementation manner, the object may trigger a custom setting flow of any kind of scenario adaptation information through a next page operation, and the specific process is as follows:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the object clicks a ">" icon, and the client presents a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: and a second custom region.
The object can automatically input any kind of custom content related to the scenario in the second custom region, such as scenario development, scenario style, scenario role, scenario scene, scenario duration, scenario ending type and the like, so as to adapt the original video. And the client responds to the input operation aiming at the second custom region, acquires other input custom content, and takes the other custom content as scenario adaptation information.
Fig. 12 is a schematic diagram of an editing interface related to other customized content according to an embodiment of the present application. The object can describe the content which any other user wants to add by using sentences, words and the like, and the system fuses with other scenario adaptation information after semantic recognition. As in fig. 12, another custom content that the object has newly added is "a and C erase sparks during the meeting process".
In addition, the object may describe scenario adaptation information of other classes in this section, such as "exchange a and C roles", such as "change scenario duration to 45 minutes", such as "change scenario end type to he", and so on.
Based on the above embodiment, the object can set the scenario duration by itself.
It should be noted that, in the setting of the above-listed various scenario adaptation information, except that scenario adaptation information related to scenario development is a necessary step (for example, fig. 6), the setting steps of the other scenario adaptation information may be skipped, and the sequence between the other setting steps may be flexibly adjusted according to the actual situation, which is just a simple illustration and is not specifically limited herein.
In the embodiment of the present application, after the setting of scenario adaptation information by the object is completed, the system background (i.e., the server) learns the original episode content of the original video, and this part of the content will be described in detail in step S1806 of fig. 18. The part of the content is carried out by the background of the system, and the object can stay waiting on the original page or can carry out other operations after being minimized.
An alternative embodiment is to first present a content generation interface before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, the content generation interface comprising: a progress bar is generated for the target video. In view of the process of generating a target video based on scenario adaptation information, a certain time is required, and thus, the progress of the object video adaptation can be prompted based on the generation progress bar.
Referring to fig. 13, a schematic diagram of a content generation interface according to an embodiment of the present application is shown. S131 is a generating progress bar in the embodiment of the present application. On the basis, the method can also be directly used for prompting through percentages, and if the current progress is 80%, the generation of the target video is 80%.
In the application, after the content generation interface is presented, the object can minimize other operations after the content generation interface besides staying waiting on the original page in consideration of a certain time required for generating the target video. An alternative embodiment is as follows:
the client side responds to the minimizing operation aiming at the content generating interface, jumps to the video playing interface, adapts the relevant position of the entrance in the video playing interface, and presents the generating progress bar aiming at the target video; the adaptation entry is used to trigger an adaptation operation for the original video.
The related position of the recomposition entry may be near the recomposition entry anchor point, such as above, below, left, right, etc. the recomposition entry is at a set value (e.g. 10 pixels), or within a preset distance range. The distance or the distance range may be according to practical situations, for example, the larger the screen for displaying the video playing interface, the smaller the distance, etc., which are not specifically limited herein.
Still referring to fig. 13, S132 is a minimizing control listed in the embodiment of the present application, and the object may click on the minimizing control to minimize the content generation interface, and display the video playing interface shown in fig. 14.
Fig. 14 is a schematic diagram of a video playing interface according to another embodiment of the present application. Wherein, the S140 part displays the recomposition entry, and the generation progress bar is displayed below the recomposition entry.
In this way, the object can continue to watch the original video on the video playing interface, and in the process, the generation progress of the target video can be known through the generation progress bar. In addition, the object may perform some other operations, such as screenshot, bullet screen, etc., without limitation.
In the above embodiment, the object viscosity is improved by minimizing the content generation interface to prevent the object from exiting the current adaptation due to excessive waiting.
And when the target video generation process is completed, namely the generation progress bar reaches 100%, playing the target video in the content generation interface.
In the embodiment of the present application, the playing progress bar of the target video may be further displayed in the content generating interface, which is similar to the playing progress bar of the original video listed in the video playing interface, and the object may adjust the playing progress of the target video based on the playing progress bar.
Fig. 15 is a schematic diagram of a target video frame according to an embodiment of the application. The video picture is one picture in the target video obtained through the adaptation, namely one picture of the target video when the target video is played to a certain moment.
Based on the implementation mode, original episode content can be broken, objects can be imagined and AI authored among different episodes, based on the fact, the objects can define the episodes, the thousand-face episode imagination is met, the objects are helped to convert the imagined content, the object imagination is deeply met, and the movie of the moving of the objects is changed into an album. Moreover, the target video is automatically generated based on the AI technology, so that imagination and exercise of potential specific scenario trend generated by the scenario can be helped, feasibility of the scenario is explored, a platform is helped to attract more types of objects, and the potential specific scenario trend is not limited by the original work type and scenario trend.
In the application, after the target video is generated, the object can also edit the picture content of the target video secondarily. An alternative embodiment is:
the object may trigger an editing operation for the target video by means of a specified gesture, voice, or a specified operation (such as clicking an editing button in the content generation interface), and the client side presents at least one type of editing control in response to the editing operation of the object for the target video. The object can select an editing control, the client responds to a selection operation triggered by at least one type of editing control, edits the picture content of the target video based on the selected target editing control, and presents a corresponding editing effect.
Fig. 16 is a schematic diagram of a secondary editing process according to an embodiment of the application. The editing controls as listed in fig. 16 are: clipping, filters, text, stickers, special effects, image quality enhancement, clipping and automatic captions.
Wherein, the clipping is the secondary editing of the picture content by the pointer, so as to increase or decrease the segments, modify the segment sequence, and the like, such as removing a part of the picture in the target video.
The filter is added to the picture content, for example, after the object clicks the "filter" icon, the picture can be further presented: different filters such as bright color, contrast color, cold color, black and white can be selected by the object, and the object can be switched in a sliding way left and right/up and down.
Text refers to inserting text into the content of a picture, for example, an object can customize the inserted text into the picture contained in a target video, and particularly, a font (such as a round body, fine black, thick black, and handwriting …), a color (such as red, black, blue, and green …), a tracing style (such as tracing, and non-tracing …) and the like of the text can be selected.
The sticker is to insert some patterns in the picture content, for example, after the object clicks the "sticker" icon, various kinds of stickers can be further presented, and the object can select different stickers and can be switched in a left-right/up-down sliding manner.
The special effects refer to inserting some atmosphere special effects such as love, snowflake and the like into the picture content.
The image quality enhancement refers to enhancing the image quality of the image content and optimizing the texture details in the image so as to make the image more natural and clear.
Clipping refers to clipping the size of the screen content, such as clipping a portion of each of the left and right sides of the video screen shown in fig. 16, and so on.
The automatic caption means that some custom captions can be added in the picture content, or custom vocabulary can be adjusted again to modify the captions, etc.
It should be noted that the editing controls listed above are only simple examples, and any editing control related to the screen content is applicable to the embodiments of the present application, and is not limited herein.
Based on the implementation mode, the object can carry out secondary editing on the target video after the target video is generated, so that the object is further subjected to high-definition of content on the basis of thousands of people and thousands of faces which are originally in line with imagination.
In the embodiment of the present application, if the object wants to make a scenario adaptation again, the setting process of the previous scenario modification information is repeated, and the detailed description is not repeated here.
The scenario adaptation method of the video in the embodiment of the present application is mainly described from the terminal device/client side.
The following describes the scenario adaptation method of the video in the embodiment of the present application in detail from the server side:
referring to fig. 17, a flowchart of an implementation of a scenario adaptation method of a video according to an embodiment of the present application is shown, taking a server as an execution body as an example, where the implementation flow of the method is as follows:
s171: the server analyzes script text and picture data of the original video to be adapted to obtain original scenario analysis data corresponding to the original video.
S172: the method comprises the steps that a server obtains scenario adaptation information of at least one target video generation area in each video generation area related to an original video, updates original scenario analysis data based on the scenario adaptation information, and generates corresponding new scenario analysis data; the scenario adaptation information is adaptation information of original scenario information of a video clip set associated with each of the at least one target video generation region.
That is, before the selection interaction of the object definition adaptation range is performed, the system needs to complete the preprocessing of the movie, that is, the preresolved of the movie is completed. The object can freely select the adapting range, no matter what the object is selected, and all contents of the whole drama can be generated by the AI system.
S173: the server generates a new script text and each scene fragment according with the scenario recomposition information based on the new scenario analytic data, and synthesizes each scene fragment into a target video based on the new script text; the target video is adapted based on the respective associated video segment sets of the at least one target video generation region.
In the embodiment of the application, after an object selects an original video, such as movie play a (abbreviated as play a), a specific underlying data integration description format (. Moveldata) is adopted to generate corresponding original scenario analysis data (play a. Moveldata) by analyzing script (play a script. Txt) and picture (play a. Mp 4) data of the given movie play a (play a).
In addition, the object can modify the local content of the parsed data through simple interaction at the upper layer, and the specific interaction process can be referred to as the related process of setting scenario adaptation information listed in the above embodiment. That is, the object may generate a new scenario analysis data (play b. Moveldata) by setting scenario adaptation information, and updating the original scenario analysis data based on the scenario adaptation information.
Further, a new brand new script (play B script. Txt) and each segment picture can be generated based on the data format, and a movie play (play B. Mp 4), namely the target video, is finally obtained after synthesis.
In the embodiment of the application, the scenario analysis data at least comprises a character behavior logic sequence, and can be obtained by supplementing the character behavior logic sequence based on a picture analysis result. And character behavior logic sequence: the method refers to a sequence obtained by analyzing and dividing script texts according to scenes or roles, one script text can be divided into one or more role behavior logic sequences, for example, when the script texts are divided according to scenes, one role behavior logic sequence can be descriptive information for describing each fragment in one scene; when a character is divided, a character behavior logic sequence may be description information describing each sub-period of a character under a certain period of time. Each character behavior logic sequence comprises at least one character behavior logic sequence unit, and each character behavior logic sequence unit corresponds to one scene segment. For example, by division into scenes, a corresponding logical sequence of character behaviors in a bookstore scene may refer to: the xx time is read by the xx position of the character A and the character B in the bookstore, the xx time is cashed by the counter of the bookstore by the character C, the xx time is entered by the character D in the bookstore, and the like. For example, by division into scenes, a corresponding logical sequence of character behaviors under character a may refer to: character a does a1 thing in scene a at xx time, character a does a2 thing in scene a at xx time, character a does b thing from scene a to scene b at xx time, and so on.
In summary, in the embodiment of the present application, an object may select or edit a plurality of types of definition items such as a script, a scenario, a character, a style, etc. of one or more movie contents to be adapted, when the object selects the contents such as adaptation related information, the system records the modified contents in real time, the system adapts the script to the definition item selected by the object, matches the adapted script, lines, character scenes, etc. with pictures to obtain new character behavior logic sequences, uses an AI picture generation technology to generate scene segments and picture organization structure logic for the new character behavior logic sequences, and finally uses a video coding and decoding technology to combine the segments with the old video segments according to the picture organization structure logic to form a complete new video, i.e. a target video.
The following describes the specific generation process of the target video in the embodiment of the present application, and various AI technologies involved in the process in detail:
in an alternative embodiment, S171 may be implemented according to the following procedure, including the following sub-steps S1711-S1712:
s1711: analyzing script text of the original video to obtain an original character behavior logic sequence containing scenes;
S1712: analyzing the picture of the original video, and supplementing the logic sequence of the behavior of the original role by combining the picture analysis result to obtain scenario analysis data corresponding to the original video.
In the embodiment of the application, the script text is text information of related staff for video shooting of directors, actors and the like. The scenario analysis data is a machine (computer) oriented data structure. Step S171 opens up scripts and pictures to convert the original video into a machine specific underlying data description format (movata).
Specifically, the script text of the original video needs to be analyzed to obtain an original character behavior logic sequence containing a scene, and the process can be implemented based on an AI natural language analysis technology, which will be described in detail below.
Furthermore, considering that the shooting process of the original video is not strictly required to be implemented according to script text, the shooting process can be possibly combined with newly proposed suggestions of directors, actors and the like, therefore, the application can analyze the picture of the original video on the basis of acquiring the original role behavior logic sequence, and obtain complete analysis data, namely the original scenario analysis data, by combining the original role behavior logic sequence. The above-mentioned picture analysis of the original video can be implemented in combination with AI picture analysis technology, which will be described in detail below.
In the above embodiment, the script text is combined with the picture data of the video, and scenario analysis data in a special bottom layer data description format is generated by analysis, so that the script and the picture are opened.
Alternatively, in step S173, the generation of the new script text may be implemented based on the AI natural language generation technique, and the generation of the scene segment may be implemented based on the AI picture generation technique.
In the embodiment of the application, the realization of the four technologies, i.e. the AI natural language analysis technology, the AI natural language generation technology, the AI picture analysis technology and the AI picture generation technology, is the technical content that the front-end research and development flow of the AI scenario adaptation system provided by the application has to be completed, and after the technologies are adopted, the input of new data and the output of expected values can be realized so as to realize the automatic generation of target videos.
The AI natural language analysis technique, AI natural language generation technique, AI picture analysis technique, and AI picture generation technique according to the embodiments of the present application will be described in detail below.
The four AI techniques listed above require a certain AI training. AI training is a process that guides how AI models learn in data. This process generally includes: 1. training data such as images, voices, texts, etc. are prepared. 2. A machine learning algorithm, such as a neural network, support vector machine, etc., is selected. 3. Parameters of the AI model, such as learning rate, number of hidden layers, etc., are configured. 4. And running training to enable the AI model to learn the modes in the training data. 5. And evaluating the performance of the model, and verifying whether the model can be accurately predicted by testing data. 6. Parameters are adjusted and training is repeated to improve model performance.
AI training is a cyclic process until the model can predict data with satisfactory accuracy. The trained model can be used in practical applications and continue to improve performance through learning on new data.
In the embodiment of the present application, the implementation of several AI techniques listed above all require a certain AI training in advance. The specific process is as follows:
first, the pretreatment before training is briefly introduced:
taking film and television drama videos as an example, a large amount of film and television drama data is needed for training, for the data preprocessing of a single film and television drama script, a whole script is automatically disassembled by using a conventional AI natural language analysis technology to obtain the relation among all roles (such as characters and figures), and each scene of the film and television drama is further automatically disassembled to obtain the data logic description of the script content structure.
The following describes a learning process of the AI natural language analysis technique:
after the preprocessing data are obtained, the character behavior logic sequences corresponding to each movie and television play can be obtained through modes such as manual labeling, namely the sample character behavior logic sequences in the application.
For example, for each movie and television play, the data annotating personnel manually annotates the data preprocessing result of one movie and television play script, and the automatically generated error data is repaired: for example, the wrong scenario trend keywords, scenario style keywords, scene keywords, character keywords and the like are modified to obtain a complete logic sequence of each character behavior with a scene of the movie and television play. That is, the sample character behavior logic sequence in the embodiment of the present application is constructed based on the relationship between characters obtained by analyzing the sample script text and the data logic description of the scenario content structure obtained by analyzing the corresponding scene.
After the data processing is carried out on a large number (more than 1000) of movie dramas, a large number of input values (sample script texts) and output values (sample character behavior logic sequences) are used as an AI training data set to train, and a special AI natural language analysis technology is obtained by combining a conventional AI natural language analysis technology. Namely, taking a sample script text as an input characteristic, taking a corresponding sample character behavior logic sequence as an output characteristic, and carrying out cyclic iterative training on the natural language analysis model to be trained to obtain a trained natural language analysis model.
On the basis of the above, an alternative embodiment of step S1711 is:
and inputting the script text of the original video into the trained natural language analysis model to obtain an original character behavior logic sequence output by the natural language analysis model.
According to the embodiment of the application, through learning of an AI natural language analysis technology, a natural language analysis model which can output a character behavior logic sequence by inputting a script text can be trained, so that analysis of an original video script text is rapidly realized based on the natural language analysis model, and a corresponding original character behavior logic sequence is obtained.
The following describes a learning process of the AI picture analysis technique:
Firstly, based on the action logic sequences of each role of a movie by a data labeling personnel, splitting a coherent movie to obtain scene fragments corresponding to each action logic sequence unit in a scattered way.
After the large-batch processing, the large-batch input values (complete scene+behavior logic sequences) and output values (scattered scene fragments) are used as AI training data sets to train, and a special AI picture analysis technology is obtained by combining a conventional AI picture analysis technology. Namely, taking a sample video and a sample character behavior logic sequence as input characteristics, taking a corresponding sample scene fragment as output characteristics, and carrying out cyclic iterative training on a picture analysis model to be trained to obtain a trained picture analysis model. The sample scene segment is obtained by splitting a sample video based on a sample role behavior logic sequence.
In addition, the technology also analyzes and marks keywords such as roles and behaviors, scenes, plot trends, styles and the like on the specific picture content in the split scattered scene segments.
An alternative implementation manner of step S171 is to input the original character behavior logic sequence and the original video into a trained picture analysis model to obtain the dispersed scene segments output by the picture analysis model; and further, analyzing pictures in each obtained scene segment to obtain key information, and marking the original character behavior logic sequence based on the obtained key information to obtain the original scenario analysis data.
Each character behavior logic sequence comprises at least one character behavior logic sequence unit, and each character behavior logic sequence unit corresponds to one scene segment.
In the embodiment of the application, through the learning of an AI picture analysis technology, a picture analysis model which inputs a character behavior logic sequence and a complete video and can output a scattered scene segment can be trained, so that based on the picture analysis model, picture data of an original video and the original character behavior logic sequence are combined, a corresponding picture analysis result can be obtained, the picture analysis result is the scattered scene segment and corresponds to each character behavior logic sequence unit in the original character behavior logic sequence, and based on the enriched original character behavior logic sequence, original scenario analysis data can be obtained.
The following describes a learning process of the AI natural language generation technique:
firstly, manually modifying character behavior logic sequence data generated by the AI natural language learning technology for a plurality of times based on a data labeling person, further taking a character behavior logic sequence before modification as an original sample character behavior logic sequence, and taking the character behavior logic sequence after modification as a new sample character behavior logic sequence.
Specific modifications include, but are not limited to, some or all of the following: the scenario trend of the scenario, the character relation logic, the scenario and the like are added and deleted. Modified content is annotated with trending keywords, storyline keywords, scene keywords, and the like. The labeling information can be used as sample scenario adaptation information.
For example, the data annotator manually edits 100 homonymous scenarios (partial scenario modification) of "dream of the red blood cell", the modified content is annotated with the above-mentioned keywords, and the scenario of "dream of the red blood cell" before modification and 100 modified scenarios of +100 modified keywords are used as part of the data set.
After the large-batch processing is carried out, a large number of input values (an original sample character behavior logic sequence+sample scenario modification information) and output values (a new sample character behavior logic sequence) are used as an AI training data set to train, and a special AI natural language generation technology is obtained by combining a conventional AI natural language generation technology. Namely, taking an original sample character behavior logic sequence and sample scenario adaptation information as input characteristics, taking a corresponding new sample character behavior logic sequence as output characteristics, and performing cyclic iterative training on a natural language generation model to be trained to obtain a trained natural language generation model.
The new sample character behavior logic sequence is obtained by modifying the original sample character behavior logic sequence for a plurality of times based on sample scenario adaptation information; the original sample character behavior logic sequence is constructed based on the relation between characters obtained by analyzing the sample script text and the data logic description of the scenario content structure obtained by analyzing the corresponding scene.
Optionally, the original sample character behavior logic sequence further includes an end identifier for representing a scenario end type of the corresponding sample video.
For example, the types of the end of scenario such as happy addition, bad addition, normal addition, open addition and the like also need manual calibration, and the marks are added into marks of the sample character behavior logic sequences as key word marks so as to improve the diversity and the accuracy of the model.
In the case where the scenario analysis data represents a character behavior logic sequence, an alternative embodiment of step S172 is as follows:
inputting the scenario adaptation information and the original scenario analysis data into a trained natural language generation model to obtain a new character behavior logic sequence output by the natural language generation model, and taking the new character behavior logic sequence as the new scenario analysis data.
According to the embodiment of the application, through learning of an AI natural language generation technology, a natural language generation model which inputs scenario recomposition information and old scenario analysis data and can output a new character behavior logic sequence can be trained, so that the character behavior logic sequence can be updated rapidly based on the scenario recomposition information combined with the object setting, and the new scenario analysis data can be obtained.
The following describes a learning process of the AI picture generation technique:
firstly, a data labeling person manually matches a behavior logic sequence with keywords (the part of the behavior logic sequence can be used for the corresponding relation between the keywords and pictures, such as scattered scene segments obtained in an AI picture analysis technology and analyzed roles and behaviors, scenes, plot trends, styles and the like) with the corresponding scene segments.
After the large-scale processing, the large-scale input values (behavior logic sequences+keywords) and output values (scene segments) are used as AI training data sets for training, and a special AI picture generation technology is obtained by combining the conventional AI picture generation technology (generally, generation of an countermeasure network GAN). Namely, taking a sample character behavior logic sequence and sample key information as input characteristics, taking a corresponding sample scene segment as output characteristics, and carrying out cyclic iterative training on a picture generation model to be trained to obtain a trained picture generation mode.
The sample key information is obtained by analyzing pictures in each sample scene segment.
For example, a new television drama XXX is created, a small scenario is manually extracted, keywords which can be changed by the user are used as input values, and pictures corresponding to the scenario are used as output values.
Based on the above, new scene images can be generated based on the new scenario analysis data; in addition, new script text can be generated through data format conversion, so that each scene segment can be logically synthesized into a target video according to a proper picture organization structure based on a script time line corresponding to the new script text.
Specifically, the above-mentioned picture generation model may be used for generating an antagonism network GAN, and in the GAN training process, sample key information corresponding to the sample character behavior logic sequence may be converted into a vector (i.e. a keyword vector) to represent the expected feature of the image; further receiving keyword vectors based on the generator network and synthesizing an image; the generated image is then evaluated based on the arbiter to determine if it meets the expected characteristics. Through training, the generator and the discriminant will cooperate to achieve the desired result, the generator learning to generate images that conform to the desired, and the discriminant learning to evaluate the generated images.
Finally, based on the generator in GAN, an image can be generated in combination with the keyword sentence vector, an alternative implementation flow is as follows:
firstly, mapping keywords and sentences into vectors, and mapping key information (such as keywords and sentences, scenario style keywords, scene keywords, character keywords and the like) corresponding to a new character behavior logic sequence into a plurality of feature vectors; each keyword sentence is converted into a vector to represent the desired feature of the image. Further, a plurality of feature vectors may be input to a generator in the picture generation model, and after a corresponding image is generated, a scene segment may be composed of the image.
In the process, if the situation of replacing the corresponding scenario roles, such as the role A and the role B, is replaced, the role replacement can be realized by utilizing the face migration function of the generated network. If the character expression state is adapted, the expression migration function of the generated network can be utilized. The pictures in other original dramas are automatically generated by the generation countermeasure network GAN.
Based on the embodiment, the target video can be automatically generated based on the AI technology, imagination and exercise of potential specific scenario trend can be helped by the scenario, feasibility of the scenario is explored, and the platform is helped to attract more types of objects and is not limited by the original type of works and scenario trend.
Optionally, after acquiring scenario adaptation information for at least one target video generation area in each video generation area associated with the original video, before updating the original scenario analysis data based on the scenario adaptation information, if the scenario adaptation information includes scenario custom content, performing semantic logic analysis on the scenario custom content; if the custom content is determined not to accord with the semantic logic, a prompt message is sent to the terminal equipment to prompt that the scenario custom content needs to be edited again.
That is, when the object selects the content such as the scenario trend and the character information, the system records the modified content in real time, and when the object sets the scenario recomposition information by inputting the scenario custom content, firstly, the semantic logic of the scenario custom content needs to be analyzed based on the AI natural language analysis technology to determine whether the logic requirement in the movie and television play is met, if the part of the content meets the normal semantic logic (i.e. meets the logic requirement), the next operation can be continuously executed, and if the part of the content does not meet the normal semantic logic, the next operation cannot be carried out, and at the moment, the terminal equipment can prompt the object to modify or re-input the part of the content.
Referring to fig. 18, a flowchart of another scenario adaptation method of video according to an embodiment of the present application is shown. The specific implementation flow of the method is as follows:
step S1801: after the object selects one play, the script text of the play is input into a trained natural language analysis model for analysis, and each original character behavior logic sequence with a scene is formed.
Step S1802: and analyzing the pictures of the film and television play by using a trained picture analysis model by combining the logic sequence time sequence of each original role, and obtaining complete original scenario analysis data by combining the logic sequence of the original behavior.
Step S1803: and obtaining the scenario adaptation information generated by the object through changing the information of scenario, behavior, style, scene, character and the like.
In this step, if the scenario adaptation information includes a free scenario input portion, it is necessary to determine whether or not the semantic logic is satisfied based on the AI natural language analysis technique. If yes, continuing to execute the subsequent step S, otherwise, sending prompt information to the terminal equipment to prompt that the free script input part of the object is not available and the input is needed to be re-performed.
Step S1804: combining the scenario adaptation information with the original character behavior logic sequence (also can be understood as the original scenario analysis data), and obtaining each new character behavior logic sequence by using a trained natural language generation model and taking the new character behavior logic sequence as the new scenario analysis data.
Wherein the time consuming condition of this step is related to the content of the adaptation, and if the scenario fails in the generation, the process returns to the previous step and prompts the object to be related to the adaptation.
Step S1805: and generating scene fragments and picture organization structure logic based on the new character behavior logic sequences by using a trained picture generation model.
Wherein the time consuming condition of the operation is related to the amount of the recomposition, if a picture which is difficult to generate is generated, the operation returns to the previous step and prompts the object to be related to the recomposition. In addition, if the object has operations such as exchange of character figures or change of expression style, the content generation is carried out by matching with the use of the AI expression migration and the face migration which are relatively mature at present.
Step 1806: and synthesizing the newly generated scene fragments into a complete new play video by combining the old video fragments of the original video according to the picture organization structure logic by using a video encoding and decoding technology, and taking the complete new play video as a target video.
In summary, the generation of the target video in the present application may combine the presbyopic video segment of the original video with the newly generated respective scene segments.
Before combining the old video clips of the original video, the original video needs to be learned, specifically:
After the setting content of the scenario adaptation information is completed by the object, the system background learns the original scenario content. This part of the content is performed by the system background, and the object can stay waiting on the original page as shown in fig. 13, and can also perform other operations after being minimized as shown in fig. 14.
Wherein the system learning content includes, but is not limited to, the following elements:
the scenario type, description, etc. If the original video script is a romantic story of teenager anxiety created under the conceptual framework of a haustorium movie, the newly adapted content will not overtake the original background framework and narrative rhythm and style after the learning style. Unless the object deletes the original keyword style and adds new keywords when customizing the content, for example, the 'blood-sucking ghost' is changed into a 'zombie', 'teenager' is changed into a 'twiddle', 'romance' is changed into a 'smiling', the system can correspondingly learn from other film and television drama concentrates with the keywords, and the original content and style are replaced in the final generated result.
The text content and language style of the original video are the text content and language style of the original video, and if the text language style content such as scotland english, chinese dialect, latin, web phrase and the like is used in the original video, the original style is maintained after learning. Unless the object deletes the original keyword style and adds new keywords when customizing the content, for example, the 'scotland English' is changed into 'cantonese', the system can learn correspondingly from other film and television episodes and language systems with the keywords, and replace the original content and style in the final generated result.
Wherein, the actor's figures are the face of the character and the makeup hair in the play. The character, style, facial expression, speech tone and the like of the actor show, and if the style of the character in the original video of a character is the characteristics of being offensive, intelligent, precise and the like, walking with wind, speaking and speaking quickly and the like, the generated content can be shown in the result according to the learned style. Unless the object deletes the original keyword style and adds new keywords when customizing the content, if the 'overlooking' is changed into 'fun', the system can learn from other film and television episode roles and language rendition styles with the keywords correspondingly, and replace the original content and styles in the final generated result.
The clip style includes the contents of clip segment transition rhythm, transition mode, filter style, segment connection, special effect application, etc. The original clip style will be retained in the final result unless the object is modified.
If the whole music application style of the original video is the popular piano song style, the learned result will not add other piano styles, such as classical style, jazz style, etc., the original popular piano song style will be preserved and used as required when editing.
[ other movie dramas or general material ] such as empty mirror content for expressing the place, time, scene, etc. where the scenario occurs; or content that the object wishes to be able to associate with the original video scenario/character, is used on demand when editing. As exemplified above, the roles in the reference other video materials are replaced or added, the scenes in the reference other video materials are replaced or added, etc., to achieve cross-video linkage.
That is, all the final results are generated by combining the learning content and the learning training results, instead of mental effort and time and technical cost required by manpower, the scenario album fragments meeting the ideal scenario of the object are generated through background learning, and the system can generate thousands of people and thousands of results, so that all types of objects can meet the emotion or viewing requirement of the user.
Specifically, according to learning [ scenario generation ] and style ] and learning [ speech text style ], the system generates new scripts and speech content in text form and generates a corresponding script timeline.
After learning the corresponding actor images and performances of two actors according to actor roles of an object, for example, a role A and a role B system, under a new script scene, the system can extract a fragment scene which can be used for a new script in an original video fragment, for example, the original video fragment can replace the original role only by role face migration and expression migration, or can be used in the recomposition content only by generating a new dialogue through a mouth shape, for example, the face of the role A is changed into the face of the role B or the original dialogue of the role A is replaced by the new dialogue, and the final result can be used. The extracted fragments can be synthesized according to the sequence of the scenario script, for example, fragments 1 and 2 and 3 are combined together, and the final result is generated. The AI can intercept the film and television comprehensive segments in the system background or network selected by other objects at the same time, and generate related linkage scenario segments (such as role linkage in different television dramas) according to the scenario, thereby meeting the object's wish.
For a scene which cannot be extracted from the original film and television fragment, generating corresponding role interaction through a training result generated after learning, wherein the expression language style and the roles of the face are reflected in the original role style. The dialogue or bystander content of the character can be subjected to voice matching according to the speech of the new character, and the system generates AI character speech according to the voice and the mouth shape of the original character, so that the speech content accords with the scenario.
For a part of characters needing to wear props, special effect replacement and other modes are selected to replace, for example, the character A in the original video is a blood sucking ghost image, and the character A after being recombined is a zombie image, so that the system can transfer bones of the original character to the zombie image of the character A in a special effect changing mode. Other elements such as wear and make-up, hair styling, etc. can also be replaced by such techniques.
The AI can automatically search scenes/characters/content segments in the related film library, such as empty mirror content for expressing the place, time, scene, etc. where the scenario occurs; or the object may want content that can be storyline/character-linked with the original video to generate more complete album content from story content. For example, a library scene needs to be added in the newly generated recomposition fragment, but the original video does not have such a scene, so that the external scene empty mirror of the coffee library scene and the image of the internal space combined with the characters can be selected from the related material movie and television play to synthesize the final library scene fragment.
When all the edited scripts and video clips have been obtained, the system clips according to the required time line according to all the clips. And the method also can be correspondingly used in the editing process, and the filter and the like are connected and edited according to the original transition rhythm style mode. Other characters, side white, empty mirrors, transition lenses and the like can be added between some scene segments to prevent the content from forming a split feeling.
After learning the original video and the object to customize the required style, matching the corresponding music fragments according to the scenario trend and emotion background of the script in the editing process. And the matched music pieces can be automatically adjusted in rhythm and tone to a certain extent according to the script and the emotion background. If the music is large-tune music with a smooth rhythm, which is originally required to be matched with the atmosphere style, the content after being recombined can be adjusted to small-tune music with a clear rhythm.
The combination including but not limited to the elements and steps above, the system will produce a final result, and both the duration of the result and the degree of story summarization may also be adjusted by the subject at a previous stage, as exemplified in fig. 10A or fig. 10B.
It should be noted that if the re-editing or modification related to the scenario of the generated content is performed, the system will be reset to return to step S1803, and a new complete generation will be restarted, whereas if the re-editing related to the content of the generated content picture is performed, such as adding a filter, the related video editing software will be called for performing the second editing.
Based on the same inventive concept, the embodiment of the application also provides a scenario adaptation device of the video. As shown in fig. 19, which is a schematic structural diagram of a scenario adaptation apparatus 1900, may include:
a first response unit 1901, configured to respond to an adaptation operation triggered for an original video to be adapted, and present a selection interface, where the selection interface includes: each video generation area associated with the original video; wherein each video generation region is associated with a video clip set in the original video;
the second response unit 1902 is configured to respond to a selection operation triggered for each video generation region, and present an editing interface, where the editing interface includes: original scenario information of video clip sets respectively associated with at least one selected target video generation region;
a third response unit 1903, configured to obtain corresponding scenario adaptation information in response to the scenario adaptation operation triggered in the editing interface, and present a target video generated based on the obtained at least one original scenario information and scenario adaptation information; the target video is adapted based on the video clip sets respectively associated with the at least one target video generation region.
Optionally, the first response unit 1901 is specifically configured to:
Presenting a video playing interface for playing the original video; the video playing interface comprises an adaptation entrance;
responding to an reprogramming operation triggered by a reprogramming inlet in the video playing interface, and presenting a selection interface; one video generation area in the selection interface is marked with the current playing progress of the original video.
Optionally, the editing interface further includes scenario development options associated with each of the at least one target video generation region;
the third response unit 1903 is specifically configured to:
responding to a selection operation triggered for each scenario development option, and taking at least one selected target scenario development option as scenario adaptation information related to scenario development;
and presenting the target video obtained by adapting the plot development trend of the video clips respectively associated with the at least one target video generation area based on the obtained at least one original plot information and plot adaptation information.
Optionally, the editing interface further includes a first custom area;
the third response unit 1903 is specifically configured to:
responding to the input operation aiming at the first custom region, acquiring the input scenario custom content, and taking the scenario custom content as scenario adaptation information;
And presenting the target video obtained by adapting the plot development trend of the video clips respectively associated with the at least one target video generation area based on the obtained at least one original plot information and plot adaptation information.
Optionally, the third response unit 1903 is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: the system comprises at least one target video generation area, scenario style information of each associated video clip set and preset scenario style options;
and responding to the selection operation triggered for each scenario style option, and taking at least one selected target scenario style option as scenario adaptation information related to the scenario style.
Optionally, the third response unit 1903 is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: the method comprises the steps that at least one target video generation area is respectively associated with original scenario roles of video clip sets and preset role options;
Responding to character modification operation triggered based on character options, and obtaining scenario adaptation information related to scenario characters; wherein the character modification operation includes at least one of: character replacement operation between original scenario characters, character replacement operation between candidate scenario characters and original scenario characters, and character addition operation based on candidate scenario characters; the candidate scenario roles include at least one of: and the scenario roles in the original video except for the original scenario roles are the scenario roles in other video materials.
Optionally, the third response unit 1903 is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: at least one target video generation area respectively associated with an original scenario scene of a video clip set and preset scene options;
acquiring scenario adaptation information related to a scenario in response to a scenario modification operation triggered based on a scenario option; wherein the scene modifying operation includes at least one of: scene replacement operation between original scenario scenes, scene replacement operation between candidate scenario scenes and original scenario scenes, and scene addition operation based on other scenario scenes; the candidate scenario includes at least one of: and the original video contains other scenario scenes except the original scenario scene, and scenario scenes in other video materials.
Optionally, the third response unit 1903 is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: setting a control for the duration;
and acquiring scenario adaptation information related to the scenario duration in response to a duration setting operation triggered by the duration setting control.
Optionally, the third response unit 1903 is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, presenting a next page of the editing interface in response to a next page operation for the editing interface; the next page contains: a second custom region;
and responding to the input operation aiming at the second custom region, acquiring other input custom content, and taking the other custom content as scenario adaptation information.
Optionally, the third response unit 1903 is further configured to:
before presenting a target video generated based on the obtained at least one original scenario information and scenario adaptation information, presenting a content generation interface comprising: generating a progress bar aiming at a target video;
The third response unit 1903 is specifically configured to:
and determining that the target video generation is completed, and playing the target video in the content generation interface.
Optionally, the third response unit 1903 is further configured to:
after presenting the content generation interface and before playing the target video in the content generation interface, responding to the minimizing operation for the content generation interface, jumping to the video playing interface, and adapting the relevant position of the entry in the video playing interface, and presenting the generation progress bar for the target video; the adaptation entry is used to trigger an adaptation operation for the original video.
Optionally, the apparatus further comprises:
a fourth response unit 1904 for presenting at least one type of editing control in response to an editing operation for the target video;
and responding to a selection operation triggered by at least one type of editing control, editing the picture content of the target video based on the selected target editing control, and presenting a corresponding editing effect.
Based on the same inventive concept, the embodiment of the application also provides a scenario adaptation device of the video.
As shown in fig. 20, which is a schematic structural diagram of the scenario adaptation apparatus 2000, may include:
an analysis unit 2001, configured to analyze script text and picture data of an original video to be adapted, and obtain original scenario analysis data corresponding to the original video;
The updating unit 2002 is configured to obtain scenario adaptation information of at least one target video generation area in each video generation area associated with the original video, update the original scenario analysis data based on the scenario adaptation information, and generate corresponding new scenario analysis data; the scenario adaptation information is adaptation information of original scenario information of video clips respectively associated with at least one target video generation region;
an adaptation unit 2003 for generating a new script text and each scene segment conforming to the scenario adaptation information based on the new scenario analysis data, and synthesizing each scene segment into a target video based on the new script text; the target video is adapted based on the respective associated video segment sets of the at least one target video generation region.
Alternatively, the analysis unit 2001 is specifically configured to:
analyzing script text of the original video to obtain an original character behavior logic sequence containing scenes;
analyzing the picture of the original video, and supplementing the logic sequence of the behavior of the original role by combining the picture analysis result to obtain scenario analysis data corresponding to the original video.
Alternatively, the analysis unit 2001 is specifically configured to:
Inputting script text of an original video into a trained natural language analysis model to obtain an original character behavior logic sequence output by the natural language analysis model;
the natural language analysis model is obtained by training a sample script text serving as an input characteristic and a corresponding sample character behavior logic sequence serving as an output characteristic; the sample character behavior logic sequence is constructed based on the relation among characters obtained by analyzing the text of the sample script and the data logic description of the scenario content structure obtained by analyzing the corresponding scene.
Alternatively, the analysis unit 2001 is specifically configured to:
inputting the original character behavior logic sequence and the original video into a trained picture analysis model to obtain scattered scene fragments output by the picture analysis model; each character behavior logic sequence comprises at least one character behavior logic sequence unit, and each character behavior logic sequence unit corresponds to one scene segment;
analyzing pictures in each obtained scene segment to obtain key information, and marking an original role behavior logic sequence based on the obtained key information to obtain original scenario analysis data;
the picture analysis model is obtained by training a sample video and a sample character behavior logic sequence serving as input features and corresponding sample scene fragments serving as output features; the sample scene segment is obtained by splitting a sample video based on a sample character behavior logic sequence.
Optionally, the updating unit 2002 is specifically configured to:
inputting the scenario adaptation information and the original scenario analysis data into a trained natural language generation model to obtain a new character behavior logic sequence output by the natural language generation model, and taking the new character behavior logic sequence as the new scenario analysis data;
the natural language generation model is obtained by training an original sample character behavior logic sequence and sample scenario adaptation information serving as input features and a corresponding new sample character behavior logic sequence serving as output features; the new sample character behavior logic sequence is obtained by modifying the original sample character behavior logic sequence for a plurality of times based on sample scenario adaptation information; the original sample character behavior logic sequence is constructed based on the relation between characters obtained by analyzing the sample script text and the data logic description of the scenario content structure obtained by analyzing the corresponding scene.
Optionally, the original sample character behavior logic sequence further includes an end identifier for representing a scenario end type of the corresponding sample video.
Optionally, the adapting unit 2003 is specifically configured to:
inputting the new character behavior logic sequence into the trained picture generation model to obtain a scene segment output by the picture generation model;
The picture generation model is obtained by training a sample character behavior logic sequence and sample key information serving as input features and corresponding sample scene fragments serving as output features; the sample key information is obtained by analyzing pictures in each sample scene segment.
Optionally, the picture generation model generates an impedance network; the adaptation unit 2003 is specifically for:
mapping key information corresponding to the new character behavior logic sequence into a plurality of feature vectors; each key information corresponds to a feature vector;
the plurality of feature vectors are input to a generator in a picture generation model, corresponding images are generated, and scene segments composed of the images are obtained.
Optionally, the updating unit 2002 is further configured to:
after scenario adaptation information of at least one target video generation area in each video generation area related to the original video is acquired, if the scenario adaptation information comprises scenario custom content before updating original scenario analysis data based on the scenario adaptation information, carrying out semantic logic analysis on the scenario custom content;
if the custom content is determined not to accord with the semantic logic, a prompt message is sent to the terminal equipment to prompt that the scenario custom content needs to be edited again.
According to the method, the object can trigger the adapting operation directly aiming at the original video to be adapted, further, a selection interface for the object to select the scenario of the original video to be adapted is presented, specifically, the selection interface comprises each video generation area related to the original video, each video generation area is related to one video segment set in the original video, based on the selection interface, the object can select at least one target video generation area to be adapted and trigger the scenario adapting operation according to the corresponding editing interface, corresponding scenario adapting information is set according to the self requirement, namely, a matched target video can be automatically generated based on the video segment set related to the target video generation area and the scenario adapting information set by the object according to the self requirement. Based on the method, the system and the device, the scenario meeting the requirement of the object to be watched can be automatically generated by combining the requirement of the object to be watched, and the flexibility and the recompilation efficiency of video recompilation are improved.
For convenience of description, the above parts are described as being functionally divided into modules (or units) respectively. Of course, the functions of each module (or unit) may be implemented in the same piece or pieces of software or hardware when implementing the present application.
Having described the scenario adaptation method and apparatus of a video according to an exemplary embodiment of the present application, next, an electronic device according to another exemplary embodiment of the present application is described.
Those skilled in the art will appreciate that the various aspects of the application may be implemented as a system, method, or program product. Accordingly, aspects of the application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
The embodiment of the application also provides electronic equipment based on the same conception as the embodiment of the method. In one embodiment, the electronic device may be a server, such as server 120 shown in FIG. 1. In this embodiment, the electronic device may be configured as shown in fig. 21, including a memory 2101, a communication module 2103, and one or more processors 2102.
A memory 2101 for storing a computer program for execution by the processor 2102. The memory 2101 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a program required for running an instant communication function, and the like; the storage data area can store various instant messaging information, operation instruction sets and the like.
The memory 2101 may be a volatile memory (RAM) such as random-access memory; the memory 2101 may also be a nonvolatile memory (non-volatile memory), such as a read-only memory, a flash memory (flash memory), a hard disk (HDD) or a Solid State Drive (SSD); or memory 2101, is any other medium that can be used to carry or store a desired computer program in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 2101 may be a combination of the above.
The processor 2102 may include one or more central processing units (central processing unit, CPU) or digital processing units, etc. The processor 2102 is configured to implement the scenario adaptation method of the video when calling the computer program stored in the memory 2101.
The communication module 2103 is used for communicating with terminal devices and other servers.
The specific connection medium between the memory 2101, the communication module 2103 and the processor 2102 is not limited in this embodiment. The connection between the memory 2101 and the processor 2102 in fig. 21 is illustrated by a bus 2104, and the bus 2104 is illustrated in bold in fig. 21, and the connection between other components is illustrated by way of example only and not by way of limitation. The bus 2104 may be divided into an address bus, a data bus, a control bus, and the like. For ease of description, only one thick line is depicted in fig. 21, but only one bus or one type of bus is not depicted.
The memory 2101 stores therein a computer storage medium having stored therein computer executable instructions for implementing a scenario adaptation method of a video according to an embodiment of the present application. The processor 2102 is configured to perform the scenario adaptation method of video described above, as shown in fig. 17.
In another embodiment, the electronic device may also be other electronic devices, such as terminal device 110 shown in fig. 1. In this embodiment, the structure of the electronic device may include, as shown in fig. 22: communication component 2210, memory 2220, display unit 2230, camera 2240, sensor 2250, audio circuit 2260, bluetooth module 2270, processor 2280, etc.
The communication component 2210 is used for communicating with a server. In some embodiments, a circuit wireless fidelity (Wireless Fidelity, wiFi) module may be included, where the WiFi module belongs to a short-range wireless transmission technology, and the electronic device may help the user to send and receive information through the WiFi module.
Memory 2220 may be used to store software programs and data. Processor 2280 performs various functions of terminal device 110 and data processing by executing software programs or data stored in memory 2220. Memory 2220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. The memory 2220 stores an operating system that enables the terminal device 110 to operate. The memory 2220 in the present application may store an operating system and various application programs, and may also store a computer program for executing the scenario adaptation method of the video according to the embodiment of the present application.
The display unit 2230 may also be used to display information input by a user or information provided to the user and a graphical user interface (graphical user interface, GUI) of various menus of the terminal device 110. Specifically, the display unit 2230 may include a display screen 2232 provided on the front surface of the terminal device 110. The display 2232 may be configured in the form of a liquid crystal display, light emitting diodes, or the like. The display unit 2230 may be used to display various types of user interfaces in embodiments of the present application, such as a selection interface, an editing interface, a content generation interface, a video playback interface, and the like.
The display unit 2230 may also be used to receive input digital or character information, generate signal inputs related to user settings and function control of the terminal device 110, and in particular, the display unit 2230 may include a touch screen 2231 provided on the front surface of the terminal device 110, and collect touch operations on or near the user, such as clicking buttons, dragging scroll boxes, and the like.
The touch screen 2231 may be covered on the display screen 2232, or the touch screen 2231 may be integrated with the display screen 2232 to implement input and output functions of the terminal device 110, and after integration, the integrated touch screen may be abbreviated as touch screen. The display unit 2230 of the present application may display the application program and the corresponding operation steps.
The camera 2240 may be used to capture still images, and a user may post images captured by the camera 2240 through an application. The camera 2240 may be one or more. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then passed to processor 2280 for conversion into a digital image signal.
The terminal device may also include at least one sensor 2250, such as an acceleration sensor 2251, a distance sensor 2252, a fingerprint sensor 2253, a temperature sensor 2254. The terminal device may also be configured with other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, light sensors, motion sensors, and the like.
Audio circuitry 2260, speaker 2261, microphone 2262 may provide an audio interface between the user and terminal device 110. The audio circuit 2260 may transmit the received electrical signal converted from audio data to the speaker 2261, and may be converted into a sound signal by the speaker 2261 for output. The terminal device 110 may also be configured with a volume button for adjusting the volume of the sound signal. On the other hand, the microphone 2262 converts the collected sound signals into electrical signals, which are received by the audio circuit 2260 and converted into audio data, which are output to the communication component 2210 for transmission to, for example, another terminal device 110, or to the memory 2220 for further processing.
The bluetooth module 2270 is configured to interact with other bluetooth devices having bluetooth modules via a bluetooth protocol. For example, the terminal device may establish a bluetooth connection with a wearable electronic device (e.g., a smart watch) that also has a bluetooth module through the bluetooth module 2270, so as to perform data interaction.
Processor 2280 is a control center of the terminal device, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal device and processes data by running or executing software programs stored in memory 2220, and invoking data stored in memory 2220. In some embodiments, processor 2280 may include one or more processing units; processor 2280 may also integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a baseband processor that primarily handles wireless communications. It will be appreciated that the baseband processor described above may not be integrated into processor 2280. Processor 2280 of the present application may run an operating system, applications, user interface displays and touch responses, as well as a scenario adaptation method for video of embodiments of the present application, such as the method shown in fig. 2. In addition, a processor 2280 is coupled to the display unit 2230.
In some possible embodiments, aspects of the scenario adaptation method of a video provided by the present application may also be implemented in the form of a program product comprising a computer program for causing an electronic device to perform the steps in the scenario adaptation method of a video according to the various exemplary embodiments of the present application described above when the program product is run on the electronic device, e.g. the electronic device may perform the steps as shown in fig. 2 or 17.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product of embodiments of the present application may take the form of a portable compact disc read only memory (CD-ROM) and comprise a computer program and may be run on an electronic device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a command execution system, apparatus, or device.
The readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave in which a readable computer program is embodied. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.
A computer program embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer programs for performing the operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer program may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device and partly on a remote electronic device or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic device may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (e.g., connected through the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the elements described above may be embodied in one element in accordance with embodiments of the present application. Conversely, the features and functions of one unit described above may be further divided into a plurality of units to be embodied.
Furthermore, although the operations of the methods of the present application are depicted in the drawings in a particular order, this is not required to either imply that the operations must be performed in that particular order or that all of the illustrated operations be performed to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having a computer-usable computer program embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program commands may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the commands executed by the processor of the computer or other programmable data processing apparatus produce means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program commands may also be stored in a computer readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the commands stored in the computer readable memory produce an article of manufacture including command means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (25)

1. A method for drama adaptation of video, the method comprising:
In response to an adaptation operation triggered for an original video to be adapted, presenting a selection interface comprising: each video generation area associated with the original video; wherein each video generation region is associated with a set of video segments in the original video;
responsive to a selection operation triggered for the respective video generation region, presenting an editing interface comprising: original scenario information of video clip sets respectively associated with at least one selected target video generation region;
responding to the scenario recomposition operation triggered in the editing interface, acquiring corresponding scenario recomposition information, and presenting a target video generated based on the acquired at least one original scenario information and the scenario recomposition information; the target video is adapted based on the video clip sets respectively associated with the at least one target video generation region.
2. The method of claim 1, wherein presenting a selection interface in response to an adaptation operation triggered for an original video to be adapted comprises:
presenting a video playing interface for playing the original video; the video playing interface comprises an adaptation entrance;
Responding to an adaptation operation triggered by an adaptation port in the video playing interface, and presenting the selection interface; one video generation area in the selection interface is marked with the current playing progress of the original video.
3. The method of claim 1, wherein the editing interface further comprises scenario development options associated with each of the at least one target video generation area;
the responding to the scenario recomposition operation triggered in the editing interface obtains corresponding scenario recomposition information, and presents a target video generated based on the obtained at least one original scenario information and the scenario recomposition information, comprising:
responding to a selection operation triggered for each scenario development option, and taking at least one selected target scenario development option as scenario adaptation information related to scenario development;
and presenting the target video obtained by adapting the plot development trend of the video clip set respectively associated with the at least one target video generation area based on the obtained at least one original plot information and the plot adaptation information.
4. The method of claim 1, wherein the editing interface further comprises a first custom region;
The responding to the scenario recomposition operation triggered in the editing interface obtains corresponding scenario recomposition information, and presents a target video generated based on the obtained at least one original scenario information and the scenario recomposition information, comprising:
responding to the input operation aiming at the first custom region, acquiring the input scenario custom content, and taking the scenario custom content as the scenario recomposition information;
and presenting the target video obtained by adapting the plot development trend of the video clip set respectively associated with the at least one target video generation area based on the obtained at least one original plot information and the plot adaptation information.
5. The method of claim 1, wherein prior to presenting the target video generated based on the obtained at least one original scenario information and the scenario adaptation information, the method further comprises:
in response to a next page operation for the editing interface, presenting a next page of the editing interface; the next page includes: the scenario style information of the video clip set respectively associated with the at least one target video generation region and preset scenario style options;
And responding to a selection operation triggered for each scenario style option, and taking the selected at least one target scenario style option as scenario adaptation information related to the scenario style.
6. The method of claim 1, wherein prior to presenting the target video generated based on the obtained at least one original scenario information and the scenario adaptation information, the method further comprises:
in response to a next page operation for the editing interface, presenting a next page of the editing interface; the next page includes: the original scenario roles of the video clip sets respectively associated with the at least one target video generation region and preset role options;
responding to character modification operation triggered based on the character options, and obtaining scenario adaptation information related to scenario characters; wherein the character modification operation includes at least one of: character replacement operation between original scenario characters, character replacement operation between candidate scenario characters and original scenario characters, and character addition operation based on candidate scenario characters; the candidate scenario roles include at least one of: and the original video contains other scenario roles except the original scenario roles, and scenario roles in other video materials.
7. The method of claim 1, wherein prior to presenting the target video generated based on the obtained at least one original scenario information and the scenario adaptation information, the method further comprises:
in response to a next page operation for the editing interface, presenting a next page of the editing interface; the next page includes: the original scenario scenes of the video clip sets respectively associated with the at least one target video generation area and preset scene options;
acquiring scenario adaptation information related to the scenario scene in response to a scene modification operation triggered based on the scene option; wherein the scene modifying operation includes at least one of: scene replacement operation between original scenario scenes, scene replacement operation between candidate scenario scenes and original scenario scenes, and scene addition operation based on other scenario scenes; the candidate scenario includes at least one of: and the original video comprises other scenario scenes except the original scenario scene, and scenario scenes in other video materials.
8. The method of claim 1, wherein prior to presenting the target video generated based on the obtained at least one original scenario information and the scenario adaptation information, the method further comprises:
In response to a next page operation for the editing interface, presenting a next page of the editing interface; the next page includes: setting a control for the duration;
and acquiring scenario adaptation information related to the scenario duration in response to a duration setting operation triggered by the duration setting control.
9. The method of claim 1, wherein prior to presenting the target video generated based on the obtained at least one original scenario information and the scenario adaptation information, the method further comprises:
in response to a next page operation for the editing interface, presenting a next page of the editing interface; the next page includes: a second custom region;
and responding to the input operation aiming at the second custom region, acquiring other input custom content, and taking the other custom content as the scenario adaptation information.
10. The method of any one of claims 1-9, wherein prior to presenting the target video generated based on the obtained at least one original scenario information and the scenario adaptation information, the method further comprises:
presenting a content generation interface, the content generation interface comprising: generating a progress bar for the target video;
The presenting a target video generated based on the obtained at least one original scenario information and the scenario adaptation information, comprising:
and determining that the generation of the target video is completed, and playing the target video in the content generation interface.
11. The method of claim 10, wherein after presenting a content generation interface, before playing the target video in the content generation interface, the method further comprises:
responding to the minimizing operation aiming at the content generating interface, jumping to a video playing interface, and adapting the relative position of an entry in the video playing interface to present a generating progress bar aiming at the target video; the adaptation entry is for triggering an adaptation operation for the original video.
12. The method of any one of claims 1-9, wherein the method further comprises:
responding to editing operation aiming at the target video, and presenting at least one type of editing control;
and responding to the selection operation triggered by the at least one type of editing control, editing the picture content of the target video based on the selected target editing control, and presenting a corresponding editing effect.
13. A method for drama adaptation of video, the method comprising:
analyzing script text and picture data of an original video to be adapted to obtain original scenario analysis data corresponding to the original video;
acquiring scenario adaptation information of at least one target video generation area in each video generation area associated with the original video, and updating the original scenario analysis data based on the scenario adaptation information to generate corresponding new scenario analysis data; the scenario adaptation information is adaptation information of original scenario information of video clips respectively associated with the at least one target video generation region;
generating a new script text and each scene fragment according with the scenario recomposition information based on the new scenario analytic data, and synthesizing each scene fragment into a target video based on the new script text; the target video is adapted based on the respective associated video segment sets of the at least one target video generation region.
14. The method of claim 13, wherein analyzing script text and picture data of the original video to be adapted to obtain original scenario parsing data corresponding to the original video, comprises:
Analyzing the script text of the original video to obtain an original character behavior logic sequence containing scenes;
analyzing the picture of the original video, and supplementing the logic sequence of the behavior of the original role by combining the picture analysis result to obtain scenario analysis data corresponding to the original video.
15. The method of claim 14, wherein analyzing the script text of the original video to obtain the original character behavior logic sequence comprising the scene comprises:
inputting script text of the original video into a trained natural language analysis model to obtain the original character behavior logic sequence output by the natural language analysis model;
the natural language analysis model is obtained by training a sample script text serving as an input characteristic and a corresponding sample character behavior logic sequence serving as an output characteristic; the sample character behavior logic sequence is constructed based on the relation among characters obtained through text analysis of the sample script and the data logic description of the scenario content structure obtained through corresponding scene analysis.
16. The method as claimed in claim 14, wherein said analyzing the frames of the original video in combination with the original character behavior logic sequence to obtain scenario parsing data corresponding to the original video includes:
Inputting the original character behavior logic sequence and the original video into a trained picture analysis model to obtain scattered scene fragments output by the picture analysis model; each character behavior logic sequence comprises at least one character behavior logic sequence unit, and each character behavior logic sequence unit corresponds to one scene segment;
analyzing pictures in each obtained scene segment to obtain key information, and marking the original role behavior logic sequence based on the obtained key information to obtain the original scenario analysis data;
the picture analysis model is obtained by training a sample video and a sample character behavior logic sequence serving as input features and corresponding sample scene fragments serving as output features; the sample scene segment is obtained by splitting the sample video based on the sample character behavior logic sequence.
17. The method of claim 13, wherein the updating the original scenario parsing data based on the scenario adaptation information to generate corresponding new scenario parsing data comprises:
inputting the scenario adaptation information and the original scenario analysis data into a trained natural language generation model to obtain a new character behavior logic sequence output by the natural language generation model, and taking the new character behavior logic sequence as the new scenario analysis data;
The natural language generation model is obtained by training an original sample character behavior logic sequence and sample scenario adaptation information serving as input features and a corresponding new sample character behavior logic sequence serving as output features; the new sample character behavior logic sequence is obtained by modifying the original sample character behavior logic sequence for a plurality of times based on sample scenario adaptation information; the original sample character behavior logic sequence is constructed based on the relation between characters obtained by analyzing the sample script text and the data logic description of the scenario content structure obtained by analyzing the corresponding scene.
18. The method of claim 17, wherein the original sample character behavior logic sequence further comprises an end identifier that is used to represent a scenario end type of the corresponding sample video.
19. The method of claim 17, wherein generating a scene clip conforming to the scenario adaptation information based on the new scenario parsing data comprises:
inputting the new character behavior logic sequence into a trained picture generation model to obtain a scene segment output by the picture generation model;
The picture generation model is obtained by training a sample character behavior logic sequence and sample key information serving as input features and corresponding sample scene fragments serving as output features; the sample key information is obtained by analyzing pictures in each sample scene segment.
20. The method of claim 19, wherein the picture generation model generates an impedance network; inputting the new character behavior logic sequence into a trained picture generation model to obtain a scene segment output by the picture generation model, wherein the scene segment comprises the following steps:
mapping key information corresponding to the new character behavior logic sequence into a plurality of feature vectors; each key information corresponds to a feature vector;
inputting the plurality of feature vectors into a generator in the picture generation model, generating corresponding images, and obtaining scene fragments composed of the images.
21. The method of any one of claims 13 to 20, wherein after acquiring scenario adaptation information for at least one target video generation area of respective video generation areas associated with the original video, before updating the original scenario analysis data based on the scenario adaptation information, the method further comprises:
If the scenario adaptation information comprises scenario custom content, carrying out semantic logic analysis on the scenario custom content;
and if the custom content is determined not to accord with the semantic logic, sending prompt information to the terminal equipment so as to prompt that the scenario custom content needs to be edited again.
22. A scenario adaptation apparatus for video, comprising:
a first response unit, configured to respond to an adaptation operation triggered for an original video to be adapted, and present a selection interface, where the selection interface includes: each video generation area associated with the original video; wherein each video generation region is associated with a set of video segments in the original video;
the second response unit is used for responding to the selection operation triggered by each video generation area and presenting an editing interface, and the editing interface comprises: original scenario information of video clip sets respectively associated with at least one selected target video generation region;
the third response unit is used for responding to the scenario recomposition operation triggered in the editing interface, acquiring corresponding scenario recomposition information and presenting a target video generated based on the acquired at least one original scenario information and the scenario recomposition information; the target video is adapted based on the video clip sets respectively associated with the at least one target video generation region.
23. A scenario adaptation apparatus for video, comprising:
the analysis unit is used for analyzing the script text and the picture data of the original video to be adapted to obtain original scenario analysis data corresponding to the original video;
the updating unit is used for acquiring scenario recomposition information of at least one target video generation area in each video generation area associated with the original video, updating the original scenario analysis data based on the scenario recomposition information and generating corresponding new scenario analysis data; the scenario adaptation information is adaptation information of original scenario information of video clips respectively associated with the at least one target video generation region;
the adaptation unit is used for generating a new script text and each scene segment which accord with the scenario adaptation information based on the new scenario analysis data, and synthesizing each scene segment into a target video based on the new script text; the target video is adapted based on the respective associated video segment sets of the at least one target video generation region.
24. An electronic device comprising a processor and a memory, wherein the memory stores a computer program which, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 21.
25. A computer readable storage medium, characterized in that it comprises a computer program for causing an electronic device to perform the steps of the method of any one of claims 1-21 when said computer program is run on the electronic device.
CN202310275923.7A 2023-03-20 2023-03-20 Scenario recomposition method and device for video, electronic equipment and storage medium Pending CN116980718A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310275923.7A CN116980718A (en) 2023-03-20 2023-03-20 Scenario recomposition method and device for video, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310275923.7A CN116980718A (en) 2023-03-20 2023-03-20 Scenario recomposition method and device for video, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116980718A true CN116980718A (en) 2023-10-31

Family

ID=88483824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310275923.7A Pending CN116980718A (en) 2023-03-20 2023-03-20 Scenario recomposition method and device for video, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116980718A (en)

Similar Documents

Publication Publication Date Title
CN111209440B (en) Video playing method, device and storage medium
US10674208B2 (en) Methods and systems for automatically evaluating an audio description track of a media asset
WO2021114881A1 (en) Intelligent commentary generation method, apparatus and device, intelligent commentary playback method, apparatus and device, and computer storage medium
KR102290419B1 (en) Method and Appratus For Creating Photo Story based on Visual Context Analysis of Digital Contents
US9961403B2 (en) Visual summarization of video for quick understanding by determining emotion objects for semantic segments of video
US20170201562A1 (en) System and method for automatically recreating personal media through fusion of multimodal features
CN112449253B (en) Interactive video generation
CN111930994A (en) Video editing processing method and device, electronic equipment and storage medium
CN115082602B (en) Method for generating digital person, training method, training device, training equipment and training medium for model
CN113746875B (en) Voice packet recommendation method, device, equipment and storage medium
CN113392273A (en) Video playing method and device, computer equipment and storage medium
WO2023045635A1 (en) Multimedia file subtitle processing method and apparatus, electronic device, computer-readable storage medium, and computer program product
CN112800263A (en) Video synthesis system, method and medium based on artificial intelligence
CN113411674A (en) Video playing control method and device, electronic equipment and storage medium
CN114359775A (en) Key frame detection method, device, equipment, storage medium and program product
US20230326369A1 (en) Method and apparatus for generating sign language video, computer device, and storage medium
US20230027035A1 (en) Automated narrative production system and script production method with real-time interactive characters
CN114503100A (en) Method and device for labeling emotion related metadata to multimedia file
CN116980718A (en) Scenario recomposition method and device for video, electronic equipment and storage medium
CN114529635A (en) Image generation method, device, storage medium and equipment
CN114007145A (en) Subtitle display method and display equipment
CN114339391A (en) Video data processing method, video data processing device, computer equipment and storage medium
Manzato et al. Multimedia content personalization based on peer-level annotation
CN113794915B (en) Server, display device, poetry and singing generation method and medium play method
Fernández Chappotin Design of a player-plugin for metadata visualization and intelligent navigation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication