CN117241087A - Propaganda video generation method, equipment and storage medium - Google Patents
Propaganda video generation method, equipment and storage medium Download PDFInfo
- Publication number
- CN117241087A CN117241087A CN202311002169.6A CN202311002169A CN117241087A CN 117241087 A CN117241087 A CN 117241087A CN 202311002169 A CN202311002169 A CN 202311002169A CN 117241087 A CN117241087 A CN 117241087A
- Authority
- CN
- China
- Prior art keywords
- sub
- video
- mirror
- mirrors
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 82
- 239000000463 material Substances 0.000 claims abstract description 54
- 230000004044 response Effects 0.000 claims abstract description 17
- 230000001737 promoting effect Effects 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 16
- 230000015572 biosynthetic process Effects 0.000 claims description 15
- 238000003786 synthesis reaction Methods 0.000 claims description 15
- 238000012790 confirmation Methods 0.000 claims description 9
- 239000000047 product Substances 0.000 description 25
- 230000008569 process Effects 0.000 description 17
- 238000012545 processing Methods 0.000 description 15
- 238000013515 script Methods 0.000 description 15
- 230000004308 accommodation Effects 0.000 description 13
- 230000008901 benefit Effects 0.000 description 12
- 230000006870 function Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 11
- 230000000007 visual effect Effects 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 230000006872 improvement Effects 0.000 description 8
- 235000012054 meals Nutrition 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 238000002372 labelling Methods 0.000 description 5
- 238000012216 screening Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000013589 supplement Substances 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The specification discloses a method, equipment and storage medium for generating propaganda videos, which are used for responding to a video generation request of a user to determine keywords used for representing characteristics of videos to be generated, displaying first guide words respectively corresponding to all sub-mirrors of the videos to be generated, description texts respectively corresponding to all sub-mirrors, and all controls used for inputting video materials corresponding to all sub-mirrors by the user, so that a target video is generated according to the description texts respectively corresponding to all sub-mirrors and the video materials corresponding to all sub-mirrors and in response to the video materials corresponding to all sub-mirrors input by the user through all controls. Therefore, through the scheme, the user can generate the target video only by determining the keywords and inputting the video materials of each sub-mirror, the user does not need to have the capability of writing descriptive texts and video clips, the threshold and the workload of video generation are greatly reduced, and the efficiency of video generation is improved.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, and a storage medium for generating a promotional video.
Background
Video is a common multimedia form, can show sound and picture simultaneously, has higher information transfer efficiency. With the development of internet technology, consumers can acquire a large amount of information through videos, so that merchants can be recommended to consumers through a mode of displaying advertising videos of the merchants. Therefore, how to generate promotional videos is a problem to be solved.
Currently, a merchant can clip the photographed original video through a video clipping technology to obtain a propaganda video.
However, in the above solution, even if the merchant adopts the video editing tool, the merchant still needs to have editing capability to generate the promotional video. In addition, not only pictures but also description texts of the video need to be displayed in the propaganda video of the merchant, and the description texts are generally manually written by the merchant, so that the threshold and the workload for generating the propaganda video are greatly improved.
Disclosure of Invention
The present disclosure provides a method, apparatus, and storage medium for generating a promotional video, so as to partially solve the foregoing problems in the prior art.
The technical scheme adopted in the specification is as follows:
the specification provides a method for generating a propaganda video, which comprises the following steps:
Determining keywords for characterizing a feature of the video to be generated in response to a video generation request of a user;
displaying a first guide word respectively corresponding to each sub-mirror of the video to be generated, a description text respectively corresponding to each sub-mirror, and each control used for the user to input video materials corresponding to each sub-mirror; the first guide word is used for guiding the user to input video materials corresponding to the various sub-mirrors through the controls, and the description text is generated according to the keywords and the comment texts which are associated with the user and correspond to the keywords;
and responding to the video materials corresponding to the various sub-mirrors input by the user through the controls, and generating a target video according to the description text respectively corresponding to the various sub-mirrors and the video materials corresponding to the various sub-mirrors.
Optionally, before the first guide word corresponding to each of the sub-mirrors of the video to be generated, the description text corresponding to each of the sub-mirrors, and the controls for the video material corresponding to each of the sub-mirrors are displayed, the method further includes:
displaying descriptive texts corresponding to the sub-mirrors respectively, input columns used for the user to input updating operations of the descriptive texts corresponding to the sub-mirrors respectively, and second guide words, wherein the second guide words are used for guiding the user to input updating operations of the descriptive texts corresponding to the sub-mirrors respectively through the input columns;
Responding to the updating operation input by the user through each input field, and updating the description text corresponding to each sub-mirror according to the updating operation to obtain updated description text corresponding to each sub-mirror;
displaying the guide words respectively corresponding to the sub-mirrors of the video to be generated, the description text respectively corresponding to the sub-mirrors and the controls for the user to input the video materials corresponding to the sub-mirrors, wherein the controls specifically comprise:
and responding to the confirmation operation input by the user, displaying the guide words respectively corresponding to the sub-mirrors of the video to be generated, the updated description text respectively corresponding to the sub-mirrors and the controls for the user to input the video materials corresponding to the sub-mirrors.
Optionally, before the first guide word corresponding to each of the sub-mirrors of the video to be generated, the description text corresponding to each of the sub-mirrors, and the controls for the video material corresponding to each of the sub-mirrors are displayed, the method further includes:
searching each comment text which is associated with the user and corresponds to the keyword;
Obtaining each description text according to the keywords, each comment text and the pre-trained natural language model;
acquiring the attribute of the user, and determining each sub-mirror of the video to be generated according to the attribute of the user;
and determining the description text corresponding to each of the sub-mirrors from the description text.
Optionally, searching each comment text associated with the user and corresponding to the keyword specifically includes:
searching a history service executed by the user according to the acquired attribute of the user;
acquiring candidate comment texts related to the history service executed by the user from a service platform;
and matching each candidate comment text with the keyword respectively, and taking the candidate comment text hit by the keyword in each candidate comment text as each comment text which is associated with the user and corresponds to the keyword.
Optionally, determining the description text corresponding to each of the sub-mirrors from the description text specifically includes:
generating a target prompt text according to each sub-mirror and each description text;
and inputting the target prompt text into the pre-trained natural language model to obtain the description text which is output by the natural language model and corresponds to each of the sub-mirrors.
Optionally, generating the target video according to the description text corresponding to each of the sub-mirrors and the video material corresponding to each of the sub-mirrors specifically includes:
for each sub-mirror, determining candidate videos corresponding to the sub-mirrors according to video materials corresponding to the sub-mirrors;
inputting the description text corresponding to the sub-mirror into a pre-trained voice synthesis model to obtain the audio corresponding to the sub-mirror output by the voice synthesis model;
generating subtitles corresponding to the sub-mirrors with the same audio time length as the sub-mirrors according to the description text corresponding to the sub-mirrors;
generating a video clip corresponding to the sub-mirror according to the candidate video corresponding to the sub-mirror, the subtitle corresponding to the sub-mirror and the audio corresponding to the sub-mirror;
and splicing the video clips corresponding to the sub-mirrors to obtain a target video.
Optionally, the video material includes video and images;
for each sub-mirror, determining a candidate video corresponding to the sub-mirror according to the video material corresponding to the sub-mirror, wherein the candidate video specifically comprises:
for each sub-mirror, when the video material corresponding to the sub-mirror is determined to be video, the video material corresponding to the sub-mirror is taken as a candidate video corresponding to the sub-mirror;
And when the video material corresponding to the sub-mirror is determined to be an image, converting the video material corresponding to the sub-mirror into a video with preset duration, and taking the video material as a candidate video corresponding to the sub-mirror.
Optionally, generating a video clip corresponding to the sub-mirror according to the candidate video corresponding to the sub-mirror, the subtitle corresponding to the sub-mirror, and the audio corresponding to the sub-mirror specifically includes:
editing the candidate video corresponding to the sub-mirror according to the difference between the duration of the audio corresponding to the sub-mirror and the duration of the candidate video corresponding to the sub-mirror;
and generating a video clip corresponding to the sub-mirror according to the subtitle corresponding to the sub-mirror, the audio corresponding to the sub-mirror and the candidate video corresponding to the sub-mirror after clipping.
The specification provides a generation device of propaganda video, includes:
the keyword determining module is used for responding to a video generation request of a user and determining keywords used for representing characteristics of the video to be generated;
the display module is used for displaying the first guide words respectively corresponding to the sub-mirrors of the video to be generated, the description texts respectively corresponding to the sub-mirrors and the controls for the user to input the video materials corresponding to the sub-mirrors; the first guide word is used for guiding the user to input video materials corresponding to the various sub-mirrors through the controls, and the description text is generated according to the keywords and the comment texts which are associated with the user and correspond to the keywords;
And the target video generation module is used for responding to the video materials corresponding to the various sub-mirrors input by the user through the controls and generating target videos according to the description texts corresponding to the various sub-mirrors and the video materials corresponding to the various sub-mirrors.
The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the method of generating promotional video described above.
The present specification provides an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method of generating promotional video described above when executing the program.
The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:
in the method for generating the propaganda video, keywords for representing the characteristics of the video to be generated are determined in response to a video generation request of a user, and a first guide word corresponding to each sub-mirror of the video to be generated, a description text corresponding to each sub-mirror and a control for the user to input video materials corresponding to each sub-mirror are displayed, so that a target video is generated according to the description text corresponding to each sub-mirror and the video materials corresponding to each sub-mirror and in response to the video materials corresponding to each sub-mirror and the video materials corresponding to each sub-mirror, which are input by the user through the control. Therefore, through the scheme, the user can generate the target video only by determining the keywords and inputting the video materials of each sub-mirror, the user does not need to have the capability of writing descriptive texts and video clips, the threshold and the workload of video generation are greatly reduced, and the efficiency of video generation is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:
fig. 1 is a flow chart of a method for generating a promotional video in the present specification;
FIG. 2 is a schematic diagram of a display interface according to the present disclosure;
fig. 3 is a flow chart of a method for generating a promotional video in the present specification;
FIG. 4 is a schematic diagram of a display interface according to the present disclosure;
fig. 5 is a flow chart of a method for generating a promotional video in the present specification;
fig. 6 is a schematic diagram of a device for generating a promotional video provided in the present specification;
fig. 7 is a schematic view of the electronic device corresponding to fig. 1 provided in the present specification.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
In addition, all the actions for acquiring signals, information or data in the present specification are performed under the condition of conforming to the corresponding data protection rule policy of the place and obtaining the authorization given by the corresponding device owner.
Video promotions have been widely used as the core competitiveness of information streams in store promotions for businesses, such as store introduction videos, dado promotions, commercials, etc. Compared with the image-text advertisement, the video advertisement can bear higher-density information, and various elements such as vision, audio frequency and rhythm attract the attention of consumers, so that the infection force and brand characteristics of the advertisement are effectively improved.
However, while current video editing tools can achieve a variety of visual processing effects, there is still a need for video editing capabilities for merchants. In addition, in the creation process of the propaganda video, not only the shot original video needs to be subjected to visual processing such as editing, but also a video script and a video propaganda document need to be created and supplemented into the propaganda video. And creating video scripts and promotional scripts requires a merchant to have a strong text processing capability and operation capability. Obviously, the threshold, workload and cost of the current propaganda video generation scheme are high.
Based on the above, the description provides a method for generating propaganda videos, and a user can generate target videos by only determining keywords and inputting video materials of each sub-mirror of the videos to be generated, and can generate high-quality videos without video editing and text writing, so that the threshold and workload of video generation are greatly reduced, and the labor cost is reduced.
The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.
Fig. 1 is a flow chart of a method for generating a promotional video in the present specification, specifically including the following steps:
s100: in response to a video generation request by a user, keywords are determined that characterize the video to be generated.
The embodiment of the invention provides a method for generating a promotional video, and the implementation process of the method can be executed by electronic equipment such as a server for generating the video. The electronic device that performs the pretraining process of the natural language model and the electronic device that performs the generating method of the promotional video may be the same or different with respect to the natural language model involved in the generating method of the promotional video, and this specification is not limited thereto. In this specification, for convenience of explanation, a specific technical solution will be explained by taking an execution body for executing a method for generating a promotional video as an example of a server. In addition, the user may interact with the server performing the method for generating the promotional video through the client, where the client includes at least an input device and a display device, that is, the user may send a request, information, or operation to the server through the client, and the server may also send information to the client, so that the client may display based on the information sent by the server. The client may be a mobile terminal such as a mobile phone or a tablet computer, or may be a fixed terminal device, which is not limited in this specification.
In practical application, compared with the traditional image-text propaganda, the propaganda video can attract the attention of consumers, and the product or service of the merchant is shown to the consumers in a vivid image mode, so that the interest and curiosity of the consumers are stimulated, and the consumers can know the characteristics, functions and advantages of the product more intuitively. And, propagation paths of propaganda videos are wider, such as third party video platforms, social media, public places, exhibitions, televisions and the like. The present specification does not particularly limit the propagation path of the generated target video.
In the present specification, a specific technical solution is described by taking a user as a merchant and taking a generated target video as a promotional video of the merchant as an example, but this does not mean that the present specification limits the user to be only the merchant, and the user may also be a general crowd who has a requirement for generating the promotional video, which is not limited in this specification.
In general, when a user promotes a service or commodity provided by the user, in order to further emphasize advantages of the user, provide clear information, meet requirements of target consumers, shape brand images and cognition, and the like, it is desirable to emphasize one or more emphasis points in the generated target video. The emphasis is generally on the characteristics (such as advantages or features) of the product or service provided by the user, such as comprehensive product functions, high service quality, good consumer experience, good store environment, strong promotion force and the like. Thus, when a user needs to highlight the one or more emphasis points in the video to be generated, the emphasis point may be used as a keyword for characterizing the feature of the video to be generated, and the keyword may be used to generate the target video, so that the finally generated target video may highlight the emphasis points.
In this specification, several keywords may be statistically obtained in advance based on a priori different promotional demands, where the keywords are used to characterize the video to be generated, or may be considered to describe core promotional points of the user demands, that is, keywords may be used to describe one or more emphasis points of a product or service that the user who highlights in the video to be generated can provide.
Optionally, according to the different propaganda requirements of priori, N keywords are obtained through statistics, namely E= { E1, E2, … …, E N }。
Based on this, when the server receives a video generation request sent by a user through the client, a keyword for characterizing a feature of a video to be generated may be determined, where the keyword may be determined in advance based on a plurality of core publicity points input by referring to a user history, or may be input by the user through the client and carried in the video generation request. In this step, the number of the keywords may be one or a plurality of keywords, and the number of the keywords may be a preset fixed number or a variable which can be flexibly changed and dynamically selected by a user, which is not limited in this specification.
Optionally, the keywords include one or more of high cost performance, good environment, good service.
Alternatively, in order to improve the individualization degree of the target video, the core publicity point required to be emphasized in the target video to be generated may be selected by the user, that is, the keyword may be selected by the user according to own preference and requirement. Specifically, a plurality of candidate keywords are displayed in response to a video generation request input by a user; different candidate keywords correspond to different features characterizing the video to be generated; one or more keywords are selected from the candidate keywords in response to a selection operation input by the user. In response to a confirmation operation input by the user, one or more keywords selected from the candidate keywords are used as keywords for characterizing the feature of the video to be generated. Optionally, one or more keywords selected by the user are used as main propaganda points of the video to be generated, and are marked as e.
For example, as shown in fig. 2, in the presentation page, three candidate keywords are presented, in turn, "cost performance", "good service", "good environment". The user may select one or more from the three candidate keywords presented based on his own preferences and needs. After the user selects the keywords, a confirmation operation can be input through a displayed confirmation control, and the server can respond to the confirmation operation after receiving the confirmation operation, wherein one or more keywords selected by the user are used as keywords for representing the characteristics of the video to be generated.
Optionally, when each candidate keyword is determined based on the needs of other reference users, when the user selects a keyword based on each displayed candidate keyword, there may be a case where each displayed candidate keyword does not meet the needs of the user, in order to cope with the above case, a typing field may be displayed while each candidate keyword is displayed, and the user may type in a keyword according to own needs and preferences through the typing field. Specifically, in response to a video generation request input by a user, a plurality of candidate keywords and typing columns are displayed, and different candidate keywords correspond to different propaganda points; when receiving a selection operation input by a user, selecting keywords from the candidate keywords according to the selection operation; when receiving the text entered by the user through the typing column, taking the text entered by the user as a keyword.
Alternatively, when the generated target video is used to advertise a product or service provided by the user, in order to present the product or service in detail from multiple dimensions, even if the user selects (and/or types) only one keyword, the keywords selected (and/or typed) by the user may be augmented based on each candidate keyword. Specifically, after S100, before step S102 described below, the method further includes: and when the number of the keywords is determined to be one, acquiring each candidate keyword, wherein each candidate keyword corresponds to a different propaganda point, and the propaganda point corresponding to each candidate keyword is different from the propaganda point corresponding to the keyword. And selecting a preset number of target keywords from the candidate keywords, and expanding the target keywords into the keywords.
S102: displaying a first guide word respectively corresponding to each sub-mirror of the video to be generated, a description text respectively corresponding to each sub-mirror, and each control used for the user to input video materials corresponding to each sub-mirror; the first guide word is used for guiding the user to input video materials corresponding to the various sub-mirrors through the controls according to descriptive texts corresponding to the various sub-mirrors respectively, and the descriptive texts are generated according to the keywords and comment texts which are associated with the user and correspond to the keywords.
In practical applications, the generation of a target video requires, in addition to the shooting of the corresponding video material, the writing of relative descriptive text for the video material, which is text describing the characteristics and advantages of the product or service provided by the user. Therefore, a plurality of video clips in the target video can be arranged layer by layer in a progressive manner according to the description of the description text, and the purpose of enhancing the playing effect of the target video is achieved.
At present, description texts (propaganda texts, video scripts and the like) are generally written manually by users, so that the workload is huge, and the users are required to have higher text capacity. Based on this, in this specification, by automatically generating each description text as a video script and a promotional document, the threshold and workload for the user to generate a target video are reduced. And in order to realize that the descriptive text can describe the characteristics of the product or service provided by the user in sufficient detail, it may be generated based on the keywords determined in S100. But in general, since the text amount of the keyword is small, the descriptive text generated based on the keyword alone cannot accurately reflect the characteristics and advantages of the product or service provided by the user. For this reason, in addition to determining keywords as publicity points highlighted in the target video, other text describing products or services that the user can provide is required as a supplement to improve the description accuracy of the description text and enrich the description dimension of the description text. Based on the above, each comment text which is associated with the user and corresponds to the keyword is searched in the specification, and the comment text is used as the supplement of the keyword, so that the quality of the generated description text is improved, and the content of the target video is enriched.
Specifically, the service executed by the user can be searched from the service platform, and then each comment text related to the service executed by the user is searched from the service record corresponding to the service executed by the user. In practical applications, a user (merchant) provides a product or service to a consumer, and then may be considered to be the user and consumer together performing a business. Products or services provided by a user may be provided to a consumer in an online and/or offline manner, and no matter what manner the user provides the products or services, the service may be performed by a service platform. When a user executes a service through the service platform, the service platform can generate a service record based on each service executed by the user, wherein the service record records the execution time of the service, the service type, the information of products or services provided for the consumer when the user executes the service, the execution result of the service and the comment text of the consumer on the service.
For example, when a user is a merchant providing a service of taking out a meal, a consumer purchases the taking out of the meal through the take-out platform, the merchant provides the meal for the consumer, that is, a service executed by the user, and the take-out platform can generate a corresponding service record based on the take-out service, wherein the order time of the take-out service, information of the taking-out meal, information of whether the taking-out is delivered, and comment text of the consumer on the taking-out meal after delivering the meal can be recorded. For another example, when the user is a merchant providing accommodation service, the consumer reserves a room through the accommodation reservation platform, the merchant provides accommodation service for the consumer, that is, the service executed by the user, the accommodation reservation platform can generate a corresponding service record based on the accommodation service, wherein the reservation time of the accommodation service, the time of the consumer, the information of the house, the event of the consumer's return of the house and the comment text of the consumer to the room after the return of the house can be recorded. For another example, when the user is a merchant providing the nail-beautifying service, after the merchant provides the nail-beautifying service for consumers who consume the nail-beautifying service offline, the consumers can evaluate the service provided by the merchant providing the nail-beautifying service on the consumption platform, and the consumption platform can generate a service record corresponding to the service executed by the merchant based on comment text input when the consumers evaluate the service and the service provided by the merchant.
Thus, before the descriptive text is generated, a service record corresponding to the service performed by the user may be searched from the service platform, and comment text associated with the service performed by the user may be searched from the service record. And then, screening out each comment text corresponding to the keyword from the searched comment texts based on the keyword, so as to obtain the comment text which is associated with the user and corresponds to the keyword. The filtering method may be any existing filtering method such as a method based on rule matching and a method based on text vector similarity, and this is not limited in this specification.
Further, after the comment text which is associated with the user and corresponds to the keyword is determined, each description text can be generated according to the keyword and each comment text, each generated description text takes the keyword as an emphasis point (core propaganda point), and the characteristics and advantages of the product or service can be smoothly described by combining each comment text. The features and advantages of the product or service described by each description text are generally similar to keywords and each comment text. For example, if the keyword is "good service", and each comment text is a comment hit on the keyword, the subject and the central idea of each description text generated from the keyword and each comment text are also good service.
Further, in order to ensure that the pictures in the generated target video are all true existing and effective products and services provided by the user, in this step, the user still needs to shoot and input the true video material. Therefore, the first guide words respectively corresponding to the sub-mirrors of the video to be generated and the controls for the user to input the video materials corresponding to the sub-mirrors can be displayed in the display interface of the client of the user. The user can upload the video material corresponding to each sub-mirror through each control under the guidance of the first guide word, and the video material can be one or more video clips and/or images, which is not limited in this specification. The split mirrors refer to contents described by videos in different time periods in the video to be generated, and different split mirrors can correspond to different split mirror types, such as a total introduced split mirror type, an environment introduced split mirror type, a device introduced split mirror type, a service introduced split mirror type and a cost performance introduced split mirror type, so that the videos in different time periods in the video to be generated can describe different contents. The video duration corresponding to each sub-mirror included in the video to be generated can be the same or different, and the video duration corresponding to each sub-mirror and the number of sub-mirrors included in the video to be generated are not limited in the specification.
In the above scheme, each description text is generated based on the keywords and each comment text associated with the user and corresponding to the keywords, so that before the user is guided to input the video material of each sub-mirror, the description text corresponding to each sub-mirror can be determined, and when each control for the user to input the video material corresponding to each sub-mirror is displayed, the description text corresponding to each sub-mirror is displayed, and the user is prompted to input the video material related to the description text of each sub-mirror at the control corresponding to each sub-mirror. In addition, a first guide word corresponding to each of the sub-mirrors of the video to be generated can be displayed, wherein the first guide word is used for guiding the text of the video material corresponding to each of the sub-mirrors to be input through the control corresponding to each of the sub-mirrors according to the descriptive text corresponding to each of the sub-mirrors, and therefore, a user can respectively input the video material corresponding to different sub-mirrors at different control positions under the guidance and prompt of the first guide word and the descriptive text. The user shoots real video clips or images in sequence, so that the images contained in the finally generated target video are real and objective, and the reliability of the target video is improved.
Fig. 4 is an alternative display interface, in which a first display area 1000 corresponding to a first sub-mirror includes a first descriptive text 1001, a first control 1002, and a first guide word 1003 corresponding to the first sub-mirror, and a second display area 2000 corresponding to a second sub-mirror includes a second descriptive text 2001, a second control 2002, and a first guide word 2003 corresponding to the second sub-mirror. The first guide word 1003 of the first minute mirror is introduced as a "clapping environment," please aim at a signboard or hall of a merchant according to the following description text, sweep one time from left to right, "the purpose of displaying the first guide word 1003 of the first minute mirror is to guide the user to follow the first description text 1001," the environment of the shop is very good, dim light and a spacious room make the whole atmosphere very relaxed, "and the video material corresponding to the first minute mirror describing the shop environment is shot according to the guiding action of" aim at the signboard or hall of the merchant, sweep one time from left to right, "indicated by the first guide word 1003 of the first minute mirror, and the shot video material is input through the first control 1002. Similarly, the second guide word 2003 of the second sub-mirror is "please shoot service introduction please shoot a service scene completely along one direction according to the following description text", the purpose of displaying the second guide word 2003 of the second sub-mirror is to actually guide the user to follow the prompt of the second description text 2001 that the service of the store is very excellent, let the user feel indifferent care ", shoot the video material corresponding to the second sub-mirror describing the store service according to the guiding action of the second guide word 2003 of the second sub-mirror along one direction", and input the shot video material through the second control 2002.
Optionally, before the description text corresponding to each of the sub-mirrors is displayed in this step, the description text corresponding to each of the sub-mirrors, each input field for the user to input an update operation for the description text corresponding to each of the sub-mirrors, and a second guide word, where the second guide word is used to guide the user to input, through each field, an update operation for the description text corresponding to each of the sub-mirrors, where the update operation includes operations such as adding, modifying, deleting, and the like, of the description text corresponding to each of the sub-mirrors. Accordingly, in response to the update operation respectively input by the user through each input field, the description text respectively corresponding to each sub-mirror is updated according to the update operation, and the updated description text respectively corresponding to each sub-mirror is obtained. By displaying the description text corresponding to each sub-mirror to the user, the user is prompted to update the description text corresponding to each sub-mirror, and if the automatically generated description text does not completely meet the requirement or preference of the user, the purpose of customizing the description text corresponding to each sub-mirror by the user can be achieved based on the update operation of the description text corresponding to each sub-mirror, so that the description text corresponding to each sub-mirror for generating the target video is more in accordance with the user's expectations. In addition, due to the limitation of the current text automatic generation technology, errors or omission of the automatically generated description text still exists, and the errors in the description text can be corrected or details of undeployed descriptions in the description text can be supplemented based on updating operation of a user, so that the reliability and quality of a target video generated later are improved.
Alternatively, each description text generated automatically is noted as specch gen The text is fed back to the user, and the text is recorded as specch after the user audits and performs updating operation to buy updated descriptive text final The updated descriptive text may be composed of n sentences, i.e. specch final =[s 1 ,s 2 ,……,s n ]。
Further, in response to confirmation operation input by a user, the guide words respectively corresponding to all the sub-mirrors of the video to be generated, the updated description text respectively corresponding to all the sub-mirrors and all the controls for the video materials corresponding to all the sub-mirrors are displayed, so that the workload of writing the description text by the user is reduced, the writing threshold of the description text is reduced, and meanwhile, the personalized requirements of the user are considered, so that the description text accords with the preference and the actual requirements of the user, the generated target video accords with individuation, the target video is not excessively similar to the target videos of other types of users, the quality of the description text is further improved, and the reliability and the quality of the target video are improved.
S104: and responding to the video materials corresponding to the various sub-mirrors input by the user through the controls, and generating a target video according to the description text respectively corresponding to the various sub-mirrors and the video materials corresponding to the various sub-mirrors.
Specifically, the video material can include video, image and audio in the video material corresponding to each of the sub-mirrors input by the user through each control. Wherein the video and images may be used to generate a picture contained in the target video, and the audio may be background music or sound features. In addition, the input video material need not be all video, but may also contain images.
Optionally, compared with the information which can be transmitted by the video, although the image is a static picture, a certain information transmission function can be achieved in combination with the presentation of the descriptive text. Since the final target video is in the format of video, it is necessary to convert the image into video and then generate the target video based on the converted video. Specifically, for each sub-mirror, determining candidate videos corresponding to the sub-mirror according to video materials corresponding to the sub-mirror; and generating video segments of the sub-mirrors according to the candidate videos corresponding to the sub-mirrors, and splicing the video segments corresponding to the sub-mirrors to obtain a target video. The video segments of the sub-mirrors are generated according to the candidate videos corresponding to the sub-mirrors, audio such as background music and sound effects can be added based on the candidate videos corresponding to the sub-mirrors, visual effects or visual processing such as color correction can be added, the added audio or the visual processing performed in the video segments generated based on the candidate videos can be automatically found from a material library, and the audio or the visual processing can be input by a user, so that the specification is not limited.
And for each sub-mirror, when the video material corresponding to the sub-mirror is determined to be video, taking the video material corresponding to the sub-mirror as a candidate video corresponding to the sub-mirror.
And when the video material corresponding to the sub-mirror is determined to be an image, converting the video material corresponding to the sub-mirror into a video with preset duration, and taking the video material as a candidate video corresponding to the sub-mirror.
Based on the method for generating the propaganda video shown in fig. 1, keywords for representing characteristics of the video to be generated are determined in response to a video generation request of a user, and a first guide word respectively corresponding to each minute mirror of the video to be generated, a description text respectively corresponding to each minute mirror and each control for the user to input video materials corresponding to each minute mirror are displayed, so that a target video is generated according to the description text respectively corresponding to each minute mirror and the video materials corresponding to each minute mirror in response to the video materials corresponding to each minute mirror input by the user through each control.
According to the method, a user can generate the target video by only determining the keywords and inputting the video materials of each sub-mirror, the user does not need to have the capability of writing descriptive texts and video clips, the threshold and the workload of video generation are greatly reduced, and the efficiency of video generation is improved.
In one or more embodiments of the present disclosure, in the step S102, the video to be generated includes a plurality of sub-mirrors, in order to improve accuracy of inputting the video material by the user, each description text generated according to the keyword and each comment text associated with the user and corresponding to the keyword may be corresponding to each sub-mirror, that is, the description text corresponding to each sub-mirror is determined, which may be specifically implemented according to the following scheme, as shown in fig. 3:
s200: and searching each comment text which is associated with the user and corresponds to the keyword.
Specifically, as described in the foregoing S102, the comment text is used as a supplement to the keyword, and each description text is generated together with the keyword, so the comment text needs to be derived from the service executed by the user, and the service content described by the comment text may be the same as or similar to the keyword. Specifically, the service executed by the user can be searched from the service platform, and then each comment text related to the service executed by the user is searched from the service record corresponding to the service executed by the user. In the takeaway scene, the consumer orders at the takeaway merchant, and the consumer can comment on the takeaway provided by the takeaway merchant after the consumer uses the meal, so that the service record corresponding to the takeaway service executed by the takeaway merchant contains comment text of the takeaway. In another embodiment, in the accommodation reservation scenario, the consumer reserves an accommodation room in the hotel, and the consumer can comment on the accommodation service provided by the hotel after taking out the accommodation room, so that the hotel executes the service record corresponding to the accommodation service or finds out the comment text provided by the consumer. Thus, the service record corresponding to the service executed by the user can be searched from the service platform, and comment text associated with the service executed by the user can be searched from the service record. And then, screening out each comment text corresponding to the keyword from the searched comment texts based on the keyword, so as to obtain the comment text which is associated with the user and corresponds to the keyword. The filtering method may be any existing filtering method such as a method based on rule matching and a method based on text vector similarity, and this is not limited in this specification.
In an optional embodiment of the present disclosure, in the step S102, the comment text is derived from a service record corresponding to the service executed by the user, but not all comment texts found in the service record are necessarily related to the determined keywords, so that each comment text found in the service record needs to be screened based on the keywords, so as to obtain comment texts associated with the user and corresponding to the keywords. The screening schemes are as follows:
the first scheme is as follows: and taking comment texts related to the service executed by the user, which are acquired from the service platform, as candidate comment texts, respectively matching the candidate comment texts with the keywords, and taking the candidate comment texts hit the keywords in the candidate comment texts as comment texts corresponding to the keywords. In this scenario, candidate comment text hit against a keyword is typically a synonym or paraphrase containing the keyword or keywords. Therefore, whether the candidate comment text hits the keyword is determined, that is, whether the candidate comment text contains the keyword or the co (or near) sense word of the keyword is determined. For example, the keywords are: the candidate comment text is 'the environment of the store is very good, the atmosphere is very relaxed, and people feel comfortable and free', and the candidate comment text contains keywords, so that the candidate comment text can be used as the comment text corresponding to the keywords.
The second scheme is as follows: inputting the keywords into a pre-trained semantic extraction model to obtain first semantics of the keywords. And taking comment texts related to the service executed by the user, which are acquired from the service platform, as candidate comment texts, respectively inputting the candidate comment texts into a semantic extraction model, and respectively obtaining second semantics corresponding to the candidate comment texts. And determining the difference between the second semantics of the candidate comment texts and the first semantics of the keywords aiming at each candidate comment text, and taking one or more candidate comment texts with the smallest difference in the candidate comment texts as comment texts corresponding to the keywords. Specifically, the pre-trained semantic extraction model can extract the semantics of the text from the input text, the training sample of the semantic extraction model can be the text in the universal corpus, and the label of the training sample can be the semantics of the text. The present specification does not limit the model structure and training process of the semantic extraction model. By taking one or more candidate comment texts with the smallest semantic difference as comment texts corresponding to the keywords, comment texts similar to the meaning of the keywords can be effectively screened out, so that the matching degree of the comment texts and the keywords is improved.
Optionally, searching comment texts associated with the user, and reserving M pieces of comment texts corresponding to the keywords, so that comment texts associated with the user and corresponding to the keywords are recorded as C= { C 1 ,c 2 ,……,c M }。
S202: and obtaining each description text according to the keywords, each comment text and the pre-trained natural language model.
Specifically, the keywords and the comment texts can be spliced to be used as input of a pre-trained natural language model, so that the pre-trained natural language model can output the description texts based on the input keywords and the comment texts.
The description text is generated based on each comment text associated with the user and corresponding to the keyword, and in an optional embodiment of the present specification, the description text may be generated by generating a specified prompt text according to the keyword and each comment text, and inputting the specified prompt text into a pre-trained natural language model to obtain each description text.
Specifically, the pre-trained natural language model is obtained by taking a general corpus as a training sample and performing repeated iterative training. The model structure of the natural language model is not limited in this specification, and may be a Generative Pre-trained Transformer (GPT), BERT (Bidirectional Encoder Representations from Transformers), transducer, large language model (Large Language Model, LLM), or the like.
Since the pre-trained natural language model learns concepts, syntactic structures, context, etc. of a language through extensive general language data pre-training, and encodes rich semantic and grammatical knowledge. Therefore, the pre-trained natural language model has a emerging capability (text capability) and a context learning (ICL) capability, that is, even if specific knowledge for generating the descriptive text is not directly contacted or learned In the training process, the descriptive text related to the input descriptive text can be generated by taking the descriptive text as a guide to learn the context relationship In the case that the descriptive text is received.
The specified prompt text may be generated from templates pre-written with certain rules. And after determining the keywords and the comment texts which are associated with the user and correspond to the keywords, filling the keywords and the comment texts which are associated with the user and correspond to the keywords into preset slots in a target template respectively, and obtaining the specified prompt text.
For example, the pre-written blank templates are: please write a section of literature advertising the store with reference to the following store comments, the literature is mainly prominent {? This advantage. Comment 1: {??}; comment 2: {??}; … … comment N: {??}. When the determined keywords are "good service", the comment texts corresponding to the keywords are "service paste first-class bar", "store staff service attitude is very hot", and "store customers are many, but the service of the store staff is still in place. The slot corresponding to the keywords is "{?}" and the slot corresponding to the comment text is "{??}", and the generated appointed prompt text is "Please refer to the following store comments to write a promotional copy for this store, which mainly highlights the advantage of good service. Comment 1: Service is considerate and excellent; Comment 2: The staff's service attitude is very enthusiastic; Comment 3: The store has many customers, but the staff's service is still very good."
Alternatively, the keyword e and comment text C are passed through a specific format F speech Splice to be a specified prompt text, and record as p speech An alternative formula is as follows:
p speech =F speech (base,e,C)
further, the specified prompt text is input into a pre-trained natural language model, and each description text output by the natural language model is obtained.
Optionally, a large language model (Large Language Model, LLM) is invoked, and a specified prompt text p is entered speech Generating each description text, namely speedh gen An alternative formula is as follows:
speech gen =LLM(p speech )
s204: and acquiring the attribute of the user, and determining each sub-mirror of the video to be generated according to the attribute of the user.
In practical application, there is a large difference between products or services provided by different users, so that there is a large difference between contents displayed by target videos generated by different users, and thus, each of the sub-mirrors included in the target videos is also different. For this purpose, different sub-mirror templates may be set in advance for different types of services, respectively, each sub-mirror template containing a plurality of sub-mirrors of the video to be generated. For example, the split mirror templates corresponding to take-out business can comprise mirrors such as a 'clapping head', 'clapping raw materials', 'clapping product manufacturing process', 'clapping product display', 'clapping environment', 'clapping delivery' and the like. The split mirror templates corresponding to hotel accommodation business can comprise split mirrors such as appearance of hotel, close-up of hotel room, facility of hotel, service of hotel, sign of playing cards, activity and entertainment of hotel, and scenic spots nearby the hotel. The split mirror templates corresponding to the nail service business can comprise 'clapping head', 'clapping hall', 'clapping nail wall', 'clapping nail process', 'clapping window outside environment' split mirrors.
Therefore, the attribute of the user can be obtained, the type of the service executed by the user is determined, the sub-mirror template matched with the type of the service executed by the user is searched from the preset different sub-mirror templates, and each sub-mirror of the video to be generated is determined based on each sub-mirror contained in the matched sub-mirror template. The attributes of the user include the type of service executed by the user (industries or fields described by merchants, such as catering, retail, accommodation, service and the like), information of products or services provided by the user, geographic positions corresponding to the user and the like. The attribute of the user may be pre-written and uploaded by the user, or may be attribute information extracted based on a service executed by the user, which is not limited in this specification. For example, based on a take-out business executed by a merchant, determining that the type of the business executed by the merchant is a take-out catering type, so that when the merchant requests to generate a target video, a mirror splitting template matched with the take-out catering type can be searched from preset different mirror splitting templates based on the take-out catering type of the merchant, thereby searching the mirror splitting template corresponding to the take-out business, and executing subsequent steps based on the mirror splitting template corresponding to the take-out business to generate the target video.
S206: and determining the description text corresponding to each of the sub-mirrors from the description text.
Further, in order to improve the matching degree of the video material input by the user and each sub-mirror, the description text corresponding to each sub-mirror can be displayed when each control for the video material corresponding to each sub-mirror is displayed, so that the user can correctly input the video material corresponding to each sub-mirror under the prompt of the description text corresponding to each sub-mirror. For this purpose, it is necessary to determine description texts corresponding to the respective sub-mirrors from the description texts. The method for determining the description text corresponding to each of the sub-mirrors may be to make the arrangement order of each of the description texts correspond to the arrangement order of each of the sub-mirrors one by one, or may be to label the sub-mirrors for each of the description texts based on a pre-trained natural language model, or may be to use each of the sub-mirrors as a keyword, and determine the description text corresponding to each of the sub-mirrors by means of keyword matching or keyword semantic similarity, which is not limited in this specification.
For example, when determining the description text corresponding to each of the sub-mirrors, the arrangement order of each of the description texts may be one-to-one corresponding to the arrangement order of each of the sub-mirrors, or the first sequence may be obtained by sequentially arranging each of the sub-mirrors in a dimension from the whole to the detail, the second sequence may be obtained by sequentially arranging each of the description texts in a front-to-back order, and the sub-mirrors corresponding to each of the description texts in the second sequence may be obtained according to the order of each of the sub-mirrors in the first sequence and the order of each of the description texts in the second sequence, thereby determining the description text corresponding to each of the sub-mirrors.
In an alternative embodiment of the present specification, the description text corresponding to each of the illustrated sub-mirrors is text for describing features and advantages of a product or service provided by the user, and the description text is generated based on keywords and comment text associated with the user and corresponding to the keywords. In order to express the characteristics and advantages of a product or service clearly from multiple dimensions, the number of words of each description text generated is generally large. Therefore, it may be unclear how to shoot the video material corresponding to each description text (or what video material is shot is the video material conforming to the description text), so that the description text corresponding to each description text expression can be determined and displayed for each of the description text expressions, which video material the user needs to shoot is elaborated, the threshold of generating the target video by the user is further reduced, and the accuracy, reliability and quality of the finally generated target video are improved. Based on this, an alternative scheme of S206 is implemented by the following steps:
first, generating target prompt text according to each of the sub-mirrors and each of the description texts.
Specifically, in the foregoing step S204, the respective sub-mirrors of the video to be generated have been determined according to the attribute of the user, so that the semantic analysis capability of the pre-trained natural language model may be utilized to annotate each description text with the corresponding sub-mirror, and since the description text is used to generate the video to be generated, each description text needs to be annotated based on the respective sub-mirrors of the video to be generated.
Similar to the scheme of generating each description text based on the pre-trained natural language model and the specified prompt text in S202 described above, in this embodiment, the ICL capability of the pre-trained natural language model is also utilized, and therefore, in the above scheme, the input of the pre-trained natural language model is the target prompt text, and each of the partial mirrors and each of the description texts is included in the target prompt text. The target prompt text may be generated from templates pre-written with rules that are generally different from templates used to generate the specified prompt text, which are written for the purpose of labeling the descriptive text with a top-up mirror.
In this specification, each of the sub-mirrors of the video to be generated may be a sub-mirror template corresponding to a different sub-mirror of the video to be generated by the user with different attributes as described in step S204, so as to include a plurality of sub-mirrors, or a preset fixed sub-mirror template may be used, where the fixed sub-mirror template includes an environmental sub-mirror, an equipment sub-mirror, a service sub-mirror, and a cost performance sub-mirror. Of course, other mirrors may also exist in the fixed mirror template, such as mirrors introduced by a promotional program, mirrors introduced by a workflow, mirrors of a merchant brand story, and so forth.
Taking the example that each sub-mirror is derived from a fixed sub-mirror template, a blank template corresponding to a pre-written target prompt text is as follows: "existing candidate mirrors: introduction [ ], … … [ ], introduction [ ]. The best fit of each sentence in the following text is now told to me in the format of "th [ \ ] sentence: [ \ ] introduction". (1) [ \ \\is \is disclosed; (2) [ \ \\is \is disclosed; … … (N) [ \\ \ the ]; "when the obtained candidate mirrors are respectively the overall, environment, equipment, service and cost performance, the descriptions are respectively (1) AAA. (2) BBB. (3) CCC. The slot position corresponding to the split mirror is "[ \ ]", and the generated target prompt text can be: "existing candidate mirrors: general description, environment description, device description, service description, cost performance description. The best fit of each sentence in the following text is now told to me in the format of "the [ \ ] sentence: [ \ ] introduction". (1) AAA. (2) BBB.
(3)CCC。”
The separate mirrors corresponding to the description texts are determined, so that a simpler and clearer guiding explanation is further provided when the description texts are displayed to the user, and excessive time waste when the user reads the description texts with more words or misunderstanding is generated on the description texts, so that wrong video materials are uploaded by mistake.
And secondly, inputting the target prompt text into the pre-trained natural language model to obtain the description text which is output by the natural language model and corresponds to each of the sub-mirrors.
Alternatively, it is determined that the video to be generated contains λ partial mirrors, denoted as y= { Y 1 ,y 2 ,……,y λ }. Each descriptive text (not updated specch gen Or updated specch final ) And each of the sub-mirrors Y passes through a specific format F script Splicing the text into a target prompt text, and marking the target prompt text as p script . An alternative formula is as follows:
p script =F script (speech final ,Y)
further, LLM is called to prompt the target text p script Inputting LLM to make its pair form specch final The n sentences of (a) are marked with the best matched sub-mirror type Y epsilon Y= { Y in sequence 1 ,y 2 ,……,y λ Generating descriptive text according to the labeling result and recording the descriptive text as script gen An alternative formula is as follows:
script gen =LLM(p script )
for descriptive text script gen Analysis (in a specific format G script ) After analysis is successful, determining a split mirror corresponding to each description text, wherein the split mirrors can be used forThe formula selected is as follows:
thus, the description text corresponding to each of the sub-mirrors can be obtained.
In addition, in practical applications, the information transmitted by the video may include, in addition to visual information, auditory information, that is, audio, including background music, sound effects, descriptive text dubbing, and the like. However, in some specific scenes, the scene for displaying the target video may not support audio playing, such as no matched audio playing device, or the display scene of the target video needs to be muted, so that in the process of generating the target video, subtitles can be added for each video segment describing the text, and thus, the visual information transmitted by the target video not only contains images but also contains texts, thereby achieving the purpose of enriching the types of the visual information transmitted by the video, and even though the target video is displayed in the scene which does not support audio playing, the information to be transmitted by the user is not missed. Thus, in an alternative embodiment of the present disclosure, the step S104 may be specifically implemented according to the following steps, as shown in fig. 5:
S300: and aiming at each sub-mirror, determining candidate videos corresponding to the sub-mirrors according to the video materials corresponding to the sub-mirrors.
Specifically, the video material corresponding to each description text input by the user comprises a video format material, an audio format material and an image format material. Screening out the materials in the video format in the video materials, taking the materials as candidate videos corresponding to the corresponding sub-mirrors, screening out the materials in the image format in the video materials, and converting the materials in the image format into videos in the preset duration according to the preset duration to serve as candidate videos corresponding to the corresponding sub-mirrors.
Optionally, the video material of each sub-mirror input by the user in step S104 is v= [ V ] 1 ,v 2 ,……,v n ]An alternative formula may be as follows:
wherein H represents the image of each of the sub-mirrors y by the user si And descriptive text s corresponding to each of the partial mirrors i Video material is photographed and submitted.
S302: and inputting the description text corresponding to the partial mirror into a pre-trained voice synthesis model to obtain the audio corresponding to the partial mirror output by the voice synthesis model.
In this step, the speech synthesis model used for determining the audio of the descriptive text corresponding to the micromirror may be any existing speech synthesis model, such as a statistical-based speech synthesis model, a deep learning-based speech synthesis model, an autoregressive-based speech synthesis model, or a network-resistant speech synthesis model, which may be flexibly selected according to a specific application scenario, which is not limited in this specification.
An alternative formula is as follows:
a i =TTS(s i )
wherein the above formula adopts Speech synthesis (TTS) technology, s i Descriptive text corresponding to each minute mirror, a i Audio corresponding to each sub-mirror.
S304: and generating subtitles corresponding to the sub-mirrors, which have the same audio time length as the sub-mirrors, according to the description text corresponding to the sub-mirrors.
Because in the video display scene which does not support audio playing, the audio of the target video cannot be played, if the target video only contains picture information, the sound information cannot be transmitted to a user, and the problem of information transmission omission is caused, so that the efficiency of recommending merchants to consumers by adopting the target video is reduced. For this reason, a subtitle may be generated with the descriptive text, and by adding the subtitle corresponding to the descriptive text to the candidate video corresponding to the descriptive text, information of the descriptive text may be conveyed through the subtitle even if audio cannot be played when the target video is presented.
In order to enable a consumer watching a target video to have good watching experience, generally, the caption of the descriptive text corresponding to the minute mirror needs to be aligned and synchronized with the audio of the descriptive text corresponding to the minute mirror, that is, the audio duration of the minute mirror should correspond to the presentation duration of the caption corresponding to the minute mirror, that is, the time axis of the caption of the descriptive text corresponding to the minute mirror played in the candidate video corresponding to the minute mirror should be aligned with the time axis of the audio of the descriptive text corresponding to the minute mirror, and if there is a misalignment between the time axis of the caption of the descriptive text corresponding to the minute mirror and the time axis of the audio of the minute mirror, the time axis of the caption of the descriptive text corresponding to the minute mirror and the time axis of the caption corresponding to the minute mirror can be adjusted by taking the time axis of the audio corresponding to the minute mirror as a reference, so that the time axis of the caption of the descriptive text corresponding to the minute mirror and the time axis of the audio corresponding to the minute mirror after adjustment are aligned.
An alternative formula is as follows:
c i =C(a i ,s i )
wherein C represents the generation and audio a i Description text s of same duration and each minute mirror i Video subtitle c of the same content i 。
S306: and generating a video clip corresponding to the sub-mirror according to the candidate video corresponding to the sub-mirror, the subtitle corresponding to the sub-mirror and the audio corresponding to the sub-mirror.
Further, the subtitle corresponding to the sub-mirror is added to the candidate video corresponding to the sub-mirror, and then the audio corresponding to the sub-mirror is added to the candidate video corresponding to the sub-mirror, so that the video clip of the descriptive text with the existing picture, subtitle and audio can be obtained.
Alternatively, when the video clip corresponding to the sub-mirror is generated using the candidate video, the subtitle, and the audio corresponding to the sub-mirror, since the subtitle of the sub-mirror and the audio of the sub-mirror have been aligned in time length in the aforementioned step S304, it is necessary to determine whether the candidate video corresponding to the sub-mirror is aligned in time length with the audio corresponding to the sub-mirror in this step. If there is a difference between the duration of the audio corresponding to the sub-mirror and the duration of the candidate video corresponding to the sub-mirror, the candidate video corresponding to the sub-mirror may be clipped according to the difference in the market, where the clipping process may be clipping process for the candidate video corresponding to the sub-mirror, or may be acceleration and deceleration process for the candidate video corresponding to the sub-mirror, or may be clipping process of other types, which is not limited in this specification. And generating a video clip corresponding to the sub-mirror according to the subtitle corresponding to the sub-mirror, the audio corresponding to the sub-mirror and the candidate video corresponding to the sub-mirror after clipping.
In another optional embodiment, the audio corresponding to the split mirror may be processed according to the difference between the duration of the audio corresponding to the split mirror and the duration of the candidate video corresponding to the split mirror, so that the duration of the audio corresponding to the split mirror is the same as the duration of the candidate video corresponding to the split mirror, but it should be noted that, because the audio corresponding to the split mirror corresponds to the descriptive text corresponding to the split mirror one by one, in order to ensure that the content of the descriptive text corresponding to the split mirror is not lost, the audio corresponding to the split mirror may be only subjected to acceleration or deceleration processing, so that the processed audio of the split mirror will not lose the content of the descriptive text due to clipping of the audio. Similarly, the caption corresponding to the sub-mirror corresponds to the audio corresponding to the sub-mirror, and the same processing is required. Therefore, according to the processed subtitle corresponding to the sub-mirror, the processed audio corresponding to the sub-mirror and the candidate video corresponding to the sub-mirror, the video segment corresponding to the sub-mirror is generated.
S308: and splicing the video clips corresponding to the sub-mirrors to obtain a target video.
When the video clips corresponding to the descriptive texts are spliced, the problem that pictures are not smooth and smooth may exist at the splicing positions of the video clips, and visual processing schemes such as visual special effects, transition cartoons and the like can be adopted to process the pictures at the splicing positions of the video clips, so that the complete target video is smoothly played, and the click feeling is reduced.
An alternative formula may be as follows:
wherein,representing the generated target video, D representing the audio a using the respective partial mirrors i Subtitle c i And candidate video v i And synthesizing the target video.
The above method for generating the promotional video provided for one or more embodiments of the present disclosure further provides a corresponding device for generating the promotional video based on the same concept, as shown in fig. 6.
Fig. 6 is a schematic diagram of a device for generating a promotional video provided in the present specification, which specifically includes:
a keyword determining module 400, configured to determine keywords for characterizing features of a video to be generated in response to a video generation request of a user;
the display module 402 is configured to display a first guide word corresponding to each of the sub-mirrors of the video to be generated, a description text corresponding to each of the sub-mirrors, and controls for the user to input video materials corresponding to each of the sub-mirrors; the first guide word is used for guiding the user to input video materials corresponding to the various sub-mirrors through the controls according to descriptive texts corresponding to the various sub-mirrors respectively, and the descriptive texts are generated according to the keywords and comment texts which are associated with the user and correspond to the keywords;
And the target video generating module 404 is configured to generate a target video according to the description text corresponding to each of the sub-mirrors and the video material corresponding to each of the sub-mirrors, in response to the video material corresponding to each of the sub-mirrors input by the user through each of the controls.
Optionally, the apparatus further comprises:
the updating module 406 is specifically configured to display a description text corresponding to each of the sub-mirrors, each input field for the user to input an updating operation for the description text corresponding to each of the sub-mirrors, and a second guide word, where the second guide word is used to guide the user to input an updating operation for the description text corresponding to each of the sub-mirrors through the input field; responding to the updating operation input by the user through each input field, and updating the description text corresponding to each sub-mirror according to the updating operation to obtain updated description text corresponding to each sub-mirror;
optionally, the display module 402 is specifically configured to display, in response to a confirmation operation input by the user, a guide word corresponding to each of the sub-mirrors of the video to be generated, an updated description text corresponding to each of the sub-mirrors, and controls for the user to input video materials corresponding to each of the sub-mirrors.
Optionally, the apparatus further comprises:
the descriptive text labeling module 408 is specifically configured to find each comment text associated with the user and corresponding to the keyword; obtaining each description text according to the keywords, each comment text and the pre-trained natural language model; acquiring the attribute of the user, and determining each sub-mirror of the video to be generated according to the attribute of the user; and determining the description text corresponding to each of the sub-mirrors from the description text.
Optionally, the description text labeling module 408 is specifically configured to search, according to the obtained attribute of the user, a history service executed by the user; acquiring candidate comment texts related to the history service executed by the user from a service platform; and matching each candidate comment text with the keyword respectively, and taking the candidate comment text hit by the keyword in each candidate comment text as each comment text which is associated with the user and corresponds to the keyword.
Optionally, the description text labeling module 408 is specifically configured to generate a target prompt text according to the respective sub-mirrors and the respective description texts; and inputting the target prompt text into the pre-trained natural language model to obtain the description text which is output by the natural language model and corresponds to each of the sub-mirrors.
Optionally, the target video generating module 404 is specifically configured to determine, for each sub-mirror, a candidate video corresponding to the sub-mirror according to a video material corresponding to the sub-mirror; inputting the description text corresponding to the sub-mirror into a pre-trained voice synthesis model to obtain the audio corresponding to the sub-mirror output by the voice synthesis model; generating subtitles corresponding to the sub-mirrors with the same audio time length as the sub-mirrors according to the description text corresponding to the sub-mirrors; generating a video clip corresponding to the sub-mirror according to the candidate video corresponding to the sub-mirror, the subtitle corresponding to the sub-mirror and the audio corresponding to the sub-mirror; and splicing the video clips corresponding to the sub-mirrors to obtain a target video.
Optionally, the video material includes video and images;
optionally, the target video generating module 404 is specifically configured to, for each sub-mirror, when determining that the video material corresponding to the sub-mirror is a video, take the video material corresponding to the sub-mirror as the candidate video corresponding to the sub-mirror; and when the video material corresponding to the sub-mirror is determined to be an image, converting the video material corresponding to the sub-mirror into a video with preset duration, and taking the video material as a candidate video corresponding to the sub-mirror.
Optionally, the target video generating module 404 is specifically configured to clip the candidate video corresponding to the micromirror according to a difference between a duration of the audio corresponding to the micromirror and a duration of the candidate video corresponding to the micromirror; and generating a video clip corresponding to the sub-mirror according to the subtitle corresponding to the sub-mirror, the audio corresponding to the sub-mirror and the candidate video corresponding to the sub-mirror after clipping.
The present specification also provides a computer readable storage medium storing a computer program operable to perform the method of generating promotional video provided in fig. 1 described above.
The present specification also provides a schematic structural diagram of the electronic device shown in fig. 7. At the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile storage, as described in fig. 7, although other hardware required by other services may be included. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to implement the propaganda video generating method described in fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.
Claims (10)
1. A method of generating promotional video, comprising:
determining keywords for characterizing a feature of the video to be generated in response to a video generation request of a user;
displaying a first guide word respectively corresponding to each sub-mirror of the video to be generated, a description text respectively corresponding to each sub-mirror, and each control used for the user to input video materials corresponding to each sub-mirror; the first guide word is used for guiding the user to input video materials corresponding to the various sub-mirrors through the controls according to descriptive texts corresponding to the various sub-mirrors respectively, and the descriptive texts are generated according to the keywords and comment texts which are associated with the user and correspond to the keywords;
And responding to the video materials corresponding to the various sub-mirrors input by the user through the controls, and generating a target video according to the description text respectively corresponding to the various sub-mirrors and the video materials corresponding to the various sub-mirrors.
2. The method of claim 1, wherein before presenting the first guide word respectively corresponding to each of the sub-mirrors of the video to be generated, the descriptive text respectively corresponding to each of the sub-mirrors, and each control for the video material corresponding to each of the sub-mirrors by the user, the method further comprises:
displaying descriptive texts corresponding to the sub-mirrors respectively, input columns used for the user to input updating operations of the descriptive texts corresponding to the sub-mirrors respectively, and second guide words, wherein the second guide words are used for guiding the user to input updating operations of the descriptive texts corresponding to the sub-mirrors respectively through the input columns;
responding to the updating operation input by the user through each input field, and updating the description text corresponding to each sub-mirror according to the updating operation to obtain updated description text corresponding to each sub-mirror;
Displaying the guide words respectively corresponding to the sub-mirrors of the video to be generated, the description text respectively corresponding to the sub-mirrors and the controls for the user to input the video materials corresponding to the sub-mirrors, wherein the controls specifically comprise:
and responding to the confirmation operation input by the user, displaying the guide words respectively corresponding to the sub-mirrors of the video to be generated, the updated description text respectively corresponding to the sub-mirrors and the controls for the user to input the video materials corresponding to the sub-mirrors.
3. The method of claim 1, wherein before presenting the first guide word respectively corresponding to each of the sub-mirrors of the video to be generated, the descriptive text respectively corresponding to each of the sub-mirrors, and each control for the video material corresponding to each of the sub-mirrors by the user, the method further comprises:
searching each comment text which is associated with the user and corresponds to the keyword;
obtaining each description text according to the keywords, each comment text and the pre-trained natural language model;
acquiring the attribute of the user, and determining each sub-mirror of the video to be generated according to the attribute of the user;
And determining the description text corresponding to each of the sub-mirrors from the description text.
4. The method of claim 3, wherein searching for each comment text associated with the user and corresponding to the keyword, specifically comprises:
searching a history service executed by the user according to the acquired attribute of the user;
acquiring candidate comment texts related to the history service executed by the user from a service platform;
and matching each candidate comment text with the keyword respectively, and taking the candidate comment text hit by the keyword in each candidate comment text as each comment text which is associated with the user and corresponds to the keyword.
5. The method according to claim 3, wherein determining the description text corresponding to each of the sub-mirrors from the description texts specifically comprises:
generating a target prompt text according to each sub-mirror and each description text;
and inputting the target prompt text into the pre-trained natural language model to obtain the description text which is output by the natural language model and corresponds to each of the sub-mirrors.
6. The method of claim 1, wherein generating the target video according to the descriptive text corresponding to each of the sub-mirrors and the video material corresponding to each of the sub-mirrors specifically comprises:
For each sub-mirror, determining candidate videos corresponding to the sub-mirrors according to video materials corresponding to the sub-mirrors;
inputting the description text corresponding to the sub-mirror into a pre-trained voice synthesis model to obtain the audio corresponding to the sub-mirror output by the voice synthesis model;
generating subtitles corresponding to the sub-mirrors with the same audio time length as the sub-mirrors according to the description text corresponding to the sub-mirrors;
generating a video clip corresponding to the sub-mirror according to the candidate video corresponding to the sub-mirror, the subtitle corresponding to the sub-mirror and the audio corresponding to the sub-mirror;
and splicing the video clips corresponding to the sub-mirrors to obtain a target video.
7. The method of claim 6, wherein the video material comprises video and images;
for each sub-mirror, determining a candidate video corresponding to the sub-mirror according to the video material corresponding to the sub-mirror, wherein the candidate video specifically comprises:
for each sub-mirror, when the video material corresponding to the sub-mirror is determined to be video, the video material corresponding to the sub-mirror is taken as a candidate video corresponding to the sub-mirror;
and when the video material corresponding to the sub-mirror is determined to be an image, converting the video material corresponding to the sub-mirror into a video with preset duration, and taking the video material as a candidate video corresponding to the sub-mirror.
8. The method of claim 6, wherein generating the video clip corresponding to the sub-mirror based on the candidate video corresponding to the sub-mirror, the subtitle corresponding to the sub-mirror, and the audio corresponding to the sub-mirror, specifically comprises:
editing the candidate video corresponding to the sub-mirror according to the difference between the duration of the audio corresponding to the sub-mirror and the duration of the candidate video corresponding to the sub-mirror;
and generating a video clip corresponding to the sub-mirror according to the subtitle corresponding to the sub-mirror, the audio corresponding to the sub-mirror and the candidate video corresponding to the sub-mirror after clipping.
9. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-8.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-8 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311002169.6A CN117241087A (en) | 2023-08-09 | 2023-08-09 | Propaganda video generation method, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311002169.6A CN117241087A (en) | 2023-08-09 | 2023-08-09 | Propaganda video generation method, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117241087A true CN117241087A (en) | 2023-12-15 |
Family
ID=89093738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311002169.6A Pending CN117241087A (en) | 2023-08-09 | 2023-08-09 | Propaganda video generation method, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117241087A (en) |
-
2023
- 2023-08-09 CN CN202311002169.6A patent/CN117241087A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7123122B2 (en) | Navigating Video Scenes Using Cognitive Insights | |
US20140161356A1 (en) | Multimedia message from text based images including emoticons and acronyms | |
CN115082602B (en) | Method for generating digital person, training method, training device, training equipment and training medium for model | |
US9213705B1 (en) | Presenting content related to primary audio content | |
US20140163980A1 (en) | Multimedia message having portions of media content with audio overlay | |
WO2017124116A1 (en) | Searching, supplementing and navigating media | |
US20140310746A1 (en) | Digital asset management, authoring, and presentation techniques | |
US20140164507A1 (en) | Media content portions recommended | |
US20140164506A1 (en) | Multimedia message having portions of networked media content | |
US20140163957A1 (en) | Multimedia message having portions of media content based on interpretive meaning | |
CN115952272B (en) | Method, device and equipment for generating dialogue information and readable storage medium | |
US20140161423A1 (en) | Message composition of media portions in association with image content | |
CN113704513B (en) | Model training method, information display method and device | |
WO2014154097A1 (en) | Automatic page content reading-aloud method and device thereof | |
CN109600646B (en) | Voice positioning method and device, smart television and storage medium | |
US20140163956A1 (en) | Message composition of media portions in association with correlated text | |
JP7177175B2 (en) | Creating rich content from text content | |
KR20210050410A (en) | Method and system for suppoting content editing based on real time generation of synthesized sound for video content | |
KR101790709B1 (en) | System, apparatus and method for providing service of an orally narrated fairy tale | |
KR20180042116A (en) | System, apparatus and method for providing service of an orally narrated fairy tale | |
CN117171369A (en) | Content generation method, device, computer equipment and storage medium | |
CN117241087A (en) | Propaganda video generation method, equipment and storage medium | |
Gansing | The cinema of extractions: film as infrastructure for (artistic?) research | |
US20140297285A1 (en) | Automatic page content reading-aloud method and device thereof | |
KR20150121928A (en) | System and method for adding caption using animation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |