CN114286181A - Video optimization method and device, electronic equipment and storage medium - Google Patents

Video optimization method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114286181A
CN114286181A CN202111238981.XA CN202111238981A CN114286181A CN 114286181 A CN114286181 A CN 114286181A CN 202111238981 A CN202111238981 A CN 202111238981A CN 114286181 A CN114286181 A CN 114286181A
Authority
CN
China
Prior art keywords
video
target
optimized
template
templates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111238981.XA
Other languages
Chinese (zh)
Other versions
CN114286181B (en
Inventor
许奂杰
吴恒冠
李岳光
严计升
董浩
林璟
林琴
张浩鑫
芦清林
杨秀金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111238981.XA priority Critical patent/CN114286181B/en
Publication of CN114286181A publication Critical patent/CN114286181A/en
Application granted granted Critical
Publication of CN114286181B publication Critical patent/CN114286181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application relates to the technical field of video processing, in particular to a video optimization method, a video optimization device, electronic equipment and a storage medium, wherein the video to be optimized is subjected to label recognition in response to an optimization request aiming at the video to be optimized, and a video label set corresponding to the video to be optimized is obtained, wherein the optimization request comprises a platform type of at least one target application platform and an original size of the video to be optimized; respectively obtaining at least one target template set corresponding to each target application platform based on the obtained at least one platform type, the original size and the video label set; each target template has a target size which meets the release size condition of the corresponding target application platform; and respectively filling the video to be optimized into each target template contained in the obtained at least one target template set to obtain each optimized target video, so that video distortion can be avoided when the video size is adjusted, and the video click rate is improved.

Description

Video optimization method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a video optimization method and apparatus, an electronic device, and a storage medium.
Background
At present, with the development of video technology and network technology, the same target video can be put in different application platforms.
However, during the release of the target video, the release size requirements of different application platforms may be different, for example, the release size of the same target video in a game platform is different from the release size in a social platform; therefore, in order to ensure that the delivered target video can meet the requirements of different application platforms on the delivery size, the target video needs to be optimized before delivery.
In the related art, when a target video is optimized, an original ratio (e.g., an aspect ratio) of the target video is usually adjusted based on a target ratio of an application platform, so as to determine a delivery size of the target video, so as to meet video delivery requirements of different application platforms.
However, when the target proportion of the application platform is greatly different from the original proportion of the target video, the target video is optimized by adopting the above method, and in the process of adjusting the target video, part of video elements are lost, so that the adjusted target video has video distortion and cannot achieve the expected playing effect, and the video click rate is reduced.
Disclosure of Invention
The embodiment of the application provides a video optimization method and device, electronic equipment and a storage medium, so that video distortion of a target video in a size adjustment process is avoided, and the video click rate is improved.
The video optimization method provided by the embodiment of the application comprises the following steps:
responding to an optimization request aiming at a video to be optimized, performing label identification on the video to be optimized, and obtaining a video label set corresponding to the video to be optimized, wherein the optimization request comprises a platform type of at least one target application platform and an original size of the video to be optimized;
respectively obtaining a target template set corresponding to each of the at least one target application platform based on the obtained at least one platform type, the original size and the video tag set; each target template has a target size which meets the release size condition of the corresponding target application platform;
and filling the video to be optimized into each target template contained in the obtained at least one target template set respectively to obtain each optimized target video.
The video optimization device provided by the embodiment of the application comprises:
the video optimizing method comprises a first identification unit, a second identification unit and a third identification unit, wherein the first identification unit is used for responding to an optimizing request aiming at a video to be optimized, carrying out label identification on the video to be optimized and obtaining a video label set corresponding to the video to be optimized, and the optimizing request comprises a platform type of at least one target application platform and an original size of the video to be optimized;
a second identification unit, configured to obtain, based on the obtained at least one platform type, the original size, and the video tag set, a target template set corresponding to each of the at least one target application platform respectively; each target template has a target size which meets the release size condition of the corresponding target application platform;
and the optimization unit is used for respectively filling the video to be optimized into each target template contained in the obtained at least one target template set to obtain each optimized target video.
Optionally, the second identification unit is configured to:
respectively executing the following operations aiming at the at least one target application platform:
determining a candidate template set corresponding to one target application platform according to the platform type and the original size of the target application platform, wherein the candidate template set comprises a plurality of candidate templates, and each candidate template corresponds to one template label set;
and matching the video label set with the template label sets respectively corresponding to the candidate templates to obtain the target templates and generate a target template set comprising the target templates.
Optionally, when performing tag identification on the video to be optimized and obtaining a video tag set of the video to be optimized, the first identification unit is configured to:
obtaining video characteristics of the video to be optimized, wherein the video characteristics are at least one of image content characteristics, audio content characteristics and text content characteristics;
determining matching probability between the video features and candidate features corresponding to the candidate labels by adopting a trained label recognition model, taking the candidate labels meeting the matching probability condition as video labels of the video to be optimized, and generating a video label set containing the video labels;
the label recognition model is obtained through iterative training based on a training sample set, the training sample set comprises a plurality of video samples to be optimized, and the video samples to be optimized respectively correspond to video sample characteristics and video labels of the samples.
Optionally, the optimization unit is configured to:
for each target template contained in the obtained at least one target template set, respectively executing the following operations:
performing equal proportion adjustment on the original size of the video to be optimized based on the size of a filling area corresponding to a target filling area in a target template;
and filling the adjusted video to be optimized into the target filling area to obtain the optimized target video.
Optionally, after obtaining the target template sets respectively corresponding to the at least one target application platform, respectively, before filling the video to be optimized into the target templates included in the obtained at least one target template set, the method further includes a processing unit, where the processing unit is configured to:
for the at least one target template set, respectively performing the following operations:
respectively sending the playing parameters corresponding to the target templates to a client so that the client generates preview videos based on the playing parameters and the videos to be optimized, and responding to a selection instruction for each preview video to send the target template identification corresponding to each selected preview video to a server;
and receiving each target template identifier returned by the client, and regenerating a target template set containing the target template corresponding to each target template identifier.
Optionally, the second identification unit is configured to:
respectively executing the following operations aiming at the at least one target application platform:
obtaining each target template corresponding to one target application platform based on the platform type corresponding to the target application platform, the original size and the video label set;
when the number of the templates of each target template is determined to be not less than the threshold value of the number of the templates, generating a target template set containing each target template;
and when the number of the templates is determined to be smaller than the threshold value of the number of the templates, acquiring the universal templates with the corresponding number, and generating a target template set comprising the universal templates and the target templates.
Optionally, the video tag is any one of the following: a video storyline tab, a video style tab, a presentation object tab, or a key content tab.
An electronic device provided by an embodiment of the present application includes a processor and a memory, where the memory stores program codes, and when the program codes are executed by the processor, the processor is caused to execute the steps of any one of the video optimization methods described above.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the steps of any of the video optimization methods described above.
An embodiment of the present application provides a computer-readable storage medium, which includes program code for causing an electronic device to perform any one of the steps of the video optimization method described above when the program product runs on the electronic device.
The beneficial effect of this application is as follows:
the embodiment of the application provides a video optimization method and device, electronic equipment and a storage medium. When the video to be optimized is adjusted to the target size corresponding to the target application platform, the target template set corresponding to at least one target application platform can be determined based on the platform type corresponding to at least one target application platform, the original size of the video to be optimized and the video label set, the video to be optimized is filled into the target templates contained in each target template set, and the video to be optimized can be adjusted from the original size to the target size specified by the target application platform on the premise that the original proportion of the video to be optimized is not changed. Therefore, even if the target proportion of the target application platform is greatly different from the original proportion of the video to be optimized, all video elements in the original video can be reserved, video distortion is avoided, the expected playing effect can be achieved, and the video click rate is improved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic view of an application scenario according to an embodiment of the present application;
fig. 2 is a flowchart of an implementation of a video optimization method in an embodiment of the present application;
FIG. 3 is a schematic view of a first interface in an embodiment of the present application;
fig. 4 is a schematic flowchart of determining each video tag of a video to be optimized in an embodiment of the present application;
FIG. 5 is a schematic flow chart illustrating the determination of a target template in an embodiment of the present application;
FIG. 6 is a first schematic diagram of a target template in an embodiment of the present application;
FIG. 7 is a flow chart of determining a set of target templates in an embodiment of the present application;
FIG. 8 is a diagram of a generic template in an embodiment of the present application;
FIG. 9 is a flowchart illustrating obtaining a new set of target templates in an embodiment of the present application;
FIG. 10 is a second interface diagram in an embodiment of the present application;
fig. 11 is a schematic diagram illustrating an effect of a first target video in an embodiment of the present application;
fig. 12 is a schematic diagram illustrating an effect of a second target video in the embodiment of the present application;
fig. 13 is a schematic diagram illustrating an effect of a third target video in the embodiment of the present application;
FIG. 14 is a flow chart of a method of advertisement optimization in an embodiment of the present application;
FIG. 15 is another flow chart of a video optimization method according to an embodiment of the present application;
fig. 16 is a schematic structural diagram illustrating a video optimization apparatus according to an embodiment of the present application;
fig. 17 is a schematic diagram of a hardware component structure of an electronic device to which an embodiment of the present application is applied.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments, but not all embodiments, of the technical solutions of the present application. All other embodiments obtained by a person skilled in the art without any inventive step based on the embodiments described in the present application are within the scope of the protection of the present application.
Some concepts related to the embodiments of the present application are described below.
And (3) video to be optimized: the video to be optimized represents various forms of video information available in the internet, such as game advertisement videos, startup advertisements of video application software and the like.
Target video: the target video is a video having a size that satisfies a target size, which is size information of a launch size condition of the target application platform, e.g., the target size of the target video is 9: 16.
Video label set: the video tag set comprises a plurality of video tags, and each video tag is used for representing specific content and characteristics of a video.
Specifically, the video tag may be a video plot tag representing a plot of a video, such as driving, education, etc.; the video label can also be a video style label, and the video style label represents the color style of the video, such as a lovely style, a cheerful style and the like; the video label can also be a display object label, and the display object label represents a label of a commodity displayed by the video, such as an automobile, soda water and the like; the video tag may also be a key content tag, where the key content tag represents key content capable of attracting the target object to purchase the product, such as a video end frame image, which is not limited in the embodiment of the present application.
The target application platform: the target application platforms are platforms with application programs in different video creative forms, such as game application platforms, video playing application platforms and the like, and each target application platform corresponds to different video delivery size conditions.
The application programs are computer programs that can complete one or more services, some application programs need to be installed on used terminal equipment by a user for use, and some application programs do not need to be installed, for example, each applet, webpage and the like in some social applications. The applet can be used without downloading or installing, and the user can open the application program by scanning or searching.
Target template set: the target template set is a set of target templates corresponding to the target application platform, the target template set comprises a plurality of target templates, and each target template has a target size meeting the release size condition of the corresponding target application platform.
The following briefly introduces the design concept of the embodiments of the present application:
at present, with the development of video technology and network technology, the same target video can be launched in different application platforms, for example, for the same advertisement video, the same target video can be launched in a game platform, and can also be launched in a video playing platform.
However, the requirements of different application platforms for the placement size of the video may be different, for example, the target video is an advertisement video, the placement size of the same advertisement video in the game platform is 9:16, and the placement size in the social platform is 3:4, so that the placement size of the advertisement video in the game platform is different from that in the social platform; in order to ensure that the target videos launched in different target application platforms can meet the requirements of the different application platforms on launching sizes, therefore, before launching the videos, the launching sizes of the target videos need to be optimized, so that the video sizes of the optimized target videos can meet the launching sizes of the target application platforms.
In the related art, when optimizing a target video, an original scale of the target video is adjusted, for example, the target video is stretched, based on a target scale (e.g., an aspect ratio) of an application platform, so that the target video can be adjusted from the original scale to a target scale required when the application platform is released.
However, when the target proportion of the application platform is greatly different from the original proportion of the target video, the target video is optimized by adopting the above method, so that part of video elements of the target video are lost in the process of video size adjustment, thereby causing video content distortion, and after the target video is put on the application platform, the expected playing effect cannot be achieved, and the click rate of the target video is also reduced.
Moreover, if the video needs to be delivered to a plurality of different target application platforms, the size of the video needs to be adjusted for a plurality of times to meet the delivery size requirements of the different target application platforms, the target videos under a plurality of versions are stored, and the adjustment process depends on professional designers, so that a large amount of time and resources are consumed, and the tracking of the effect performance of the same video on the different application platforms is not facilitated.
In view of this, embodiments of the present application provide a video optimization method, an apparatus, an electronic device, and a storage medium. The method comprises the steps of obtaining a video label set by analyzing video content, respectively determining a target template set corresponding to each target application platform based on a platform type and an original size corresponding to at least one target application platform and a video label set of a video to be optimized, and optimizing the video to be optimized based on each target template set, so that the optimized target video can meet the releasing size requirements of different target application platforms, the video distortion is avoided, the video making releasing threshold is greatly reduced, and the video releasing effect is enhanced.
The preferred embodiments of the present application will be described below with reference to the accompanying drawings of the specification, it should be understood that the preferred embodiments described herein are merely for illustrating and explaining the present application, and are not intended to limit the present application, and that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Fig. 1 is a schematic view of an application scenario in the embodiment of the present application. The application scenario diagram includes a client 110 and a server 120. The client 110 in the embodiment of the present application is a device installed with a video resizing application.
The video resizing application related to the embodiment of the application may be software, a web page, an applet, or the like, and the server is a server for video resizing corresponding to the software, the web page, the applet, or the like.
It should be noted that the video optimization method in the embodiment of the present application may be executed by a server or a client alone, or may be executed by both the server and the client. When the server and the client execute together, for example, the client may send a triggered optimization request and a video to be optimized to the server, and after the server determines a corresponding target template set, the video to be optimized is optimized based on each target template included in the determined target template set, so that each optimized target video is sent to the client for display. Hereinafter, the examples are mainly implemented by the server and the client, and the examples are not limited in this respect.
The following takes advertising videos as an example:
(1) the target object selects each target application platform, game platform and video playing platform which are expected to launch the advertisement video from the client, so that an optimization request is generated in the client in a triggering mode, and the client sends the optimization request generated in the triggering mode and the advertisement video uploaded in the client to the server.
The optimization request comprises a platform type corresponding to the game platform, a platform type corresponding to the video playing platform and the original size of the advertisement video.
(2) And after the obtained advertisement video and the optimization request are obtained, the server identifies the video content of the advertisement video, so that a video label set of the advertisement video is obtained.
The video tag set comprises a vehicle driving scene tag and an automobile industry tag.
(3) And matching and obtaining a target template set corresponding to the game platform and a target template set corresponding to the video playing platform from all candidate templates based on the platform type corresponding to the game platform, the platform type corresponding to the video playing platform, the original size of the advertisement video, the vehicle driving scene tag and the automobile industry tag.
(4) And respectively filling the original advertisement videos into each target template contained in each target template set, thereby obtaining the target advertisement videos meeting the launching size condition of the game platform and the target advertisement videos meeting the launching size condition of the video playing platform.
(5) And sending each target advertisement video to the client, so that the client can display each generated target advertisement video in different target application platforms after receiving each target advertisement video.
It should be noted that the above description is only one of the application scenarios, and there are other scenarios in the actual service.
In an alternative embodiment, the client 110 and the server 120 may communicate over a communication network.
In an alternative embodiment, the communication network is a wired network or a wireless network.
In this embodiment, the client 110 is a computer device used by a user, and the computer device may be a computer device having a certain computing capability and running instant messaging software and a website or social contact software and a website, such as a personal computer, a mobile phone, a tablet computer, a notebook, an e-book reader, and a vehicle-mounted terminal. Each client 110 is connected to a server 120 through a wireless network, and the server 120 is a server or a server cluster or a cloud computing center formed by a plurality of servers, or is a virtualization platform.
It should be noted that fig. 1 is only an example, and the number of the clients and the servers is not limited in practice, and is not specifically limited in the embodiment of the present application.
The video optimization method provided by the exemplary embodiment of the present application is described below with reference to the accompanying drawings in conjunction with the application scenarios described above, and it should be noted that the application scenarios described above are only shown for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect.
Referring to fig. 2, a flowchart of an implementation of a video optimization method in the embodiment of the present application is shown, which is described by taking a server as an execution subject, and a specific implementation flow of the method is as follows:
step 20: and responding to an optimization request aiming at the video to be optimized, and performing label identification on the video to be optimized to obtain a video label set corresponding to the video to be optimized.
Wherein the optimization request comprises the platform type of at least one target application platform and the original size of the video to be optimized.
In the embodiment of the application, a target object uploads a video to be optimized to a client, so that the client forwards the video to be optimized to a server, identifies the video size of the video to be optimized, obtains the original size of the video to be optimized, and meanwhile, the client responds to platform type selection operation triggered by the target object, obtains a platform type corresponding to at least one target application platform, generates an optimization request comprising at least one platform type and the original size, and sends the generated optimization request to the server.
The platform type selection operation is a selection operation triggered by a target object in the client for at least one application platform.
For example, referring to fig. 3, as a first interface schematic diagram in this embodiment of the application, in an operation interface of a client, operation controls corresponding to three application platforms, which are a "game platform" operation control, a "social platform" operation control, and a "video playing platform" operation control, are displayed, and when a target object clicks the "game platform" operation control, the "social platform" operation control, and the "video playing platform" operation control, a platform type selection operation is triggered and generated, so that the client responds to the platform type selection operation to acquire a game platform type, a social platform type, and a video playing platform type.
Meanwhile, the operation interface also comprises a video uploading operation control, so that the target object uploads the video to be optimized to the client by clicking the video uploading operation control, and the size of the video to be optimized is identified to obtain the original size of the video to be optimized.
The method comprises the steps that an operation interface comprises a 'confirm' operation control and a 'reset' operation control, when a target object clicks the 'confirm' operation control, an optimization request containing a game platform type, a social platform type, a video playing platform type and the original size of a video to be optimized is generated, and when the target object clicks the 'reset' operation control, the obtained game platform type, social platform type and video playing platform type are deleted.
Then, after the server obtains the video to be optimized and the optimization request aiming at the video to be optimized, the server performs label identification on the video to be optimized by adopting a preset label identification mode, obtains each video label of the video to be optimized, and generates a video label set comprising each video label.
It should be noted that the video tag set in the embodiment of the present application is used to mark video content, style, and scenario of a video to be optimized, and the video tag set can reflect video content information of the video to be optimized.
Optionally, in this embodiment of the present application, a possible implementation manner is provided for obtaining a video tag set, and a process of obtaining a tag set corresponding to a video to be optimized in this embodiment of the present application is described in detail below, including:
s201: and obtaining the video characteristics of the video to be optimized.
Wherein the video feature is at least one of an image content feature, an audio content feature, and a text content feature.
In the embodiment of the application, a preset feature extraction mode is adopted to extract the features of the video to be optimized, so that the video features of the video to be optimized are obtained.
The video features in the embodiment of the present application include at least one of image content features, audio content features, and text content features, for example, the video features include image content features, and for example, the video features include image content features, audio content features, and text content features, and the type and number of videos in the embodiment of the present application are not limited.
In specific implementation, if the video features include a plurality of features with different dimensions, the plurality of features with different dimensions need to be spliced to obtain the video features, for example, if the image content features, the audio content features and the text content features are obtained through a preset feature extraction mode, the image content features, the audio content features and the text content features are spliced through a preset feature splicing mode, so that the video features of the video to be optimized are obtained.
It should be noted that, in the embodiment of the present application, when performing feature extraction on a video to be optimized, feature extraction may be performed on the video to be optimized from different dimensions through different feature extraction models, so as to obtain features of the video to be optimized in different dimensions. The following describes in detail a process of obtaining video features in the embodiment of the present application, taking video features as image content features, audio content features, and text content features as examples, and includes:
s2011: and adopting an image recognition model to extract the characteristics of the video to be optimized so as to obtain the image content characteristics of the video to be optimized.
The image recognition model may be a behavior recognition model, and the behavior recognition model may be, for example, a Temporal Shift Module (TSM).
Step 2012: and adopting an audio recognition model to extract the characteristics of the video to be optimized so as to obtain the audio content characteristics of the video to be optimized.
The audio recognition model may be, for example, a VGGish model, and in the embodiment of the present application, the principle of performing feature extraction on a video to be optimized by using the VGGish model is as follows: in a fixed time segment, extracting a spectrogram corresponding to the audio of the video to be optimized, and inputting the spectrogram into a VGGish model for classification, wherein the VGGish model in the embodiment of the application is composed of 11 layers of convolution layers and is subjected to down-sampling for 3 times, and the audio content features are features output by a last but one full connection layer in the VGGish model.
S2013: and adopting a text recognition model to extract the characteristics of the video to be optimized so as to obtain the text content characteristics of the video to be optimized.
The text recognition model may be a language Representation model, and the language Representation model may be, for example, a Bidirectional encoding from transforms (BERT) based on a converter, and is used to encode the input text and output a vector Representation in which semantic information is fused to each word/word in the text.
It should be noted that, when feature extraction is performed on a video to be optimized, there is no limitation on the execution sequence between S2011 and S2013 in this embodiment, S2011 may be executed first, S2012 may be executed first, and of course, S2013 may be executed first, or executed simultaneously, which is not limited in this embodiment of the present application.
S2014: and splicing the video content characteristics, the audio content characteristics and the text content characteristics to obtain the video characteristics of the video to be optimized.
In the embodiment of the application, after the video content characteristics, the audio content characteristics and the text content characteristics are obtained, the video content characteristics, the audio content characteristics and the text content characteristics are spliced by adopting a preset characteristic splicing mode, and the video characteristics of the video to be optimized are obtained.
S202: and determining the matching probability between the video features and the candidate features corresponding to the candidate labels by adopting the trained label recognition model, respectively using the candidate labels meeting the matching probability condition as the video labels of the video to be optimized, and generating a video label set containing the video labels.
The label recognition model is obtained through iterative training based on a training sample set, the training sample set comprises a plurality of video samples to be optimized, and the video samples to be optimized respectively correspond to video sample characteristics and video labels of the samples.
First, the training process of the label recognition model is introduced as follows: when the label recognition model is trained, firstly, a training sample set is obtained, wherein the training sample set comprises a plurality of video samples to be optimized, video sample characteristics corresponding to each video sample to be optimized and each sample video label.
In this embodiment, the video sample features include at least one of image content features, audio content features, and text content features.
In addition, it should be noted that, in the embodiment of the present application, the video features obtained in the practical application process of the tag identification model are consistent with the video sample features in the training process, for example, in the practical application process, the obtained video features include image content features and audio content features, and then, in the training process, the video sample features also include image content features and audio content features.
After the training sample set is obtained, inputting the training sample set into an initial label recognition model, and performing iterative training on the initial label recognition model according to the video sample characteristics and each sample video label until the objective function of the initial label recognition model is converged to obtain a trained label recognition model.
Wherein the objective function is a minimization of a cross entropy function between each video sample feature and the corresponding sample video tag.
It should be noted that the tag identification model in the embodiment of the present application is continuously updated iteratively, for example, every 24 hours.
Then, after the trained tag recognition model is obtained, the obtained video features are input into the trained tag recognition model, the matching probabilities between the video features and the candidate features corresponding to each candidate tag are respectively determined, so that each matching probability is obtained, the matching probability between each candidate tag and the video features is obtained, then, based on each obtained matching probability, each candidate tag meeting the matching probability condition is selected from each candidate tag, and each selected candidate tag is used as the video tag of the video to be optimized, so that a video tag set comprising each video tag is generated.
In specific implementation, the matching probability condition may be a probability threshold, and therefore, when each candidate tag meeting the matching probability condition is used as a video tag of the video to be optimized, whether the matching probability corresponding to each candidate tag is smaller than a preset probability threshold may be determined by judging whether the matching probability corresponding to each candidate tag is smaller than the preset probability threshold, so as to determine whether the corresponding candidate tag is used as the video tag of the video to be optimized.
For example, referring to fig. 4, for a schematic flow chart of determining each video tag of a video to be optimized in the embodiment of the present application, assuming that a video feature of the video to be optimized is X, each candidate tag is a1, a2, and A3, and a preset probability threshold is 80%, inputting the video feature X into a trained tag recognition model, matching the video feature X with a candidate feature of candidate tag a1, matching the video feature X with a candidate feature of candidate tag a2, and matching the video tag X with a candidate feature of candidate tag A3, thereby obtaining a matching probability 30% between the video feature X and a candidate feature of candidate tag a1, obtaining a matching probability 80% between the video feature X and a candidate feature of candidate tag a2, and obtaining a matching probability 90% between the video feature X and a candidate feature of candidate tag A3, then candidate tag a2 corresponding to a matching probability of 80% and candidate tag A3 corresponding to a matching probability of 90% are used as video tags of the video to be optimized, and a video tag set including candidate tag a2 and candidate tag A3 is generated.
The video tags can be video scenario tags, video style tags, display object tags or key content tags, the video scenario tags represent the content of the story content of the video to be optimized, the video style tags are the tags of the image style of the video, the display object tags are the tags of the goods displayed by the video, and the key content tags are the benefit points of the video, namely the content which is mainly expected to be expressed by the video.
It should be noted that, in the embodiment of the present application, a video tag set includes a plurality of video tags, and certainly, in an actual application process, a video tag may also be included.
Therefore, the video content of the video to be optimized is understood, the video label set of the video to be optimized is obtained based on the trained label recognition model, the matched target template can be ensured to be the template related to the video content in the subsequent template matching process, and the video click rate can be improved when the target video is generated.
Step 21: and respectively obtaining a target template set corresponding to each of at least one target application platform based on the obtained at least one platform type, the original size and the video label set.
Wherein each target template has a target size that satisfies a launch size condition of a corresponding target application platform.
In the embodiment of the application, after the video tag of the video to be optimized is obtained, the target template set corresponding to at least one target application platform can be respectively obtained according to the obtained at least one platform type, the original size and the video tag set.
Optionally, in this embodiment of the present application, a possible implementation manner is provided for obtaining a target template set corresponding to at least one target application platform, and when step 21 is executed, a target template set corresponding to at least one target application platform needs to be obtained, specifically, taking any one target application platform (hereinafter referred to as a target application platform i) as an example, a process of obtaining the target template set is described as follows:
s211: and determining a candidate template set corresponding to the target application platform i according to the platform type and the original size of the target application platform i.
The candidate template set comprises a plurality of candidate templates, and each candidate template corresponds to one template label set.
In the embodiment of the application, firstly, each candidate template meeting the requirement of the launching size of the target application platform i is determined from the template database according to the platform type of the target application platform i. Because each candidate template corresponds to an adjustable original size, the original size of the video to be optimized is matched with the original size corresponding to each candidate template, a plurality of candidate templates are obtained from each candidate template through matching, and a candidate template set containing the plurality of candidate templates obtained through matching is generated.
It should be noted that each candidate template corresponds to a template tag set, and the candidate templates are labeled in advance.
S212: and matching the video label set with the template label set corresponding to each candidate template respectively to obtain each target template, and generating a target template set comprising each target template.
In the embodiment of the application, the video tag sets are respectively matched with the template tag sets corresponding to the candidate templates, so that each target template capable of being matched is determined from the candidate templates, and a target template set comprising each target template is generated.
For example, for a characteristic scene, a financial instrument panel and a writing test paper scene in the 'education and finance' industry, a template containing materials with industry characteristics can be adopted for filling; aiming at special scenes, namely mouth broadcast and situation drama scenes in the financial and network service industries, a focus figure can be adopted to fill the special scenes along with the template; aiming at a characteristic scene 'commodity feature' scene in the 'e-commerce' industry, a 'selling point display' template can be adopted for filling; for a wide 'filling' scene in all industries, a depopulation template can be adopted; for a video multi-shot scene in all industries, a highlight shot display template and a hierarchical simulcast template can be adopted, which is not limited in the embodiment of the application.
For another example, referring to fig. 5, as a schematic diagram of a process for determining a target template in the embodiment of the present application, it is assumed that a video tag set includes two video tags, which are an education tag and a test paper writing tag, respectively, a template database includes four candidate templates, which are a candidate template a, a candidate template B, a candidate template C, and a candidate template D, a template tag set corresponding to the candidate template a includes the education tag and the test paper writing tag, a template tag set corresponding to the candidate template B includes a car tag, a driving school test tag, and a vehicle driving tag, a template tag set corresponding to the candidate template C includes the education tag and the test paper writing tag, a template tag set corresponding to the candidate template D includes a live broadcast tag and an lovely style tag, and the education tag and the test paper writing tag in the video tag set are used to match the template tag sets between the candidate templates, respectively determining whether the template label set corresponding to each candidate template also comprises an education label and a writing test paper label, then matching and obtaining the candidate templates which also comprise the education label and the writing test paper label in the template label set from each candidate template, wherein the candidate templates are a candidate template A and a candidate template D, the candidate template A and the candidate template D are used as target templates, and a target template set comprising each target template is generated.
For example, referring to fig. 6, which is a first schematic diagram of a target template in an embodiment of the present application, a template tag set corresponding to a candidate template shown in fig. 6 includes an education tag and a writing paper tag, so that when video is optimized, a video to be optimized is filled into a video adding area to be optimized.
In the embodiment of the present application, in order to improve the accuracy of template recommendation, it is necessary to use the candidate template as the target template obtained by matching each video tag in the video tag set with each video tag in the template tag set of the candidate template.
Certainly, in an actual application process, when each video tag in the video tag set is included in the template tag set, the candidate template may be used as a target template obtained through matching, for example, when the video tag set includes two video tags, which are an automobile tag and a vehicle driving tag, and the template tag set corresponding to the candidate template includes an automobile tag, a vehicle driving tag, and a driving test tag, it is determined that each video tag in the video tag set at this time is included in the template tag set of the candidate template, and therefore, the candidate template may be used as the target template, which is not limited in this embodiment of the application.
Optionally, in this embodiment of the application, the corresponding target template may be determined based on the video audience crowd information, the video tag set, the original size, and the platform type, so that the accuracy of determining the target template may be improved, and the video audience crowd information may be, for example, children, the elderly, and the like.
Optionally, in this embodiment of the application, a corresponding duration clipping scheme, an effect enhancement scheme, and the like may also be determined based on the video audience crowd information, the video tag set, the original size, and the platform type, which is not limited in this embodiment of the application.
For example, assuming that the advertisement duration that the target application platform can play is 10 seconds, and the video duration of the video to be optimized is 15 seconds, it is necessary to determine a corresponding duration clipping scheme based on the video tag set, the original size, and the platform type, and clip the video to be optimized to 10 seconds.
It should be noted that the effect enhancement scheme includes a sticker and a special effect, which are not limited in the embodiment of the present application.
Therefore, the target template for video size adjustment can be accurately pushed to the target object in a tag matching mode, the target template adaptive to video content can be recommended to the target object, size conversion is guaranteed, understandability of the converted target video can be improved, and click rate of the target video is improved.
Optionally, in the embodiment of the present application, the target template determined based on the video tag set is a target template that conforms to the video content of the video to be optimized, however, when the target template adapted to the video content of the video to be optimized is not matched, the selection of the target object for video optimization is reduced, and therefore, in order to ensure that the target object can obtain more target templates, therefore, more template selections can be brought to the target object, the selectivity of video optimization is improved, in the embodiment of the application, when the number of the target templates does not reach the threshold value of the number of the templates, providing a generic template, providing a possible implementation manner, and referring to fig. 7, a flowchart for determining a target template set in the embodiment of the present application is shown, specifically, taking the target application platform i as an example, the process for obtaining the target template set is described as follows:
s70: and obtaining each target template corresponding to the target application platform i based on the platform type, the original size and the video label set corresponding to the target application platform i.
In the embodiment of the application, each target template corresponding to the target application platform i is obtained based on the platform type, the original size and the video label set corresponding to the target application platform i.
It should be noted that the obtaining manner of each target template corresponding to the target application platform i may adopt the manner from S211 to S212, which is not described herein in detail.
In addition, after each target template corresponding to the target application platform i is obtained, the target template set is not generated once.
S71: and when the number of the templates of each target template is determined to be not less than the threshold value of the number of the templates, generating a target template set containing each target template.
In the embodiment of the present application, it is determined whether the template number of each target template of the target application platform i is smaller than a preset template number threshold, which may specifically be divided into the following two cases: in the first case: the template number of each target template of the target application platform i is not less than a preset template number threshold value; in the second case: and the template number of each target template of the target application platform i is less than a preset template number threshold value. Both of these cases will be described in detail later.
Specifically, when S71 is executed, if it is determined that the number of templates of each target template is not less than the threshold value of the number of templates, it is determined that each target template corresponding to the target application platform i at this time is the template for performing the video optimization processing, and therefore, a target template set including each target template is generated.
For example, assuming that the preset threshold value of the number of templates is 10, after the number of each target template is counted, the number of templates of each target template is determined to be 15, and then a target template set including 15 target templates is generated.
S72: and when the number of the templates is determined to be smaller than the threshold value of the number of the templates, acquiring the universal templates with the corresponding number, and generating a target template set comprising the universal templates and the target templates.
In the embodiment of the application, whether the number of templates of each target template is smaller than a preset threshold value is judged, and when the number of templates is smaller than the threshold value of the number of templates, the corresponding number of general templates are obtained, and a target template set comprising each general template and each target template is generated.
The video to be optimized is assumed to be an advertisement video, the universal template is suitable for all video advertisement materials and is suitable for putting in various advertisement industries and various flow positions, the non-universal target template is only suitable for part of the video advertisement materials and needs to be used on the basis of understanding the video content, and the template is a customized template.
For example, if it is determined that the customized non-generic template cannot be output or the number of the adapted non-generic templates is less than the threshold value of the number of templates, the pattern of the generic template is output, and finally 12 target templates are output per video.
Optionally, in the embodiment of the present application, a possible implementation manner is provided for obtaining a corresponding number of general templates, and a process of obtaining a corresponding number of general templates in the embodiment of the present application is described in detail below, including:
and calculating a difference value between the threshold value of the number of the templates and the number of the templates, and randomly acquiring the number of the universal templates corresponding to the difference value from each universal template so as to generate a target template set comprising each universal template and each target template.
It should be noted that, in the embodiment of the present application, the obtained general template is a template capable of meeting the delivery size requirement of the target application platform i.
For example, referring to fig. 8, a general template diagram in the embodiment of the present application is shown, where a target size of the general template is 9:16, and an original size of a video to be optimized that can be filled is 16:9, that is, when the original size of the video to be optimized is 16:9, the video to be optimized can be adjusted from the original size of 16:9 to the target size of 9:16 through the general template, so as to meet a target size requirement of a target application platform.
Optionally, in this embodiment of the present application, after obtaining the target templates, because the number of the target templates is large, if after obtaining each target template, each target template and the video to be optimized are rendered to obtain each target video, the amount of computation may be increased, and therefore, in this embodiment of the present application, a possible implementation is provided, and only the target template selected by the target object may be rendered to obtain the target video, so as to reduce the amount of computation, refer to fig. 9, which is a schematic flow diagram for obtaining a new target template set in this embodiment of the present application, and includes:
s90: and respectively sending the playing parameters corresponding to the target templates to the client so that the client generates preview videos based on the playing parameters and the videos to be optimized, and responding to a selection instruction for each preview video and sending the target template identification corresponding to each selected preview video to the server.
In the embodiment of the application, the client side sends the playing parameters corresponding to the target templates to the client side respectively by adopting a communication protocol agreed with the server, so that after receiving the playing parameters, the client side generates the preview videos under each playing parameter based on the playing parameters and the videos to be optimized respectively, displays the generated preview videos to the target object according to a preset display mode, then the target object can trigger a selection execution instruction for each displayed preview video, the client side determines each selected preview video in response to the selection instruction for each displayed preview video, and sends the target template identification corresponding to each selected preview video to the server.
For example, referring to fig. 10, as a second interface schematic diagram in the embodiment of the present application, each target template obtained through template matching is a target template a "portrait video 9: 16" meeting the drop size requirement of the application platform X, and a target template B "landscape video 16: 9" and a target template C "landscape video 16: 9" meeting the drop size requirement of the target application platform Y, so that a target object may view a preview video generated using the target template a, view a preview video generated using the target template B, and view a preview video generated using the target template C in a preview area, and the target object may select a target template that needs to be rendered, so that the server renders the video to be optimized.
S91: and receiving each target template identifier returned by the client, and regenerating to generate a target template set containing the target template corresponding to each target template identifier.
In the embodiment of the application, each target template identifier returned by the client is received, the target template corresponding to each returned target template identifier is determined, and the target template set comprising each target template is regenerated. Therefore, the client rendering only needs to map the target template into an agreed communication protocol for displaying, and does not need to really call the FFmpeg video rendering framework for synthesis rendering, so that the rendering speed is high, and after the front end and the rear end of the front end are separately rendered, the rapidity of the client rendering and splicing display process can be ensured, the stability of the server rendering, synthesis and storage process is kept, and the waiting time of a user is reduced.
Step 22: and filling the video to be optimized into each target template contained in the obtained at least one target template set respectively to obtain each optimized target video.
In the embodiment of the application, videos to be optimized are respectively filled into each target template contained in at least one obtained target template set, so that each optimized target video is obtained.
For example, refer to fig. 11, which is a schematic view of an effect of a first target video in the embodiment of the present application, fig. 11 is a schematic view of a target video in a filled education scene, refer to fig. 12, which is a schematic view of an effect of a second target video in the embodiment of the present application, fig. 12 is a schematic view of a target video in a filled reading scene, refer to fig. 13, which is a schematic view of an effect of a third target video in the embodiment of the present application, and fig. 13 is a schematic view of an effect in a filled food scene.
Optionally, in this embodiment of the present application, a possible implementation manner is provided for executing step 22, and the original size may be adjusted in an equal proportion, so that the video to be optimized can meet the requirement of the size of the filling area, and at the same time, the video is guaranteed not to be distorted, and the following describes in detail a manner of performing video filling in this embodiment of the present application, and includes:
s221: and carrying out equal-proportion adjustment on the original size of the video to be optimized based on the size of a filling area corresponding to a target filling area in a target template.
In the embodiment of the application, when the size of the target filling area in one target template is based, the original size of the video to be optimized is adjusted in an equal proportion, so that the original size of the video to be optimized is adjusted to the size of the target filling area, that is, the aspect ratio of the adjusted video to be optimized to the target filling area is the same.
S222: and filling the adjusted video to be optimized into the target filling area to obtain the optimized target video.
In the embodiment of the application, since the aspect ratio of the adjusted video to be optimized is the same as that of the target filling region, the adjusted video to be optimized is filled into the target filling region, so that the optimized target video is obtained.
For example, the video to be optimized is a horizontal version video with 1280 pixels wide and 720 pixels high, the target size of the target application platform is 720 pixels wide and 1280 pixels high, therefore, it is determined that the original size of the video to be optimized cannot meet the size release requirement of the target application platform, the video size needs to be subjected to one-key conversion, the video plot label of the video to be optimized is determined to be an internet e-commerce through a label identification model, the video style label is light and fast, the display object label is an e-commerce, and therefore, the target template obtained through label matching is a customized template of the e-commerce industry and has the customized effect of red-covered rain. If there is no customized e-commerce template under the current flow specification at this time, a generic template is recommended. Such as a three-grid, accordion, etc. template. And then, after receiving the playing parameters aiming at the e-commerce template sent by the server, the client plays the video to be optimized according to the playing parameters so as to enable the user to preview the video without really rendering a video material finished product, finally, selects the target template, sends the target template identification corresponding to the target template to the server, generates the final target video, and stores the final target video in a material library of the server.
In the embodiment of the application, a current video to be optimized is understood based on a tag identification model, then, on the basis of understanding the video to be optimized, a target template for size conversion, a duration adjustment scheme and a process for enriching video effects are predicted, namely, various conversion parameters of the video to be optimized are obtained, and finally, the video to be optimized is optimized through video rendering to obtain the optimized target video. Therefore, the abundant target templates are adopted, different target templates are matched according to different industries, flow and advertisement contents, one-key video delivery is realized, the video making and delivering threshold is greatly reduced, and the video delivering effect is improved.
Based on the foregoing embodiment, a specific example is used to describe the video optimization method in the embodiment of the present application, and referring to fig. 14, a flowchart of the advertisement optimization method in the embodiment of the present application is shown, which includes:
1. the target object selects each target application platform expected to put the advertisement in the client, the target application platforms are respectively a game application platform and a video playing application platform, an optimization request is triggered and generated by clicking a 'determination' operation control, and meanwhile, the advertisement to be optimized is uploaded to the client by clicking and dragging.
2. And performing label recognition on the advertisement to be optimized, and determining a video label set of the advertisement to be optimized, wherein the video label set comprises an education label and a post-class tutoring label.
3. And matching labels based on the education labels and the class-behind tutoring labels to obtain a corresponding target template set, wherein the target template set comprises a target template A, a target template B and a target template C, and the target template A and the target template C are general templates.
4. The playing parameters of the target template A, the target template B and the target template C are sent to the client, so that the client generates preview advertisements according to the playing parameters of the target template A, the target template B and the target template C and the advertisements to be optimized, the target object can select a corresponding target template, namely the target template B, based on the generated preview advertisements, and click a 'determination' operation control in an operation interface, so that the client sends the target template identification corresponding to the target template B to the server.
5. And determining a target template B based on the target template identification, and optimizing the advertisement to be optimized based on the target template B to obtain the target advertisement.
Based on the foregoing embodiment, referring to fig. 15, another flowchart of a video optimization method in the embodiment of the present application is shown, including:
step 150: and obtaining a video plot label, a visual style label, a display object label and a key content label by adopting the trained label identification model.
In the embodiment of the application, a shot segmentation mode, a video color identification mode, a video multi-label matching mode, a main body detection mode, a subtitle identification mode, a filling removal mode, a single-target tracking mode and a video cover picture identification mode can be adopted to obtain video scenario labels, view style labels, display object labels, key content labels and other attribute labels.
Step 151: and respectively obtaining a target template set corresponding to at least one target application platform based on the obtained at least one platform type, original size, video scenario label, visual style label, display object label and key content label.
The target template can be a video depopulation template, a focus following template, an intelligent drawing template, an intelligent color taking template, a three-grid template, an accordion template, a split-mirror simulcast template and a fuzzy filling template.
Step 152: and rendering the video to be optimized based on each target template contained in the target template set to obtain each target video.
Based on the same inventive concept, the embodiment of the application also provides a video optimization device. As shown in fig. 16, which is a schematic structural diagram of the video optimization apparatus 1600, the video optimization apparatus may include:
the first identification unit 1601 is configured to perform label identification on a video to be optimized in response to an optimization request for the video to be optimized, and obtain a video label set corresponding to the video to be optimized, where the optimization request includes a platform type of at least one target application platform and an original size of the video to be optimized;
a second identifying unit 1602, configured to obtain target template sets respectively corresponding to at least one target application platform based on the obtained at least one platform type, original size, and video tag set; each target template has a target size which meets the release size condition of the corresponding target application platform;
the optimizing unit 1603 is configured to respectively fill the video to be optimized into each target template included in the obtained at least one target template set, so as to obtain each optimized target video.
Optionally, the second identifying unit 1602 is configured to:
respectively executing the following operations aiming at least one target application platform:
determining a candidate template set corresponding to a target application platform according to the platform type and the original size of the target application platform, wherein the candidate template set comprises a plurality of candidate templates, and each candidate template corresponds to a template label set;
and matching the video label set with the template label set corresponding to each candidate template respectively to obtain each target template, and generating a target template set comprising each target template.
Optionally, when performing label identification on the video to be optimized to obtain a video label set of the video to be optimized, the first identifying unit 1601 is configured to:
obtaining video characteristics of a video to be optimized, wherein the video characteristics are at least one of image content characteristics, audio content characteristics and text content characteristics;
determining matching probability between the video features and the candidate features corresponding to the candidate labels by adopting a trained label recognition model, taking the candidate labels meeting the matching probability condition as video labels of the video to be optimized, and generating a video label set containing the video labels;
the label recognition model is obtained through iterative training based on a training sample set, the training sample set comprises a plurality of video samples to be optimized, and the video samples to be optimized respectively correspond to video sample characteristics and video labels of the samples.
Optionally, the optimizing unit 1603 is configured to:
for each target template contained in the obtained at least one target template set, respectively executing the following operations:
based on the size of a filling area corresponding to a target filling area in a target template, carrying out equal-proportion adjustment on the original size of a video to be optimized;
and filling the adjusted video to be optimized into the target filling area to obtain the optimized target video.
Optionally, after obtaining the target template sets corresponding to the at least one target application platform respectively, before filling the video to be optimized into the target templates included in the obtained at least one target template set respectively, the method further includes a processing unit 1604, where the processing unit 1604 is configured to:
respectively executing the following operations aiming at least one target template set:
respectively sending the playing parameters corresponding to the target templates to the client so that the client generates preview videos based on the playing parameters and the videos to be optimized, and sending the target template identifications corresponding to the selected preview videos to the server in response to selection instructions aiming at the preview videos;
and receiving each target template identifier returned by the client, and regenerating to generate a target template set containing the target template corresponding to each target template identifier.
Optionally, the second identifying unit 1602 is configured to:
respectively executing the following operations aiming at least one target application platform:
obtaining each target template corresponding to one target application platform based on the platform type, the original size and the video label set corresponding to the target application platform;
when the number of the templates of each target template is determined to be not less than the threshold value of the number of the templates, generating a target template set containing each target template;
and when the number of the templates is determined to be smaller than the threshold value of the number of the templates, acquiring the universal templates with the corresponding number, and generating a target template set comprising the universal templates and the target templates.
Optionally, the video tag is any one of the following: a video storyline tab, a video style tab, a presentation object tab, or a key content tab.
As will be appreciated by one skilled in the art, aspects of the present application may be embodied as a system, method or program product. Accordingly, various aspects of the present application may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible embodiments, a video optimization device according to the present application may include at least a processor and a memory. Wherein the memory stores program code which, when executed by the processor, causes the processor to perform the steps of the video optimization method according to various exemplary embodiments of the present application described in the specification. For example, a processor may perform the steps as shown in fig. 2.
The electronic equipment is based on the same inventive concept as the method embodiment, and the embodiment of the application also provides the electronic equipment. In one embodiment, the electronic device may be a server, such as server 120 shown in FIG. 1. In this embodiment, the electronic device may be configured as shown in FIG. 17, and may include a memory 1701, a communication module 1703, and one or more processors 1702.
The memory 1701 is used to store computer programs executed by the processor 1702. The memory 1701 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a program required for running an instant messaging function, and the like; the storage data area can store various instant messaging information, operation instruction sets and the like.
The memory 1701 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1701 may also be a non-volatile memory (non-volatile memory), such as a read-only memory (rom), a flash memory (flash memory), a hard disk (HDD) or a solid-state drive (SSD); or the memory 1701 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The memory 1701 may be a combination of the above memories.
The processor 1702, may include one or more Central Processing Units (CPUs), a digital processing unit, and so on. A processor 1702 for implementing the above-described video optimization method when invoking a computer program stored in the memory 1701.
The communication module 1703 is used for communicating with the terminal device and other servers.
The embodiment of the present application does not limit the specific connection medium among the memory 1701, the communication module 1703 and the processor 1702. In the embodiment of the present application, the memory 1701 and the processor 1702 are connected through the bus 1704 in fig. 17, the bus 1704 is depicted by a thick line in fig. 17, and the connection manner between other components is merely illustrative and not limited. The bus 1704 may be divided into an address bus, a data bus, a control bus, etc. For ease of description, only one thick line is depicted in FIG. 17, but only one bus or one type of bus is not depicted.
The memory 1701 stores therein a computer storage medium having stored therein computer-executable instructions for implementing the video optimization method of the embodiments of the present application. The processor 1702 is configured to perform the video optimization method described above, as shown in fig. 2.
In some possible embodiments, the aspects of the video optimization method provided in the present application may also be implemented in the form of a program product including program code for causing a computer device to perform the steps in the video optimization method according to various exemplary embodiments of the present application described above in this specification when the program product is run on the computer device, for example, the computer device may perform the steps as shown in fig. 2.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product of embodiments of the present application may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a computing device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a command execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on the user equipment, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, according to embodiments of the application. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.
Further, while the operations of the methods of the present application are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (15)

1. A method for video optimization, comprising:
responding to an optimization request aiming at a video to be optimized, performing label identification on the video to be optimized, and obtaining a video label set corresponding to the video to be optimized, wherein the optimization request comprises a platform type of at least one target application platform and an original size of the video to be optimized;
respectively obtaining a target template set corresponding to each of the at least one target application platform based on the obtained at least one platform type, the original size and the video tag set; each target template has a target size which meets the release size condition of the corresponding target application platform;
and filling the video to be optimized into each target template contained in the obtained at least one target template set respectively to obtain each optimized target video.
2. The method of claim 1, wherein obtaining a set of target templates corresponding to each of the at least one target application platform based on the obtained at least one platform type, the original size, and the set of video tags respectively comprises:
respectively executing the following operations aiming at the at least one target application platform:
determining a candidate template set corresponding to one target application platform according to the platform type and the original size of the target application platform, wherein the candidate template set comprises a plurality of candidate templates, and each candidate template corresponds to one template label set;
and matching the video label set with the template label sets respectively corresponding to the candidate templates to obtain the target templates and generate a target template set comprising the target templates.
3. The method of claim 1, wherein performing label recognition on the video to be optimized to obtain a video label set of the video to be optimized comprises:
obtaining video characteristics of the video to be optimized, wherein the video characteristics are at least one of image content characteristics, audio content characteristics and text content characteristics;
determining matching probability between the video features and candidate features corresponding to the candidate labels by adopting a trained label recognition model, taking the candidate labels meeting the matching probability condition as video labels of the video to be optimized, and generating a video label set containing the video labels;
the label recognition model is obtained through iterative training based on a training sample set, the training sample set comprises a plurality of video samples to be optimized, and the video samples to be optimized respectively correspond to video sample characteristics and video labels of the samples.
4. The method according to claim 1, 2 or 3, wherein the filling the video to be optimized into each target template included in the obtained at least one target template set respectively to obtain each optimized target video comprises:
for each target template contained in the obtained at least one target template set, respectively executing the following operations:
performing equal proportion adjustment on the original size of the video to be optimized based on the size of a filling area corresponding to a target filling area in a target template;
and filling the adjusted video to be optimized into the target filling area to obtain the optimized target video.
5. The method according to claim 1, 2 or 3, wherein after obtaining the target template sets corresponding to the at least one target application platform respectively, before filling the video to be optimized into the target templates included in the obtained at least one target template set respectively, further comprising:
for the at least one target template set, respectively performing the following operations:
respectively sending the playing parameters corresponding to the target templates to a client so that the client generates preview videos based on the playing parameters and the videos to be optimized, and responding to a selection instruction for each preview video to send the target template identification corresponding to each selected preview video to a server;
and receiving each target template identifier returned by the client, and regenerating a target template set containing the target template corresponding to each target template identifier.
6. The method of claim 5, wherein obtaining a set of target templates corresponding to each of the at least one target application platform based on the obtained at least one platform type, the original size, and the set of video tags respectively comprises:
respectively executing the following operations aiming at the at least one target application platform:
obtaining each target template corresponding to one target application platform based on the platform type corresponding to the target application platform, the original size and the video label set;
when the number of the templates of each target template is determined to be not less than the threshold value of the number of the templates, generating a target template set containing each target template;
and when the number of the templates is determined to be smaller than the threshold value of the number of the templates, acquiring the universal templates with the corresponding number, and generating a target template set comprising the universal templates and the target templates.
7. A method as claimed in claim 1, 2 or 3, wherein the video tag is any one of: a video storyline tab, a video style tab, a presentation object tab, or a key content tab.
8. A video optimization apparatus, comprising:
the video optimizing method comprises a first identification unit, a second identification unit and a third identification unit, wherein the first identification unit is used for responding to an optimizing request aiming at a video to be optimized, carrying out label identification on the video to be optimized and obtaining a video label set corresponding to the video to be optimized, and the optimizing request comprises a platform type of at least one target application platform and an original size of the video to be optimized;
a second identification unit, configured to obtain, based on the obtained at least one platform type, the original size, and the video tag set, a target template set corresponding to each of the at least one target application platform respectively; each target template has a target size which meets the release size condition of the corresponding target application platform;
and the optimization unit is used for respectively filling the video to be optimized into each target template contained in the obtained at least one target template set to obtain each optimized target video.
9. The apparatus of claim 8, wherein the second identification unit is to:
respectively executing the following operations aiming at the at least one target application platform:
determining a candidate template set corresponding to one target application platform according to the platform type and the original size of the target application platform, wherein the candidate template set comprises a plurality of candidate templates, and each candidate template corresponds to one template label set;
and matching the video label set with the template label sets respectively corresponding to the candidate templates to obtain the target templates and generate a target template set comprising the target templates.
10. The apparatus of claim 8, wherein when tag identification is performed on the video to be optimized, and a video tag set of the video to be optimized is obtained, the first identification unit is configured to:
obtaining video characteristics of the video to be optimized, wherein the video characteristics are at least one of image content characteristics, audio content characteristics and text content characteristics;
determining matching probability between the video features and candidate features corresponding to the candidate labels by adopting a trained label recognition model, taking the candidate labels meeting the matching probability condition as video labels of the video to be optimized, and generating a video label set containing the video labels;
the label recognition model is obtained through iterative training based on a training sample set, the training sample set comprises a plurality of video samples to be optimized, and the video samples to be optimized respectively correspond to video sample characteristics and video labels of the samples.
11. The apparatus according to claim 8, 9 or 10, wherein the optimization unit is configured to:
for each target template contained in the obtained at least one target template set, respectively executing the following operations:
performing equal proportion adjustment on the original size of the video to be optimized based on the size of a filling area corresponding to a target filling area in a target template;
and filling the adjusted video to be optimized into the target filling area to obtain the optimized target video.
12. The apparatus according to claim 8, 9 or 10, wherein after obtaining the target template sets corresponding to the at least one target application platform respectively, before filling the video to be optimized into the target templates included in the obtained at least one target template set respectively, the apparatus further includes a processing unit, the processing unit is configured to:
for the at least one target template set, respectively performing the following operations:
respectively sending the playing parameters corresponding to the target templates to a client so that the client generates preview videos based on the playing parameters and the videos to be optimized, and responding to a selection instruction for each preview video to send the target template identification corresponding to each selected preview video to a server;
and receiving each target template identifier returned by the client, and regenerating a target template set containing the target template corresponding to each target template identifier.
13. An electronic device, characterized in that it comprises a processor and a memory, wherein the memory stores program code which, when executed by the processor, causes the processor to carry out the steps of the method of any of claims 1-7.
14. A computer-readable storage medium, characterized in that it comprises program code for causing an electronic device to perform the steps of the method of any of claims 1-7, when the storage medium is run on the electronic device.
15. A computer program product comprising computer instructions, characterized in that the computer instructions, when executed by a processor, implement the steps of the method according to any one of claims 1 to 7.
CN202111238981.XA 2021-10-25 2021-10-25 Video optimization method and device, electronic equipment and storage medium Active CN114286181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111238981.XA CN114286181B (en) 2021-10-25 2021-10-25 Video optimization method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111238981.XA CN114286181B (en) 2021-10-25 2021-10-25 Video optimization method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114286181A true CN114286181A (en) 2022-04-05
CN114286181B CN114286181B (en) 2023-08-15

Family

ID=80868899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111238981.XA Active CN114286181B (en) 2021-10-25 2021-10-25 Video optimization method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114286181B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115348459A (en) * 2022-08-16 2022-11-15 支付宝(杭州)信息技术有限公司 Short video processing method and device
CN115379259A (en) * 2022-08-18 2022-11-22 百度在线网络技术(北京)有限公司 Video processing method and device, electronic equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9602881B1 (en) * 2016-01-14 2017-03-21 Echostar Technologies L.L.C. Apparatus, systems and methods for configuring a mosaic of video tiles
US20180376178A1 (en) * 2017-06-21 2018-12-27 Google Inc. Dynamic custom interstitial transition videos for video streaming services
CN109168028A (en) * 2018-11-06 2019-01-08 北京达佳互联信息技术有限公司 Video generation method, device, server and storage medium
CN110602544A (en) * 2019-09-12 2019-12-20 腾讯科技(深圳)有限公司 Video display method and device, electronic equipment and storage medium
CN110662103A (en) * 2019-09-26 2020-01-07 北京达佳互联信息技术有限公司 Multimedia object reconstruction method and device, electronic equipment and readable storage medium
CN111683280A (en) * 2020-06-04 2020-09-18 腾讯科技(深圳)有限公司 Video processing method and device and electronic equipment
CN111739128A (en) * 2020-07-29 2020-10-02 广州筷子信息科技有限公司 Target video generation method and system
US20200404173A1 (en) * 2018-07-23 2020-12-24 Tencent Technology (Shenzhen) Company Limited Video processing method and apparatus, terminal device, server, and storage medium
WO2021073315A1 (en) * 2019-10-14 2021-04-22 北京字节跳动网络技术有限公司 Video file generation method and device, terminal and storage medium
CN113473182A (en) * 2021-09-06 2021-10-01 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium
WO2021196281A1 (en) * 2020-03-30 2021-10-07 北京金堤科技有限公司 Multimedia file generation method and apparatus, storage medium and electronic device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9602881B1 (en) * 2016-01-14 2017-03-21 Echostar Technologies L.L.C. Apparatus, systems and methods for configuring a mosaic of video tiles
US20180376178A1 (en) * 2017-06-21 2018-12-27 Google Inc. Dynamic custom interstitial transition videos for video streaming services
US20200404173A1 (en) * 2018-07-23 2020-12-24 Tencent Technology (Shenzhen) Company Limited Video processing method and apparatus, terminal device, server, and storage medium
CN109168028A (en) * 2018-11-06 2019-01-08 北京达佳互联信息技术有限公司 Video generation method, device, server and storage medium
CN110602544A (en) * 2019-09-12 2019-12-20 腾讯科技(深圳)有限公司 Video display method and device, electronic equipment and storage medium
CN110662103A (en) * 2019-09-26 2020-01-07 北京达佳互联信息技术有限公司 Multimedia object reconstruction method and device, electronic equipment and readable storage medium
WO2021073315A1 (en) * 2019-10-14 2021-04-22 北京字节跳动网络技术有限公司 Video file generation method and device, terminal and storage medium
WO2021196281A1 (en) * 2020-03-30 2021-10-07 北京金堤科技有限公司 Multimedia file generation method and apparatus, storage medium and electronic device
CN111683280A (en) * 2020-06-04 2020-09-18 腾讯科技(深圳)有限公司 Video processing method and device and electronic equipment
CN111739128A (en) * 2020-07-29 2020-10-02 广州筷子信息科技有限公司 Target video generation method and system
CN113473182A (en) * 2021-09-06 2021-10-01 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115348459A (en) * 2022-08-16 2022-11-15 支付宝(杭州)信息技术有限公司 Short video processing method and device
CN115379259A (en) * 2022-08-18 2022-11-22 百度在线网络技术(北京)有限公司 Video processing method and device, electronic equipment and storage medium
CN115379259B (en) * 2022-08-18 2024-04-26 百度在线网络技术(北京)有限公司 Video processing method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114286181B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN110458918B (en) Method and device for outputting information
US20210342385A1 (en) Interactive method and system of bullet screen easter eggs
CN114286181B (en) Video optimization method and device, electronic equipment and storage medium
CN112948708B (en) Short video recommendation method
CN111144937A (en) Advertisement material determination method, device, equipment and storage medium
CN110225398B (en) Multimedia object playing method, device and equipment and computer storage medium
CN111400518A (en) Method, device, terminal, server and system for generating and editing works
CN111836118B (en) Video processing method, device, server and storage medium
CN112995749A (en) Method, device and equipment for processing video subtitles and storage medium
US20230291978A1 (en) Subtitle processing method and apparatus of multimedia file, electronic device, and computer-readable storage medium
CN111897950A (en) Method and apparatus for generating information
CN110569429A (en) method, device and equipment for generating content selection model
CN115496820A (en) Method and device for generating image and file and computer storage medium
CN111144974B (en) Information display method and device
CN112182281B (en) Audio recommendation method, device and storage medium
CN115100582A (en) Model training method and device based on multi-mode data
CN109816023B (en) Method and device for generating picture label model
CN111914850B (en) Picture feature extraction method, device, server and medium
CN112954453A (en) Video dubbing method and apparatus, storage medium, and electronic device
CN109947526B (en) Method and apparatus for outputting information
US20190384466A1 (en) Linking comments to segments of a media presentation
CN113505268A (en) Interactive processing method and device
CN113641853A (en) Dynamic cover generation method, device, electronic equipment, medium and program product
WO2022188563A1 (en) Dynamic cover setting method and system
CN112818914B (en) Video content classification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant