CN113490027A

CN113490027A - Short video production generation processing method and equipment and computer storage medium

Info

Publication number: CN113490027A
Application number: CN202110766814.6A
Authority: CN
Inventors: 孔祥兰
Original assignee: Wuhan Yirong Xinke Technology Co ltd
Current assignee: Wuhan Yirong Xinke Technology Co ltd
Priority date: 2021-07-07
Filing date: 2021-07-07
Publication date: 2021-10-08

Abstract

The invention discloses a method, a device and a computer storage medium for producing and processing short video, which are characterized in that by acquiring foreground and background images of each video frame image in each segment of sub-video in a short video material, detecting the contrast ratio of the foreground and the background images of each video frame image in each segment of sub-video in the short video material, analyzing the comprehensive image contrast ratio of each segment of sub-video in the short video material, calculating the contrast ratio of subtitles in each segment of sub-video in the short video material, simultaneously identifying and checking text information in each segment of sub-video in the short video material, acquiring the audio appearance time and audio disappearance time in each segment of sub-video in the short video material, calculating the appearance speed of subtitles in each segment of sub-video in the short video material, and carrying out corresponding subtitle contrast setting and corresponding subtitle appearance speed adjustment processing on the text information in each segment of sub-video in the short video material, therefore, the overall watching quality of the short video is improved, and the diversified watching requirements of people are met.

Description

Short video production generation processing method and equipment and computer storage medium

Technical Field

The invention relates to the technical field of short video production and generation, in particular to a short video production and generation processing method, equipment and a computer storage medium.

Background

With the continuous popularization of mobile terminals and the acceleration of network access speed, short videos gradually get the favor of users due to the characteristics of short videos, fast videos, large videos and the like. People like to generate and edit the short video in the short video production process, so that the content of the short video can be greatly enriched.

At present, the existing short video production generation processing technology has the following defects:

1. most of the existing short video production, generation and processing methods add subtitles to the short video manually, so that the intelligent degree of the short video production, generation and processing is reduced, and the problem that the played audio of the short video is not matched with the subtitles exists, so that the normal watching effect of the short video is influenced, and the watching experience and interest of people on the short video are reduced;

2. the existing short video production generation processing method sets the subtitle contrast according to the manual production experience, cannot perform intelligent setting according to the image contrast of the short video, has the problem that the contrast difference between the short video subtitle and the image is not obvious, causes the overall viewing quality of the short video to be reduced, and cannot meet the diversified viewing requirements of people;

in order to solve the above problems, a short video production generation processing method, a device, and a computer storage medium are now designed.

Disclosure of Invention

The invention aims to provide a short video production generation processing method, equipment and a computer storage medium, wherein a short video material is divided into sub-videos of all segments, foreground images and background images of all video frame images in all segments of sub-videos in the short video material are obtained and are subjected to image processing, the contrast of the foreground images and the contrast of the background images of all video frame images in all segments of sub-videos in the short video material are detected at the same time, the comprehensive image contrast of all segments of sub-videos in the short video material is analyzed, the contrast of subtitles in all segments of sub-videos in the short video material is calculated, text information in all segments of sub-videos in the short video material is identified and checked at the same time, the audio appearance time and audio disappearance time in all segments of sub-videos in the short video material are obtained, the subtitle appearance speed in all segments of sub-videos in the short video material is calculated, and the text information in all segments of sub-videos in the short video material is subjected to corresponding subtitle contrast setting and corresponding to the subtitle appearance time segments in the short video material The existing speed adjustment processing solves the problems existing in the background technology.

The purpose of the invention can be realized by the following technical scheme:

a short video production generation processing method comprises the following steps:

s1, short video material division: dividing the imported short video material into sub-videos of each segment according to a set division rule, and numbering the sub-videos in sequence;

s2, video frame image segmentation: respectively segmenting each video frame image in each segment of sub-video in the short video material by acquiring each video frame image in each segment of sub-video in the short video material to obtain a foreground image and a background image of each video frame image in each segment of sub-video in the short video material;

s3, video frame image processing: respectively processing foreground images and background images of all video frame images in all sections of sub-videos in a short video material by adopting an image processing technology to obtain foreground processed images and background processed images of all video frame images in all sections of sub-videos in the short video material;

s4, detecting image contrast: analyzing the comprehensive image contrast of each segment of sub-video in the short video material by respectively detecting the foreground processing image contrast and the background processing image contrast of each video frame image in each segment of sub-video in the short video material;

s5, subtitle contrast analysis: calculating the contrast ratio of the subtitles in each segment of sub-video in the short video material by extracting the standard contrast ratio of the video image and the subtitles in the short video stored in the storage database;

s6, identifying video and audio information: respectively identifying audio information in each segment of sub-video in the short video material through a voice identification technology, and converting to obtain text information in each segment of sub-video in the short video material;

s7, text information auditing: performing semantic association audit and structure association audit on text information in each segment of sub-video in the short video material respectively, and if the semantic association or the structure association of the text information in a certain segment of sub-video in the short video material is not accordant, performing appropriate correction on the text information in the segment of sub-video manually;

s8, acquiring audio appearance and disappearance time: calculating the time period of text information in each segment of sub-video in the short video material by acquiring the audio appearing time and audio disappearing time in each segment of sub-video in the short video material, and calculating the subtitle appearing speed in each segment of sub-video in the short video material;

s9, video subtitle setting processing: the text information in each segment of sub-video in the short video material is set according to the corresponding subtitle contrast, and the text information in each segment of sub-video in the short video material is subjected to corresponding subtitle occurrence speed adjustment processing within the corresponding occurrence time period.

Further, in the step S1, the short video material is divided into sub-videos according to the video voice pause sequence, and the sub-videos are numbered sequentially according to the video playing sequence, where the number of each sub-video in the short video material is 1,2,.

Further, the step S2 includes counting foreground images of each video frame image in each segment of sub-video in the short video material, and forming a foreground image set P of each video frame image in each segment of sub-video in the short video material_iA(p_ia₁,p_ia₂,...,p_ia_j,...,p_ia_m)，p_ia_jRepresenting the foreground image as the jth video frame image in the ith segment of sub-video in the short video material; meanwhile, the background images of all video frame images in all segments of sub-videos in the short video material are counted to form a background scene image set P of all video frame images in all segments of sub-videos in the short video material_iB(p_ib₁,p_ib₂,...,p_ib_j,...,p_ib_m)，p_ib_jAnd representing the background image of the jth video frame image in the ith sub-video in the short video material.

Further, the image processing technique adopted in step S3 includes performing geometric normalization processing on the foreground image and the background image of each video frame image in each segment of sub-video in the short video material, respectively, transforming into the foreground image and the background image of each video frame image in each segment of sub-video in a fixed standard form, performing optimization enhancement processing on the foreground image of each video frame image in each segment of sub-video after transformation, and performing blurring processing on the background image of each video frame image in each segment of sub-video after transformation.

Further, the step S4 includes the following steps:

s41, detecting the foreground processing image contrast of each video frame image in each segment of sub-video in the short video material to form a foreground processing image contrast set k of each video frame image in each segment of sub-video in the short video material_iA(k_ia₁,k_ia₂,...,k_ia_j,...,k_ia_m)，k_ia_jRepresenting the contrast of a foreground processing image of a jth video frame image in an ith segment of sub-video in the short video material;

s42, detecting the contrast of the background processing image of each video frame image in each segment of sub-video in the short video material, and forming a background processing image contrast set k of each video frame image in each segment of sub-video in the short video material_iB(k_ib₁,k_ib₂,...,k_ib_j,...,k_ib_m)，k_ib_jRepresenting the contrast of a background processing image of a jth video frame image in an ith segment of sub-video in the short video material;

s43, calculating the comprehensive image contrast of each segment of sub-video in the short video material

The method comprises the steps of expressing the comprehensive image contrast of the ith segment of sub-video in a short video material, expressing m as the number of video frame images in the ith segment of sub-video in the short video material, and expressing alpha and beta as weighting proportion coefficients corresponding to foreground images and background images in the video frame images respectively, wherein alpha + beta is 1.

Further, the calculation formula of the subtitle contrast in each segment of sub-video in the short video material is as follows

k′_iExpressed as the subtitle contrast, lambda, in the ith sub-video within the short video material_{Sign board}Expressed as the standard contrast ratio of video images to subtitles in short video.

Further, the speech recognition technique adopted in step S6 includes the following steps:

s61, filtering and framing preprocessing are carried out on the audio information in each segment of sub-video in the short video material, and redundant information is removed;

s62, extracting key information which influences voice recognition and characteristic information expressing voice meaning in the audio information in each section of sub-video;

s63, recognizing words by using the minimum unit according to the characteristic information in the audio information in each segment of sub-video, and sequentially recognizing the words according to the grammars corresponding to the audio information in each segment of sub-video;

and S64, connecting the recognized words in each segment of sub-video according to semantic analysis, and adjusting sentence composition according to the meaning of a sentence to obtain text information in each segment of sub-video in the short video material.

Further, the step S8 includes the following steps:

s81, obtaining the audio frequency appearance time in each segment of sub-video in the short video material, and forming the audio frequency appearance time set t (t) in each segment of sub-video in the short video material₁,t₂,...,t_i,...,t_n)，t_iRepresenting the audio occurrence time in the ith sub-video in the short video material;

s82, obtaining the audio disappearance time in each segment of sub-video in the short video material, and forming an audio disappearance time set t ' (t ' in each segment of sub-video in the short video material '₁,t′₂,...,t′_i,...,t′_n)，t′_iRepresenting the audio disappearance time in the ith sub-video in the short video material;

s83, extracting the word number of the text information in each segment of the sub-video in the short video material, and calculating the subtitle occurrence speed in each segment of the sub-video in the short video material

v_iExpressed as the speed of occurrence, x, of subtitles in the ith sub-video within the short video material_iRepresented as text messages in the ith sub-video within the short video materialThe number of words of the message.

An apparatus, comprising: the system comprises a processor, a memory and a network interface, wherein the memory and the network interface are connected with the processor; the network interface is connected with a nonvolatile memory in the server; when the processor runs, the processor calls the computer program from the nonvolatile memory through the network interface, and runs the computer program through the memory, so as to execute the short video production and generation processing method.

A computer storage medium is burned with a computer program, and when the computer program runs in a memory of a server, the short video production generation processing method is realized.

Has the advantages that:

(1) the invention provides a short video production generation processing method, equipment and computer storage medium, which divide a short video material into each segment of sub-video, acquire foreground images and background images of each video frame image in each segment of sub-video in the short video material, and perform image processing, thereby reducing the time and task amount required by image analysis, simultaneously detect the foreground and background processing image contrast of each video frame image in each segment of sub-video in the short video material, analyze the comprehensive image contrast of each segment of sub-video in the short video material, calculate the subtitle contrast in each segment of sub-video in the short video material, and perform corresponding subtitle contrast setting on text information in each segment of sub-video in the short video material, thereby realizing the intelligent setting function of the subtitle contrast of the short video, and ensuring that the difference between the short video subtitle and the image contrast is obvious, the overall watching quality of the short video is improved, and further the diversified watching requirements of people are met.

(2) According to the method, the accuracy and the reliability of text information identification in the short video are improved by identifying and checking the text information in each segment of the sub-video in the short video material, the audio appearing time and the audio disappearing time in each segment of the sub-video in the short video material are obtained, the subtitle appearing speed in each segment of the sub-video in the short video material is calculated, and the text information in each segment of the sub-video in the short video material is adjusted and processed according to the corresponding subtitle appearing speed in the corresponding appearing time period, so that the matching of the playing audio and the subtitle of the short video is ensured, the normal watching effect of the short video is prevented from being influenced, the intelligentization degree of the short video production and generation processing is improved, and the watching experience and interest of people on the short video are increased.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a method for generating and processing a short video includes the following steps:

s1, short video material division: the imported short video materials are divided into segments of sub-videos according to set division rules and are numbered in sequence.

In this embodiment, in step S1, the short video material is divided into segments of sub-videos according to the video voice pause sequence, and the segments of sub-videos in the short video material are numbered sequentially according to the video playing sequence, where the numbers of the segments of sub-videos in the short video material are 1,2, a.

S2, video frame image segmentation: the method comprises the steps of obtaining each video frame image in each segment of sub-video in the short video material, and respectively segmenting each video frame image in each segment of sub-video in the short video material to obtain a foreground image and a background image of each video frame image in each segment of sub-video in the short video material.

In this embodiment, the step S2 includes counting foreground images of each video frame image in each segment of sub-video in the short video material, and forming a foreground image set P of each video frame image in each segment of sub-video in the short video material_iA(p_ia₁,p_ia₂,...,p_ia_j,...,p_ia_m)，p_ia_jRepresenting the foreground image as the jth video frame image in the ith segment of sub-video in the short video material; meanwhile, the background images of all video frame images in all segments of sub-videos in the short video material are counted to form a background scene image set P of all video frame images in all segments of sub-videos in the short video material_iB(p_ib₁,p_ib₂,...,p_ib_j,...,p_ib_m)，p_ib_jAnd representing the background image of the jth video frame image in the ith sub-video in the short video material.

S3, video frame image processing: the foreground image and the background image of each video frame image in each segment of sub-video in the short video material are respectively processed by adopting an image processing technology, so that the foreground processing image and the background processing image of each video frame image in each segment of sub-video in the short video material are obtained.

In this embodiment, the image processing technique adopted in step S3 includes performing geometric normalization processing on the foreground image and the background image of each video frame image in each segment of sub-video in the short video material, respectively, transforming into the foreground image and the background image of each video frame image in each segment of sub-video in a fixed standard form, performing optimization enhancement processing on the foreground image of each video frame image in each segment of sub-video after transformation, and performing blurring processing on the background image of each video frame image in each segment of sub-video after transformation.

Specifically, the short video material is divided into the sub-videos, the foreground image and the background image of each video frame image in each sub-video in the short video material are obtained, and image processing is performed, so that the time and the task amount required by image analysis are reduced.

S4, detecting image contrast: and analyzing the comprehensive image contrast of each segment of sub-video in the short video material by respectively detecting the foreground processing image contrast and the background processing image contrast of each video frame image in each segment of sub-video in the short video material.

In this embodiment, the step S4 includes the following steps:

S5, subtitle contrast analysis: and calculating the contrast ratio of the subtitles in each segment of sub-video in the short video material by extracting the standard contrast ratio of the video image and the subtitles in the short video stored in the storage database.

In this embodiment, the calculation formula of the subtitle contrast in each segment of sub-video in the short video material is

Specifically, the method processes the image contrast by detecting the foreground and the background of each video frame image in each segment of the sub-video in the short video material, analyzes the comprehensive image contrast of each segment of the sub-video in the short video material, and calculates the subtitle contrast in each segment of the sub-video in the short video material, so that the difference between the short video subtitle and the image contrast is obvious enough, the overall watching quality of the short video is improved, the diversified watching requirements of people are further met, and a reliable reference basis is provided for the subtitle contrast in each segment of the sub-video in the later period.

S6, identifying video and audio information: and respectively identifying the audio information in each segment of sub-video in the short video material through a voice identification technology, and converting to obtain the text information in each segment of sub-video in the short video material.

In this embodiment, the speech recognition technique adopted in step S6 includes the following steps:

S7, text information auditing: by respectively performing semantic association audit and structure association audit on the text information in each segment of sub-video in the short video material, if the semantic association or structure association of the text information in a certain segment of sub-video in the short video material is not met, the text information in the segment of sub-video is appropriately corrected.

In this embodiment, if the semantic association and the structural association of the text information in a certain segment of sub-video in the short video material are both satisfied, it is indicated that the review of the text information in the segment of sub-video is passed, and if the semantic association or the structural association of the text information in a certain segment of sub-video in the short video material are not satisfied, it is indicated that the review of the text information in the segment of sub-video is not passed, the text information in the segment of sub-video is appropriately corrected manually.

Specifically, the accuracy and reliability of text information identification in the short video are improved by identifying and auditing the text information in each segment of sub-video in the short video material.

S8, acquiring audio appearance and disappearance time: the method comprises the steps of obtaining the audio appearing time and audio disappearing time in each section of sub-video in the short video material, counting the time period of the text information appearing in each section of sub-video in the short video material, and calculating the subtitle appearing speed in each section of sub-video in the short video material.

In this embodiment, the step S8 includes the following steps:

v_iExpressed as the speed of occurrence, x, of subtitles in the ith sub-video within the short video material_iRepresented as the number of words of text information in the ith sub-video within the short video material.

Specifically, the method and the device calculate the caption appearance speed in each segment of sub-video in the short video material by acquiring the audio appearance time and audio disappearance time in each segment of sub-video in the short video material, thereby ensuring that the playing audio of the short video can be matched with the caption, avoiding influencing the normal watching effect of the short video, and providing reliable reference basis for the later regulation and processing of the caption in each sub-video.

Specifically, the method and the device realize the intelligent setting function of the short video subtitle contrast by setting the corresponding subtitle contrast of the text information in each segment of the sub-video in the short video material, and adjust the appearance speed of the corresponding subtitle in the corresponding appearance time period of the text information in each segment of the sub-video in the short video material, thereby improving the intelligent degree of the short video production and generation processing and increasing the watching experience and interest of people on the short video.

The foregoing is merely exemplary and illustrative of the principles of the present invention and various modifications, additions and substitutions of the specific embodiments described herein may be made by those skilled in the art without departing from the principles of the present invention or exceeding the scope of the claims set forth herein.

Claims

1. A short video production generation processing method is characterized by comprising the following steps: the method comprises the following steps:

2. The short video production generation processing method according to claim 1, wherein: in the step S1, the short video material is divided into sub-videos according to the video voice pause sequence, and the sub-videos are sequentially numbered according to the video playing sequence, where the number of each sub-video in the short video material is 1,2,.

3. The short video production generation processing method according to claim 1, wherein: the step S2 includes counting foreground images of each video frame image in each segment of sub-video in the short video material, and forming a foreground image set P of each video frame image in each segment of sub-video in the short video material_iA(p_ia₁,p_ia₂,...,p_ia_j,...,p_ia_m)，p_ia_jRepresenting the foreground image as the jth video frame image in the ith segment of sub-video in the short video material; meanwhile, the background images of all video frame images in all segments of sub-videos in the short video material are counted to form a background scene image set P of all video frame images in all segments of sub-videos in the short video material_iB(p_ib₁,p_ib₂,...,p_ib_j,...,p_ib_m)，p_ib_jAnd representing the background image of the jth video frame image in the ith sub-video in the short video material.

4. The short video production generation processing method according to claim 1, wherein: the image processing technique adopted in step S3 includes performing geometric normalization processing on the foreground image and the background image of each video frame image in each segment of sub-video in the short video material, respectively, to transform the foreground image and the background image into the foreground image and the background image of each video frame image in each segment of sub-video in a fixed standard form, performing optimization enhancement processing on the foreground image of each video frame image in each segment of sub-video after transformation, and performing blurring processing on the background image of each video frame image in each segment of sub-video after transformation.

5. The short video production generation processing method according to claim 1, wherein: the step S4 includes the following steps:

s42, and comparing each of the short video materialsDetecting the contrast of the background processing image of each video frame image in the segment of the sub-video to form a background processing image contrast set k of each video frame image in each segment of the sub-video in the short video material_iB(k_ib₁,k_ib₂,...,k_ib_j,...,k_ib_m)，k_ib_jRepresenting the contrast of a background processing image of a jth video frame image in an ith segment of sub-video in the short video material;

6. The short video production generation processing method according to claim 1, wherein: the calculation formula of the subtitle contrast in each segment of sub-video in the short video material is

7. The short video production generation processing method according to claim 1, wherein: the speech recognition technique employed in step S6 includes the steps of:

8. The short video production generation processing method according to claim 1, wherein: the step S8 includes the following steps:

9. An apparatus, characterized by: the method comprises the following steps: the system comprises a processor, a memory and a network interface, wherein the memory and the network interface are connected with the processor; the network interface is connected with a nonvolatile memory in the server; the processor retrieves a computer program from the non-volatile memory through the network interface when running, and runs the computer program through the memory to execute a short video production generation processing method according to any one of claims 1 to 8.

10. A computer storage medium, characterized in that: the computer storage medium is burned with a computer program, and the computer program realizes a short video production generation processing method according to any one of claims 1 to 8 when running in a memory of a server.