WO2022078363A1 - 视频预览内容生成方法和装置、计算机装置和存储介质 - Google Patents
视频预览内容生成方法和装置、计算机装置和存储介质 Download PDFInfo
- Publication number
- WO2022078363A1 WO2022078363A1 PCT/CN2021/123447 CN2021123447W WO2022078363A1 WO 2022078363 A1 WO2022078363 A1 WO 2022078363A1 CN 2021123447 W CN2021123447 W CN 2021123447W WO 2022078363 A1 WO2022078363 A1 WO 2022078363A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- video
- similarity
- preview content
- images
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000001914 filtration Methods 0.000 claims abstract description 8
- 238000012216 screening Methods 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 10
- 239000000284 extract Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8549—Creating video summaries, e.g. movie trailer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/454—Content or additional data filtering, e.g. blocking advertisements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/454—Content or additional data filtering, e.g. blocking advertisements
- H04N21/4545—Input to filtering algorithms, e.g. filtering a region of the image
Definitions
- the present disclosure relates to the field of project construction, and in particular, to a method and device for generating video preview content, a computer device and a storage medium.
- Video content is used more and more widely. Many times, when faced with a lot of video content, you need a quick way to browse videos. For example, on a video information list page, you don’t want each video item in the list to put the entire video into it. , prefer to use a video clip or an image as the preview content. In this case, a method is needed to generate a shorter video than the original video or a GIF (Graphics Interchange Format) image based on the video.
- GIF Graphics Interchange Format
- a method for generating video preview content comprising:
- Parse the video content to be processed obtain all video image frame information of the video content to be processed, and generate an ordered image frame list;
- the image frame list is processed by means of image recognition, and the video credits and video credits are filtered;
- video preview content is generated.
- the method for generating video preview content further includes:
- the similarity between adjacent images is calculated based on the image frame list, and the next image in the adjacent images whose similarity is greater than a predetermined similarity threshold is filtered out.
- the image recognition method is used to process the image frame list, and the filtering of the video credits and the video credits includes:
- the text ratio is the ratio of the area occupied by all texts to the image area
- calculating the similarity between adjacent images based on the image frame list, and filtering out the next image in the adjacent images whose similarity is greater than a predetermined similarity threshold includes:
- the generating video preview content based on the filtered image frame list includes:
- the to-be-screened images for determining the number of to-be-screened images from the filtered image frame list include:
- the image with the largest weight value in the segment is used as the image to be screened.
- the video preview content is at least one of a Graphics Interchange Format picture and a new video.
- an apparatus for generating video preview content comprising:
- the image frame list generation module is used to parse the video content to be processed, obtain all video image frame information of the video content to be processed, and generate an ordered image frame list;
- an image recognition module used for processing the image frame list by means of image recognition, and filtering the video title and video title;
- the preview content generation module is used to generate video preview content based on the filtered image frame list.
- the video preview content generating apparatus further includes:
- the image similarity calculation module is configured to calculate the similarity between adjacent images based on the image frame list, and filter out the next image in the adjacent images whose similarity is greater than a predetermined similarity threshold.
- the image recognition module is configured to perform image recognition on each image in the ordered image frame list, and determine the position of the region containing text in each image; for each image image, to determine whether the text ratio is greater than a predetermined threshold, wherein the text ratio is the ratio of the area occupied by all text to the image area; filter out the ratio of the area occupied by all text to the image area that is greater than the predetermined threshold image.
- the image similarity calculation module is configured to calculate the similarity between adjacent images based on each image in the image frame list, wherein the adjacent images are The current image and the next image; determine whether there are adjacent images with a similarity greater than a predetermined similarity threshold; if there are adjacent images with a similarity greater than a predetermined similarity threshold, filter out those with a similarity greater than the predetermined similarity threshold The next image in the adjacent images, until there is no adjacent image with a similarity greater than a predetermined similarity threshold in the image frame list.
- the preview content generation module is configured to determine the number of images to be screened according to the length of the video to be processed and the type of video preview content; from the filtered image frame list, determine the number of images to be screened generating an orderly list of image frames to be screened from the images to be screened; generating video preview content according to the list of image frames to be screened.
- the preview content generation module when the preview content generation module determines the number of images to be screened and the number of images to be screened from the filtered image frame list, the preview content generation module may be configured to use each image and the next The similarity of the images is calculated, and the screening weight of each image in the filtered image frame list is calculated; the filtered image frame list is divided into a predetermined number of segments, wherein the predetermined number is the number of images to be screened, and each segment The sum of the screening weights of all the images in the segment is equal; for each segment, the image with the largest weight value in the segment is taken as the image to be screened.
- the video preview content is at least one of a Graphics Interchange Format picture and a new video.
- a computer apparatus comprising:
- the processor is configured to execute the instructions, so that the computer apparatus executes the operation of implementing the method for generating video preview content according to any one of the above embodiments.
- a non-transitory computer-readable storage medium wherein the computer-readable storage medium stores computer instructions, and when the instructions are executed by a processor, implement the method described in any of the foregoing embodiments.
- FIG. 1 is a schematic diagram of some embodiments of the disclosed video preview content generation method.
- FIG. 2 is a schematic diagram of other embodiments of the method for generating video preview content of the present disclosure.
- FIG. 3 is a schematic diagram of some embodiments of an apparatus for generating video preview content of the present disclosure.
- FIG. 4 is a schematic diagram of other embodiments of the apparatus for generating video preview content of the present disclosure.
- FIG. 5 is a schematic diagram of still other embodiments of the apparatus for generating video preview content of the present disclosure.
- the method of the related art includes: extracting multiple image frames of a video file, adding the extracted multiple image frames to a thumbnail image set, and generating a dynamic thumbnail according to the image frames in the thumbnail image set.
- the quality of the preview image generated by the related art is not good enough, and manual intervention may be required if one wishes to generate a preview image of higher quality.
- the related technical method cannot distinguish the text content of the video at the beginning and the end, and cannot perceive the rapid change of the video content (such as surveillance video, most of the content is the same, and the changed part needs to be intercepted).
- the present disclosure provides a method and device for generating video preview content, a computer device, and a storage medium, and the present disclosure will be described below through specific embodiments.
- FIG. 1 is a schematic diagram of some embodiments of the disclosed video preview content generation method.
- this embodiment can be executed by a video preview content generating apparatus or a computer apparatus of the present disclosure.
- the method may include steps 11-13, wherein:
- Step 11 Parse the video content to be processed, obtain all video image frame information of the video content to be processed, and generate an ordered image frame list.
- step 11 may include: reading the content of the video file to be processed, extracting all video frame image information in the video file, and obtaining an ordered image frame list information according to the time sequence of the video.
- step 12 the image frame list is processed by means of image recognition, and the video credits and video credits are filtered.
- step 12 may include steps 121-123, wherein:
- Step 121 Perform image recognition on each image in the ordered image frame list, and determine the location of the region containing text in each image.
- Step 122 for each image, determine whether the text ratio is greater than a predetermined threshold, wherein the text ratio is the ratio of the area occupied by all the text to the image area.
- Step 123 Filter out images whose ratio of the area occupied by all the characters to the image area is greater than a predetermined threshold.
- Step 13 Generate video preview content based on the filtered image frame list.
- the video preview content may be at least one of a Graphics Interchange Format picture and a new video.
- the quality of the preview content generated based on the video can be improved, and the title and ending text content of the video can be ignored.
- the above-mentioned embodiments of the present disclosure filter core images through image recognition to generate GIF images or new videos.
- FIG. 2 is a schematic diagram of some embodiments of the disclosed video preview content generation method.
- this embodiment can be executed by a video preview content generating apparatus or a computer apparatus of the present disclosure.
- the method may include steps 20-29, wherein:
- Step 20 Parse the video content to be processed, obtain all video image frame information of the video content to be processed, and generate an ordered image frame list.
- step 20 may include: reading the content of the video file to be processed, extracting all video frame image information in the video file, and obtaining an ordered image frame list information according to the time sequence of the video.
- Step 21 using an image recognition method to process the image frame list.
- step 21 may include: performing the following cycle of steps 211 to 216 on the image frame list obtained in step 20:
- Step 211 Perform image recognition on the current image (picture) to obtain the location of the region containing the text in the current image.
- step 211 may include: extracting a plurality of image regions containing text from the current image.
- Step 212 Calculate the area occupied by all the characters, and then sum them up.
- Step 213 Calculate the ratio of the area of the area occupied by the text to the area of the current image.
- Step 214 check whether the ratio is greater than a predetermined threshold.
- the predetermined threshold may be configured, for example, the predetermined threshold may be configured as 10%.
- Step 215 if the ratio is greater than a predetermined threshold, mark the current image as not participating in subsequent image screening.
- Step 216 check whether there is a next image behind the current image. If yes, continue the loop, take the next image as the current image, and then execute step 211 . If not, go to step 22.
- Step 22 Calculate the similarity between every two adjacent images based on the image frame list, and determine adjacent images whose similarity is greater than a predetermined similarity threshold.
- step 22 may include: performing the following loop on the image frame list obtained in step 21:
- Step 221 Obtain the current image and the next image to perform similarity calculation to obtain a similarity ratio.
- step 221 may include: using the current image and the next image as a reference image and a query image, respectively; dividing the reference image and the query image into small areas; extracting each of the divided small areas The feature quantity of the small area is used as the small area feature quantity of the query image and the reference image; the small area feature quantity of the reference image is compared with the small area feature quantity of the query image; and the similarity of the feature quantities of each small area is calculated as the small area. similarity; and calculating the image similarity between the query image and the reference image by weighting the small-region similarity with a small-region-based weight value derived from the local-region weight value.
- Step 222 check whether the similarity of the two images is greater than the similarity threshold.
- the similarity threshold may be configured, for example, the similarity threshold may be configured as 50%.
- Step 223 if the similarity of the two images is greater than the similarity threshold.
- Step 224 marking the next image of the current image as not participating in the subsequent image screening.
- Step 225 check whether there is a next image in the current image. If yes, continue the loop, take the next image as the current image, and then go to step 221; if not, go to step 23.
- Step 23 from the image frame list obtained in step 22, filter out images that have been marked as not participating in subsequent screening, and obtain a new ordered image list.
- Step 24 Repeat steps 22 and 23 for the ordered image list obtained in step 23 to filter images that do not participate in the screening, until no images are marked as not participating in the subsequent screening in step 23.
- Step 25 Determine the number of images to be screened according to the length of the video to be processed and the content type of the video preview.
- the number of images to be screened may be a configuration list.
- step 25 may include: generating pictures that need to be finally generated, and performing stepwise configuration according to the length of the video, for example: for a 1-minute video, take 5 pictures, for a 1-5 minute video, take 10 pictures, and 5-20 pictures for a 1-minute video. Take 15 images per minute, 20-30 minutes, and 30 images over 30 minutes; if a new video needs to be generated in the end, the number of images can be more.
- Step 26 from the image frame list obtained in step 24, according to the similarity between each image and the next image, calculate the screening weight of each image in the filtered image frame list.
- step 26 may include steps 261-262, wherein:
- Step 261 Calculate the screening weight of each image in the ordered image list obtained in step 24 by using the similarity ratio between each image and the next image.
- step 261 may include: calculating the screening weight of each image according to formula (1).
- Step 262 adding and summing the screening weights of each image to obtain a total weight.
- Step 263 then calculate the weight position of each image, the algorithm is as follows: add all the weights in front of the position of each image to obtain the weight position of the image.
- Step 27 Divide the image frame list obtained in step 24 into a predetermined number of segments, where the predetermined number is the number of images to be screened, and the sum of the screening weights of all images in each segment is equal.
- step 27 may include steps 271-272, wherein:
- Step 271 according to the required number of images to be screened obtained in Step 25, perform an average segmentation between 0 and the total weight.
- Step 272 and then divide the ordered image list into the same segments according to the weight range of each segment.
- Step 28 For each segment, the image with the largest weight value in the segment is used as the image to be screened.
- step 28 may include: performing a loop of steps 281 and 282 from the ordered image list obtained in step 27:
- Step 281 Find the image with the largest weight value in the current segment, filter the image and put it into a new ordered image list, put it in order, and put it first.
- Step 282 determine whether there is a next segment of the ordered image list. If yes, repeat step 281; if not, go to step 29.
- Step 29 generating video preview content from the image list obtained in step 28.
- the video preview content may be at least one of a Graphics Interchange Format picture and a new video.
- step 29 may include: step 291-step 293, wherein:
- step 291 the image list obtained in step 28 is used to determine which content format needs to be generated in the end:
- Step 292 if a GIF image is generated: the image calls the GIF image generation module to generate GIF for the image list.
- step 292 may include: acquiring a single picture material for generating a GIF picture; generating an animation according to the single picture material; extracting each frame of the animation; Each frame of the animation is rendered into a GIF image.
- Step 293 if a video is generated: the image frame list obtained in step 28 is formed into a new image frame list, the audio content is removed, and a new video is regenerated.
- the above embodiments of the present disclosure can filter core pictures through image recognition and image similarity to generate GIF pictures or new videos.
- the above embodiments of the present disclosure can improve the quality of the preview content generated based on the video, and can ignore the text content of the intro and credits of the video, as well as large segments of the same video content.
- the above embodiments of the present disclosure can distinguish the text content of the video at the beginning and the end of the video, and can perceive the rapid change of the video content.
- the above-mentioned embodiments of the present disclosure can realize the quick preview of the video file, so that the user can know the main information of the video file in a short time, thereby improving the user experience.
- FIG. 3 is a schematic diagram of some embodiments of an apparatus for generating video preview content of the present disclosure.
- the apparatus for generating video preview content of the present disclosure may include an image frame list generation module 31, an image recognition module 32 and a preview content generation module 33, wherein:
- the image frame list generation module 31 is configured to parse the video content to be processed, obtain all video image frame information of the to-be-processed video content, and generate an ordered image frame list.
- the image frame list generation module 31 may be configured to read the content of the video file to be processed, extract all video frame image information in the video file, and obtain an ordered image frame list according to the time sequence of the video information.
- the image recognition module 32 is configured to process the image frame list by using an image recognition method, and filter the video title and video title.
- the image recognition module 32 may be configured to perform image recognition on each image in the ordered image frame list, and determine the location of the region containing text in each image; for each image , judging whether the text ratio is greater than a predetermined threshold, wherein the text ratio is the ratio of the area occupied by all the text to the image area; filter out the images whose ratio of the area occupied by all the text to the image area is greater than the predetermined threshold .
- the preview content generation module 33 is configured to generate video preview content based on the filtered image frame list.
- the video preview content may be at least one of a Graphics Interchange Format picture and a new video.
- the quality of the preview content generated based on the video can be improved, and the title and ending text content of the video can be ignored.
- the above-mentioned embodiments of the present disclosure filter core images through image recognition to generate GIF images or new videos.
- FIG. 4 is a schematic diagram of other embodiments of the apparatus for generating video preview content of the present disclosure. Compared with the embodiment of FIG. 3 , the apparatus for generating video preview content of the present disclosure in the embodiment of FIG. 4 may further include an image similarity calculation module 34, wherein:
- the image similarity calculation module 34 is used to calculate the similarity between adjacent images based on the image frame list, filter out the next image in the adjacent images whose similarity is greater than a predetermined similarity threshold, and then instruct the preview content to generate Module 33 performs an operation of generating video preview content based on the filtered list of image frames.
- the image similarity calculation module 34 may be configured to calculate the similarity between adjacent images based on each image in the image frame list, where the adjacent image is the current The image and the next image; determine whether there are adjacent images with a similarity greater than a predetermined similarity threshold; if there are adjacent images with a similarity greater than a predetermined similarity threshold, filter out the similarity greater than the predetermined similarity threshold.
- the preview content generation module 33 is instructed to perform an operation of generating video preview content based on the filtered image frame list.
- the preview content generation module 33 may be configured to determine the number of images to be screened according to the length of the video to be processed and the type of video preview content; from the filtered image frame list, determine the number of images to be screened generating an orderly list of image frames to be screened; generating video preview content according to the list of image frames to be screened.
- the preview content generation module 33 when the preview content generation module 33 determines the number of images to be screened from the filtered image frame list, the preview content generation module 33 may be configured to use each image and the next The similarity of the images, calculate the screening weight of each image in the filtered image frame list; divide the filtered image frame list into a predetermined number of segments, wherein the predetermined number is the number of images to be screened, and each segment The sum of the screening weights of all images in the image is equal; for each segment, the image with the largest weight value in the segment is taken as the image to be screened.
- the apparatus for generating video preview content is configured to perform operations for implementing the method for generating video preview content as described in any of the above embodiments (eg, the embodiment in FIG. 1 or FIG. 2 ).
- the above embodiments of the present disclosure can distinguish the text content of the video at the beginning and the end of the video, and can perceive the rapid change of the video content.
- the above-mentioned embodiments of the present disclosure can realize the quick preview of the video file, so that the user can know the main information of the video file in a short time, thereby improving the user experience.
- FIG. 5 is a schematic diagram of still other embodiments of the apparatus for generating video preview content of the present disclosure.
- the apparatus for generating video preview content of the present disclosure may include a memory 51 and a processor 52, wherein:
- the memory 51 is used to store instructions.
- the processor 52 is configured to execute the instructions, so that the computer apparatus executes the operation of implementing the method for generating video preview content as described in any of the foregoing embodiments (eg, the embodiment in FIG. 1 or FIG. 2 ).
- the above embodiments of the present disclosure can filter core pictures through image recognition and image similarity to generate GIF pictures or new videos.
- the above embodiments of the present disclosure can improve the quality of the preview content generated based on the video, and can ignore the text content of the intro and credits of the video, as well as large segments of the same video content.
- a non-transitory computer-readable storage medium stores computer instructions, and when the instructions are executed by a processor, implement any of the foregoing embodiments ( For example, the method for generating video preview content described in the embodiment of FIG. 1 or FIG. 2).
- a GIF picture or a new video can be generated by filtering core pictures through image recognition and image similarity.
- the above embodiments of the present disclosure can improve the quality of the preview content generated based on the video, and can ignore the text content of the intro and credits of the video, as well as large segments of the same video content.
- the above embodiments of the present disclosure can distinguish the text content of the video at the beginning and the end of the video, and can perceive the rapid change of the video content.
- the above embodiments of the present disclosure can realize a quick preview of the video file, so that the user can know the main information of the video file in a short time, thereby improving the user experience.
- the video preview content generating apparatus described above can be implemented as a general purpose processor, a programmable logic controller (PLC), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a programmable logic controller (PLC), a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or any suitable combination thereof.
- PLC programmable logic controller
- DSP digital signal processor
- ASIC application specific integrated circuit
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGAs Field Programmable Gate Arrays
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Computer Security & Cryptography (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
本公开涉及一种视频预览内容生成方法和装置、计算机装置和存储介质。该视频预览内容生成方法包括:解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表;采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾;基于过滤后的图像帧列表,生成视频预览内容。本公开可以提高基于视频生成预览内容的质量,可以忽略视频的片头和片尾文字内容。
Description
相关申请的交叉引用
本申请是以CN申请号为202011092575.2,申请日为2020年10月13日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
本公开涉及项目构建领域,特别涉及一种视频预览内容生成方法和装置、计算机装置和存储介质。
视频内容使用越来越广泛,很多时候在面对很多视频内容的时候,需要一种快速浏览视频的方式,例如在一个视频信息列表页,既不希望列表的每个视频项把整个视频放进去,更希望使用一个视频的片段或者一个图像作为预览的内容。在这种情况下,就需要一种方法来基于视频生成比原来视频更短的视频或者一个GIF(Graphics Interchange Format,图形交换格式)图像。
发明内容
根据本公开的一个方面,提供一种视频预览内容生成方法,包括:
解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表;
采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾;
基于过滤后的图像帧列表,生成视频预览内容。
在本公开的一些实施例中,所述视频预览内容生成方法还包括:
基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像。
在本公开的一些实施例中,所述采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾包括:
针对有序的图像帧列表中的每一张图像进行图像识别,确定每一张图像中包含文字的区域位置;
对于每一张图像,判断文字占比是否大于预定阈值,其中,所述文字占比为全部文字所占的面积与该图像面积的比值;
过滤掉全部文字所占的面积与该图像面积的比值大于预定阈值的图像。
在本公开的一些实施例中,所述基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像包括:
基于所述图像帧列表中的每一张图像,计算相邻图像之间的相似度,其中,所述相邻图像为当前图像与下一张图像;
判断是否存在相似度大于预定相似度阈值的相邻图像;
在存在相似度大于预定相似度阈值的相邻图像的情况下,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像,直到所述图像帧列表中不存在相似度大于预定相似度阈值的相邻图像。
在本公开的一些实施例中,所述基于过滤后的图像帧列表,生成视频预览内容包括:
根据待处理视频长度和视频预览内容类型,确定待筛选图像张数;
从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像;
将待筛选图像生成有序的待筛选图像帧列表;
根据待筛选图像帧列表生成视频预览内容。
在本公开的一些实施例中,所述从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像包括:
根据利用每张图像与下一张图像的相似度,计算过滤后的图像帧列表中每张图像的筛选权重;
将过滤后的图像帧列表分为预定数目个分段,其中,预定数目为待筛选图像张数,每个分段内所有图像的筛选权重的和相等;
针对每个分段,将该分段中权重值最大的图像作为待筛选图像。
在本公开的一些实施例中,所述视频预览内容为图形交换格式图片和新视频中的至少一种。
根据本公开的另一方面,提供一种视频预览内容生成装置,包括:
图像帧列表生成模块,用于解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表;
图像识别模块,用于采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾;
预览内容生成模块,用于基于过滤后的图像帧列表,生成视频预览内容。
在本公开的一些实施例中,所述视频预览内容生成装置还包括:
图像相似度计算模块,用于基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像。
在本公开的一些实施例中,所述图像识别模块,用于针对有序的图像帧列表中的每一张图像进行图像识别,确定每一张图像中包含文字的区域位置;对于每一张图像,判断文字占比是否大于预定阈值,其中,所述文字占比为全部文字所占的面积与该图像面积的比值;过滤掉全部文字所占的面积与该图像面积的比值大于预定阈值的图像。
在本公开的一些实施例中,所述图像相似度计算模块,用于基于所述图像帧列表中的每一张图像,计算相邻图像之间的相似度,其中,所述相邻图像为当前图像与下一张图像;判断是否存在相似度大于预定相似度阈值的相邻图像;在存在相似度大于预定相似度阈值的相邻图像的情况下,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像,直到所述图像帧列表中不存在相似度大于预定相似度阈值的相邻图像。
在本公开的一些实施例中,所述预览内容生成模块,用于根据待处理视频长度和视频预览内容类型,确定待筛选图像张数;从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像;将待筛选图像生成有序的待筛选图像帧列表;根据待筛选图像帧列表生成视频预览内容。
在本公开的一些实施例中,所述预览内容生成模块在从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像的情况下,可以用于根据利用每张图像与下一张图像的相似度,计算过滤后的图像帧列表中每张图像的筛选权重;将过滤后的图像帧列表分为预定数目个分段,其中,预定数目为待筛选图像张数,每个分段内所有图像的筛选权重的和相等;针对每个分段,将该分段中权重值最大的图像作为待筛选图像。
在本公开的一些实施例中,所述视频预览内容为图形交换格式图片和新视频中的至少一种。
根据本公开的另一方面,提供一种计算机装置,包括:
存储器,用于存储指令;
处理器,用于执行所述指令,使得所述计算机装置执行实现如上述任一实施例所述的视频预览内容生成方法的操作。
根据本公开的另一方面,提供一种非瞬时性计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机指令,所述指令被处理器执行时实现如上述任一实施例所述的 视频预览内容生成方法。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本公开视频预览内容生成方法一些实施例的示意图。
图2为本公开视频预览内容生成方法另一些实施例的示意图。
图3为本公开视频预览内容生成装置一些实施例的示意图。
图4为本公开视频预览内容生成装置另一些实施例的示意图。
图5为本公开视频预览内容生成装置又一些实施例的示意图。
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为授权说明书的一部分。
在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它示例可以具有不同的值。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
发明人通过研究发现:相关技术抽取视频里边的部分图像帧,压缩成GIF图像。例如: 相关技术的方法包括:抽取视频文件的多个图像帧,将所述抽取到的多个图像帧加入缩略图图像集中,根据所述缩略图图像集中的图像帧生成动态缩略图。
但是,相关技术生成的预览图的质量不够好,若希望生成质量较高的预览图,可能需要人工干预。例如,相关技术方法无法分辨视频的片头和片尾文字内容,也无法感知视频内容快速变化(如监控视频,里边大部分内容都是雷同的,需要截取变化的部分)。
鉴于以上技术问题中的至少一项,本公开提供了一种视频预览内容生成方法和装置、计算机装置和存储介质,下面通过具体实施例对本公开进行说明。
图1为本公开视频预览内容生成方法一些实施例的示意图。优选的,本实施例可由本公开视频预览内容生成装置或计算机装置执行。该方法可以包括步骤11-步骤13,其中:
步骤11,解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表。
在本公开的一些实施例中,步骤11可以包括:读取待处理视频文件内容,抽取视频文件内的全部视频帧图像信息,按照视频的时间顺序得到一个有序的图像帧列表信息。
步骤12,采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾。
在本公开的一些实施例中,步骤12可以包括步骤121-步骤123,其中:
步骤121,针对有序的图像帧列表中的每一张图像进行图像识别,确定每一张图像中包含文字的区域位置。
步骤122,对于每一张图像,判断文字占比是否大于预定阈值,其中,所述文字占比为全部文字所占的面积与该图像面积的比值。
步骤123,过滤掉全部文字所占的面积与该图像面积的比值大于预定阈值的图像。
步骤13,基于过滤后的图像帧列表,生成视频预览内容。
在本公开的一些实施例中,所述视频预览内容可以为图形交换格式图片和新视频中的至少一种。
基于本公开上述实施例提供的视频预览内容生成方法,可以提高基于视频生成预览内容的质量,可以忽略视频的片头和片尾文字内容。本公开上述实施例通过图像识别来筛选核心图像来生成GIF图像或新的视频。
图2为本公开视频预览内容生成方法一些实施例的示意图。优选的,本实施例可由本公开视频预览内容生成装置或计算机装置执行。该方法可以包括步骤20-步骤29,其中:
步骤20,解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表。
在本公开的一些实施例中,步骤20可以包括:读取待处理视频文件内容,抽取视频文件内的全部视频帧图像信息,按照视频的时间顺序得到一个有序的图像帧列表信息。
步骤21,采用图像识别方式对所述图像帧列表进行处理。
在本公开的一些实施例中,步骤21可以包括:对步骤20得到的图像帧列表进行如下步骤211-步骤216的循环:
步骤211,对当前图像(图片)进行图片识别,得到当前图像中包含文字的区域位置。
在本公开的一些实施例中,步骤211可以包括:从当前图像中提取多个包含文字的图像区域。
步骤212,将全部文字所占的面积进行计算,然后求和。
步骤213,计算文字所占区域的面积与当前图像面积的比值。
步骤214,检查该比值是否大于预定阈值。
在本公开的一些实施例中,预定阈值可以配置,例如:预定阈值可以配置为10%。
步骤215,若该比值大于预定阈值,则将当前图像标记为不参与后续的图像筛选。
步骤216,检查当前图像后边是否有下一张图像。若有,则继续循环,将下一张图像作为当前图像,之后执行步骤211。若无,则执行步骤22。
步骤22,基于所述图像帧列表计算每两个相邻图像之间的相似度,确定相似度大于预定相似度阈值的相邻图像。
在本公开的一些实施例中,步骤22可以包括:对步骤21得到的图像帧列表进行如下循环:
步骤221,获取当前图像以及下一张图像进行相似度计算,得到相似度比值。
在本公开的一些实施例中,步骤221可以包括:将当前图像以及下一张图像分别作为参考图像和查询图像;将参考图像和查询图像分割为小区域;从分割的小区域中提取每一个小区域的特征量,作为查询图像和参考图像的小区域特征量;将参考图像小区域特征量与查询图像小区域特征量进行比较;和计算各个小区域的特征量的相似度,作为小区域相似度;和通过利用从局部区域权重值中得到的基于小区域的权重值对小区域相似度进行加权,来计算查询图像和参考图像之间的图像相似度。
步骤222,检查两个图像的相似度是否大于相似度阈值。
在本公开的一些实施例中,所述相似度阈值可以配置,例如相似度阈值可以配置为50%。
步骤223,若两个图像的相似度大于相似度阈值。
步骤224,将当前图像的下一张标记为不参与后续的图像筛选。
步骤225,检查当前图像是否还有下一张图像。若有,则继续循环,将下一张图像作为当前图像,之后执行步骤221;若无,则执行步骤23。
步骤23,从步骤22得到的图像帧列表中,过滤掉已经标记为不参与后续筛选的图像,得到新的一个有序图像列表。
步骤24,将步骤23得到的有序图像列表重复步骤22和步骤23的方式过滤不参与筛选的图像,直至在步骤23的环节没有任何图像被标记为不参与后续筛选。
步骤25,根据待处理视频长度和视频预览内容类型,确定待筛选图像张数。
在本公开的一些实施例中,待筛选图像张数可以为一个配置列表。
在本公开的一些实施例中,步骤25可以包括:将最终需要生成图片,根据视频长度进行阶梯配置,例如:对于1分钟视频,取5张,1-5分钟视频取10张,5-20分钟取15张,20-30分钟,30分钟以上取30张;若最终需要生成新的视频,图像张数可以多一些。
步骤26,从步骤24得到的图像帧列表中,根据利用每张图像与下一张图像的相似度,计算过滤后的图像帧列表中每张图像的筛选权重。
在本公开的一些实施例中,步骤26可以包括步骤261-步骤262,其中:
步骤261,利用每个图像与下一张图像的相似度比率,计算步骤24得到的有序图像列表的每张图像的筛选权重。
在本公开的一些实施例中,步骤261可以包括:根据公式(1)计算每张图像的筛选权重。
筛选权重=1/(相似度比率*100) (1)
步骤262,将每一张图像的筛选权重进行相加求和,得到总权重。
步骤263,然后计算每张图像的权重位置,算法如下:把每一张图像的位置前边的全部权重相加得到该张图像的权重位置。
步骤27,将步骤24得到的图像帧列表分为预定数目个分段,其中,预定数目为待筛选图像张数,每个分段内所有图像的筛选权重的和相等。
在本公开的一些实施例中,步骤27可以包括步骤271-步骤272,其中:
步骤271,按照需要步骤25得到的待筛选图像张数,对0-总权重之间进行平均分段。
步骤272,然后根据每一段的权重的范围,将有序图像列表也分为相同的分段。
步骤28,针对每个分段,将该分段中权重值最大的图像作为待筛选图像。
在本公开的一些实施例中,步骤28可以包括:从步骤27得到的有序图像列表分段进 行步骤281和步骤282的循环:
步骤281,找到当前分段中权重值最大的图像,将该图像筛选出来放入一个新的有序图像列表中,按顺序放入,先放入在前边。
步骤282,判断是否还有下一段有序图像列表。若有,则重复步骤281;若无,则执行步骤29。
步骤29,将步骤28的得到的图像列表,生成视频预览内容。
在本公开的一些实施例中,所述视频预览内容可以为图形交换格式图片和新视频中的至少一种。
在本公开的一些实施例中,步骤29可以包括:步骤291-步骤293,其中:
步骤291,将步骤28的得到的图像列表,判断最终需要生成的内容形式是哪一种:
步骤292,若是生成GIF图像:图像调用GIF图像生成模块,对图像列表进行GIF生成。
在本公开的一些实施例中,步骤292可以包括:获取用于生成GIF图片的单张图片素材;根据所述单张图片素材,生成动画;提取所述动画的每一帧图像;根据所述动画的每一帧图像,渲染得到GIF图片。
步骤293,若是生成视频:将步骤28得到的图像帧列表形成新的图像帧列表,去掉音频内容,重新生成新的视频。
本公开上述实施例可以通过图像识别和图像相似来筛选核心图片来生成GIF图片或新的视频。本公开上述实施例可以提高基于视频生成预览内容的质量,可以忽略视频的片头和片尾文字内容以及大段雷同的视频内容。
本公开上述实施例可以分辨视频的片头和片尾文字内容,可以感知视频内容快速变化。
本公开上述实施例可以实现视频文件的快速预览,使得用户在短时间内了解视频文件的主要信息,从而提高了用户体验。
图3为本公开视频预览内容生成装置一些实施例的示意图。如图3所示,本公开视频预览内容生成装置可以包括图像帧列表生成模块31、图像识别模块32和预览内容生成模块33,其中:
图像帧列表生成模块31,用于解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表。
在本公开的一些实施例中,图像帧列表生成模块31可以用于读取待处理视频文件内容,抽取视频文件内的全部视频帧图像信息,按照视频的时间顺序得到一个有序的图像帧列表信息。
图像识别模块32,用于采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾。
在本公开的一些实施例中,图像识别模块32可以用于针对有序的图像帧列表中的每一张图像进行图像识别,确定每一张图像中包含文字的区域位置;对于每一张图像,判断文字占比是否大于预定阈值,其中,所述文字占比为全部文字所占的面积与该图像面积的比值;过滤掉全部文字所占的面积与该图像面积的比值大于预定阈值的图像。
预览内容生成模块33,用于基于过滤后的图像帧列表,生成视频预览内容。
在本公开的一些实施例中,所述视频预览内容可以为图形交换格式图片和新视频中的至少一种。
基于本公开上述实施例提供的视频预览内容生成装置,可以提高基于视频生成预览内容的质量,可以忽略视频的片头和片尾文字内容。本公开上述实施例通过图像识别来筛选核心图像来生成GIF图像或新的视频。
图4为本公开视频预览内容生成装置另一些实施例的示意图。与图3实施例相比,图4实施例的本公开视频预览内容生成装置还可以包括图像相似度计算模块34,其中:
图像相似度计算模块34,用于基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像,之后指示预览内容生成模块33执行基于过滤后的图像帧列表,生成视频预览内容的操作。
在本公开的一些实施例中,图像相似度计算模块34可以用于基于所述图像帧列表中的每一张图像,计算相邻图像之间的相似度,其中,所述相邻图像为当前图像与下一张图像;判断是否存在相似度大于预定相似度阈值的相邻图像;在存在相似度大于预定相似度阈值的相邻图像的情况下,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像;之后,针对当前图像帧列表,执行基于所述图像帧列表中的每一张图像,计算相邻图像之间的相似度的操作;在不存在相似度大于预定相似度阈值的相邻图像的情况下,指示预览内容生成模块33执行基于过滤后的图像帧列表,生成视频预览内容的操作。
在本公开的一些实施例中,预览内容生成模块33可以用于根据待处理视频长度和视频预览内容类型,确定待筛选图像张数;从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像;将待筛选图像生成有序的待筛选图像帧列表;根据待筛选图像帧列表生成 视频预览内容。
在本公开的一些实施例中,预览内容生成模块33在从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像的情况下,可以用于根据利用每张图像与下一张图像的相似度,计算过滤后的图像帧列表中每张图像的筛选权重;将过滤后的图像帧列表分为预定数目个分段,其中,预定数目为待筛选图像张数,每个分段内所有图像的筛选权重的和相等;针对每个分段,将该分段中权重值最大的图像作为待筛选图像。
在本公开的一些实施例中,所述视频预览内容生成装置用于执行实现如上述任一实施例(例如图1或图2实施例)所述的视频预览内容生成方法的操作。
本公开上述实施例可以分辨视频的片头和片尾文字内容,可以感知视频内容快速变化。
本公开上述实施例可以实现视频文件的快速预览,使得用户在短时间内了解视频文件的主要信息,从而提高了用户体验。
图5为本公开视频预览内容生成装置又一些实施例的示意图。如图5所示,本公开视频预览内容生成装置可以包括存储器51和处理器52,其中:
存储器51,用于存储指令。
处理器52,用于执行所述指令,使得所述计算机装置执行实现如上述任一实施例(例如图1或图2实施例)所述的视频预览内容生成方法的操作。
本公开上述实施例可以通过图像识别和图像相似来筛选核心图片来生成GIF图片或新的视频。本公开上述实施例可以提高基于视频生成预览内容的质量,可以忽略视频的片头和片尾文字内容以及大段雷同的视频内容。
根据本公开的另一方面,提供一种非瞬时性计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机指令,所述指令被处理器执行时实现如上述任一实施例(例如图1或图2实施例)所述的视频预览内容生成方法。
基于本公开上述实施例提供的非瞬时性计算机可读存储介质,可以通过图像识别和图像相似来筛选核心图片来生成GIF图片或新的视频。本公开上述实施例可以提高基于视频生成预览内容的质量,可以忽略视频的片头和片尾文字内容以及大段雷同的视频内容。
本公开上述实施例可以分辨视频的片头和片尾文字内容,可以感知视频内容快速变化。
本公开上述实施例可以实现视频文件的快速预览,使得用户在短时间内了解视频文件 的主要信息,从而提高了用户体验。
在上面所描述的视频预览内容生成装置可以实现为用于执行本申请所描述功能的通用处理器、可编程逻辑控制器(PLC)、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件或者其任意适当组合。
至此,已经详细描述了本公开。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指示相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
本公开的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本公开限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本公开的原理和实际应用,并且使本领域的普通技术人员能够理解本公开从而设计适于特定用途的带有各种修改的各种实施例。
Claims (16)
- 一种视频预览内容生成方法,包括:解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表;采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾;基于过滤后的图像帧列表,生成视频预览内容。
- 根据权利要求1所述的视频预览内容生成方法,还包括:基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像。
- 根据权利要求1或2所述的视频预览内容生成方法,其中,所述采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾包括:针对有序的图像帧列表中的每一张图像进行图像识别,确定每一张图像中包含文字的区域位置;对于每一张图像,判断文字占比是否大于预定阈值,其中,所述文字占比为全部文字所占的面积与该图像面积的比值;过滤掉全部文字所占的面积与该图像面积的比值大于预定阈值的图像。
- 根据权利要求1或2所述的视频预览内容生成方法,其中,所述基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像包括:基于所述图像帧列表中的每一张图像,计算相邻图像之间的相似度,其中,所述相邻图像为当前图像与下一张图像;判断是否存在相似度大于预定相似度阈值的相邻图像;在存在相似度大于预定相似度阈值的相邻图像的情况下,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像,直到所述图像帧列表中不存在相似度大于预定相似度阈值的相邻图像。
- 根据权利要求1或2所述的视频预览内容生成方法,其中,所述基于过滤后的图像帧列表,生成视频预览内容包括:根据待处理视频长度和视频预览内容类型,确定待筛选图像张数;从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像;将待筛选图像生成有序的待筛选图像帧列表;根据待筛选图像帧列表生成视频预览内容。
- 根据权利要求5所述的视频预览内容生成方法,其中,所述从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像包括:根据利用每张图像与下一张图像的相似度,计算过滤后的图像帧列表中每张图像的筛选权重;将过滤后的图像帧列表分为预定数目个分段,其中,预定数目为待筛选图像张数,每个分段内所有图像的筛选权重的和相等;针对每个分段,将该分段中权重值最大的图像作为待筛选图像。
- 根据权利要求1或2所述的视频预览内容生成方法,其中,所述视频预览内容为图形交换格式图片和新视频中的至少一种。
- 一种视频预览内容生成装置,包括:图像帧列表生成模块,用于解析待处理视频内容,获取待处理视频内容的全部视频图像帧信息,生成有序的图像帧列表;图像识别模块,用于采用图像识别方式对所述图像帧列表进行处理,过滤视频片头和视频片尾;预览内容生成模块,用于基于过滤后的图像帧列表,生成视频预览内容。
- 根据权利要求8所述的视频预览内容生成装置,还包括:图像相似度计算模块,用于基于所述图像帧列表计算相邻图像之间的相似度,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像。
- 根据权利要求8或9所述的视频预览内容生成装置,其中:所述图像识别模块,用于针对有序的图像帧列表中的每一张图像进行图像识别,确定每一张图像中包含文字的区域位置;对于每一张图像,判断文字占比是否大于预定阈值,其中,所述文字占比为全部文字所占的面积与该图像面积的比值;过滤掉全部文字所占的面积与该图像面积的比值大于预定阈值的图像。
- 根据权利要求9所述的视频预览内容生成装置,其中:所述图像相似度计算模块,用于基于所述图像帧列表中的每一张图像,计算相邻图像之间的相似度,其中,所述相邻图像为当前图像与下一张图像;判断是否存在相似度大于预定相似度阈值的相邻图像;在存在相似度大于预定相似度阈值的相邻图像的情况下,过滤掉相似度大于预定相似度阈值的相邻图像中的后一张图像,直到所述图像帧列表中不存在相似度大于预定相似度阈值的相邻图像。
- 根据权利要求8或9所述的视频预览内容生成装置,其中:所述预览内容生成模块,用于根据待处理视频长度和视频预览内容类型,确定待筛选图像张数;从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像;将待筛选图像生成有序的待筛选图像帧列表;根据待筛选图像帧列表生成视频预览内容。
- 根据权利要求12所述的视频预览内容生成装置,其中:所述预览内容生成模块在从过滤后的图像帧列表中,确定待筛选图像张数的待筛选图像的情况下,可以用于根据利用每张图像与下一张图像的相似度,计算过滤后的图像帧列表中每张图像的筛选权重;将过滤后的图像帧列表分为预定数目个分段,其中,预定数目为待筛选图像张数,每个分段内所有图像的筛选权重的和相等;针对每个分段,将该分段中权重值最大的图像作为待筛选图像。
- 根据权利要求8或9所述的视频预览内容生成装置,其中,所述视频预览内容为图形交换格式图片和新视频中的至少一种。
- 一种计算机装置,包括:存储器,用于存储指令;处理器,用于执行所述指令,使得所述计算机装置执行实现如权利要求1-7中任一项 所述的视频预览内容生成方法的操作。
- 一种非瞬时性计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机指令,所述指令被处理器执行时实现如权利要求1-7中任一项所述的视频预览内容生成方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/248,670 US20230396861A1 (en) | 2020-10-13 | 2021-10-13 | Method and device for generating video preview content, computer device and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011092575.2A CN112291618B (zh) | 2020-10-13 | 2020-10-13 | 视频预览内容生成方法和装置、计算机装置和存储介质 |
CN202011092575.2 | 2020-10-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022078363A1 true WO2022078363A1 (zh) | 2022-04-21 |
Family
ID=74496688
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/123447 WO2022078363A1 (zh) | 2020-10-13 | 2021-10-13 | 视频预览内容生成方法和装置、计算机装置和存储介质 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20230396861A1 (zh) |
CN (1) | CN112291618B (zh) |
WO (1) | WO2022078363A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112291618B (zh) * | 2020-10-13 | 2023-04-07 | 北京沃东天骏信息技术有限公司 | 视频预览内容生成方法和装置、计算机装置和存储介质 |
CN114205632A (zh) * | 2021-12-17 | 2022-03-18 | 深圳Tcl新技术有限公司 | 视频预览方法、装置、电子设备及计算机可读存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030210261A1 (en) * | 2002-05-07 | 2003-11-13 | Peng Wu | Scalable video summarization |
US20160188997A1 (en) * | 2014-12-29 | 2016-06-30 | Neon Labs Inc. | Selecting a High Valence Representative Image |
CN105761263A (zh) * | 2016-02-19 | 2016-07-13 | 浙江大学 | 一种基于镜头边界检测和聚类的视频关键帧提取方法 |
CN107454454A (zh) * | 2017-08-30 | 2017-12-08 | 微鲸科技有限公司 | 信息显示方法及装置 |
CN110853124A (zh) * | 2019-09-17 | 2020-02-28 | Oppo广东移动通信有限公司 | 生成gif动态图的方法、装置、电子设备及介质 |
CN111523566A (zh) * | 2020-03-31 | 2020-08-11 | 易视腾科技股份有限公司 | 目标视频片段定位方法和装置 |
CN112291618A (zh) * | 2020-10-13 | 2021-01-29 | 北京沃东天骏信息技术有限公司 | 视频预览内容生成方法和装置、计算机装置和存储介质 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030026529A (ko) * | 2001-09-26 | 2003-04-03 | 엘지전자 주식회사 | 키프레임 기반 비디오 요약 시스템 |
CN103546828B (zh) * | 2012-07-16 | 2019-02-22 | 腾讯科技(深圳)有限公司 | 节目预览的生成方法及装置 |
JP6485452B2 (ja) * | 2014-03-27 | 2019-03-20 | ノーリツプレシジョン株式会社 | 画像処理装置 |
CN104540000B (zh) * | 2014-12-04 | 2017-10-17 | 广东欧珀移动通信有限公司 | 一种动态缩略图的生成方法及终端 |
KR101777242B1 (ko) * | 2015-09-08 | 2017-09-11 | 네이버 주식회사 | 동영상 컨텐츠의 하이라이트 영상을 추출하여 제공하는 방법과 시스템 및 기록 매체 |
US9972360B2 (en) * | 2016-08-30 | 2018-05-15 | Oath Inc. | Computerized system and method for automatically generating high-quality digital content thumbnails from digital video |
CN109327698B (zh) * | 2018-11-09 | 2020-09-15 | 杭州网易云音乐科技有限公司 | 动态预览图的生成方法、系统、介质和电子设备 |
CN110166828A (zh) * | 2019-02-19 | 2019-08-23 | 腾讯科技(深圳)有限公司 | 一种视频处理方法和装置 |
CN110532983A (zh) * | 2019-09-03 | 2019-12-03 | 北京字节跳动网络技术有限公司 | 视频处理方法、装置、介质和设备 |
-
2020
- 2020-10-13 CN CN202011092575.2A patent/CN112291618B/zh active Active
-
2021
- 2021-10-13 WO PCT/CN2021/123447 patent/WO2022078363A1/zh active Application Filing
- 2021-10-13 US US18/248,670 patent/US20230396861A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030210261A1 (en) * | 2002-05-07 | 2003-11-13 | Peng Wu | Scalable video summarization |
US20160188997A1 (en) * | 2014-12-29 | 2016-06-30 | Neon Labs Inc. | Selecting a High Valence Representative Image |
CN105761263A (zh) * | 2016-02-19 | 2016-07-13 | 浙江大学 | 一种基于镜头边界检测和聚类的视频关键帧提取方法 |
CN107454454A (zh) * | 2017-08-30 | 2017-12-08 | 微鲸科技有限公司 | 信息显示方法及装置 |
CN110853124A (zh) * | 2019-09-17 | 2020-02-28 | Oppo广东移动通信有限公司 | 生成gif动态图的方法、装置、电子设备及介质 |
CN111523566A (zh) * | 2020-03-31 | 2020-08-11 | 易视腾科技股份有限公司 | 目标视频片段定位方法和装置 |
CN112291618A (zh) * | 2020-10-13 | 2021-01-29 | 北京沃东天骏信息技术有限公司 | 视频预览内容生成方法和装置、计算机装置和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US20230396861A1 (en) | 2023-12-07 |
CN112291618B (zh) | 2023-04-07 |
CN112291618A (zh) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022078363A1 (zh) | 视频预览内容生成方法和装置、计算机装置和存储介质 | |
JP6462919B2 (ja) | イメージ分析によるイメージ自動編集装置、方法およびコンピュータ読み取り可能な記録媒体 | |
KR101605983B1 (ko) | 얼굴 검출을 이용한 이미지 재구성 | |
CN107222795B (zh) | 一种多特征融合的视频摘要生成方法 | |
CN107430780B (zh) | 用于基于视频内容特性的输出创建的方法 | |
US10108643B2 (en) | Graphical interface device, graphical interface method and medium | |
KR20160013984A (ko) | 비디오 편집을 위한 터치 최적화 디자인 | |
CN110832583A (zh) | 用于从多个图像帧生成概要故事板的系统和方法 | |
CN109791556B (zh) | 一种用于从移动视频自动创建拼贴的方法 | |
JP2006303707A (ja) | 画像処理装置及び画像処理方法 | |
CN113014957B (zh) | 视频镜头切分方法和装置、介质和计算机设备 | |
JP5984880B2 (ja) | 画像処理装置 | |
CN113297416A (zh) | 视频数据存储方法、装置、电子设备和可读存储介质 | |
JP6793169B2 (ja) | サムネイル出力装置、サムネイル出力方法およびサムネイル出力プログラム | |
WO2021259333A1 (zh) | 视频处理方法、装置、设备及计算机可读存储介质 | |
US20230351571A1 (en) | Image analysis system and image analysis method | |
CN112511766A (zh) | 基于弹幕nlp的视频剪辑方法、系统、电子设备及存储介质 | |
JP6142551B2 (ja) | 画像編集装置及び画像編集プログラム | |
JP2012022413A (ja) | 画像処理装置、画像処理方法、およびプログラム | |
JP2008065792A (ja) | 画像処理装置および方法、並びにプログラム | |
JP7530087B2 (ja) | カット表作成装置及びプログラム | |
WO2024176573A1 (ja) | データ拡張装置、データ拡張方法、及びプログラム | |
JP5977342B2 (ja) | 電子コミックデータ圧縮装置、方法およびプログラム | |
KR101747705B1 (ko) | 다큐멘터리 영상에서의 그래픽 샷 검출 방법 및 장치 | |
JP6399145B2 (ja) | 画像編集装置及び動画像の表示方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21879406 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18248670 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21879406 Country of ref document: EP Kind code of ref document: A1 |