CN110399848A

CN110399848A - Video cover generation method, device and electronic equipment

Info

Publication number: CN110399848A
Application number: CN201910692667.5A
Authority: CN
Inventors: 黄凯; 王长虎
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2019-07-30
Filing date: 2019-07-30
Publication date: 2019-11-01

Abstract

A kind of video cover generation method, device and electronic equipment are provided in the embodiment of the present disclosure, are belonged to technical field of image processing, this method comprises: parsing to target video, are obtained multiple video frame images；Object detection process is carried out to each video frame images, obtains the target object information for including on the video frame images；Type and number based on the target object that the target object information includes, determine the quality score of each video frame images；Based on the quality score of each video frame images, the target video frame image of video cover is selected as in the multiple video frame images.By the processing scheme of the disclosure, the video cover of high quality can be automatically generated.

Description

Video cover generation method, device and electronic equipment

Technical field

This disclosure relates to which technical field of image processing more particularly to a kind of video cover generation method, device and electronics are set It is standby.

Background technique

Image procossing (image processing) is also known as image processing, is needed for being reached with computer to image As a result technology.Originating from the 1920s, generally Digital Image Processing.The main contents of image processing techniques include figure As compression, enhancing restore, matching description identification 3 parts, common processing have image digitazation, image coding, image enhancement, Image restoration, image segmentation and image analysis etc..Image procossing is to be processed image information to meet people using computer Visual psychology or application demand behavior, be widely used, be chiefly used in mapping science, atmospheric science, astronomy, U.S. figure, make figure As improving identification etc..

One application scenarios of image procossing are the views that a video frame is selected in one section of video as this section of video How frequency cover selects the video frame that one representative, picture quality is high as video from numerous video frames Cover becomes the technical issues that need to address.

Summary of the invention

In view of this, the embodiment of the present disclosure provides a kind of video cover generation method, device and electronic equipment, at least partly Solve problems of the prior art.

In a first aspect, the embodiment of the present disclosure provides a kind of video cover generation method, comprising:

Target video is parsed, multiple video frame images are obtained；

Object detection process is carried out to each video frame images, obtains the target object for including on video frame images letter Breath；

Type and number based on the target object that the target object information includes, determine the matter of each video frame images Measure score value；

Based on the quality score of each video frame images, video cover is selected as in the multiple video frame images Target video frame image.

It is described that target video is parsed according to a kind of specific implementation of the embodiment of the present disclosure, obtain multiple views Frequency frame image, comprising:

Obtain the similarity in target video between two video frames of arbitrary neighborhood；

Judge whether the similarity is greater than preset threshold；

If so, from selected in described two video frames a video frame as one in the multiple video frame images It is a.

According to a kind of specific implementation of the embodiment of the present disclosure, two views for obtaining arbitrary neighborhood in target video Similarity between frequency frame, comprising:

Calculate separately the histogram value of two video frame images of the arbitrary neighborhood；

Determine the normalization coefficient between the histogram value of two video frame images；

Based on the normalization coefficient, the similarity between two video frames of arbitrary neighborhood in target video is determined.

It is described that each video frame images are carried out at target detection according to a kind of specific implementation of the embodiment of the present disclosure Reason, obtains the target object information for including on the video frame images, comprising:

The number of target object in the video frame is detected；

The number for the target object that will test is added in the target object information.

The type of target object in the video frame is detected；

The type information for the target object that will test is added in the target object information.

According to a kind of specific implementation of the embodiment of the present disclosure, the target for including based on the target object information The type and number of object, determine the quality score of each video frame images, comprising:

Obtain weighted value corresponding to different types of target object；

Number based on the weighted value and the target object determines each video frame figure by weighted summation The quality score of picture.

According to a kind of specific implementation of the embodiment of the present disclosure, the quality score based on each video frame images, The target video frame image of video cover is selected as in the multiple video frame images, comprising:

Obtain the highest video frame images of quality score in multiple video frames；

Using the highest video frame images of the quality score as the target video frame image of video cover.

In there are multiple video frames when the highest video frame images of quality score, using CNN network to multiple video frames The middle highest video frame images of quality score carry out quality score；

Based on the marking of CNN network quality as a result, being selected as the mesh of video cover in the multiple video frame images Mark video frame images.

According to a kind of specific implementation of the embodiment of the present disclosure, the quality score based on each video frame images, It is selected as in the multiple video frame images after the target video frame image of video cover, the method also includes::

Trimming operation is carried out to the target video frame image；

Using the target video frame image after cutting as the video cover.

Second aspect, the embodiment of the present disclosure provide a kind of video cover generating means, comprising:

Parsing module obtains multiple video frame images for parsing to target video；

Detection module obtains wrapping on the video frame images for carrying out object detection process to each video frame images The target object information contained；

Determining module, the type and number of the target object for including based on the target object information determine each The quality score of video frame images；

Execution module selects in the multiple video frame images for the quality score based on each video frame images Target video frame image as video cover.

The third aspect, the embodiment of the present disclosure additionally provide a kind of electronic equipment, which includes:

At least one processor；And

The memory being connect at least one processor communication；Wherein,

The memory is stored with the instruction that can be executed by least one processor, and the instruction is by least one processor It executes, so that at least one processor is able to carry out the view in any implementation of aforementioned first aspect or first aspect Frequency cover generation method.

Fourth aspect, the embodiment of the present disclosure additionally provide a kind of non-transient computer readable storage medium, the non-transient meter Calculation machine readable storage medium storing program for executing stores computer instruction, and the computer instruction is for making the computer execute aforementioned first aspect or the Video cover generation method in any implementation of one side.

5th aspect, the embodiment of the present disclosure additionally provide a kind of computer program product, which includes The calculation procedure being stored in non-transient computer readable storage medium, the computer program include program instruction, when the program When instruction is computer-executed, the computer is made to execute the video in aforementioned first aspect or any implementation of first aspect Cover generation method.

Video cover in the embodiment of the present disclosure generates scheme, including parses to target video, obtains multiple videos Frame image；Object detection process is carried out to each video frame images, obtains the target object for including on video frame images letter Breath；Type and number based on the target object that the target object information includes determine the quality point of each video frame images Value；Based on the quality score of each video frame images, the target of video cover is selected as in the multiple video frame images Video frame images.By the scheme of the disclosure, high quality can be selected based on the number and type of target object in video frame Cover image of the typical video frame as video.

Detailed description of the invention

It, below will be to needed in the embodiment attached in order to illustrate more clearly of the technical solution of the embodiment of the present disclosure Figure is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present disclosure, for this field For those of ordinary skill, without creative efforts, it can also be obtained according to these attached drawings other attached drawings.

Fig. 1 is a kind of video cover product process schematic diagram that the embodiment of the present disclosure provides；

Fig. 2 is another video cover product process schematic diagram that the embodiment of the present disclosure provides；

Fig. 3 is another video cover product process schematic diagram that the embodiment of the present disclosure provides；

Fig. 4 is another video cover product process schematic diagram that the embodiment of the present disclosure provides；

Fig. 5 is a kind of video cover generating means structural schematic diagram that the embodiment of the present disclosure provides；

Fig. 6 is the electronic equipment schematic diagram that the embodiment of the present disclosure provides.

Specific embodiment

The embodiment of the present disclosure is described in detail with reference to the accompanying drawing.

Illustrate embodiment of the present disclosure below by way of specific specific example, those skilled in the art can be by this specification Disclosed content understands other advantages and effect of the disclosure easily.Obviously, described embodiment is only the disclosure A part of the embodiment, instead of all the embodiments.The disclosure can also be subject to reality by way of a different and different embodiment It applies or applies, the various details in this specification can also be based on different viewpoints and application, in the spirit without departing from the disclosure Lower carry out various modifications or alterations.It should be noted that in the absence of conflict, the feature in following embodiment and embodiment can To be combined with each other.Based on the embodiment in the disclosure, those of ordinary skill in the art are without creative efforts Every other embodiment obtained belongs to the range of disclosure protection.

It should be noted that the various aspects of embodiment within the scope of the appended claims are described below.Ying Xian And be clear to, aspect described herein can be embodied in extensive diversified forms, and any specific structure described herein And/or function is only illustrative.Based on the disclosure, it will be understood by one of ordinary skill in the art that one described herein Aspect can be independently implemented with any other aspect, and can combine the two or both in these aspects or more in various ways. For example, carry out facilities and equipments in terms of any number set forth herein can be used and/or practice method.In addition, can make With other than one or more of aspect set forth herein other structures and/or it is functional implement this equipment and/or Practice the method.

It should also be noted that, diagram provided in following embodiment only illustrates the basic structure of the disclosure in a schematic way Think, component count, shape and the size when only display is with component related in the disclosure rather than according to actual implementation in schema are drawn System, when actual implementation kenel, quantity and the ratio of each component can arbitrarily change for one kind, and its assembly layout kenel can also It can be increasingly complex.

In addition, in the following description, specific details are provided for a thorough understanding of the examples.However, fields The skilled person will understand that the aspect can be practiced without these specific details.

The embodiment of the present disclosure provides a kind of video cover generation method.Video cover generation method provided in this embodiment can To be executed by a computing device, which can be implemented as software, or be embodied as the combination of software and hardware, the meter It calculates device and can integrate and be arranged in server, terminal device etc..

Referring to Fig. 1, a kind of video cover generation method that the embodiment of the present disclosure provides includes the following steps:

S101 parses target video, obtains multiple video frame images.

Target video is one section of video file for recording audio and image data, and target video can be arbitrary format Video file is also possible to other lattice for example, target video can be the video file of mpg, mp4, rm, rmvb, wax format The video file of formula.

It include video frame in target video, video frame is the set of all video pictures in target video, for example, for For the video that one frame per second is 30fps, according to normal video playout speed, it can be split out in the video of 1 second length 30 video frames.Certainly actual needs are based on, can also be obtained more by way of carrying out interleave in 30 video frames Video frame, alternatively, the selected section video frame from 30 video frames.

It, can be to the view in target video in order to improve the efficiency of parsing during being parsed to target video Frequency frame is screened, and will meet the video frame of screening conditions as parsing image.

The video frame for including in target video can be screened using various ways, as a kind of mode, Ke Yitong It crosses and judges in target video the whether similar mode of adjacent video frames to be screened.Specifically, in available target video Similarity between two video frames of arbitrary neighborhood, judges whether the similarity is greater than preset threshold (for example, 95%), when When similarity is greater than preset threshold, illustrate that the picture material variation between the two adjacent video frames is smaller, then it can be from institute It states and arbitrarily selects a video frame as a video frame in the multiple video frame images in two video frames.

S102 carries out object detection process to each video frame images, obtains the target for including on the video frame images Object information.

It, can be using these multiple video frame images as candidate video surface plot after obtaining multiple video frame images Picture.It, can be by way of target detection to multiple views in order to make further screening from these candidate video frame images Frequency frame image is handled, to select the video frame images in video frame comprising target object.

As an example, all video frame performance objectives for including in the target video can be detected, passes through mesh Mark detection, it can be determined that whether video frame includes target object (for example, people, object etc.), when in video frame include target object When, then the type (for example, people, automobile, trees etc.) of further detected target object.In this way, multiple videos are counted The number amount and type for the target object for being included in each video frame in frame, and the target object that will included in each video frame Number amount and type saved as target object information.

A variety of object detection methods for image can be used by carrying out target detection to video frame, at this to target detection Mode be not construed as limiting.

S103, type and number based on the target object that the target object information includes, determines each video frame figure The quality score of picture.

It, can be based on the target object detected in each video frame after completing target detection to multiple video frames Type and number carry out quality evaluation to each video frame images, obtain the quality score of each video frame images.

As an example, the quality score that each video frame can be calculated by way of weight score, for view Different types of target object present on frequency frame can define different weight score values.For example, vehicle is 5 points, face is 4 points, Artificial 3 points, text is 2 points, goes out this 4 target objects simultaneously on a video frame images, then the video frame images are scored at 5 + 4+3+2=14 points.

Certainly, based on different actual needs, different score values can be arranged to different types of target object, thus by this A little score values calculate the quality for all target objects for including on video frame images in such a way that weight is summed as weight Score value.

S104 is selected as video in the multiple video frame images based on the quality score of each video frame images The target video frame image of cover.

After the instruction score value for obtaining each image on multiple video frame images, these quality scores can be based on, from more Cover image of one or more video frame images as target video is selected in a video frame images.For example, can using pair The mode of the mass fraction descending of multiple video frames, according to sequence from high to low, selected from multiple video frame images one or Cover image of multiple video frame images as target video.

When only selecting that cover image of the video frame images as target video, it may appear that multiple target video frames The identical situation of the best quality score value of image, at this time, it may be necessary to further be carried out to the identical video frame images of quality score Quality evaluation, to select cover of the video frame images as target video in the identical video frame images of quality score Image.As an example, quality score, selection can be carried out to the identical video frame images of quality score using CNN network CNN mass point cover image of the higher video frame images as target video.

It, can be based on different strategies to the video frame of the target video of selection by the scheme in step S101~S104 Quality evaluation is carried out, and selects cover image of the video frame as target video based on the result of quality evaluation, thus Ensure the quality of target video cover figure.

Referring to fig. 2, described that target video is parsed according to a kind of specific implementation of the embodiment of the present disclosure, it obtains To multiple video frame images, may include steps of:

S201 obtains the similarity in target video between two video frames of arbitrary neighborhood.

Specifically, can be obtained and two consecutive frames in such a way that any two consecutive frame image is carried out matrixing The corresponding image array of image obtains any two video by comparing the mode of the similarity between the two image arrays Similarity between frame.

S202, judges whether the similarity is greater than preset threshold.

The similarity of image illustrates the variation degree between two video frames, when the similarity between two video frames is big When preset threshold (for example, 95%), illustrate can be set pre- when being between the two video frames almost without what variation If threshold value, it is compared by preset threshold with similarity value.

S203, if so, from selected in described two video frames a video frame as in the multiple video frame images One.

When similarity value is greater than preset threshold, a video frame images can be arbitrarily selected from two video frame images As together, another video frame images can be without processing in the multiple video frame images.

By the step in step S201-S203, the efficiency of video frame images selection can be further improved.

The similarity between two video frame images can be calculated in several ways, referring to Fig. 3, according to disclosure reality A kind of specific implementation of example is applied, the similarity in target video between two video frames of arbitrary neighborhood is obtained, can wrap Include following steps:

S301 calculates separately the histogram value of two video frame images of the arbitrary neighborhood.

S302 determines the normalization coefficient between the histogram value of two video frame images.

It, can be by calculating Pasteur's distance of two histograms or straight after the histogram value for obtaining two video frame images The mode of side's figure intersection distance, obtains the normalizated correlation coefficient between two histograms.

S303 is based on the normalization coefficient, determines similar between two video frames of arbitrary neighborhood in target video Degree.

It is described that each video frame images are carried out at target detection according to a kind of specific implementation of the embodiment of the present disclosure Reason, obtains the target object information for including on the video frame images, comprising: to the number of target object in the video frame into Row detection；The number for the target object that will test is added in the target object information.

It is described that each video frame images are carried out at target detection according to a kind of specific implementation of the embodiment of the present disclosure Reason, obtains the target object information for including on the video frame images, comprising: to the type of target object in the video frame into Row detection；The type information for the target object that will test is added in the target object information.

According to a kind of specific implementation of the embodiment of the present disclosure, the target for including based on the target object information The type and number of object, determine the quality score of each video frame images, comprising: it is right to obtain different types of target object institute The weighted value answered；Number based on the weighted value and the target object, determines each video by weighted summation The quality score of frame image.

According to a kind of specific implementation of the embodiment of the present disclosure, the quality score based on each video frame images, The target video frame image of video cover is selected as in the multiple video frame images, comprising: obtain in multiple video frames The highest video frame images of quality score；Using the highest video frame images of the quality score as the target video of video cover Frame image.

Referring to fig. 4, according to a kind of specific implementation of the embodiment of the present disclosure, the matter based on each video frame images Score value is measured, the target video frame image of video cover is selected as in the multiple video frame images, comprising:

S401, in there are multiple video frames when the highest video frame images of quality score, using CNN network to multiple views The highest video frame images of quality score carry out quality score in frequency frame.

It, can be by using CNN convolutional neural networks trained in advance to quality in multiple video frames as a kind of mode The highest video frame images of score value carry out quality score, during quality score, can from image quality, color, environment and The many aspects such as expression carry out overall merit, to provide one to the highest video frame images of quality score in multiple video frames Specific quality score

S402, based on the marking of CNN network quality as a result, being selected as video cover in the multiple video frame images Target video frame image.

After quality score is obtained, one or more times that quality score is high can be selected by way of sequence Select final cover image of the cover image as the target video

According to a kind of specific implementation of the embodiment of the present disclosure, the quality score based on each video frame images, It is selected as in the multiple video frame images after the target video frame image of video cover, the method also includes: it is right The target video frame image carries out trimming operation, using the target video frame image after cutting as the video cover.Pass through Trimming operation, can be in the case where retaining multiple target objects present in video frame, further to video cover image Carry out the optimization in size.

Corresponding with above method embodiment, referring to Fig. 5, the embodiment of the present disclosure additionally provides a kind of video cover generation Device 50, comprising:

Parsing module 501 obtains multiple video frame images for parsing to target video.

Detection module 502 obtains on the video frame images for carrying out object detection process to each video frame images The target object information for including.

Determining module 503, the type and number of the target object for including based on the target object information determine every The quality score of a video frame images.

Execution module 504 is selected in the multiple video frame images for the quality score based on each video frame images Select the target video frame image as video cover.

By the scheme in the application, quality can be carried out based on video frame of the different strategies to the target video of selection Evaluation, and cover image of the video frame as target video is selected based on the result of quality evaluation, ensure that mesh Mark the quality of video cover figure

Fig. 5 shown device can it is corresponding execute above method embodiment in content, what the present embodiment was not described in detail Part, referring to the content recorded in above method embodiment, details are not described herein.

Referring to Fig. 6, the embodiment of the present disclosure additionally provides a kind of electronic equipment 60, which includes:

At least one processor；And

The memory being connect at least one processor communication；Wherein,

The memory is stored with the instruction that can be executed by least one processor, and the instruction is by least one processor It executes, so that at least one processor is able to carry out video cover generation method in preceding method embodiment.

The embodiment of the present disclosure additionally provides a kind of non-transient computer readable storage medium, and the non-transient computer is readable to deposit Storage media stores computer instruction, and the computer instruction is for executing the computer in preceding method embodiment.

The embodiment of the present disclosure additionally provides a kind of computer program product, and the computer program product is non-temporary including being stored in Calculation procedure on state computer readable storage medium, the computer program include program instruction, when the program instruction is calculated When machine executes, the computer is made to execute the video cover generation method in preceding method embodiment.

Below with reference to Fig. 6, it illustrates the structural schematic diagrams for the electronic equipment 60 for being suitable for being used to realize the embodiment of the present disclosure. Electronic equipment in the embodiment of the present disclosure can include but is not limited to such as mobile phone, laptop, Digital Broadcasting Receiver Device, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), car-mounted terminal are (such as vehicle-mounted Navigation terminal) etc. mobile terminal and such as number TV, desktop computer etc. fixed terminal.Electronics shown in Fig. 6 Equipment is only an example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.

As shown in fig. 6, electronic equipment 60 may include processing unit (such as central processing unit, graphics processor etc.) 601, It can be loaded into random access storage according to the program being stored in read-only memory (ROM) 602 or from storage device 608 Program in device (RAM) 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with the behaviour of electronic equipment 60 Various programs and data needed for making.Processing unit 601, ROM 602 and RAM 603 are connected with each other by bus 604.It is defeated Enter/export (I/O) interface 605 and is also connected to bus 604.

In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, figure As the input unit 606 of sensor, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaking The output device 607 of device, vibrator etc.；Storage device 608 including such as tape, hard disk etc.；And communication device 609.It is logical T unit 609 can permit electronic equipment 60 and wirelessly or non-wirelessly be communicated with other equipment to exchange data.Although showing in figure The electronic equipment 60 with various devices is gone out, it should be understood that being not required for implementing or having all devices shown. It can alternatively implement or have more or fewer devices.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608 It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the embodiment of the present disclosure is executed Method in the above-mentioned function that limits.

It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned Any appropriate combination.

Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment；It is also possible to individualism, and not It is fitted into the electronic equipment.

Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by the electricity When sub- equipment executes, so that the electronic equipment: obtaining at least two internet protocol addresses；Send to Node evaluation equipment includes institute State the Node evaluation request of at least two internet protocol addresses, wherein the Node evaluation equipment is internet from described at least two In protocol address, chooses internet protocol address and return；Receive the internet protocol address that the Node evaluation equipment returns；Its In, the fringe node in acquired internet protocol address instruction content distributing network.

Alternatively, above-mentioned computer-readable medium carries one or more program, when said one or multiple programs When being executed by the electronic equipment, so that the electronic equipment: receiving the Node evaluation including at least two internet protocol addresses and request； From at least two internet protocol address, internet protocol address is chosen；Return to the internet protocol address selected；Wherein, The fringe node in internet protocol address instruction content distributing network received.

The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+ +, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions, for example, the One acquiring unit is also described as " obtaining the unit of at least two internet protocol addresses ".

It should be appreciated that each section of the disclosure can be realized with hardware, software, firmware or their combination.

The above, the only specific embodiment of the disclosure, but the protection scope of the disclosure is not limited thereto, it is any Those familiar with the art is in the technical scope that the disclosure discloses, and any changes or substitutions that can be easily thought of, all answers Cover within the protection scope of the disclosure.Therefore, the protection scope of the disclosure should be subject to the protection scope in claims.

Claims

1. a kind of video cover generation method characterized by comprising

Target video is parsed, multiple video frame images are obtained；

Object detection process is carried out to each video frame images, obtains the target object information for including on the video frame images；

Type and number based on the target object that the target object information includes determine the quality point of each video frame images Value；

Based on the quality score of each video frame images, the target of video cover is selected as in the multiple video frame images Video frame images.

2. obtaining multiple videos the method according to claim 1, wherein described parse target video Frame image, comprising:

Judge whether the similarity is greater than preset threshold；

If so, from selected in described two video frames a video frame as one in the multiple video frame images.

3. according to the method described in claim 2, it is characterized in that, two videos for obtaining arbitrary neighborhood in target video Similarity between frame, comprising:

4. the method according to claim 1, wherein described carry out at target detection each video frame images Reason, obtains the target object information for including on the video frame images, comprising:

The number of target object in the video frame is detected；

5. the method according to claim 1, wherein described carry out at target detection each video frame images Reason, obtains the target object information for including on the video frame images, comprising:

The type of target object in the video frame is detected；

6. the method according to claim 1, wherein the target pair for including based on the target object information The type and number of elephant, determine the quality score of each video frame images, comprising:

Obtain weighted value corresponding to different types of target object；

Number based on the weighted value and the target object, determines each video frame images by weighted summation Quality score.

7. the method according to claim 1, wherein the quality score based on each video frame images, In The target video frame image of video cover is selected as in the multiple video frame images, comprising:

8. the method according to claim 1, wherein the quality score based on each video frame images, In The target video frame image of video cover is selected as in the multiple video frame images, comprising:

In there are multiple video frames when the highest video frame images of quality score, using CNN network to matter in multiple video frames It measures the highest video frame images of score value and carries out quality score；

Based on CNN network quality marking as a result, be selected as in the multiple video frame images video cover target view Frequency frame image.

9. the method according to claim 1, wherein the quality score based on each video frame images, In It is selected as in the multiple video frame images after the target video frame image of video cover, the method also includes::

Trimming operation is carried out to the target video frame image；

Using the target video frame image after cutting as the video cover.

10. a kind of video cover generating means characterized by comprising

Detection module, for each video frame images carry out object detection process, obtain include on the video frame images Target object information；

Determining module, the type and number of the target object for including based on the target object information, determines each video The quality score of frame image；

Execution module is selected as in the multiple video frame images for the quality score based on each video frame images The target video frame image of video cover.

11. a kind of electronic equipment, which is characterized in that the electronic equipment includes:

At least one processor；And

The memory being connect at least one described processor communication；Wherein,

The memory is stored with the instruction that can be executed by least one described processor, and described instruction is by described at least one It manages device to execute, so that at least one described processor is able to carry out the generation of video cover described in aforementioned any claim 1-9 Method.

12. a kind of non-transient computer readable storage medium, which stores computer instruction, The computer instruction is for making the computer execute video cover generation method described in aforementioned any claim 1-9.