CN110351597A

CN110351597A - A kind of method, apparatus and electronic equipment of video clipping

Info

Publication number: CN110351597A
Application number: CN201810308302.3A
Authority: CN
Inventors: 刘兆艳; 肖其虎
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2018-04-08
Filing date: 2018-04-08
Publication date: 2019-10-18
Also published as: WO2019196795A1

Abstract

The embodiment of the invention discloses a kind of method, apparatus of video clipping and electronic equipments, which comprises obtains the video image of video to be clipped each time point in multiple time points, forms image collection；Classify to all video images in described image set；The video image that the classification chosen is obtained from described image set generates corresponding classification video according to the video image of the classification chosen.Through the embodiment of the present invention, video clipping operation can be made convenient, and can also editing go out personalized classification video, promote user experience, meet users ' individualized requirement.

Description

A kind of method, apparatus and electronic equipment of video clipping

Technical field

This application involves field of video processing, the method, apparatus and electronic equipment of espespecially a kind of video clipping.

Background technique

Video clipping is to carry out non-linear editing to video source, by the picture of addition, background music, special efficacy, scene Equal materials are mixed again with video, are cut, are merged to video source, and by secondary coding, generating has different manifestations power New video.

With the fast development of internet, user will use the application program sharing video frequency in electronic equipment, at present in society Hand over each platform on platform not support the video of long time, if the video of user's shooting is too long, it is necessary to video into Row is cut；And present people also pursue personalized more and more, and the video of sharing also prefers to unusual, this is also It needs to carry out video personalized cutting and editor.

The Normal practice of current major video software for editing on the market, is that a video leaves out the beginning and the end, when choosing some Between point to the video content segments between another time point, the video and audio data of this segment according to certain video lattice Formula container rule generates a new video file.

This method can solve the primary demand that user cuts video, but also suffer from certain drawbacks.PC(Personal Computer, personal computer) on software for editing the method can be reused and choose multiple segments and cut, and hand Cannot be excessively complicated because being limited to interaction design on machine, it is most of namely only to support to select one section of continual view therein Frequently, any number of segments of selecting video are unable to, have limitation to the operation of user, user experience is poor.

Summary of the invention

The embodiment of the invention provides the method, apparatus and electronic equipment of a kind of processing of video, to promote user experience.

The embodiment of the invention provides a kind of methods of video clipping, comprising:

The video image of video to be clipped each time point in multiple time points is obtained, image collection is formed；

Classify to all video images in described image set；

The video image that the classification chosen is obtained from described image set, according to the video image of the classification chosen Generate corresponding classification video.

The embodiment of the invention also provides a kind of devices of video clipping, comprising:

Image set obtains module, for obtaining the video image of video to be clipped each time point in multiple time points, Form image collection；

Categorization module, for classifying to all video images in described image set；

Editing module is chosen according to described for obtaining the video image for the classification chosen from described image set The video image of classification generates corresponding classification video.

The embodiment of the invention also provides a kind of electronic equipment, comprising:

Processor；

For storing the memory of the processor-executable instruction；

Wherein, the processor is for performing the following operations:

Classify to all video images in described image set；

The embodiment of the present invention includes: obtain the video image of video to be clipped each time point in multiple time points, group At image collection；Classify to all video images in described image set；Acquisition is chosen from described image set The video image of classification generates corresponding classification video according to the video image of the classification chosen.Implement through the invention Example can make video clipping operation convenient, and can also editing go out personalized classification video, promote user experience, meet and use Family individual demand.

Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right Specifically noted structure is achieved and obtained in claim and attached drawing.

Detailed description of the invention

Attached drawing is used to provide to further understand technical solution of the present invention, and constitutes part of specification, with this The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.

Fig. 1 is the flow chart of the method for the video clipping of the embodiment of the present invention；

Fig. 2 is the schematic diagram of the method for the video clipping of the embodiment of the present invention；

Fig. 3 is the flow chart of the method for the video clipping of application example of the present invention；

Fig. 4 is GUI (Graphical User Interface, the graphical user of the electronic equipment of application example of the present invention Interface) schematic diagram；

Fig. 5 is that user selects the sorted GUI schematic diagram of A；

Fig. 6 is that user selects editing, obtains the schematic diagram of the new video of A classification；

Fig. 7 is the schematic device of the video clipping of the embodiment of the present invention；

Fig. 8 is the schematic device of the video clipping of another embodiment of the present invention.

Specific embodiment

The embodiment of the present invention is described in detail below in conjunction with attached drawing.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can mutual any combination.

Step shown in the flowchart of the accompanying drawings can be in a computer system such as a set of computer executable instructions It executes.Also, although logical order is shown in flow charts, and it in some cases, can be to be different from herein suitable Sequence executes shown or described step.

With the fast development of internet, when user uses the application program sharing video frequency in electronic equipment, it is desirable to have a The cutting and edit mode of property.In addition, popularizing with monitor and control facility, produces the video of magnanimity, looks for from massive video It is more and more difficult to interested video content, it is time-consuming and laborious by manually searching.In embodiments of the present invention, it proposes to video Content recognition classification, then can extract the video of classification interested.

As depicted in figs. 1 and 2, the method for the video clipping of the embodiment of the present invention, comprising:

Step 101, the video image of video to be clipped each time point in multiple time points is obtained, image set is formed It closes.

Wherein, the video to be clipped can be user by the application program of video record in electronic equipment to certain fields What scape was recorded, being also possible to the video that user is obtained by other approach, as downloaded on network, being also possible to lead to Cross what monitor and control facility automatic recording obtained.

The multiple time point can be continuous time point, such as n seconds every, and n is the number greater than 0, can be according to be clipped The difference of the time span of video and have different settings.For example, video image as much as possible is known for image in order to obtain Not, n is equal to 1, i.e., one video image of acquisition in every 1 second.

The video image is also referred to as video frame images, can be the key frame (I frame) or P frame or B frame of video, It can be determined according to the compression situation of video to be clipped, the video that video record software is recorded on electronic equipment is usually to adopt With frame data compression, the video image on time point taken is usually key frame images (I frame).

In one embodiment, after step 101, it may also include that generating each video image in described image set corresponds to Thumbnail.And it may also include that the corresponding thumbnail of each video image in display described image set.

In an embodiment of the present invention, user can understand the content of video to be clipped by corresponding thumbnail, lead to Thumbnail corresponding with each image in generation multi-frame video image is crossed, depositing for electronic equipment can be saved in the display thumbnail time Space is stored up, and guarantees that user sees accurate video content, promotes user experience.

Step 102, classify to all video images in described image set.

It in one embodiment, can also include: to all video images in described image set before being classified It is pre-processed.

Wherein, the pretreatment may include: the image data for inputting the video image, and described image data are carried out Data type conversion carries out data normalization and whitening processing.

Wherein, input image data, the method that can be read using image decoding or image file；

Described image data are subjected to data type conversion, i.e., is converted to be more suitable to classify by the type of image data and calculate The data type of method, such as int type (integer type) are converted into float type (float)；

Data normalization, common method have simple scalability, sample-by-sample mean value abatement, feature normalization (to make institute in data set Have feature that all there is zero-mean and unit variance) etc..Wherein simple scalability, it is therefore an objective to pass through the value of each latitude to data It is readjusted, so that final data vector is fallen in the section of [0,1] or [- 1,1], for color image, color is logical Between road and smooth performance is not present, therefore when handling color image, usually feature scaling is carried out to data, what we obtained For pixel value in [0,255] section, common processing is to make these pixel values divided by 255 them to zoom to [0,1] section It is interior；

Whitening processing, generally include PCA (Principal Component Analysis, principal component analysis) albefaction and ZCA (Zero-phase Component Analysis, zero phase constituent analysis) albefaction, wherein PCA albefaction guarantees that data are each The variance of dimension is 1, and ZCA albefaction guarantees that the variance of each dimension of data is identical.PCA albefaction, which can be used for dimensionality reduction, can also go phase Guan Xing, and ZCA albefaction is mainly used for decorrelation, and makes the data after albefaction close to original input data as far as possible.PCA/ZCA is white In change processing, covariance matrix is unit matrix, first has to make feature zero averaging, followed by select suitable epsilon (rule Then change item, have low-pass filtering effect to data), generally to select sufficiently large epsilon to carry out PCA/ for color image ZCA。

In addition, image preprocessing can also include following one or more processing in order to increase data set to deep learning:

The a series of overturning of the picture progress such as left and right of input is overturn, is spun upside down by Image Reversal, diagonal line is overturn Deng carrying out expanding data amount with this so that the image of all angles has, moreover it is possible to alleviate the problem of identification mistake；

Colour switching, such as brightness, contrast, saturation degree, the tone etc. of adjustment image.

In embodiments of the present invention, can by deep learning algorithm to all video images in described image set into Row classification.

In an embodiment of the present invention, artificial intelligence approach is introduced, is divided by frame image of the deep learning to video Class, the deep learning is the Learning Studies based on deep neural network, and deep neural network has multiple hidden layers Neural network, it is more useful to learn by constructing the training data of the machine learning model and magnanimity with many hidden layers Feature, to finally promote the accuracy of classification or prediction.Wherein feature learning is by sample by layer-by-layer eigentransformation in original The character representation in space transforms to a new feature space, to make to classify or predict to be more easier.Deep learning can be from big The expression of automatic learning characteristic in data, wherein may include thousands of parameter.For example, Hinton research group in 2012 Participating in ImageNet ILSVRC match and using is convolutional network model, and the character representation of model contains 6,000 ten thousand parameters, From sample middle schools up to a million acquistion to.

Deep learning training process may include:

It 1) is the process of a feature learning using the unsupervised learning from lower rising

Using no label data or there is each layer parameter of label data order training method, this step can be regarded as one without prison Training process is superintended and directed, is to distinguish the best part with traditional neural network.

Wherein, first with no label or there is label data training first layer, when training first learns parameter (this layer of first layer It is considered as obtaining the hidden layer for export and input the smallest three-layer neural network of difference)；

After study obtains (n-1)th layer, by n-1 layers of the input exported as n-th layer, thus training n-th layer obtains respectively To the parameter of each layer.

2) top-down supervised learning is a process to whole network tuning

It based on each layer parameter that the first step obtains, goes to train by the data of tape label, error is top-down to be transmitted into one Step finely tunes the parameter of entire Multi-Layered Network Model, this step is a Training process.

From ImageNet go to school acquistion to character representation have very strong generalization ability, can successfully be applied to it Its data acquisition system task.The training set much applied be it is lesser, in this case using deep learning can there are three types of Method:

(1) it can will train obtained model as starting point on ImageNet, utilize target training set and backpropagation pair It carries out continuing to train, and model is adapted to specific application.ImageNet plays the role of pre-training.

(2) if target training set is not big enough, the network parameter of low layer can also be fixed, continues to use the instruction on ImageNet Practice collection as a result, being only updated to upper layer.This is because the network parameter of bottom is most difficult to update, and learn from ImageNet Obtained bottom filter often describes a variety of different local edges and texture information, and these filters are to general figure As there is preferable universality.

(3) model that training obtains is directlyed adopt on ImageNet, using the output of highest hidden layer as feature representation, Instead of the feature of common hand-designed.

Wherein ImageNet data set is that the maximum data set of image recognition and deep learning image are led in the world at present A very more fields for domain application, ImageNet data set have more than 1,400 ten thousand width pictures, cover a classification more than 20,000, wherein It has more than million picture to be marked by specific classification, which has special messenger's maintenance, can all do mark update every year.

In one embodiment, after step 102, it may also include that and classification marker is arranged to the video image of each classification.

The classification marker can indicate the classification that the video image is belonged to.

It in one embodiment, can also include: that a video is selected from the video image of each classification after step 102 Image generates thumbnail.

In the embodiment of the present invention, a video image conduct can be selected from all video frame images of each classification The video image of the classification generates the corresponding thumbnail of video image of each classification.The mode of selection can use preset Rule, for example, select first in the video frame images of each classification or it is intermediate one or last, can also be random Select a video image as the classification in the video frame images of each classification.

In one embodiment, it may also include that the corresponding thumbnail of each classification of display.

By showing the corresponding thumbnail of each classification, it includes the interior of which classification that user, which can see video to be clipped, Hold, user can be allowed to understand video to be clipped from different sides, promotes user experience.

Step 103, the video image that the classification chosen is obtained from described image set, according to the classification chosen Video image generates corresponding classification video.

In one embodiment, before step 103, it may also include that and receive the first user instruction, first user instruction is used In the classification that instruction is chosen.

Wherein, first user instruction can be the selection instruction of user；For example, user can pass through touch screen, mouse The input tools such as mark, keyboard select some classification.

In one embodiment, the video image that the classification chosen is obtained from described image set, comprising: from described The video image labeled as the classification chosen is obtained in image collection.

Wherein it is possible to according to classification marker, the video image for the classification chosen described in acquisition.

In one embodiment, described after the video image for obtaining the classification chosen in described image set, it can also wrap It includes: the corresponding thumbnail of all video images for the classification chosen described in generation.And it may also include that the display classification The corresponding thumbnail of all video images.

User can pass through the thumbnail of all video images under the classification chosen in an embodiment of the present invention Know the content in selected classification comprising which, user can be allowed to understand and check the video from the angle of classification, enhancing is used The individualized experience at family.

In one embodiment, before the video image of the classification chosen according to generates corresponding classification video, It may also include that and receive second user instruction, the second user instruction, which is used to indicate, generates the classification video.

Wherein, the second user instruction can be the confirmation instruction of user, receives the confirmation instruction of user, then starts to give birth to Constituent class video.

In one embodiment, the video image of the classification chosen according to generates corresponding classification video, comprising: The corresponding video clip of the video image of the classification is merged and encoded, the classification video is generated.

Wherein, the video clip in the corresponding time point section of each video image, for example, in a step 101, obtaining for every 1 second A video image is taken, then corresponding 1 second video clip of each video image.

In this step, the corresponding video clip of the video image of the classification is merged, and is generated as needed Video format, carry out secondary coding, generate the classification video.

Through the embodiment of the present invention, video clipping operation can be made convenient, and can also editing go out personalized classification view Frequently, user experience is promoted, users ' individualized requirement is met.

It is illustrated below with an application example.

As shown in figure 3, the method for the video clipping for application example of the present invention, Fig. 4~Fig. 6 is application example of the present invention The GUI schematic diagram of electronic equipment.

As shown in Fig. 4~Fig. 5, the GUI include video display area 401, video to be clipped video frame thumbnail show The thumbnail display area 404 of all images under region 402, classification thumbnail display area 403, classification, video playing are worked as Preceding moment 405, the total duration 406 of video, editing 407.

Wherein, video display area 401 is used to play the display of video.

The video frame thumbnail display area 402 of video to be clipped is used to show the contracting of the video frame images of video to be clipped Sketch map, user can be presented whole thumbnails by sliding thumbnail complete.

Classification thumbnail display area 403 is for showing the corresponding thumbnail of the image of each classification.

The thumbnail display area 404 of all images is used to show all thumbnails of the classification under classification.

The current time 405 of video playing is used to show current schedules moment when video playing.

The total duration 406 of video is used to show the total duration of video, after having selected a classification, then shows to be the classification The corresponding video clips of all images merge after new video duration.

Editing 407 is clip button, merges the corresponding video clips of all images of the classification after click, coding, generates New classification video.

Referring to Fig. 3, the method for the video clipping of application example of the present invention, comprising:

Step 301, video to be clipped is obtained, starts to execute editing operation.

The executing subject of this application example can be video clipping class application program in electronic equipment.

Wherein, the application program can be the software program of operation on an electronic device, and electronic equipment can be individual Computer, cloud device, mobile device such as smart phone or tablet computer etc..

Wherein, the video to be clipped is the video for needing to carry out it editing, can be passed through in electronic equipment for user The relevant software of video record records some scenes, or the video obtained for user by other approach is such as It is being downloaded on network, obtain from monitor and control facility.

Wherein, user triggers the editing operation to the video to be clipped, and triggering mode can pass through the text from electronic equipment The video of any time length is imported video clipping class application program by the video that any time length is selected in part folder, or Person can pass through its additional processing function such as editing of the video selection of any time length.

Step 302, the video image for obtaining video to be clipped each time point in multiple time points forms an image Set；

In an embodiment of the present invention, user can by the video image in each time point in multiple time points come Obtain the content of video to be clipped, acquired video frame images can be the key frame (I frame) or P frame or B of video Frame is determined by the background operating system of video clipping class application program according to the compression situation of video to be clipped, electronic equipment Frame data compression is usually used in the video that upper video record software is recorded, and the video image on time point taken is usually to close Key frame image (I frame).

Wherein, multiple time points can be continuous time point, such as n seconds every, and n is the number greater than 0, can be by video clipping The background operating system of class application program has different settings according to the difference of the time span of video to be clipped.

Step 310, for accessed image collection, the corresponding contracting of each video image in described image set is generated Sketch map.

Step 311, whole thumbnails are shown.

In application example of the present invention, user can be presented whole thumbnails by sliding thumbnail complete.It is aobvious Show region for the thumbnail display area 402 of video to be cut in Fig. 4.

Step 303, it (may include to image that every video image, which is pre-processed, in the image collection obtained to step 302 Data carry out data type conversion, carry out data normalization and whitening processing etc.), it prepares for image recognition.

Step 304, classified by deep learning algorithm to all video images in described image set, by video Image recognition is at different classification.

Wherein, deep learning includes the study of feature learning and disaggregated model, practical real according to the difference of the scene of application It is existing different.

For example, can choose often for garage or the monitor video of highway to the feature extraction of identified image The license plate number of vehicle carries out intelligent recognition and classification according to license plate number.Car license recognition mainly divides 3 parts, comprising:

It is License Plate first, generally using color positioning, feature location etc.；

Followed by License Plate Segmentation, generally use sciagraphy；

Character recognition is relatively good come recognition effect using convolutional neural networks, and sample number is big when training network, training set In character include all numbers and letter and the Chinese character in part province.Trained network can be used to carry out later Character recognition.

For another example carrying out the feature of face to identified image for the monitor video of public place such as kindergarten etc. It extracts, intelligent recognition and classification is carried out to face using face recognition algorithms.

Deep learning model can learn to express for the layered characteristic of facial image, may include:

The bottom can portray local edge and textural characteristics from original pixels learning filters；

By being combined to various boundary filters, middle layer filter can describe different types of human face；

The top global characteristics for describing entire face.

Deep learning provides distributed character representation.Highest hidden layer, each neuron represent a category Property classifier, such as men and women, ethnic group and hair color etc..

In field of face identification, Labeled Faces in the Wild (LFW) is current famous recognition of face test Collection, is created in 2007.LFW has collected the human face photo of a famous person more than 5,000 from internet, for assessing face recognition algorithms Performance under the conditions of non-controllable.These photos often have complicated light, expression, posture, age and block etc. Variation.The test set of LFW contains 6000 pairs of facial images.Wherein 3000 pairs are positive samples, and two each pair of images belong to together One people；Remaining 3000 pairs are negative samples, and two each pair of images belong to different people.The accuracy rate guessed at random is 50%.Through The face recognition algorithms Eigenface of allusion quotation only has 60% discrimination on this test set.In the algorithm of non-deep learning, Best discrimination is 96.33%.Deep learning can achieve 99.47% discrimination at present.

Step 305, classification marker is carried out to video image according to the result of image recognition.

Step 312, the thumbnail of video image corresponding with classification is generated.

Wherein it is possible to select video figure of the image as the classification from all video images of each classification Picture generates the corresponding thumbnail of video image of each classification.

In addition, since step 310 has generated the thumbnail of all video images, so in this step, it can also be direct The corresponding thumbnail of video image of each classification is obtained from the thumbnail of all video images.

Step 313, classification thumbnail is shown.

Referring to fig. 4, classification thumbnail display area is thumbnail display area 403 of classifying in Fig. 4, in this application example, Video image is divided into A, B, C tri- classification.

User can see that video to be clipped includes the content of which classification, Ke Yirang by corresponding classification thumbnail User understands video from different sides, promotes user experience.

Step 306, the first user instruction is received, a classification is selected.

First user instruction can be the selection instruction of user, and in application example of the invention, user can be from The classification thumbnail display area 403 of Fig. 4 selects a classification.

Step 307, it will be extracted labeled as all video images of the classification.

Step 314, thumbnail corresponding with all video images of the classification is generated.

Since step 310 has generated the thumbnail of all video images, so in this step, it can also be directly from all All thumbnails of the classification are obtained in the thumbnail of video image.

Step 315, all thumbnails of the classification are shown.

In application example of the present invention, display area is the thumbnail display area 404 of all images under classifying in Fig. 5, User selects the label seen after A classification for all thumbnails 404 of A.

User can know the content in selected classification comprising which by the thumbnail for lower all images of classifying, can be with It allows user to understand and check the video from the angle of classification, enhances the individualized experience of user.

Step 308, second user instruction is received, determines editing.

The confirmation that the second user instruction can be user instructs in application example of the present invention, and user can pass through a little 407 icon of editing hit in Fig. 5 come determine editing selected classification under all video clips.

Step 309, the corresponding video clips of all images of the classification are merged, coding, generates new classification video.

As shown in fig. 6, user selects editing, the new video of A classification is obtained.

The embodiment of the present invention also provides a kind of device of video clipping, and the device is for realizing above-described embodiment and embodiment party Formula, the descriptions that have already been made will not be repeated.As used below, the software of predetermined function may be implemented in term " module " And/or the combination of hardware.Although device described in following embodiment can be realized with software, hardware or software Realization with the combination of hardware is also that may and be contemplated.

As shown in fig. 7, the device of the video clipping of the embodiment of the present invention, comprising:

Image set obtains module 701, for obtaining the video figure of video to be clipped each time point in multiple time points Picture forms image collection；

Categorization module 702, for classifying to all video images in described image set；

Editing module 703 is chosen for obtaining the video image for the classification chosen from described image set according to described The video image of classification generate corresponding classification video.

As shown in figure 8, in one embodiment, described device further include:

First generation module 704, for generating the corresponding thumbnail of each video image in described image set.

In one embodiment, described device further include:

First display module 705, for showing the corresponding thumbnail of each video image in described image set.

In one embodiment, the categorization module 702, is used for:

Classified by deep learning algorithm to all video images in described image set.

In one embodiment, the categorization module 702, is also used to through deep learning algorithm in described image set Before all video images are classified, all video images in described image set are pre-processed.

In one embodiment, the categorization module 702 is also used to: being carried out to all video images in described image set After classification, classification marker is arranged to the video image of each classification；

The editing module, for obtaining the video image labeled as the classification chosen from described image set.

In one embodiment, described device further include:

Second generation module 706 generates thumbnail for selecting a video image from the video image of each classification.

In one embodiment, described device further include:

Second display module 707, for showing the corresponding thumbnail of each classification.

In one embodiment, described device further include:

First receiving module 708, for receiving the first user instruction, first user instruction is used to indicate point chosen Class.

In one embodiment, described device further include:

Third generation module 709, for generating the corresponding thumbnail of all video images of the classification chosen.

In one embodiment, described device further include:

Third display module 710, for showing the corresponding thumbnail of all video images of the classification chosen.

In one embodiment, described device further include:

Second receiving module 711, for receiving second user instruction, the second user instruction is used to indicate described in generation Classification video.

In one embodiment, the editing module 703, is used for:

The corresponding video clip of the video image of the classification is merged and encoded, the classification video is generated.

The embodiment of the present invention also provides a kind of electronic equipment, and the electronic equipment can be PC, cloud device, shifting Dynamic equipment such as smart phone or tablet computer etc..The electronic equipment includes:

Processor；

For storing the memory of the processor-executable instruction；

Wherein, the processor is for performing the following operations:

Classify to all video images in described image set；

In one embodiment, the processor is for performing the following operations:

In the video image for obtaining video each time point in multiple time points to be clipped, after forming image collection, Generate the corresponding thumbnail of each video image in described image set.

In one embodiment, the processor is for performing the following operations:

In generating described image set after the corresponding thumbnail of each video image, show every in described image set The corresponding thumbnail of a video image.

In one embodiment, the processor is for performing the following operations:

Before being classified by deep learning algorithm to all video images in described image set, to the figure All video images in image set conjunction are pre-processed.

In one embodiment, the processor is for performing the following operations:

Described image data are carried out data type conversion by the image data for inputting the video image, are carried out data and are returned One change and whitening processing.

In one embodiment, the processor is for performing the following operations:

After classifying to all video images in described image set, the video image of each classification is arranged Classification marker；

The video image labeled as the classification chosen is obtained from described image set.

In one embodiment, the processor is for performing the following operations:

After classifying to all video images in described image set, selected from the video image of each classification A video image is selected, thumbnail is generated.

In one embodiment, the processor is for performing the following operations:

A video image is selected in the video image from each classification, after generating thumbnail, shows each classification pair The thumbnail answered.

In one embodiment, the processor is for performing the following operations:

Before the video image for obtaining the classification chosen in described image set, the first user instruction is being received, it is described First user instruction is used to indicate the classification chosen.

In one embodiment, the processor is for performing the following operations:

In the institute for the classification after the video image for obtaining the classification chosen in described image set, chosen described in generation There is the corresponding thumbnail of video image.

In one embodiment, the processor is for performing the following operations:

After the corresponding thumbnail of all video images for the classification chosen described in the generation, the classification chosen described in display The corresponding thumbnail of all video images.

In one embodiment, the processor is for performing the following operations:

Before the video image for the classification chosen according to generates corresponding classification video, receives second user and refer to It enables, the second user instruction, which is used to indicate, generates the classification video.

In one embodiment, the processor is for performing the following operations:

The embodiment of the present invention also provides a kind of computer readable storage medium, is stored with computer executable instructions, described The method that computer executable instructions are used to execute the video clipping.

In the present embodiment, above-mentioned storage medium can include but is not limited to: USB flash disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. The various media that can store program code.

Obviously, it is logical to should be understood that the module of the above-mentioned embodiment of the present invention or step can be used by those skilled in the art Computing device realizes that they can be concentrated on a single computing device, or be distributed in multiple computing device institutes group At network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are deposited Storage is performed by computing device in the storage device, and in some cases, can be to be different from sequence execution institute herein The step of showing or describing, perhaps they are fabricated to integrated circuit modules or by the multiple modules or step in them Suddenly single integrated circuit module is fabricated to realize.In this way, the embodiment of the present invention is not limited to any specific hardware and software In conjunction with.

Although disclosed herein embodiment it is as above, the content only for ease of understanding the present invention and use Embodiment is not intended to limit the invention.Technical staff in any fields of the present invention is taken off not departing from the present invention Under the premise of the spirit and scope of dew, any modification and variation, but the present invention can be carried out in the form and details of implementation Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.

Claims

1. a kind of method of video clipping, comprising:

Classify to all video images in described image set；

The video image that the classification chosen is obtained from described image set is generated according to the video image of the classification chosen Corresponding classification video.

2. the method as described in claim 1, which is characterized in that it is described obtain video to be clipped it is each in multiple time points when Between the video image put, after forming image collection, further includes:

Generate the corresponding thumbnail of each video image in described image set.

3. method according to claim 2, which is characterized in that each video image is corresponding in the generation described image set Thumbnail after, further includes:

Show the corresponding thumbnail of each video image in described image set.

4. the method as described in claim 1, which is characterized in that all video images in described image set carry out Classification includes:

5. method as claimed in claim 4, which is characterized in that it is described by deep learning algorithm in described image set Before all video images are classified, further includes:

All video images in described image set are pre-processed.

6. method as claimed in claim 5, which is characterized in that all video images in described image set carry out Pretreatment includes:

Described image data are carried out data type conversion, carry out data normalization by the image data for inputting the video image And whitening processing.

7. the method as described in claim 1, which is characterized in that

After all video images in described image set are classified, further includes: to the video figure of each classification As setting classification marker；

The video image that the classification chosen is obtained from described image set, comprising: mark is obtained from described image set It is denoted as the video image of the classification chosen.

8. the method as described in claim 1, which is characterized in that all video images in described image set carry out After classification, further includes:

A video image is selected from the video image of each classification, generates thumbnail.

9. method according to claim 8, which is characterized in that described to select a video figure from the video image of each classification Picture, generate thumbnail after, further includes:

Show the corresponding thumbnail of each classification.

10. the method as described in claim 1, which is characterized in that described to obtain the classification chosen from described image set Before video image, further includes:

The first user instruction is received, first user instruction is used to indicate the classification chosen.

11. the method as described in claim 1, which is characterized in that described to obtain the classification chosen from described image set After video image, further includes:

The corresponding thumbnail of all video images for the classification chosen described in generation.

12. method as claimed in claim 11, which is characterized in that all video images for the classification chosen described in the generation After corresponding thumbnail, further includes:

The corresponding thumbnail of all video images for the classification chosen described in display.

13. the method as described in claim 1, which is characterized in that the video image of the classification chosen according to generates Before corresponding classification video, further includes:

Second user instruction is received, the second user instruction, which is used to indicate, generates the classification video.

14. the method as described in claim 1, which is characterized in that the video image of the classification chosen according to generates Corresponding classification video, comprising:

15. the method as described in any one of claim 1~14, which is characterized in that

The multiple time point is continuous time point.

16. the method as described in any one of claim 1~14, which is characterized in that

The video image is key frame images.

17. a kind of device of video clipping characterized by comprising

Image set obtains module, for obtaining the video image of video to be clipped each time point in multiple time points, forms Image collection；

Editing module, for obtaining the video image for the classification chosen from described image set, according to the classification chosen Video image generate corresponding classification video.

18. device as claimed in claim 17, which is characterized in that further include:

First generation module, for generating the corresponding thumbnail of each video image in described image set.

19. device as claimed in claim 18, which is characterized in that further include:

First display module, for showing the corresponding thumbnail of each video image in described image set.

20. device as claimed in claim 17, which is characterized in that the categorization module is used for:

21. device as claimed in claim 20, which is characterized in that

The categorization module is also used to classify to all video images in described image set by deep learning algorithm Before, all video images in described image set are pre-processed.

22. device as claimed in claim 17, which is characterized in that

The categorization module is also used to: after classifying to all video images in described image set, to each classification Video image be arranged classification marker；

23. device as claimed in claim 17, which is characterized in that further include:

Second generation module generates thumbnail for selecting a video image from the video image of each classification.

24. device as claimed in claim 23, which is characterized in that further include:

Second display module, for showing the corresponding thumbnail of each classification.

25. device as claimed in claim 17, which is characterized in that further include:

First receiving module, for receiving the first user instruction, first user instruction is used to indicate the classification chosen.

26. device as claimed in claim 17, which is characterized in that further include:

Third generation module, for generating the corresponding thumbnail of all video images of the classification chosen.

27. device as claimed in claim 26, which is characterized in that further include:

Third display module, for showing the corresponding thumbnail of all video images of the classification chosen.

28. device as claimed in claim 17, which is characterized in that further include:

Second receiving module, for receiving second user instruction, the second user instruction, which is used to indicate, generates the classification view Frequently.

29. device as claimed in claim 17, which is characterized in that the editing module is used for:

30. a kind of electronic equipment characterized by comprising

Processor；

For storing the memory of the processor-executable instruction；

Wherein, the processor is for performing the following operations:

Classify to all video images in described image set；