CN110347876A - Video classification methods, device, terminal device and computer readable storage medium - Google Patents
Video classification methods, device, terminal device and computer readable storage medium Download PDFInfo
- Publication number
- CN110347876A CN110347876A CN201910628108.8A CN201910628108A CN110347876A CN 110347876 A CN110347876 A CN 110347876A CN 201910628108 A CN201910628108 A CN 201910628108A CN 110347876 A CN110347876 A CN 110347876A
- Authority
- CN
- China
- Prior art keywords
- face
- video
- video frame
- classification
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/71—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/75—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Abstract
The application is suitable for visual classification technical field, provides video classification methods, device, terminal device and computer readable storage medium, comprising: executes picture quality screening operation to video file, obtains the video frame that picture quality is met the requirements;Face datection operation is executed to the video frame, determines face in the information of video frame, the face includes face quantity and position in the information of video frame;The face of respective numbers is cut out according to the face quantity and position, and the face for extracting the face represents feature;Feature is represented according to the face to classify to the face.By the above method, the extraction speed that video human face represents feature can be greatly improved, also, is classified since the face according to acquisition represents feature to face, this improves the accuracy of classification results.
Description
Technical field
The application belongs to visual classification technical field more particularly to video classification methods, device, terminal device and computer
Readable storage medium storing program for executing.
Background technique
Human classification belongs to a subtask in visual classification, and personage's visual classification refers to a given video clip,
Classified according to the personage that video clip includes to video.
Current video classification mainstream algorithm include: based on shot and long term memory network (Long Short-Term Memory,
LSTM) and based on 3 dimensions convolutional neural networks (3D Convolutional Neural Networks, 3D-CNN).Wherein, classical
The scheme based on LSTM usually combine two kinds of models of LSTM and CNN, this combination takes full advantage of LSTM temporal aspect and mentions
The advantage that the advantage and CNN space characteristics taken is extracted.And 3D-CNN is then to be extended derivative on the basis of the CNN of 2D to obtain.
Both methods all shows good effect in visual classification task, but since the model of use is too complicated,
Therefore it is difficult to be applied directly in production environment.Especially 3D-CNN scheme, the program can training parameter expansions by convolutional network
A dimension has been opened up, number of parameters is caused to sharply increase, and the complexity that a large amount of network parameter directly results in network training is spent
High and computing resource is largely consumed.
Therefore, it is desirable to provide the new method of one kind is to solve the above technical problems.
Summary of the invention
In view of this, the embodiment of the present application provides video classification methods, device, terminal device and computer-readable storage
Medium, to solve the problems, such as to be difficult in the prior art quickly to classify to the face of video file.
The first aspect of the embodiment of the present application provides a kind of video classification methods, comprising:
Picture quality screening operation is executed to video file, obtains the video frame that picture quality is met the requirements;
Face datection operation is executed to the video frame, determines face in the information of video frame, the face is in video frame
Information include face quantity and position;
Spy is represented according to the face that the face quantity and position cut out the face of respective numbers, and extracts the face
Sign;
Feature is represented according to the face to classify to the face.
The second aspect of the embodiment of the present application provides a kind of visual classification device, comprising:
Picture quality screening unit obtains picture quality satisfaction for executing picture quality screening operation to video file
It is required that video frame;
Face datection unit, for executing Face datection operation to the video frame, determine face in the information of video frame,
The face includes face quantity and position in the information of video frame;
Face represents feature extraction unit, for cutting out the face of respective numbers according to the face quantity and position,
And the face for extracting the face represents feature;
Face classification unit classifies to the face for representing feature according to the face.
The third aspect of the embodiment of the present application provides a kind of terminal device, including memory, processor and is stored in
In the memory and the computer program that can run on the processor, when the processor executes the computer program
The step of realizing method as described in relation to the first aspect.
The fourth aspect of the embodiment of the present application provides a kind of computer readable storage medium, the computer-readable storage
Media storage has computer program, and the step of method as described in relation to the first aspect is realized when the computer program is executed by processor
Suddenly.
Existing beneficial effect is the embodiment of the present application compared with prior art:
Due to first executing picture quality screening operation to video file, the video file of redundancy can be effectively reduced,
Also, it, therefore, can be from the face cut out by executing the operation such as Face datection, cutting to the video file after screening
Quickly determine that face represents feature, thus when representing the classification of feature realization face according to the face, without complexity
Model, greatly improve the speed of face classification, also, divide face since the face according to acquisition represents feature
Class, this improves the accuracy of classification results.
Detailed description of the invention
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only some of the application
Embodiment for those of ordinary skill in the art without creative efforts, can also be attached according to these
Figure obtains other attached drawings.
Fig. 1 is the flow diagram of the first video classification methods provided by the embodiments of the present application;
Fig. 2 is the flow diagram of second of video classification methods provided by the embodiments of the present application;
Fig. 3 is the structural schematic diagram of visual classification device provided by the embodiments of the present application;
Fig. 4 is the schematic diagram of terminal device provided by the embodiments of the present application.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed
Body details, so as to provide a thorough understanding of the present application embodiment.However, it will be clear to one skilled in the art that there is no these specific
The application also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity
The detailed description of road and method, so as not to obscure the description of the present application with unnecessary details.
In order to illustrate technical solution described herein, the following is a description of specific embodiments.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " instruction is described special
Sign, entirety, step, operation, the presence of element and/or component, but be not precluded one or more of the other feature, entirety, step,
Operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this present specification merely for the sake of description specific embodiment
And be not intended to limit the application.As present specification and it is used in the attached claims, unless on
Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in present specification and the appended claims is
Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
As used in this specification and in the appended claims, term " if " can be according to context quilt
Be construed to " when ... " or " once " or " in response to determination " or " in response to detecting ".Similarly, phrase " if it is determined that " or
" if detecting [described condition or event] " can be interpreted to mean according to context " once it is determined that " or " in response to true
It is fixed " or " once detecting [described condition or event] " or " in response to detecting [described condition or event] ".
In the specific implementation, terminal device described in the embodiment of the present application is including but not limited to such as with the sensitive table of touch
Mobile phone, laptop computer or the tablet computer in face (for example, touch-screen display and/or touch tablet) etc it is other
Portable device.It is to be further understood that in certain embodiments, above equipment is not portable communication device, but is had
The desktop computer of touch sensitive surface (for example, touch-screen display and/or touch tablet).
In following discussion, the terminal device including display and touch sensitive surface is described.However, should manage
Solution, terminal device may include that one or more of the other physical User of such as physical keyboard, mouse and/or control-rod connects
Jaws equipment.
Terminal device supports various application programs, such as one of the following or multiple: drawing application program, demonstration application
Program, word-processing application, website creation application program, disk imprinting application program, spreadsheet applications, game are answered
With program, telephony application, videoconference application, email application, instant messaging applications, forging
Refining supports application program, photo management application program, digital camera application program, digital camera application program, web-browsing to answer
With program, digital music player application and/or video frequency player application program.
At least one of such as touch sensitive surface can be used in the various application programs that can be executed on the terminal device
Public physical user-interface device.It can be adjusted among applications and/or in corresponding application programs and/or change touch is quick
Feel the corresponding information shown in the one or more functions and terminal on surface.In this way, terminal public physical structure (for example,
Touch sensitive surface) it can support the various application programs with user interface intuitive and transparent for a user.
Embodiment:
Fig. 1 shows the flow chart of the first video classification methods provided by the embodiments of the present application, in the present embodiment, mainly
The face for including for video file is classified, and details are as follows:
Step S11 executes picture quality screening operation to video file, obtains the video frame that picture quality is met the requirements;
Wherein, it carries out picture quality to video frame to screen including: to judge whether the edge of foreground object meets the requirements, judges
Whether the noise of video frame meets the requirements, whether brightness value meets the requirements, whether the dispersion degree of gray value meets the requirements etc..
In the present embodiment, the above-mentioned any one or several picture quality screening operation enumerated can be used (or using upper
State the picture quality screening operation that do not enumerate) Screening Treatment is carried out to the video frame of video file, it is not construed as limiting herein.
Due to first carrying out primary screening operation to video file, subsequent video to be treated can be effectively reduced
The quantity of frame.
In some embodiments, since the key frame of video file includes more information, in order to be further reduced
The video frame of redundancy, then above-mentioned steps S11 specifically:
Picture quality screening operation is executed to the key frame of video file, obtains the video frame that picture quality is met the requirements.
In the present embodiment, due to being to execute picture quality screening operation to key frame, the video frame after screening is also
Key frame.
Step S12 executes Face datection operation to the video frame, determines face in the information of video frame, the face
It include face quantity and position in the information of video frame;
Specifically, the position of face and face quantity in detection video frame are detected using human-face detector.For example, detecting
The choosing of position upper ledge (as selected with rectangle circle) of the face arrived the face.If a video frame includes 2 or 2 or more people
Face, then frame selects the face detected respectively.
Step S13 cuts out the face of respective numbers according to the face quantity and position, and extracts the people of the face
Face represents feature;
Wherein, face represents feature and refers to feature representative in face characteristic.In the step, due to only cutting
Face is extracted on face out and represents feature, it is therefore not necessary to which extracting face from entire video frame represents feature, improves face
Represent the speed of feature extraction.
Step S14 represents feature according to the face and classifies to the face.
Specifically, feature is represented according to the face of acquisition and classifier (such as support vector machines) divides face
Class, to realize the semantic understanding to video frame.
In the embodiment of the present application, picture quality screening operation is executed to video file, obtains what picture quality was met the requirements
Video frame executes Face datection operation to the video frame, determines face in the information of video frame, the face is in video frame
Information includes face quantity and position, the face of respective numbers is cut out according to the face quantity and position, and described in extraction
The face of face represents feature, represents feature according to the face and classifies to the face.Due to first being held to video file
Therefore row picture quality screening operation can effectively reduce the video file of redundancy, also, by the video text after screening
Part executes the operation such as Face datection, cutting, therefore, can quickly determine that face represents feature from the face cut out, from
And when representing the classification of feature realization face according to the face, without complicated model, greatly improve face classification
Speed, also, classify since the face according to acquisition represents feature to face, this improves the standards of classification results
Exactness.
In order to further increase the accuracy that the face of extraction represents feature, Fig. 2 shows provided by the embodiments of the present application
The flow chart of second of video classification methods also needs to execute face alignment behaviour after cutting out face in the present embodiment
Make, finally just extract face from the face after alignment and represent feature, wherein step S21 and step S22 and above-mentioned steps S11 and
Step S12 is identical, and details are not described herein again:
Step S21 executes picture quality screening operation to video file, obtains the video frame that picture quality is met the requirements;
Step S22 executes Face datection operation to the video frame, determines face in the information of video frame, the face
It include face quantity and position in the information of video frame;
Step S23 cuts out the face of respective numbers according to the face quantity and position, and holds to the face cut out
Row alignment operation;
The alignment operation of the step, which refers to, executes the operation such as translation or rotation to the face cut out, the people after guaranteeing operation
Face meets preset requirement, for example meets the positive requirement for facing user.
In some embodiments, due to movement of personage etc., so there may be differences for the user oriented angle of face, and
The positive face for facing user is usually only needed, at this point, alignment operation is specially to execute the behaviour such as translation and/or rotation to face
Make, so that the face forward direction after operation faces user.In order to execute alignment operation, be arranged the face video frame information also
Including face anchor point, at this point, the step S23 includes:
The face that respective numbers are cut out according to the face quantity and position is cut out further according to the rotation of face anchor point
Face, with, at positive face, the face of the forward direction is as the face after alignment the face normalization.
In the present embodiment, face anchor point is used to identify the fixed character of face, such as canthus, the corners of the mouth and nose equipotential
It sets, in face alignment procedure, face is cut out to come inside whole image first with face frame, further according to face anchor point people
Face rotation correction is at a positive face, the case where to filter out side face or inclined face.
Step S24, the face characteristic of the face after extracting alignment, clusters the face characteristic of extraction, and right
Each obtained classification that clusters carries out feature pool, and the face obtained under corresponding classification represents feature;
Specifically, a face is characterized in referring to one group of abstract number, cluster, which refers to, gathers the face characteristic of output
Class.When cluster, it is assumed that there are 10 face characteristics, i.e., the vectors of 10 100 dimensions, after clustering algorithm Cluster,
Assuming that the vector that this 10 100 are tieed up is divided into 3 groups of A, B, C, its A group and B component do not contain 3 vectors, and C group contains 4
A vector.Inside each group, by taking A group as an example, 3 vectors of A group are subjected to mean value pond, i.e., addition of vectors is again divided by 3.
The center vector of the A group obtained in this way represents feature as face.Other groups can also obtain face generation by same process
Table feature.
Face is represented feature and is input in preparatory trained classifier, obtained the face and represent feature by step S25
Corresponding face classification.
In the step, by classifier, similar or identical face can be represented the corresponding face of feature be divided into it is same
Classification.For example, when the face of a face represents feature and the face of another face, to represent feature identical, then this classifier mould
Type will determine that the two faces are same category of face.
In the present embodiment, due to just extracting face after executing alignment operation to face and represent feature, thus it is guaranteed that extract
It is from the same angle extraction, so that the face for improving extraction represents the accuracy of feature that face, which represents feature all,.
In some embodiments, in order to quickly determine video frame that picture quality is met the requirements, the step S21 (or
Step S11), comprising:
A1, as unit of video frame, calculate at least one following parameter value of the video frame of video file: image mean value,
Graphics standard difference and image averaging gradient;
Wherein, image mean value refers to the average value of image pixel, and the image mean value of a video frame has reacted the video frame
Mean picture brightness, average brightness is bigger, and picture quality is better.For example, it is assumed that calculating the image mean value of F video frame, the figure
Picture size is (M, N), and M is the line number of image array, and N is the columns of image array, then the meter of the image mean value u of the F video frame
Calculate formula are as follows:
Wherein, graphics standard difference refers to dispersion degree of the image pixel gray level value relative to average value, if standard deviation is got over
Greatly, show that grey level distribution is more dispersed in image, picture quality is also better, it is assumed that the graphics standard for calculating F video frame is poor, should
Image size is (M, N), and M is the line number of image array, and N is the columns of image array, then the graphics standard of the F video frame is poor
The calculation formula of std are as follows:
Wherein, image averaging gradient can reflect details contrast and texture transformation in image, it is reflected to a certain extent
The readability of image, image averaging gradientCalculation formula are as follows:
In above formula, ((F (i, j) respectively indicates a scale of the pixel (i, j) on the direction x y to Δ x by F (i, j), Δ y
Point.
If A2, the parameter value calculated determine the corresponding view of the parameter value both greater than or equal to preset parameter threshold
The picture quality of frequency frame is met the requirements, the preset parameter threshold include it is following at least one: image mean value threshold value, image mark
Quasi- difference threshold value and image averaging Grads threshold.
In the step, if calculate video frame parameter value be image mean value, only judge the image mean value whether be greater than or
Equal to preset image mean value threshold value, if more than or be equal to, then determine that the picture quality of the video frame is met the requirements;If calculating
Parameter value is that image mean value and graphics standard are poor, then needs to judge simultaneously whether the image mean value is greater than or equal to preset image
Mean value threshold value, and, judge whether the graphics standard difference is greater than or equal to preset graphics standard difference threshold value, if image mean value is big
In or be equal to preset image mean value threshold value, and graphics standard difference also greater than or be equal to the poor threshold value of preset graphics standard, then sentence
The picture quality of the fixed video frame is met the requirements.When the parameter value of calculating is other situations, then its judgment rule is similar to the above,
Details are not described herein again.
According to draw above as the calculation formula of mean value, graphics standard difference and image averaging gradient it is found that the application is implemented
For example when executing picture quality screening, corresponding computation complexity is all lower, so as to quickly determine first set packet
The video frame included.
In some embodiments, since corresponding video tab can be obtained after classification, at this point, can be according to the video tab
Specified operation is executed to other designated files, at this point, after the step S14, comprising:
Specified operation is executed to designated file according to the video tab obtained after classification, the specified operation includes inspection
Rope and management.
In the present embodiment, due to executing search operaqtion to designated file according to the video tab obtained after classification, from
And realize the intelligent retrieval to the file content of video file.
In some embodiments, described according to the video tab pair obtained after classification if the specified operation is management
Designated file executes specified operation, comprising:
The self-defined information for receiving user's input carries out the video tab obtained after classification according to the self-defined information
Sort out, categorizing operation is executed to designated file according to categorization results.
Wherein, self-defined information can be household, friend, colleague etc..After being sorted out by self-defined information to video tab,
Corresponding categorizing operation can be carried out to specified file according to the video tab after classification, for example, wrapping in designated file
When including " household " this video tab, which is classified as to the video file of " household " classification.
In actual conditions, terminal device (such as mobile phone) generally selects video file in the video file of display storage
Cover of the first frame as the video file.But since the information content that the first frame of video file includes is usually seldom, for example,
Video file is when shooting the video file of personage, and the first frame of the video file is likely to the figure viewed from behind image of the only personage,
Or the image for other scenery.Therefore, user is difficult to obtain more effective informations in the cover that video file is shown.
In order to allow user quickly know storage video file content of shooting, in some embodiments, the step S14 it
Afterwards, comprising:
Select a video frame as the cover frame of the video file from the video frame after face classification.
In the step, it can arbitrarily select a video frame as the cover frame of video file from sorted video frame.
Wherein, the cover frame of the step refers to the video frame that a video file is shown, since the cover frame of selection is
It is selected from the video frame after face classification, thus it is guaranteed that the cover frame includes the information of face, and the information content that face includes
It is larger so that user from cover frame quick obtaining to more information.
Further, since a video file generally comprises multiple faces, and different faces are in the important of video file
Degree is different, therefore, in order to which the video frame where prior face to be selected as to the cover frame of video file, then described
Select a video frame as the cover frame of the video file from the video frame after face classification, comprising:
B1, the quantity for counting the video frame that each classification includes;
If video file includes 2 or 2 or more faces, face can be divided into pair according to above-mentioned steps
It in the classification answered, i.e., is the video frame comprising identical face under one classification (or classification), when the video frame that a classification includes
It is more, show that its shared ratio in video file is bigger, the importance of corresponding face is also bigger.
B2, from including select one video frame frame literary as the video under classification corresponding to the quantity of maximum video frame
The cover frame of part.
In the present embodiment, since cover frame is from the video for including selection under classification corresponding to the quantity of maximum video frame
Frame, thereby it is ensured that information content contained by the cover frame of the video file is bigger.
In some embodiments, user wishes to sort out the video file of shooting, i.e., by the view including same face
Frequency file is returned in same file folder, at this point, after the step S14, comprising:
By the video file deposit same file folder where the face with same classification.
It, can be by the face institute with same classification after the face to multiple video files is classified in the present embodiment
Video file deposit same file folder.For example, being created different after the face to multiple video files carries out the mankind
File, and the video file deposit same file where the face with same classification is pressed from both sides.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present application constitutes any limit
It is fixed.
Fig. 3 shows a kind of structural schematic diagram of visual classification device provided by the embodiments of the present application, visual classification dress
It sets and is applied in terminal device, for ease of description, illustrate only part relevant to the embodiment of the present application.
The visual classification device includes: that picture quality screening unit 31, Face datection unit 32, face represent feature extraction
Unit 33, face classification unit 34.Wherein:
It is full to obtain picture quality for executing picture quality screening operation to video file for picture quality screening unit 31
The video frame required enough;
Wherein, it carries out picture quality to video frame to screen including: to judge whether the edge of foreground object meets the requirements, judges
Whether the noise of video frame meets the requirements, whether brightness value meets the requirements, whether the dispersion degree of gray value meets the requirements etc..
In some embodiments, since the key frame of video file includes more information, in order to be further reduced
The video frame of redundancy, then above-mentioned picture quality screening unit 31 is specifically used for:
Picture quality screening operation is executed to the key frame of video file, obtains the video frame that picture quality is met the requirements.
Face datection unit 32 determines face in the letter of video frame for executing Face datection operation to the video frame
Breath, the face includes face quantity and position in the information of video frame;
Face represents feature extraction unit 33, for cutting out the people of respective numbers according to the face quantity and position
Face, and the face for extracting the face represents feature;
Face classification unit 34 classifies to the face for representing feature according to the face.
In the embodiment of the present application, due to first executing picture quality screening operation to video file, can effectively reduce
The video file of redundancy, also, by executing the operation such as Face datection, cutting, therefore, Neng Goucong to the video file after screening
Quickly determine that face represents feature in the face cut out, thus representing the classification that feature realizes face according to the face
When, without complicated model, the speed of face classification is greatly improved, also, since the face according to acquisition represents spy
Sign classifies to face, and this improves the accuracy of classification results.
In order to further increase the accuracy that the face of extraction represents feature, the face represents the packet of feature extraction unit 33
It includes:
Face alignment module, for cutting out the face of respective numbers according to the face quantity and position, and to cutting
Face out executes alignment operation;
Face characteristic cluster module, for extracting the face characteristic of the face after being aligned, to the face characteristic of extraction
It is clustered, and feature pool is carried out to each obtained classification that clusters, the face obtained under corresponding classification represents feature;
Accordingly, the face classification unit 34 is specifically used for:
Face is represented feature to be input in preparatory trained sorter model, the face is obtained and represents feature correspondence
Face classification.
In some embodiments, the face further includes face anchor point in the information of video frame, at this point, the face is aligned
Module is specifically used for:
The face that respective numbers are cut out according to the face quantity and position is cut out further according to the rotation of face anchor point
Face, with, at positive face, the face of the forward direction is as the face after alignment the face normalization.
In some embodiments, in order to quickly determine video frame that picture quality is met the requirements, described image quality sieve
Menu member 31 includes:
Video frame parameter calculating module, for as unit of video frame, calculate the video frame of video file it is following at least
One parameter value: image mean value, graphics standard difference and image averaging gradient;
Whether picture quality meets the requirements judgment module, if the parameter value for calculating is both greater than or equal to preset parameter
Threshold value then determines that the picture quality of the corresponding video frame of the parameter value is met the requirements, the preset parameter threshold include with
Descend at least one: image mean value threshold value, graphics standard difference threshold value and image averaging Grads threshold.
In some embodiments, since the video tab of corresponding human face can be obtained after classification, at this point, can be according to the face
Video tab specified operation is executed to other designated files, at this point, the visual classification device 3 further include:
Specified operation execution unit, for executing specified behaviour to designated file according to the video tab obtained after classification
Make, the specified operation includes retrieval and management.
In the present embodiment, due to executing search operaqtion to designated file according to the video tab obtained after classification, from
And realize the intelligent retrieval to the file content of video file.
In some embodiments, if the specified operation is management, the specified operation execution unit is specifically used for:
The self-defined information for receiving user's input carries out the video tab obtained after classification according to the self-defined information
Sort out, categorizing operation is executed to designated file according to categorization results.
Wherein, self-defined information can be household, friend, colleague etc..After being sorted out by self-defined information to video tab,
Corresponding categorizing operation can be carried out to specified file according to the video tab after classification, for example, wrapping in designated file
When including " household " this video tab, which is classified as to the video file of " household " classification.
In order to allow user quickly to know the content of shooting of storage video file, in some embodiments, the video point
Class device 3 further include:
Arbitrary frame selecting unit, for selecting a video frame as the video text from the video frame after face classification
The cover frame of part.
Further, since a video file generally comprises multiple faces, and different faces are in the important of video file
Degree is different, therefore, in order to which the video frame where prior face to be selected as to the cover frame of video file, then described
Arbitrary frame selecting unit includes:
The quantity statistics module of video frame, for counting the quantity for the video frame that each classification includes;
Important video frame selecting module, for from including one view of selection under classification corresponding to the quantity of maximum video frame
Cover frame of the frequency frame frame as the video file.
In some embodiments, user wishes to sort out the video file of shooting, i.e., by the view including same face
Frequency file is returned in same file folder, at this point, the visual classification device 3 further include:
The face of same classification sorts out unit, same for the video file deposit where having the face of same classification
A file.
It, can be by the face institute with same classification after the face to multiple video files is classified in the present embodiment
Video file deposit same file folder.
Fig. 4 is the schematic diagram for the terminal device that one embodiment of the application provides.As shown in figure 4, the terminal of the embodiment is set
Standby 4 include: processor 40, memory 41 and are stored in the meter that can be run in the memory 41 and on the processor 40
Calculation machine program 42.The processor 40 realizes the step in above-mentioned each embodiment of the method when executing the computer program 42,
Such as step S11 to S14 shown in FIG. 1.Alternatively, the processor 40 realizes above-mentioned each dress when executing the computer program 42
Set the function of each module/unit in embodiment, such as the function of module 31 to 34 shown in Fig. 3.
Illustratively, the computer program 42 can be divided into one or more module/units, it is one or
Multiple module/units are stored in the memory 41, and are executed by the processor 40, to complete the application.Described one
A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for
Implementation procedure of the computer program 42 in the terminal device 4 is described.For example, the computer program 42 can be divided
It is cut into picture quality screening unit, Face datection unit, face and represents feature extraction unit, face classification unit, each unit tool
Body function is as follows:
Picture quality screening unit obtains picture quality satisfaction for executing picture quality screening operation to video file
It is required that video frame;
Face datection unit, for executing Face datection operation to the video frame, determine face in the information of video frame,
The face includes face quantity and position in the information of video frame;
Face represents feature extraction unit, for cutting out the face of respective numbers according to the face quantity and position,
And the face for extracting the face represents feature;
Face classification unit classifies to the face for representing feature according to the face.
The terminal device 4 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set
It is standby.The terminal device may include, but be not limited only to, processor 40, memory 41.It will be understood by those skilled in the art that Fig. 4
The only example of terminal device 4 does not constitute the restriction to terminal device 4, may include than illustrating more or fewer portions
Part perhaps combines certain components or different components, such as the terminal device can also include input-output equipment, net
Network access device, bus etc..
Alleged processor 40 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng.
The memory 41 can be the internal storage unit of the terminal device 4, such as the hard disk or interior of terminal device 4
It deposits.The memory 41 is also possible to the External memory equipment of the terminal device 4, such as be equipped on the terminal device 4
Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge
Deposit card (Flash Card) etc..Further, the memory 41 can also both include the storage inside list of the terminal device 4
Member also includes External memory equipment.The memory 41 is for storing needed for the computer program and the terminal device
Other programs and data.The memory 41 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function
Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different
Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing
The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also
To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list
Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system
The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
Scope of the present application.
In embodiment provided herein, it should be understood that disclosed device/terminal device and method, it can be with
It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute
The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as
Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately
A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device
Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or
In use, can store in a computer readable storage medium.Based on this understanding, the application realizes above-mentioned implementation
All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program
Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on
The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation
Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium
It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code
Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM,
Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described
The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice
Subtract, such as does not include electric carrier signal and electricity according to legislation and patent practice, computer-readable medium in certain jurisdictions
Believe signal.
Embodiment described above is only to illustrate the technical solution of the application, rather than its limitations;Although referring to aforementioned reality
Example is applied the application is described in detail, those skilled in the art should understand that: it still can be to aforementioned each
Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified
Or replacement, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution should all
Comprising within the scope of protection of this application.
Claims (10)
1. a kind of video classification methods characterized by comprising
Picture quality screening operation is executed to video file, obtains the video frame that picture quality is met the requirements;
Face datection operation is executed to the video frame, determines face in the information of video frame, letter of the face in video frame
Breath includes face quantity and position;
The face of respective numbers is cut out according to the face quantity and position, and the face for extracting the face represents feature;
Feature is represented according to the face to classify to the face.
2. video classification methods as described in claim 1, which is characterized in that described to be cut according to the face quantity and position
The face of respective numbers out, and the face for extracting the face represents feature, comprising:
The face of respective numbers is cut out according to the face quantity and position, and alignment operation is executed to the face cut out;
The face characteristic of the face after extracting alignment, clusters the face characteristic of extraction, and obtains to each cluster
Classification carry out feature pool, the face obtained under corresponding classification represents feature;
It is accordingly, described feature is represented according to the face to classify to the face specifically:
Face is represented feature to be input in preparatory trained classifier, the face is obtained and represents the corresponding face class of feature
Not.
3. video classification methods as claimed in claim 2, which is characterized in that the face further includes people in the information of video frame
Face anchor point, at this point, the face that respective numbers are cut out according to the face quantity and position, and the face cut out is held
Row alignment operation, specifically includes:
The face that respective numbers are cut out according to the face quantity and position rotates the people cut out further according to face anchor point
Face, with, at positive face, the face of the forward direction is as the face after alignment the face normalization.
4. video classification methods as described in claim 1, which is characterized in that described to execute picture quality screening to video file
Operation, obtains the video frame that picture quality is met the requirements, comprising:
As unit of video frame, at least one following parameter value of the video frame of video file: image mean value, graphics standard is calculated
Difference and image averaging gradient;
If the parameter value calculated determines the figure of the corresponding video frame of the parameter value both greater than or equal to preset parameter threshold
Image quality amount is met the requirements, the preset parameter threshold include it is following at least one: image mean value threshold value, graphics standard difference threshold value
With image averaging Grads threshold.
5. video classification methods as described in claim 1, which is characterized in that represent feature to institute according to the face described
It states after face classified, comprising:
Specified operation executed to designated file according to the video tab obtained after classification, the specified operation include retrieval and
Management.
6. video classification methods as claimed in claim 5, which is characterized in that if the specified operation is to manage, described
Specified operation is executed to designated file according to the video tab obtained after classification, comprising:
The self-defined information for receiving user's input, returns the video tab obtained after classification according to the self-defined information
Class;
Categorizing operation is executed to designated file according to categorization results.
7. the video classification methods as described in claim 1 to 6, which is characterized in that represent feature according to the face described
After classifying to the face, comprising:
By the video file deposit same file folder where the face with same classification.
8. a kind of visual classification device characterized by comprising
Picture quality screening unit obtains picture quality and meets the requirements for executing picture quality screening operation to video file
Video frame;
Face datection unit, for executing Face datection operation to the video frame, determine face in the information of video frame, it is described
Face includes face quantity and position in the information of video frame;
Face represents feature extraction unit, for cutting out the face of respective numbers according to the face quantity and position, and takes out
The face of the face is taken to represent feature;
Face classification unit classifies to the face for representing feature according to the face.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor
The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program
The step of any one the method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In when the computer program is executed by processor the step of any one of such as claim 1 to 7 of realization the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910628108.8A CN110347876A (en) | 2019-07-12 | 2019-07-12 | Video classification methods, device, terminal device and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910628108.8A CN110347876A (en) | 2019-07-12 | 2019-07-12 | Video classification methods, device, terminal device and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110347876A true CN110347876A (en) | 2019-10-18 |
Family
ID=68175903
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910628108.8A Pending CN110347876A (en) | 2019-07-12 | 2019-07-12 | Video classification methods, device, terminal device and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110347876A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111177086A (en) * | 2019-12-27 | 2020-05-19 | Oppo广东移动通信有限公司 | File clustering method and device, storage medium and electronic equipment |
CN111553191A (en) * | 2020-03-30 | 2020-08-18 | 深圳壹账通智能科技有限公司 | Video classification method and device based on face recognition and storage medium |
CN111881755A (en) * | 2020-06-28 | 2020-11-03 | 腾讯科技(深圳)有限公司 | Method and device for cutting video frame sequence |
CN112101154A (en) * | 2020-09-02 | 2020-12-18 | 腾讯科技(深圳)有限公司 | Video classification method and device, computer equipment and storage medium |
CN113705386A (en) * | 2021-08-12 | 2021-11-26 | 北京有竹居网络技术有限公司 | Video classification method and device, readable medium and electronic equipment |
CN115205768A (en) * | 2022-09-16 | 2022-10-18 | 山东百盟信息技术有限公司 | Video classification method based on resolution self-adaptive network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104408429A (en) * | 2014-11-28 | 2015-03-11 | 北京奇艺世纪科技有限公司 | Method and device for extracting representative frame of video |
US20170039419A1 (en) * | 2015-08-05 | 2017-02-09 | Canon Kabushiki Kaisha | Information processing apparatus and control method of the same |
CN107644213A (en) * | 2017-09-26 | 2018-01-30 | 司马大大(北京)智能系统有限公司 | Video person extraction method and device |
CN109151501A (en) * | 2018-10-09 | 2019-01-04 | 北京周同科技有限公司 | A kind of video key frame extracting method, device, terminal device and storage medium |
-
2019
- 2019-07-12 CN CN201910628108.8A patent/CN110347876A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104408429A (en) * | 2014-11-28 | 2015-03-11 | 北京奇艺世纪科技有限公司 | Method and device for extracting representative frame of video |
US20170039419A1 (en) * | 2015-08-05 | 2017-02-09 | Canon Kabushiki Kaisha | Information processing apparatus and control method of the same |
CN107644213A (en) * | 2017-09-26 | 2018-01-30 | 司马大大(北京)智能系统有限公司 | Video person extraction method and device |
CN109151501A (en) * | 2018-10-09 | 2019-01-04 | 北京周同科技有限公司 | A kind of video key frame extracting method, device, terminal device and storage medium |
Non-Patent Citations (1)
Title |
---|
都伊林: "《智能安防新发展与应用》", 31 May 2018 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111177086A (en) * | 2019-12-27 | 2020-05-19 | Oppo广东移动通信有限公司 | File clustering method and device, storage medium and electronic equipment |
CN111553191A (en) * | 2020-03-30 | 2020-08-18 | 深圳壹账通智能科技有限公司 | Video classification method and device based on face recognition and storage medium |
CN111881755A (en) * | 2020-06-28 | 2020-11-03 | 腾讯科技(深圳)有限公司 | Method and device for cutting video frame sequence |
CN111881755B (en) * | 2020-06-28 | 2022-08-23 | 腾讯科技(深圳)有限公司 | Method and device for cutting video frame sequence |
CN112101154A (en) * | 2020-09-02 | 2020-12-18 | 腾讯科技(深圳)有限公司 | Video classification method and device, computer equipment and storage medium |
CN112101154B (en) * | 2020-09-02 | 2023-12-15 | 腾讯科技(深圳)有限公司 | Video classification method, apparatus, computer device and storage medium |
CN113705386A (en) * | 2021-08-12 | 2021-11-26 | 北京有竹居网络技术有限公司 | Video classification method and device, readable medium and electronic equipment |
CN115205768A (en) * | 2022-09-16 | 2022-10-18 | 山东百盟信息技术有限公司 | Video classification method based on resolution self-adaptive network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110347876A (en) | Video classification methods, device, terminal device and computer readable storage medium | |
Ma et al. | Pyramidal feature shrinking for salient object detection | |
CN105095902B (en) | Picture feature extracting method and device | |
CN109583449A (en) | Character identifying method and Related product | |
CN109215037A (en) | Destination image partition method, device and terminal device | |
CN108109152A (en) | Medical Images Classification and dividing method and device | |
CN111325271B (en) | Image classification method and device | |
CN109376645A (en) | A kind of face image data preferred method, device and terminal device | |
CN108228844A (en) | A kind of picture screening technique and device, storage medium, computer equipment | |
CN108961267A (en) | Image processing method, picture processing unit and terminal device | |
CN110232318A (en) | Acupuncture point recognition methods, device, electronic equipment and storage medium | |
CN109739223A (en) | Robot obstacle-avoiding control method, device and terminal device | |
CN111126347B (en) | Human eye state identification method, device, terminal and readable storage medium | |
CN109165316A (en) | A kind of method for processing video frequency, video index method, device and terminal device | |
Madan et al. | Synthetically trained icon proposals for parsing and summarizing infographics | |
CN108898082A (en) | Image processing method, picture processing unit and terminal device | |
CN110751218A (en) | Image classification method, image classification device and terminal equipment | |
CN107517312A (en) | A kind of wallpaper switching method, device and terminal device | |
CN109214333A (en) | Convolutional neural networks structure, face character recognition methods, device and terminal device | |
WO2024016812A1 (en) | Microscopic image processing method and apparatus, computer device, and storage medium | |
CN108133020A (en) | Video classification methods, device, storage medium and electronic equipment | |
CN108932703A (en) | Image processing method, picture processing unit and terminal device | |
CN110263741A (en) | Video frame extraction method, apparatus and terminal device | |
Cambuim et al. | An efficient static gesture recognizer embedded system based on ELM pattern recognition algorithm | |
CN108985215A (en) | A kind of image processing method, picture processing unit and terminal device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191018 |