CN110347876A - Video classification methods, device, terminal device and computer readable storage medium - Google Patents

Video classification methods, device, terminal device and computer readable storage medium Download PDF

Info

Publication number
CN110347876A
CN110347876A CN201910628108.8A CN201910628108A CN110347876A CN 110347876 A CN110347876 A CN 110347876A CN 201910628108 A CN201910628108 A CN 201910628108A CN 110347876 A CN110347876 A CN 110347876A
Authority
CN
China
Prior art keywords
face
video
video frame
classification
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910628108.8A
Other languages
Chinese (zh)
Inventor
康健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201910628108.8A priority Critical patent/CN110347876A/en
Publication of CN110347876A publication Critical patent/CN110347876A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Abstract

The application is suitable for visual classification technical field, provides video classification methods, device, terminal device and computer readable storage medium, comprising: executes picture quality screening operation to video file, obtains the video frame that picture quality is met the requirements;Face datection operation is executed to the video frame, determines face in the information of video frame, the face includes face quantity and position in the information of video frame;The face of respective numbers is cut out according to the face quantity and position, and the face for extracting the face represents feature;Feature is represented according to the face to classify to the face.By the above method, the extraction speed that video human face represents feature can be greatly improved, also, is classified since the face according to acquisition represents feature to face, this improves the accuracy of classification results.

Description

Video classification methods, device, terminal device and computer readable storage medium
Technical field
The application belongs to visual classification technical field more particularly to video classification methods, device, terminal device and computer Readable storage medium storing program for executing.
Background technique
Human classification belongs to a subtask in visual classification, and personage's visual classification refers to a given video clip, Classified according to the personage that video clip includes to video.
Current video classification mainstream algorithm include: based on shot and long term memory network (Long Short-Term Memory, LSTM) and based on 3 dimensions convolutional neural networks (3D Convolutional Neural Networks, 3D-CNN).Wherein, classical The scheme based on LSTM usually combine two kinds of models of LSTM and CNN, this combination takes full advantage of LSTM temporal aspect and mentions The advantage that the advantage and CNN space characteristics taken is extracted.And 3D-CNN is then to be extended derivative on the basis of the CNN of 2D to obtain. Both methods all shows good effect in visual classification task, but since the model of use is too complicated, Therefore it is difficult to be applied directly in production environment.Especially 3D-CNN scheme, the program can training parameter expansions by convolutional network A dimension has been opened up, number of parameters is caused to sharply increase, and the complexity that a large amount of network parameter directly results in network training is spent High and computing resource is largely consumed.
Therefore, it is desirable to provide the new method of one kind is to solve the above technical problems.
Summary of the invention
In view of this, the embodiment of the present application provides video classification methods, device, terminal device and computer-readable storage Medium, to solve the problems, such as to be difficult in the prior art quickly to classify to the face of video file.
The first aspect of the embodiment of the present application provides a kind of video classification methods, comprising:
Picture quality screening operation is executed to video file, obtains the video frame that picture quality is met the requirements;
Face datection operation is executed to the video frame, determines face in the information of video frame, the face is in video frame Information include face quantity and position;
Spy is represented according to the face that the face quantity and position cut out the face of respective numbers, and extracts the face Sign;
Feature is represented according to the face to classify to the face.
The second aspect of the embodiment of the present application provides a kind of visual classification device, comprising:
Picture quality screening unit obtains picture quality satisfaction for executing picture quality screening operation to video file It is required that video frame;
Face datection unit, for executing Face datection operation to the video frame, determine face in the information of video frame, The face includes face quantity and position in the information of video frame;
Face represents feature extraction unit, for cutting out the face of respective numbers according to the face quantity and position, And the face for extracting the face represents feature;
Face classification unit classifies to the face for representing feature according to the face.
The third aspect of the embodiment of the present application provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer program that can run on the processor, when the processor executes the computer program The step of realizing method as described in relation to the first aspect.
The fourth aspect of the embodiment of the present application provides a kind of computer readable storage medium, the computer-readable storage Media storage has computer program, and the step of method as described in relation to the first aspect is realized when the computer program is executed by processor Suddenly.
Existing beneficial effect is the embodiment of the present application compared with prior art:
Due to first executing picture quality screening operation to video file, the video file of redundancy can be effectively reduced, Also, it, therefore, can be from the face cut out by executing the operation such as Face datection, cutting to the video file after screening Quickly determine that face represents feature, thus when representing the classification of feature realization face according to the face, without complexity Model, greatly improve the speed of face classification, also, divide face since the face according to acquisition represents feature Class, this improves the accuracy of classification results.
Detailed description of the invention
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only some of the application Embodiment for those of ordinary skill in the art without creative efforts, can also be attached according to these Figure obtains other attached drawings.
Fig. 1 is the flow diagram of the first video classification methods provided by the embodiments of the present application;
Fig. 2 is the flow diagram of second of video classification methods provided by the embodiments of the present application;
Fig. 3 is the structural schematic diagram of visual classification device provided by the embodiments of the present application;
Fig. 4 is the schematic diagram of terminal device provided by the embodiments of the present application.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed Body details, so as to provide a thorough understanding of the present application embodiment.However, it will be clear to one skilled in the art that there is no these specific The application also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity The detailed description of road and method, so as not to obscure the description of the present application with unnecessary details.
In order to illustrate technical solution described herein, the following is a description of specific embodiments.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " instruction is described special Sign, entirety, step, operation, the presence of element and/or component, but be not precluded one or more of the other feature, entirety, step, Operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this present specification merely for the sake of description specific embodiment And be not intended to limit the application.As present specification and it is used in the attached claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in present specification and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
As used in this specification and in the appended claims, term " if " can be according to context quilt Be construed to " when ... " or " once " or " in response to determination " or " in response to detecting ".Similarly, phrase " if it is determined that " or " if detecting [described condition or event] " can be interpreted to mean according to context " once it is determined that " or " in response to true It is fixed " or " once detecting [described condition or event] " or " in response to detecting [described condition or event] ".
In the specific implementation, terminal device described in the embodiment of the present application is including but not limited to such as with the sensitive table of touch Mobile phone, laptop computer or the tablet computer in face (for example, touch-screen display and/or touch tablet) etc it is other Portable device.It is to be further understood that in certain embodiments, above equipment is not portable communication device, but is had The desktop computer of touch sensitive surface (for example, touch-screen display and/or touch tablet).
In following discussion, the terminal device including display and touch sensitive surface is described.However, should manage Solution, terminal device may include that one or more of the other physical User of such as physical keyboard, mouse and/or control-rod connects Jaws equipment.
Terminal device supports various application programs, such as one of the following or multiple: drawing application program, demonstration application Program, word-processing application, website creation application program, disk imprinting application program, spreadsheet applications, game are answered With program, telephony application, videoconference application, email application, instant messaging applications, forging Refining supports application program, photo management application program, digital camera application program, digital camera application program, web-browsing to answer With program, digital music player application and/or video frequency player application program.
At least one of such as touch sensitive surface can be used in the various application programs that can be executed on the terminal device Public physical user-interface device.It can be adjusted among applications and/or in corresponding application programs and/or change touch is quick Feel the corresponding information shown in the one or more functions and terminal on surface.In this way, terminal public physical structure (for example, Touch sensitive surface) it can support the various application programs with user interface intuitive and transparent for a user.
Embodiment:
Fig. 1 shows the flow chart of the first video classification methods provided by the embodiments of the present application, in the present embodiment, mainly The face for including for video file is classified, and details are as follows:
Step S11 executes picture quality screening operation to video file, obtains the video frame that picture quality is met the requirements;
Wherein, it carries out picture quality to video frame to screen including: to judge whether the edge of foreground object meets the requirements, judges Whether the noise of video frame meets the requirements, whether brightness value meets the requirements, whether the dispersion degree of gray value meets the requirements etc..
In the present embodiment, the above-mentioned any one or several picture quality screening operation enumerated can be used (or using upper State the picture quality screening operation that do not enumerate) Screening Treatment is carried out to the video frame of video file, it is not construed as limiting herein.
Due to first carrying out primary screening operation to video file, subsequent video to be treated can be effectively reduced The quantity of frame.
In some embodiments, since the key frame of video file includes more information, in order to be further reduced The video frame of redundancy, then above-mentioned steps S11 specifically:
Picture quality screening operation is executed to the key frame of video file, obtains the video frame that picture quality is met the requirements.
In the present embodiment, due to being to execute picture quality screening operation to key frame, the video frame after screening is also Key frame.
Step S12 executes Face datection operation to the video frame, determines face in the information of video frame, the face It include face quantity and position in the information of video frame;
Specifically, the position of face and face quantity in detection video frame are detected using human-face detector.For example, detecting The choosing of position upper ledge (as selected with rectangle circle) of the face arrived the face.If a video frame includes 2 or 2 or more people Face, then frame selects the face detected respectively.
Step S13 cuts out the face of respective numbers according to the face quantity and position, and extracts the people of the face Face represents feature;
Wherein, face represents feature and refers to feature representative in face characteristic.In the step, due to only cutting Face is extracted on face out and represents feature, it is therefore not necessary to which extracting face from entire video frame represents feature, improves face Represent the speed of feature extraction.
Step S14 represents feature according to the face and classifies to the face.
Specifically, feature is represented according to the face of acquisition and classifier (such as support vector machines) divides face Class, to realize the semantic understanding to video frame.
In the embodiment of the present application, picture quality screening operation is executed to video file, obtains what picture quality was met the requirements Video frame executes Face datection operation to the video frame, determines face in the information of video frame, the face is in video frame Information includes face quantity and position, the face of respective numbers is cut out according to the face quantity and position, and described in extraction The face of face represents feature, represents feature according to the face and classifies to the face.Due to first being held to video file Therefore row picture quality screening operation can effectively reduce the video file of redundancy, also, by the video text after screening Part executes the operation such as Face datection, cutting, therefore, can quickly determine that face represents feature from the face cut out, from And when representing the classification of feature realization face according to the face, without complicated model, greatly improve face classification Speed, also, classify since the face according to acquisition represents feature to face, this improves the standards of classification results Exactness.
In order to further increase the accuracy that the face of extraction represents feature, Fig. 2 shows provided by the embodiments of the present application The flow chart of second of video classification methods also needs to execute face alignment behaviour after cutting out face in the present embodiment Make, finally just extract face from the face after alignment and represent feature, wherein step S21 and step S22 and above-mentioned steps S11 and Step S12 is identical, and details are not described herein again:
Step S21 executes picture quality screening operation to video file, obtains the video frame that picture quality is met the requirements;
Step S22 executes Face datection operation to the video frame, determines face in the information of video frame, the face It include face quantity and position in the information of video frame;
Step S23 cuts out the face of respective numbers according to the face quantity and position, and holds to the face cut out Row alignment operation;
The alignment operation of the step, which refers to, executes the operation such as translation or rotation to the face cut out, the people after guaranteeing operation Face meets preset requirement, for example meets the positive requirement for facing user.
In some embodiments, due to movement of personage etc., so there may be differences for the user oriented angle of face, and The positive face for facing user is usually only needed, at this point, alignment operation is specially to execute the behaviour such as translation and/or rotation to face Make, so that the face forward direction after operation faces user.In order to execute alignment operation, be arranged the face video frame information also Including face anchor point, at this point, the step S23 includes:
The face that respective numbers are cut out according to the face quantity and position is cut out further according to the rotation of face anchor point Face, with, at positive face, the face of the forward direction is as the face after alignment the face normalization.
In the present embodiment, face anchor point is used to identify the fixed character of face, such as canthus, the corners of the mouth and nose equipotential It sets, in face alignment procedure, face is cut out to come inside whole image first with face frame, further according to face anchor point people Face rotation correction is at a positive face, the case where to filter out side face or inclined face.
Step S24, the face characteristic of the face after extracting alignment, clusters the face characteristic of extraction, and right Each obtained classification that clusters carries out feature pool, and the face obtained under corresponding classification represents feature;
Specifically, a face is characterized in referring to one group of abstract number, cluster, which refers to, gathers the face characteristic of output Class.When cluster, it is assumed that there are 10 face characteristics, i.e., the vectors of 10 100 dimensions, after clustering algorithm Cluster, Assuming that the vector that this 10 100 are tieed up is divided into 3 groups of A, B, C, its A group and B component do not contain 3 vectors, and C group contains 4 A vector.Inside each group, by taking A group as an example, 3 vectors of A group are subjected to mean value pond, i.e., addition of vectors is again divided by 3. The center vector of the A group obtained in this way represents feature as face.Other groups can also obtain face generation by same process Table feature.
Face is represented feature and is input in preparatory trained classifier, obtained the face and represent feature by step S25 Corresponding face classification.
In the step, by classifier, similar or identical face can be represented the corresponding face of feature be divided into it is same Classification.For example, when the face of a face represents feature and the face of another face, to represent feature identical, then this classifier mould Type will determine that the two faces are same category of face.
In the present embodiment, due to just extracting face after executing alignment operation to face and represent feature, thus it is guaranteed that extract It is from the same angle extraction, so that the face for improving extraction represents the accuracy of feature that face, which represents feature all,.
In some embodiments, in order to quickly determine video frame that picture quality is met the requirements, the step S21 (or Step S11), comprising:
A1, as unit of video frame, calculate at least one following parameter value of the video frame of video file: image mean value, Graphics standard difference and image averaging gradient;
Wherein, image mean value refers to the average value of image pixel, and the image mean value of a video frame has reacted the video frame Mean picture brightness, average brightness is bigger, and picture quality is better.For example, it is assumed that calculating the image mean value of F video frame, the figure Picture size is (M, N), and M is the line number of image array, and N is the columns of image array, then the meter of the image mean value u of the F video frame Calculate formula are as follows:
Wherein, graphics standard difference refers to dispersion degree of the image pixel gray level value relative to average value, if standard deviation is got over Greatly, show that grey level distribution is more dispersed in image, picture quality is also better, it is assumed that the graphics standard for calculating F video frame is poor, should Image size is (M, N), and M is the line number of image array, and N is the columns of image array, then the graphics standard of the F video frame is poor The calculation formula of std are as follows:
Wherein, image averaging gradient can reflect details contrast and texture transformation in image, it is reflected to a certain extent The readability of image, image averaging gradientCalculation formula are as follows:
In above formula, ((F (i, j) respectively indicates a scale of the pixel (i, j) on the direction x y to Δ x by F (i, j), Δ y Point.
If A2, the parameter value calculated determine the corresponding view of the parameter value both greater than or equal to preset parameter threshold The picture quality of frequency frame is met the requirements, the preset parameter threshold include it is following at least one: image mean value threshold value, image mark Quasi- difference threshold value and image averaging Grads threshold.
In the step, if calculate video frame parameter value be image mean value, only judge the image mean value whether be greater than or Equal to preset image mean value threshold value, if more than or be equal to, then determine that the picture quality of the video frame is met the requirements;If calculating Parameter value is that image mean value and graphics standard are poor, then needs to judge simultaneously whether the image mean value is greater than or equal to preset image Mean value threshold value, and, judge whether the graphics standard difference is greater than or equal to preset graphics standard difference threshold value, if image mean value is big In or be equal to preset image mean value threshold value, and graphics standard difference also greater than or be equal to the poor threshold value of preset graphics standard, then sentence The picture quality of the fixed video frame is met the requirements.When the parameter value of calculating is other situations, then its judgment rule is similar to the above, Details are not described herein again.
According to draw above as the calculation formula of mean value, graphics standard difference and image averaging gradient it is found that the application is implemented For example when executing picture quality screening, corresponding computation complexity is all lower, so as to quickly determine first set packet The video frame included.
In some embodiments, since corresponding video tab can be obtained after classification, at this point, can be according to the video tab Specified operation is executed to other designated files, at this point, after the step S14, comprising:
Specified operation is executed to designated file according to the video tab obtained after classification, the specified operation includes inspection Rope and management.
In the present embodiment, due to executing search operaqtion to designated file according to the video tab obtained after classification, from And realize the intelligent retrieval to the file content of video file.
In some embodiments, described according to the video tab pair obtained after classification if the specified operation is management Designated file executes specified operation, comprising:
The self-defined information for receiving user's input carries out the video tab obtained after classification according to the self-defined information Sort out, categorizing operation is executed to designated file according to categorization results.
Wherein, self-defined information can be household, friend, colleague etc..After being sorted out by self-defined information to video tab, Corresponding categorizing operation can be carried out to specified file according to the video tab after classification, for example, wrapping in designated file When including " household " this video tab, which is classified as to the video file of " household " classification.
In actual conditions, terminal device (such as mobile phone) generally selects video file in the video file of display storage Cover of the first frame as the video file.But since the information content that the first frame of video file includes is usually seldom, for example, Video file is when shooting the video file of personage, and the first frame of the video file is likely to the figure viewed from behind image of the only personage, Or the image for other scenery.Therefore, user is difficult to obtain more effective informations in the cover that video file is shown. In order to allow user quickly know storage video file content of shooting, in some embodiments, the step S14 it Afterwards, comprising:
Select a video frame as the cover frame of the video file from the video frame after face classification.
In the step, it can arbitrarily select a video frame as the cover frame of video file from sorted video frame.
Wherein, the cover frame of the step refers to the video frame that a video file is shown, since the cover frame of selection is It is selected from the video frame after face classification, thus it is guaranteed that the cover frame includes the information of face, and the information content that face includes It is larger so that user from cover frame quick obtaining to more information.
Further, since a video file generally comprises multiple faces, and different faces are in the important of video file Degree is different, therefore, in order to which the video frame where prior face to be selected as to the cover frame of video file, then described Select a video frame as the cover frame of the video file from the video frame after face classification, comprising:
B1, the quantity for counting the video frame that each classification includes;
If video file includes 2 or 2 or more faces, face can be divided into pair according to above-mentioned steps It in the classification answered, i.e., is the video frame comprising identical face under one classification (or classification), when the video frame that a classification includes It is more, show that its shared ratio in video file is bigger, the importance of corresponding face is also bigger.
B2, from including select one video frame frame literary as the video under classification corresponding to the quantity of maximum video frame The cover frame of part.
In the present embodiment, since cover frame is from the video for including selection under classification corresponding to the quantity of maximum video frame Frame, thereby it is ensured that information content contained by the cover frame of the video file is bigger.
In some embodiments, user wishes to sort out the video file of shooting, i.e., by the view including same face Frequency file is returned in same file folder, at this point, after the step S14, comprising:
By the video file deposit same file folder where the face with same classification.
It, can be by the face institute with same classification after the face to multiple video files is classified in the present embodiment Video file deposit same file folder.For example, being created different after the face to multiple video files carries out the mankind File, and the video file deposit same file where the face with same classification is pressed from both sides.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present application constitutes any limit It is fixed.
Fig. 3 shows a kind of structural schematic diagram of visual classification device provided by the embodiments of the present application, visual classification dress It sets and is applied in terminal device, for ease of description, illustrate only part relevant to the embodiment of the present application.
The visual classification device includes: that picture quality screening unit 31, Face datection unit 32, face represent feature extraction Unit 33, face classification unit 34.Wherein:
It is full to obtain picture quality for executing picture quality screening operation to video file for picture quality screening unit 31 The video frame required enough;
Wherein, it carries out picture quality to video frame to screen including: to judge whether the edge of foreground object meets the requirements, judges Whether the noise of video frame meets the requirements, whether brightness value meets the requirements, whether the dispersion degree of gray value meets the requirements etc..
In some embodiments, since the key frame of video file includes more information, in order to be further reduced The video frame of redundancy, then above-mentioned picture quality screening unit 31 is specifically used for:
Picture quality screening operation is executed to the key frame of video file, obtains the video frame that picture quality is met the requirements.
Face datection unit 32 determines face in the letter of video frame for executing Face datection operation to the video frame Breath, the face includes face quantity and position in the information of video frame;
Face represents feature extraction unit 33, for cutting out the people of respective numbers according to the face quantity and position Face, and the face for extracting the face represents feature;
Face classification unit 34 classifies to the face for representing feature according to the face.
In the embodiment of the present application, due to first executing picture quality screening operation to video file, can effectively reduce The video file of redundancy, also, by executing the operation such as Face datection, cutting, therefore, Neng Goucong to the video file after screening Quickly determine that face represents feature in the face cut out, thus representing the classification that feature realizes face according to the face When, without complicated model, the speed of face classification is greatly improved, also, since the face according to acquisition represents spy Sign classifies to face, and this improves the accuracy of classification results.
In order to further increase the accuracy that the face of extraction represents feature, the face represents the packet of feature extraction unit 33 It includes:
Face alignment module, for cutting out the face of respective numbers according to the face quantity and position, and to cutting Face out executes alignment operation;
Face characteristic cluster module, for extracting the face characteristic of the face after being aligned, to the face characteristic of extraction It is clustered, and feature pool is carried out to each obtained classification that clusters, the face obtained under corresponding classification represents feature;
Accordingly, the face classification unit 34 is specifically used for:
Face is represented feature to be input in preparatory trained sorter model, the face is obtained and represents feature correspondence Face classification.
In some embodiments, the face further includes face anchor point in the information of video frame, at this point, the face is aligned Module is specifically used for:
The face that respective numbers are cut out according to the face quantity and position is cut out further according to the rotation of face anchor point Face, with, at positive face, the face of the forward direction is as the face after alignment the face normalization.
In some embodiments, in order to quickly determine video frame that picture quality is met the requirements, described image quality sieve Menu member 31 includes:
Video frame parameter calculating module, for as unit of video frame, calculate the video frame of video file it is following at least One parameter value: image mean value, graphics standard difference and image averaging gradient;
Whether picture quality meets the requirements judgment module, if the parameter value for calculating is both greater than or equal to preset parameter Threshold value then determines that the picture quality of the corresponding video frame of the parameter value is met the requirements, the preset parameter threshold include with Descend at least one: image mean value threshold value, graphics standard difference threshold value and image averaging Grads threshold.
In some embodiments, since the video tab of corresponding human face can be obtained after classification, at this point, can be according to the face Video tab specified operation is executed to other designated files, at this point, the visual classification device 3 further include:
Specified operation execution unit, for executing specified behaviour to designated file according to the video tab obtained after classification Make, the specified operation includes retrieval and management.
In the present embodiment, due to executing search operaqtion to designated file according to the video tab obtained after classification, from And realize the intelligent retrieval to the file content of video file.
In some embodiments, if the specified operation is management, the specified operation execution unit is specifically used for:
The self-defined information for receiving user's input carries out the video tab obtained after classification according to the self-defined information Sort out, categorizing operation is executed to designated file according to categorization results.
Wherein, self-defined information can be household, friend, colleague etc..After being sorted out by self-defined information to video tab, Corresponding categorizing operation can be carried out to specified file according to the video tab after classification, for example, wrapping in designated file When including " household " this video tab, which is classified as to the video file of " household " classification.
In order to allow user quickly to know the content of shooting of storage video file, in some embodiments, the video point Class device 3 further include:
Arbitrary frame selecting unit, for selecting a video frame as the video text from the video frame after face classification The cover frame of part.
Further, since a video file generally comprises multiple faces, and different faces are in the important of video file Degree is different, therefore, in order to which the video frame where prior face to be selected as to the cover frame of video file, then described Arbitrary frame selecting unit includes:
The quantity statistics module of video frame, for counting the quantity for the video frame that each classification includes;
Important video frame selecting module, for from including one view of selection under classification corresponding to the quantity of maximum video frame Cover frame of the frequency frame frame as the video file.
In some embodiments, user wishes to sort out the video file of shooting, i.e., by the view including same face Frequency file is returned in same file folder, at this point, the visual classification device 3 further include:
The face of same classification sorts out unit, same for the video file deposit where having the face of same classification A file.
It, can be by the face institute with same classification after the face to multiple video files is classified in the present embodiment Video file deposit same file folder.
Fig. 4 is the schematic diagram for the terminal device that one embodiment of the application provides.As shown in figure 4, the terminal of the embodiment is set Standby 4 include: processor 40, memory 41 and are stored in the meter that can be run in the memory 41 and on the processor 40 Calculation machine program 42.The processor 40 realizes the step in above-mentioned each embodiment of the method when executing the computer program 42, Such as step S11 to S14 shown in FIG. 1.Alternatively, the processor 40 realizes above-mentioned each dress when executing the computer program 42 Set the function of each module/unit in embodiment, such as the function of module 31 to 34 shown in Fig. 3.
Illustratively, the computer program 42 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 41, and are executed by the processor 40, to complete the application.Described one A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for Implementation procedure of the computer program 42 in the terminal device 4 is described.For example, the computer program 42 can be divided It is cut into picture quality screening unit, Face datection unit, face and represents feature extraction unit, face classification unit, each unit tool Body function is as follows:
Picture quality screening unit obtains picture quality satisfaction for executing picture quality screening operation to video file It is required that video frame;
Face datection unit, for executing Face datection operation to the video frame, determine face in the information of video frame, The face includes face quantity and position in the information of video frame;
Face represents feature extraction unit, for cutting out the face of respective numbers according to the face quantity and position, And the face for extracting the face represents feature;
Face classification unit classifies to the face for representing feature according to the face.
The terminal device 4 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set It is standby.The terminal device may include, but be not limited only to, processor 40, memory 41.It will be understood by those skilled in the art that Fig. 4 The only example of terminal device 4 does not constitute the restriction to terminal device 4, may include than illustrating more or fewer portions Part perhaps combines certain components or different components, such as the terminal device can also include input-output equipment, net Network access device, bus etc..
Alleged processor 40 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.
The memory 41 can be the internal storage unit of the terminal device 4, such as the hard disk or interior of terminal device 4 It deposits.The memory 41 is also possible to the External memory equipment of the terminal device 4, such as be equipped on the terminal device 4 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge Deposit card (Flash Card) etc..Further, the memory 41 can also both include the storage inside list of the terminal device 4 Member also includes External memory equipment.The memory 41 is for storing needed for the computer program and the terminal device Other programs and data.The memory 41 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed Scope of the present application.
In embodiment provided herein, it should be understood that disclosed device/terminal device and method, it can be with It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or In use, can store in a computer readable storage medium.Based on this understanding, the application realizes above-mentioned implementation All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice Subtract, such as does not include electric carrier signal and electricity according to legislation and patent practice, computer-readable medium in certain jurisdictions Believe signal.
Embodiment described above is only to illustrate the technical solution of the application, rather than its limitations;Although referring to aforementioned reality Example is applied the application is described in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution should all Comprising within the scope of protection of this application.

Claims (10)

1. a kind of video classification methods characterized by comprising
Picture quality screening operation is executed to video file, obtains the video frame that picture quality is met the requirements;
Face datection operation is executed to the video frame, determines face in the information of video frame, letter of the face in video frame Breath includes face quantity and position;
The face of respective numbers is cut out according to the face quantity and position, and the face for extracting the face represents feature;
Feature is represented according to the face to classify to the face.
2. video classification methods as described in claim 1, which is characterized in that described to be cut according to the face quantity and position The face of respective numbers out, and the face for extracting the face represents feature, comprising:
The face of respective numbers is cut out according to the face quantity and position, and alignment operation is executed to the face cut out;
The face characteristic of the face after extracting alignment, clusters the face characteristic of extraction, and obtains to each cluster Classification carry out feature pool, the face obtained under corresponding classification represents feature;
It is accordingly, described feature is represented according to the face to classify to the face specifically:
Face is represented feature to be input in preparatory trained classifier, the face is obtained and represents the corresponding face class of feature Not.
3. video classification methods as claimed in claim 2, which is characterized in that the face further includes people in the information of video frame Face anchor point, at this point, the face that respective numbers are cut out according to the face quantity and position, and the face cut out is held Row alignment operation, specifically includes:
The face that respective numbers are cut out according to the face quantity and position rotates the people cut out further according to face anchor point Face, with, at positive face, the face of the forward direction is as the face after alignment the face normalization.
4. video classification methods as described in claim 1, which is characterized in that described to execute picture quality screening to video file Operation, obtains the video frame that picture quality is met the requirements, comprising:
As unit of video frame, at least one following parameter value of the video frame of video file: image mean value, graphics standard is calculated Difference and image averaging gradient;
If the parameter value calculated determines the figure of the corresponding video frame of the parameter value both greater than or equal to preset parameter threshold Image quality amount is met the requirements, the preset parameter threshold include it is following at least one: image mean value threshold value, graphics standard difference threshold value With image averaging Grads threshold.
5. video classification methods as described in claim 1, which is characterized in that represent feature to institute according to the face described It states after face classified, comprising:
Specified operation executed to designated file according to the video tab obtained after classification, the specified operation include retrieval and Management.
6. video classification methods as claimed in claim 5, which is characterized in that if the specified operation is to manage, described Specified operation is executed to designated file according to the video tab obtained after classification, comprising:
The self-defined information for receiving user's input, returns the video tab obtained after classification according to the self-defined information Class;
Categorizing operation is executed to designated file according to categorization results.
7. the video classification methods as described in claim 1 to 6, which is characterized in that represent feature according to the face described After classifying to the face, comprising:
By the video file deposit same file folder where the face with same classification.
8. a kind of visual classification device characterized by comprising
Picture quality screening unit obtains picture quality and meets the requirements for executing picture quality screening operation to video file Video frame;
Face datection unit, for executing Face datection operation to the video frame, determine face in the information of video frame, it is described Face includes face quantity and position in the information of video frame;
Face represents feature extraction unit, for cutting out the face of respective numbers according to the face quantity and position, and takes out The face of the face is taken to represent feature;
Face classification unit classifies to the face for representing feature according to the face.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program The step of any one the method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In when the computer program is executed by processor the step of any one of such as claim 1 to 7 of realization the method.
CN201910628108.8A 2019-07-12 2019-07-12 Video classification methods, device, terminal device and computer readable storage medium Pending CN110347876A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910628108.8A CN110347876A (en) 2019-07-12 2019-07-12 Video classification methods, device, terminal device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910628108.8A CN110347876A (en) 2019-07-12 2019-07-12 Video classification methods, device, terminal device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN110347876A true CN110347876A (en) 2019-10-18

Family

ID=68175903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910628108.8A Pending CN110347876A (en) 2019-07-12 2019-07-12 Video classification methods, device, terminal device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110347876A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177086A (en) * 2019-12-27 2020-05-19 Oppo广东移动通信有限公司 File clustering method and device, storage medium and electronic equipment
CN111553191A (en) * 2020-03-30 2020-08-18 深圳壹账通智能科技有限公司 Video classification method and device based on face recognition and storage medium
CN111881755A (en) * 2020-06-28 2020-11-03 腾讯科技(深圳)有限公司 Method and device for cutting video frame sequence
CN112101154A (en) * 2020-09-02 2020-12-18 腾讯科技(深圳)有限公司 Video classification method and device, computer equipment and storage medium
CN113705386A (en) * 2021-08-12 2021-11-26 北京有竹居网络技术有限公司 Video classification method and device, readable medium and electronic equipment
CN115205768A (en) * 2022-09-16 2022-10-18 山东百盟信息技术有限公司 Video classification method based on resolution self-adaptive network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408429A (en) * 2014-11-28 2015-03-11 北京奇艺世纪科技有限公司 Method and device for extracting representative frame of video
US20170039419A1 (en) * 2015-08-05 2017-02-09 Canon Kabushiki Kaisha Information processing apparatus and control method of the same
CN107644213A (en) * 2017-09-26 2018-01-30 司马大大(北京)智能系统有限公司 Video person extraction method and device
CN109151501A (en) * 2018-10-09 2019-01-04 北京周同科技有限公司 A kind of video key frame extracting method, device, terminal device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408429A (en) * 2014-11-28 2015-03-11 北京奇艺世纪科技有限公司 Method and device for extracting representative frame of video
US20170039419A1 (en) * 2015-08-05 2017-02-09 Canon Kabushiki Kaisha Information processing apparatus and control method of the same
CN107644213A (en) * 2017-09-26 2018-01-30 司马大大(北京)智能系统有限公司 Video person extraction method and device
CN109151501A (en) * 2018-10-09 2019-01-04 北京周同科技有限公司 A kind of video key frame extracting method, device, terminal device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
都伊林: "《智能安防新发展与应用》", 31 May 2018 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177086A (en) * 2019-12-27 2020-05-19 Oppo广东移动通信有限公司 File clustering method and device, storage medium and electronic equipment
CN111553191A (en) * 2020-03-30 2020-08-18 深圳壹账通智能科技有限公司 Video classification method and device based on face recognition and storage medium
CN111881755A (en) * 2020-06-28 2020-11-03 腾讯科技(深圳)有限公司 Method and device for cutting video frame sequence
CN111881755B (en) * 2020-06-28 2022-08-23 腾讯科技(深圳)有限公司 Method and device for cutting video frame sequence
CN112101154A (en) * 2020-09-02 2020-12-18 腾讯科技(深圳)有限公司 Video classification method and device, computer equipment and storage medium
CN112101154B (en) * 2020-09-02 2023-12-15 腾讯科技(深圳)有限公司 Video classification method, apparatus, computer device and storage medium
CN113705386A (en) * 2021-08-12 2021-11-26 北京有竹居网络技术有限公司 Video classification method and device, readable medium and electronic equipment
CN115205768A (en) * 2022-09-16 2022-10-18 山东百盟信息技术有限公司 Video classification method based on resolution self-adaptive network

Similar Documents

Publication Publication Date Title
CN110347876A (en) Video classification methods, device, terminal device and computer readable storage medium
Ma et al. Pyramidal feature shrinking for salient object detection
CN105095902B (en) Picture feature extracting method and device
CN109583449A (en) Character identifying method and Related product
CN109215037A (en) Destination image partition method, device and terminal device
CN108109152A (en) Medical Images Classification and dividing method and device
CN111325271B (en) Image classification method and device
CN109376645A (en) A kind of face image data preferred method, device and terminal device
CN108228844A (en) A kind of picture screening technique and device, storage medium, computer equipment
CN108961267A (en) Image processing method, picture processing unit and terminal device
CN110232318A (en) Acupuncture point recognition methods, device, electronic equipment and storage medium
CN109739223A (en) Robot obstacle-avoiding control method, device and terminal device
CN111126347B (en) Human eye state identification method, device, terminal and readable storage medium
CN109165316A (en) A kind of method for processing video frequency, video index method, device and terminal device
Madan et al. Synthetically trained icon proposals for parsing and summarizing infographics
CN108898082A (en) Image processing method, picture processing unit and terminal device
CN110751218A (en) Image classification method, image classification device and terminal equipment
CN107517312A (en) A kind of wallpaper switching method, device and terminal device
CN109214333A (en) Convolutional neural networks structure, face character recognition methods, device and terminal device
WO2024016812A1 (en) Microscopic image processing method and apparatus, computer device, and storage medium
CN108133020A (en) Video classification methods, device, storage medium and electronic equipment
CN108932703A (en) Image processing method, picture processing unit and terminal device
CN110263741A (en) Video frame extraction method, apparatus and terminal device
Cambuim et al. An efficient static gesture recognizer embedded system based on ELM pattern recognition algorithm
CN108985215A (en) A kind of image processing method, picture processing unit and terminal device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191018