CN108090497B

CN108090497B - Video classification method and device, storage medium and electronic equipment

Info

Publication number: CN108090497B
Application number: CN201711464317.0A
Authority: CN
Inventors: 陈岩; 刘耀勇
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2017-12-28
Filing date: 2017-12-28
Publication date: 2020-07-07
Anticipated expiration: 2037-12-28
Also published as: CN108090497A

Abstract

The application discloses a video classification method, a video classification device, a storage medium and electronic equipment, wherein the method comprises the following steps: acquiring a multi-frame image of a video file; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; and the classification model classifies the video file according to the first characteristic point set and the second characteristic point set. Two feature point sets related to scene features and object features are obtained from a video file, then the two feature point sets are input into an algorithm module for classification, and input data are classified, so that the classification is more accurate.

Description

Video classification method and device, storage medium and electronic equipment

Technical Field

The present application belongs to the field of communications technologies, and in particular, to a video classification method and apparatus, a storage medium, and an electronic device.

Background

With the increase of video files, people can perform preliminary screening according to the categories of the video files before needing to watch the video files, and then select interesting video files from the video files of corresponding categories to watch the video files. This requires efficient classification of the video files so that they are presented in the appropriate category.

When the existing video files are classified, the category labels of the video files are mainly set firstly, and then the video files are classified into corresponding video categories according to the category labels. But the category label settings may be inaccurate or incomplete, resulting in inaccurate video classification.

Disclosure of Invention

The application provides a video classification method, a video classification device, a storage medium and electronic equipment, which are used for automatically and reasonably classifying video files and improving the accuracy of video classification.

In a first aspect, an embodiment of the present application provides a video classification method, which is applied to an electronic device, and the method includes:

acquiring a multi-frame image of a video file;

extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set;

selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set;

and the classification model classifies the video file according to the first characteristic point set and the second characteristic point set.

In a second aspect, an embodiment of the present application provides a video classification apparatus, which is applied to an electronic device, and the apparatus includes:

the image acquisition module is used for acquiring multi-frame images of the video file;

the characteristic point set acquisition module is used for extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set;

the characteristic point set selection module is used for selecting characteristic points related to scene characteristics from the characteristic point set to obtain a first characteristic point set, and selecting characteristic points related to object characteristics from the characteristic point set to obtain a second characteristic point set;

and the processing module is used for classifying the video file according to the first characteristic point set and the second characteristic point set by a classification model.

In a third aspect, an embodiment of the present application provides a storage medium, on which a computer program is stored, which, when running on a computer, causes the computer to execute the video classification method described above.

In a fourth aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the memory has a computer program, and the processor is configured to execute the video classification method described above by calling the computer program.

According to the video classification method, the video classification device, the storage medium and the electronic equipment, the multi-frame image of the video file is acquired; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; the classification model classifies the video file according to the first feature point set and the second feature point set. Two feature point sets related to scene features and object features are obtained from a video file, then the two feature point sets are input into an algorithm module for classification, and input data are classified, so that the classification is more accurate.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

Fig. 1 is a schematic view of an application scenario of a video classification apparatus according to an embodiment of the present application.

Fig. 2 is a schematic flowchart of a video classification method according to an embodiment of the present application.

Fig. 3 is a schematic flowchart of a second video classification method according to an embodiment of the present application.

Fig. 4 is a schematic flowchart of a third video classification method according to an embodiment of the present application.

Fig. 5 is a fourth flowchart illustrating a video classification method according to an embodiment of the present application.

Fig. 6 is a fifth flowchart illustrating a video classification method according to an embodiment of the present application.

Fig. 7 is a sixth flowchart illustrating a video classification method according to an embodiment of the present application.

Fig. 8 is a seventh flowchart illustrating a video classification method according to an embodiment of the present application.

Fig. 9 is a schematic view of a first structure of a video classification apparatus according to an embodiment of the present application.

Fig. 10 is a schematic view of a second structure of a video classification apparatus according to an embodiment of the present application.

Fig. 11 is a schematic structural diagram of a third video classification apparatus according to an embodiment of the present application.

Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Fig. 13 is another schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present application are illustrated as being implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.

In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific form set forth herein, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.

The term module, as used herein, may be considered a software object executing on the computing system. The various components, modules, engines, and services described herein may be viewed as objects implemented on the computing system. The apparatus and method described herein may be implemented in software, but may also be implemented in hardware, and are within the scope of the present application.

The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules listed, but rather, some embodiments may include other steps or modules not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to fig. 1, fig. 1 is a schematic view of an application scenario of a video classification apparatus according to an embodiment of the present application. For example, a video classification device acquires a plurality of frame images of a video file; then extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; then selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; and inputting the first feature point set and the second feature point set into a classification model, and classifying the video file according to the first feature point set and the second feature point set by the classification model. For example, the video files may be categorized into categories of starring actors, or into categories of sports videos, etc.

An execution main body of the video classification method may be the video classification device provided in the embodiment of the present application, or an electronic device integrated with the video classification device, where the video classification device may be implemented in a hardware or software manner. It can be understood that the execution subject of the embodiment of the present application may be a terminal device such as a smart phone or a tablet computer.

Embodiments of the present application will be described in terms of a video classification apparatus, which may be specifically integrated in an electronic device. The video classification method comprises the following steps: acquiring a multi-frame image of a video file; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; the classification model classifies the video file according to the first feature point set and the second feature point set.

Referring to fig. 2, fig. 2 is a first flowchart illustrating a video classification method according to an embodiment of the present disclosure. The video classification method provided by the embodiment of the application is applied to electronic equipment, and the specific process can be as follows:

step 101, acquiring a multi-frame image of a video file.

A video file is a multimedia file containing real-time audio and video information. Video files are one of the internet multimedia important content. Including formats such as AVI, MPEG, etc.

The method may obtain continuous multi-frame images of a video file, or obtain the multi-frame images according to a preset obtaining frequency, for example, obtain the multi-frame images at a frequency of obtaining one frame image every 1 second, or obtain the multi-frame images at other frequencies, for example, obtain one frame image every 2 seconds, every 1 minute, or every 5 minutes, or obtain continuous multi-frame images every 1 minute, every 5 minutes, or every 10 minutes, and combine several obtained multi-frame images to form a final multi-frame image. The continuous multi-frame images have relevance in a time dimension, so that the continuous multi-frame images are more accurate as reference data for video classification.

Referring to fig. 3, fig. 3 is a second flowchart illustrating a video classification method according to an embodiment of the present disclosure. The steps for acquiring the multi-frame image of the video file provided by the embodiment of the application can be as follows:

in step 1011, the video file is divided into a plurality of sub video files with equal playing time duration.

The method comprises the steps of firstly obtaining the total playing time of a video file, then obtaining the number of sub-video files needing to be divided, then dividing the total time by the divided number to obtain the sub-playing time of each sub-video file, further obtaining the starting time and the ending time of each sub-video file, and then taking out the sub-video files at the corresponding starting time and the corresponding ending time to divide the video file into a plurality of sub-video files with the same playing time.

Step 1012, acquiring a plurality of frames of images of the start time period of each sub video file.

Each sub-video file acquires only a plurality of frame images of the start time period. Of course, the multi-frame images of other time periods of each sub video file can be acquired, the multi-frame images of the initial time period are easy to acquire, and the problem of insufficient frame number is avoided.

And 102, extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set.

In a plurality of frames of images obtained from a video file, extracting one or more corresponding characteristic points from each frame of image through a characteristic point extraction algorithm, thereby obtaining a characteristic point set of the corresponding plurality of frames of images.

The feature point extraction algorithm can be a scale invariant feature transformation algorithm or an accelerated robust feature algorithm and the like.

And 103, selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set.

After the feature point set corresponding to the multi-frame images is obtained, feature points related to the scene features are selected from the feature point set, and all the feature points related to the scene features form a first feature point set. Feature points related to the object features are selected from the feature point set, and all the feature points related to the object features are formed into a second feature point set.

And 104, classifying the video file according to the first characteristic point set and the second characteristic point set by the classification model.

And inputting the feature point sets corresponding to the multi-frame images into a classification module, and classifying the video files according to the feature point sets by using a classification model. And inputting a first characteristic point set and a second characteristic point set corresponding to the multi-frame images into a classification model, and classifying the video according to the first characteristic point set and the second characteristic point set by the classification model. Since the video file is dynamic, images of adjacent frames are associated with each other, and the accuracy of video classification can be improved by acquiring images of adjacent frames. And the change trend of the characteristic points can be obtained through the adjacent frame images, and the accuracy of video classification is further improved. Such as the motion feature point changes corresponding to a sports video file.

Referring to fig. 4, fig. 4 is a third flowchart illustrating a video classification method according to an embodiment of the present disclosure. The classification model provided by the embodiment of the application classifies the video files according to the first feature point set and the second feature point set, and the specific process can be as follows:

step 1041, obtaining a plurality of classification labels of the classification model, wherein the classification labels respectively correspond to a video category.

The method comprises the steps of firstly obtaining a plurality of classification labels of a classification model, namely classifying video files into classes by the classification model, wherein the classification labels can comprise movie videos, sports videos, education videos, fun videos and the like, and the classification labels can also comprise basketball videos, football videos, sprint videos and the like. The classification tags may include multiple levels of classification tags, such as a larger range of first level classification tags, followed by multiple second level classification tags. If the first-level classification label is a sports video, the second-level classification label is a basketball video. Of course, more levels of category labels, such as three, four, etc., may also be included.

And 1042, inputting the first feature point set and the second feature point set into a classification model to obtain probability values of the video file matched with the classification labels.

And inputting the first characteristic point set and the second characteristic point set into a classification model, and predicting by the classification model according to the input information to obtain the probability value of the matching of the video file and each classification label.

Step 1043, obtaining the classification label with the probability value larger than the preset probability value as a target classification label.

And comparing the probability value of the video file matched with each classification label with a preset probability value in sequence, and selecting the classification label with the probability value larger than the preset probability value.

And step 1044, classifying the video files according to the target classification labels.

And classifying the video file according to the target classification label. And if the target classification label is a sports video label, dividing the video file into sports videos. Specifically, the video file may be stored in a storage space divided by the sports video. The method can also set a set corresponding to different types of video files, the set stores the storage link addresses of the video files of each type, when the video files are viewed, a plurality of videos are displayed in different areas according to the types of the video files in a classified manner, for example, in different files, one folder corresponds to one type of video files. It should be noted that the target classification label may include a plurality of corresponding video files, and the video files may be divided into a plurality of categories, each category corresponds to a set, and a storage link address of all the video files of the category is stored in each set. Physical storage space is not needed to be arranged together, and space is not wasted.

Referring to fig. 5, fig. 5 is a fourth flowchart illustrating a video classification method according to an embodiment of the present disclosure. In the video classification method provided in the embodiment of the present application, after the step of classifying the video file according to the first feature point set and the second feature point set by the classification model, the method further includes:

and 1051, identifying the face image of each frame of image in the multi-frame images to obtain a face image set.

And performing face recognition on each frame of image by a face recognition technology to obtain a corresponding face image. And then combining the face images corresponding to the multi-frame images to obtain a face image set, wherein the face image set comprises all face images appearing in the multi-frame images.

Step 1052, calculating the frequency of occurrence of each face image in the face image set, and determining the face image with the highest frequency of occurrence as the target face image.

And calculating the occurrence frequency of each face image in the face image set, namely calculating the occurrence frequency of each face image in a plurality of frame images, and then determining the face image with the highest occurrence probability as the target face image.

And 1053, identifying the target face image to obtain the face label information of the target face image.

And identifying the target face image to obtain corresponding face label information. Such as the name corresponding to the target face image.

And 1054, adding the face label information into the file name of the video file.

And adding the face label information into the file name of the video file. For example, the target face image is liu de hua, the face label information such as name of liu de hua is added to the file name of the video file, and the corresponding movie classified as liu de hua is classified according to the lead actor.

Referring to fig. 6, fig. 6 is a fifth flowchart illustrating a video classification method according to an embodiment of the present disclosure. In the video classification method provided in the embodiment of the present application, after the step of classifying the video file according to the first feature point set and the second feature point set by the classification model, the method further includes:

and 1055, acquiring first label information according to the first feature point set.

And 1056, acquiring second label information according to the second feature point set.

Step 1057, add the first tag information and the second tag information to the file name of the video file.

And acquiring first tag information according to the first feature point set corresponding to the scene features, wherein the first tag information is the main court of the lake people if the scene features are the basketball court of the lake people team. And acquiring second label information according to the second characteristic point of the corresponding object, wherein the second label information can be a car driver if the object characteristic point is a car driver.

Referring to fig. 7, fig. 7 is a sixth flowchart illustrating a video classification method according to an embodiment of the present application. The classification method provided by the embodiment of the application can have the following specific processes:

step 201, acquiring a multi-frame image of a video file.

Step 202, extracting the feature points of each frame of image in the multi-frame image to obtain a plurality of sub-feature point sets corresponding to the multi-frame image.

And step 203, calculating the occurrence frequency of each feature point in the plurality of sub-feature point sets.

And 204, combining the plurality of sub-feature point sets to form a feature point set, and obtaining a weight value corresponding to each feature point according to the occurrence frequency of each feature point.

Step 205, selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set.

And step 206, multiplying the first characteristic point set and the second characteristic point set by the corresponding weight values and inputting the result into the classification model.

And step 207, classifying the video file according to the multiplied first characteristic point set and second characteristic point set by the classification model.

According to the frequency of the feature point occurrence frequency, different weight values are set, the more the feature point occurrence frequency is, the more important the feature point is, and the less the feature point occurrence frequency is, the lower the importance of the feature point is. The proportion of the characteristic points with high occurrence frequency is improved, and the accuracy of video classification is improved.

Referring to fig. 8, fig. 8 is a seventh flowchart illustrating a video classification method according to an embodiment of the present disclosure. The classification method provided by the embodiment of the application can have the following specific processes:

step 301, acquiring a multi-frame image of a video file.

Step 302, obtaining a foreground image and a background image of each frame of image.

And 303, acquiring foreground characteristic points according to the foreground image, and acquiring background characteristic points according to the background image.

And step 304, extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set.

Step 305, selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set.

Step 306, converting the first feature point set and the second feature point set into a first feature point vector and a second feature point vector.

And 307, setting the weight of the foreground characteristic point to be greater than the weight of the background characteristic point.

And 308, multiplying the first characteristic point vector and the second characteristic point vector by corresponding weight values and inputting the multiplied first characteristic point vector and second characteristic point vector into a classification model.

And 309, classifying the video file according to the multiplied first characteristic point set and second characteristic point set by the classification model.

The importance of the feature points of the foreground image is greater than the importance of the feature points of the background image. The proportion of the characteristic points of the foreground image is improved, and the accuracy of video classification is improved.

As can be seen from the above, in the video classification method provided in the embodiment of the present application, the multi-frame image of the video file is obtained; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; the classification model classifies the video file according to the first feature point set and the second feature point set. Two feature point sets related to scene features and object features are obtained from a video file, then the two feature point sets are input into an algorithm module for classification, and input data are classified, so that the classification is more accurate.

Referring to fig. 9, fig. 9 is a schematic view illustrating a first structure of a video classification apparatus according to an embodiment of the present disclosure. The video classification apparatus 500 is applied to an electronic device, and the video classification apparatus 500 includes an image obtaining module 501, a feature point set obtaining module 502, a feature point set selecting module 503, and a processing module 504. Wherein:

the image obtaining module 501 is configured to obtain a multi-frame image of a video file.

The method may obtain continuous multi-frame images of a video file, or obtain the multi-frame images according to a preset obtaining frequency, for example, obtain the multi-frame images at a frequency of obtaining one frame image every 1 second, or obtain the multi-frame images at other frequencies, for example, obtain one frame image every 2 seconds, every 1 minute, or every 5 minutes, or obtain continuous multi-frame images every 1 minute, every 5 minutes, or every 10 minutes, and combine several obtained multi-frame images to form a final multi-frame image.

The feature point set obtaining module 502 is configured to extract feature points of each frame of image in multiple frames of images to obtain a feature point set.

The feature point set selecting module 503 is configured to select feature points related to scene features from the feature point set to obtain a first feature point set, and select feature points related to object features from the feature point set to obtain a second feature point set.

And the processing module 504 is configured to classify the video file according to the first feature point set and the second feature point set by the classification model.

Referring to fig. 10, fig. 10 is a schematic view illustrating a second structure of a video classification apparatus according to an embodiment of the present application. In this embodiment, the feature point set obtaining module 502 includes a sub-feature point set submodule 5021, a calculating submodule 5022, and a weight value obtaining submodule 5023. Wherein:

the sub-feature point set submodule 5021 is used for extracting feature points of each frame of image in a plurality of frames of images to obtain a plurality of sub-feature point sets corresponding to the plurality of frames of images;

the calculating submodule 5022 is used for calculating the occurrence frequency of each feature point in the plurality of sub-feature point sets;

the weight value obtaining submodule 5023 is used for combining a plurality of sub-feature point sets to form a feature point set, and obtaining the weight value corresponding to each feature point according to the occurrence frequency of each feature point;

the processing module 504 is further configured to multiply the first feature point set and the second feature point set by the corresponding weight values and input the result to the classification model.

Referring to fig. 11, fig. 11 is a schematic diagram illustrating a third structure of a video classification apparatus according to an embodiment of the present application. In this embodiment, the processing module 504 includes a classification tag obtaining sub-module 5041, a probability value obtaining sub-module 5042, a classification tag selecting sub-module 5043, and a classification sub-module 5044. Wherein:

the classification label obtaining sub-module 5041 is configured to obtain a plurality of classification labels of the classification model, where the classification labels correspond to one video category.

And the probability value obtaining submodule 5042 is configured to input the first feature point set and the second feature point set into the classification model, so as to obtain a probability value of the video file matching each classification label.

The classification label selection submodule 5043 is configured to obtain a classification label with a probability value greater than a preset probability value as a target classification label.

The classification sub-module 5044 is configured to classify the video file according to the target classification tag.

In some embodiments, the apparatus further comprises a foreground background image acquisition module and a feature point extraction module. The foreground and background image acquisition module is used for acquiring a foreground image and a background image of each frame of image. And the characteristic point extraction module is used for acquiring foreground characteristic points according to the foreground image and acquiring background characteristic points according to the background image.

The processing module comprises a conversion submodule, a weight value setting submodule, a merging submodule and a processing submodule. And the conversion sub-module is used for converting the first characteristic point set and the second characteristic point set into a first characteristic point vector and a second characteristic point vector. And the weight value setting submodule is used for setting the weight of the foreground characteristic points to be greater than the weight value of the background characteristic points. And the merging submodule is used for multiplying the first characteristic point vector and the second characteristic point vector by the corresponding weight values and inputting the multiplied first characteristic point vector and the multiplied second characteristic point vector into the classification model. And the processing submodule is used for classifying the video file according to the multiplied first characteristic point set and the multiplied second characteristic point set by the classification model.

In some embodiments, the apparatus further comprises a face image set acquisition module, a target face image acquisition module, a face tag information acquisition module, and a file name renaming module. The face image set acquisition module is used for identifying the face image of each frame of image in the multi-frame images to obtain a face image set. And the target face image acquisition module is used for calculating the occurrence frequency of each face image in the face image set and determining the face image with the highest occurrence frequency as the target face image. And the face label information acquisition module is used for identifying the target face image and obtaining the face label information of the target face image. And the file name renaming module is used for adding the face label information into the file name of the video file.

And performing face recognition on each frame of image by a face recognition technology to obtain a corresponding face image. And then combining the face images corresponding to the multi-frame images to obtain a face image set, wherein the face image set comprises all face images appearing in the multi-frame images. And performing face recognition on each frame of image by a face recognition technology to obtain a corresponding face image. And then combining the face images corresponding to the multi-frame images to obtain a face image set, wherein the face image set comprises all face images appearing in the multi-frame images. And identifying the target face image to obtain corresponding face label information. Such as the name corresponding to the target face image. And adding the face label information into the file name of the video file. For example, the target face image is liu de hua, the face label information such as name of liu de hua is added to the file name of the video file, and the corresponding movie classified as liu de hua is classified according to the lead actor.

In some embodiments, the apparatus further comprises a first tag information obtaining module, a second tag information obtaining module, and a file name renaming module. The first tag information acquisition module is used for acquiring first tag information according to the first feature point set. And the second label information acquisition module is used for acquiring second label information according to the second feature point set. And the file name renaming module is used for adding the first label information and the second label information into the file name of the video file.

In some embodiments, the image acquisition module includes a partitioning sub-module and an image acquisition sub-module. The division submodule is used for dividing the video file into a plurality of sub video files with equal playing time. And the image acquisition submodule is used for acquiring the multi-frame images of the initial time period of each sub video file.

The method comprises the steps of firstly obtaining the total playing time of a video file, then obtaining the number of sub-video files needing to be divided, then dividing the total time by the divided number to obtain the sub-playing time of each sub-video file, further obtaining the starting time and the ending time of each sub-video file, and then taking out the sub-video files at the corresponding starting time and the corresponding ending time to divide the video file into a plurality of sub-video files with the same playing time. Each sub-video file acquires only a plurality of frame images of the start time period. Of course, the multi-frame images of other time periods of each sub video file can be acquired, the multi-frame images of the initial time period are easy to acquire, and the problem of insufficient frame number is avoided.

As can be seen from the above, the video classification apparatus provided in the embodiment of the present application obtains the multi-frame images of the video file; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; the classification model classifies the video file according to the first feature point set and the second feature point set. Two feature point sets related to scene features and object features are obtained from a video file, then the two feature point sets are input into an algorithm module for classification, and input data are classified, so that the classification is more accurate.

In specific implementation, the above modules may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and specific implementation of the above modules may refer to the foregoing method embodiments, which are not described herein again.

In the embodiment of the present application, the video classification apparatus and the video classification method in the above embodiments belong to the same concept, and any method provided in the video classification method embodiment may be run on the video classification apparatus, and a specific implementation process thereof is described in detail in the embodiment of the video classification method, and is not described herein again.

The embodiment of the application also provides the electronic equipment. Referring to fig. 12, the electronic device 600 includes a processor 601 and a memory 602. The processor 601 is electrically connected to the memory 602.

The processor 600 is a control center of the electronic device 600, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device 600 by running or loading a computer program stored in the memory 602, and calls data stored in the memory 602, and processes the data, thereby performing overall monitoring of the electronic device 600.

The memory 602 may be used for storing software programs and units, and the processor 601 executes various functional applications and data processing by running the computer programs and units stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 601 with access to the memory 602.

In the embodiment of the present application, the processor 601 in the electronic device 600 loads instructions corresponding to one or more processes of the computer program into the memory 602 according to the following steps, and the processor 601 runs the computer program stored in the memory 602, thereby implementing various functions as follows:

acquiring a multi-frame image of a video file;

the classification model classifies the video file according to the first feature point set and the second feature point set.

In some embodiments, the processor 601 is further configured to perform the following steps:

acquiring a plurality of classification labels of a classification model, wherein the classification labels respectively correspond to one video category;

inputting the first characteristic point set and the second characteristic point set into a classification model to obtain a probability value of the video file matched with each classification label;

acquiring a classification label with a probability value larger than a preset probability value as a target classification label;

and classifying the video files according to the target classification labels.

extracting the characteristic points of each frame of image in the multi-frame images to obtain a plurality of sub-characteristic point sets corresponding to the multi-frame images;

calculating the occurrence frequency of each feature point in a plurality of sub-feature point sets;

combining a plurality of sub-feature point sets to form a feature point set, and obtaining a weight value corresponding to each feature point according to the occurrence frequency of each feature point;

multiplying the first characteristic point set and the second characteristic point set by corresponding weight values and inputting the result into a classification model;

and the classification model classifies the video file according to the multiplied first characteristic point set and the multiplied second characteristic point set.

The processor 601 is further configured to perform the following steps:

acquiring a foreground image and a background image of each frame of image;

acquiring foreground characteristic points according to the foreground image, and acquiring background characteristic points according to the background image;

converting the first feature point set and the second feature point set into a first feature point vector and a second feature point vector;

setting the weight of the foreground characteristic points to be larger than the weight of the background characteristic points;

multiplying the first characteristic point vector and the second characteristic point vector by corresponding weight values and inputting the multiplied first characteristic point vector and second characteristic point vector into a classification model;

identifying the face image of each frame of image in the multi-frame images to obtain a face image set;

calculating the occurrence frequency of each face image in the face image set, and determining the face image with the highest occurrence frequency as a target face image;

identifying a target face image to obtain face label information of the target face image;

and adding the face label information into the file name of the video file.

acquiring first label information according to the first feature point set;

acquiring second label information according to the second feature point set;

and adding the first label information and the second label information into the file name of the video file.

dividing the video file into a plurality of sub video files with equal playing time;

acquiring a multi-frame image of the starting time period of each sub video file.

As can be seen from the above, the electronic device provided in the embodiment of the present application obtains the multi-frame image of the video file; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; the classification model classifies the video file according to the first feature point set and the second feature point set. Two feature point sets related to scene features and object features are obtained from a video file, then the two feature point sets are input into an algorithm module for classification, and input data are classified, so that the classification is more accurate.

Referring also to fig. 13, in some embodiments, the electronic device 600 may further include: a display 603, a radio frequency circuit 604, an audio circuit 605, and a power supply 606. The display 603, the rf circuit 604, the audio circuit 605 and the power supply 606 are electrically connected to the processor 601, respectively.

The display 603 may be used to display information entered by or provided to the user as well as various graphical user interfaces, which may be made up of graphics, text, icons, video, and any combination thereof. The Display 603 may include a Display panel, and in some embodiments, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The rf circuit 604 may be used for transceiving rf signals to establish wireless communication with a network device or other electronic devices through wireless communication, and for transceiving signals with the network device or other electronic devices.

The audio circuit 605 may be used to provide an audio interface between the user and the electronic device through a speaker, microphone.

The power supply 606 may be used to power various components of the electronic device 600. In some embodiments, the power supply 606 may be logically connected to the processor 601 through a power management system, so as to implement functions of managing charging, discharging, and power consumption management through the power management system.

Although not shown in fig. 13, the electronic device 600 may further include a camera, a bluetooth unit, and the like, which are not described in detail herein.

It can be understood that the electronic device of the embodiment of the present application may be a terminal device such as a smart phone or a tablet computer.

An embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, the computer is caused to execute the video classification method in any of the above embodiments, for example: as can be seen from the above, in the video classification method provided in the embodiment of the present application, the multi-frame image of the video file is obtained; extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; selecting feature points related to scene features from the feature point set to obtain a first feature point set, and selecting feature points related to object features from the feature point set to obtain a second feature point set; the classification model classifies the video file according to the first feature point set and the second feature point set.

In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It should be noted that, for the video classification method of the embodiment of the present application, it can be understood by a person skilled in the art that all or part of the process for implementing the video classification method of the embodiment of the present application can be implemented by controlling related hardware through a computer program, where the computer program can be stored in a computer-readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution process, the process of the embodiment of the video classification method can be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.

In the video classification apparatus according to the embodiment of the present application, each functional unit may be integrated into one processing chip, each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.

The foregoing detailed description is directed to a video classification method, apparatus, storage medium, and electronic device provided in the embodiments of the present application, and specific examples are applied in the present application to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A video classification method is applied to electronic equipment, and is characterized by comprising the following steps:

acquiring a multi-frame image of a video file;

calculating the occurrence frequency of each feature point in the plurality of sub-feature point sets;

combining the plurality of sub-feature point sets to form a feature point set, and obtaining a weight value corresponding to each feature point according to the occurrence frequency of each feature point;

multiplying the first characteristic point set and the second characteristic point set by corresponding weight values and inputting the multiplied values into a classification model;

and the classification model classifies the video files according to the multiplied first characteristic point set and second characteristic point set.

2. The video classification method according to claim 1, wherein the step of classifying the video file according to the first feature point set and the second feature point set by a classification model comprises:

obtaining a plurality of classification labels of a classification model, wherein the classification labels respectively correspond to one video category;

inputting the first feature point set and the second feature point set into the classification model to obtain probability values of the video file matched with the classification labels;

acquiring a classification label with the probability value larger than a preset probability value as a target classification label;

and classifying the video files according to the target classification labels.

3. The video classification method of claim 1, further comprising:

identifying the target face image to obtain face label information of the target face image;

and adding the face label information into the file name of the video file.

4. The video classification method of claim 1, further comprising:

acquiring first label information according to the first feature point set;

acquiring second label information according to the second feature point set;

5. The video classification method according to claim 1, wherein the step of obtaining a plurality of frames of images of a video file comprises:

and acquiring a plurality of frames of images of the initial time period of each sub video file.

6. A video classification method is applied to electronic equipment, and is characterized by comprising the following steps:

acquiring a multi-frame image of a video file;

extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set, wherein a foreground image and a background image of each frame of image are obtained, the foreground characteristic points are obtained according to the foreground image, and the background characteristic points are obtained according to the background image;

setting the weight of the foreground characteristic point to be larger than the weight of the background characteristic point;

7. A video classification device applied to electronic equipment is characterized by comprising:

the characteristic point set acquisition module is used for extracting the characteristic points of each frame of image in the multi-frame images to obtain a characteristic point set; the characteristic point set acquisition module comprises a sub-characteristic point set sub-module, a calculation sub-module and a weight value acquisition sub-module;

the sub-feature point set sub-module is used for extracting feature points of each frame of image in the multi-frame images to obtain a plurality of sub-feature point sets corresponding to the multi-frame images;

the calculating submodule is used for calculating the occurrence frequency of each feature point in the plurality of sub-feature point sets;

the weight value obtaining submodule is used for combining the plurality of sub-feature point sets to form a feature point set, and obtaining a weight value corresponding to each feature point according to the occurrence frequency of each feature point;

and the processing module is used for multiplying the first characteristic point set and the second characteristic point set by corresponding weight values and inputting the multiplied values into a classification model, and classifying the video files by utilizing the classification model according to the multiplied first characteristic point set and the multiplied second characteristic point set.

8. The video classification device according to claim 7, wherein the processing module comprises:

the classification label acquisition submodule is used for acquiring a plurality of classification labels of the classification model, and the classification labels respectively correspond to one video category;

a probability value obtaining submodule, configured to input the first feature point set and the second feature point set into the classification model, so as to obtain a probability value that the video file is matched with each classification label;

the classification label selection submodule is used for acquiring a classification label with the probability value larger than a preset probability value as a target classification label;

and the classification submodule is used for classifying the video files according to the target classification labels.

9. A video classification device applied to electronic equipment is characterized by comprising:

the foreground and background image acquisition module is used for acquiring a foreground image and a background image of each frame of image;

the characteristic point extraction module is used for acquiring foreground characteristic points according to the foreground image and acquiring background characteristic points according to the background image;

the processing module is used for classifying the video files according to the first characteristic point set and the second characteristic point set by the classification model, and comprises a conversion sub-module, a weight value setting sub-module, a merging sub-module and a processing sub-module;

the conversion submodule is used for converting the first characteristic point set and the second characteristic point set into a first characteristic point vector and a second characteristic point vector;

the weight value setting submodule is used for setting the weight of the foreground characteristic points to be greater than the weight value of the background characteristic points;

the merging submodule is used for multiplying the first characteristic point vector and the second characteristic point vector by corresponding weight values and then inputting the multiplied first characteristic point vector and second characteristic point vector into the classification model;

and the processing submodule is used for classifying the video file according to the multiplied first characteristic point set and the multiplied second characteristic point set by utilizing the classification model.

10. A storage medium having stored thereon a computer program, characterized in that, when the computer program is run on a computer, it causes the computer to execute the video classification method according to any one of claims 1 to 6.

11. An electronic device comprising a processor and a memory, said memory having a computer program, wherein said processor is adapted to perform the video classification method of any of claims 1 to 6 by invoking said computer program.