CN110879974A - Video classification method and device - Google Patents
Video classification method and device Download PDFInfo
- Publication number
- CN110879974A CN110879974A CN201911058829.6A CN201911058829A CN110879974A CN 110879974 A CN110879974 A CN 110879974A CN 201911058829 A CN201911058829 A CN 201911058829A CN 110879974 A CN110879974 A CN 110879974A
- Authority
- CN
- China
- Prior art keywords
- video
- classified
- vector
- classification
- visual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 239000013598 vector Substances 0.000 claims abstract description 155
- 230000000007 visual effect Effects 0.000 claims abstract description 68
- 238000013145 classification model Methods 0.000 claims abstract description 27
- 238000010586 diagram Methods 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 102100028266 Brain-specific angiogenesis inhibitor 1-associated protein 2-like protein 2 Human genes 0.000 description 1
- 101710102057 Brain-specific angiogenesis inhibitor 1-associated protein 2-like protein 2 Proteins 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/75—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a video classification method and device, and relates to the field of data processing. The invention aims to solve the problem that the existing video classification process is low in efficiency and accuracy. The technical scheme provided by the embodiment of the invention comprises the following steps: acquiring a feature vector of each key frame in the video to be classified according to the key frames in the video to be classified; acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified; acquiring a text classification vector of the video to be classified according to texts contained in image frames in the video to be classified; and substituting the visual classification vector and the text classification vector into a preset classification model to obtain the category of the video to be classified. The scheme can be applied to the fields of video directional pushing and the like.
Description
Technical Field
The present invention relates to the field of data processing, and in particular, to a video classification method and apparatus.
Background
In recent years, with the rapid development of internet short video platforms, various videos such as movies, food, science and technology, tourism, education, games and the like show explosive growth. These videos are widely available, low cost, large in daily number, and extremely fast in propagation speed, which brings great challenges to video classification.
In the prior art, videos are generally classified manually or by extracting keywords from titles. However, a large amount of manpower and material resources are consumed by adopting a manual mode, so that the efficiency is low; moreover, the title may not accurately summarize the content of the video, so that the accuracy of video classification by extracting keywords is low; through a pure visual classification method, the categories of constellation fate, job site management, emotion and the like which need semantic understanding cannot be classified, so that the video classification accuracy is low.
Disclosure of Invention
In view of the above, the main objective of the present invention is to solve the problem of low efficiency and accuracy of the existing video classification method.
In one aspect, a video classification method provided in an embodiment of the present invention includes: acquiring a feature vector of each key frame in the video to be classified according to the key frames in the video to be classified; acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified; acquiring a text classification vector of the video to be classified according to texts contained in image frames in the video to be classified; and substituting the visual classification vector and the text classification vector into a preset classification model to obtain the category of the video to be classified.
On the other hand, an embodiment of the present invention provides a video classification apparatus, including:
the characteristic acquisition module is used for acquiring a characteristic vector of each key frame in the video to be classified according to the key frames in the video to be classified;
the visual classification module is connected with the feature acquisition module and used for acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified;
the text classification module is used for acquiring a text classification vector of the video to be classified according to texts contained in image frames in the video to be classified;
and the category acquisition module is respectively connected with the visual classification module and the text classification module and used for substituting the visual classification vector and the text classification vector into a preset classification model to acquire the category of the video to be classified.
In summary, the video classification method and apparatus provided by the present invention achieve video classification by respectively obtaining the visual classification vector and the text classification vector, and substituting the visual classification vector and the text classification vector into the classification model. According to the technical scheme provided by the embodiment of the invention, the visual classification vector and the text classification vector are used as the parameters of video classification together, so that the accuracy of video classification is improved, and the problems of low efficiency and accuracy of the conventional video classification method are solved. In addition, the visual classification vector is obtained according to the feature vector of the key frame, and the text classification vector contains deeper semantic information, so that the accuracy of video classification can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a video classification method according to embodiment 1 of the present invention;
fig. 2 is a flowchart of a video classification method according to embodiment 2 of the present invention;
fig. 3 is a first schematic structural diagram of a video classification apparatus according to embodiment 3 of the present invention;
FIG. 4 is a schematic diagram of a visual classification module of the video classification apparatus shown in FIG. 3;
fig. 5 is a schematic structural diagram of a video classification apparatus according to embodiment 3 of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
As shown in fig. 1, the present invention provides a video classification method, including:
In this embodiment, the key frame in step 101 is also called an I-frame (Intra-coded frame), which is a frame that completely retains image data in the compressed video, and when decoding the key frame, only the image data of the key frame is needed to complete decoding. Because the similarity among all key frames in the video to be classified is small, the video to be classified can be comprehensively represented by a plurality of key frames; by extracting the feature vectors of the key frames, the accuracy rate of classifying the video images to be classified can be improved.
Specifically, the process of obtaining the feature vector through step 101 includes: extracting key frames from the video to be classified according to a preset rule; the preset rules include: one of duration, interval, weight and click rate; and acquiring the feature vector of each key frame in the video to be classified. The specific way to obtain the feature vector may be: respectively extracting the features of each key frame in the video to be classified by using a preset image classifier to obtain the feature vector of each key frame in the video to be classified; or acquiring the feature point of each key frame in the video to be classified, and acquiring the feature vector of each key frame in the video to be classified according to the feature point. The method for determining the feature points and the feature vectors may include: scale-invariant feature transform (SIFT) method, Speeded Up Robust Features (SURF) method, orb (organized FAST and organized brief) method, neural Network method, ResNet method (organized Residual Network), Xception (depth separable convolution), I3D (migrated 3D conv, Inflated 3D convolution method), P3D (Pseudo-3D Residual Networks), and TSN (temporal segment Networks).
And 102, acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified.
In this embodiment, the process of obtaining the visual classification vector through step 102 includes: combining the feature vectors of all key frames in the video to be classified according to rows to obtain a feature map; and respectively fusing each line of data in the characteristic diagram into one data to obtain a visual classification vector of the video to be classified. The process of respectively fusing each line of data in the feature map into one data may include: respectively calculating the average value of each line of data in the characteristic diagram to obtain a visual classification vector of the video to be classified; or respectively calculating the maximum value of each line of data in the feature map to obtain the visual classification vector of the video to be classified. In particular, each line of data in the feature map may be fused into a single data by other methods, such as calculating the minimum value, and the like, which are not described in detail herein.
In this embodiment, the process of obtaining the text classification vector in step 103 may be performed after obtaining the visual classification vector, as shown in fig. 1; before the visual classification vector is obtained, the process of obtaining the visual classification vector may be performed simultaneously, and the process is not limited herein. The process of obtaining the text classification vector through step 103 includes: extracting image frames from a video to be classified; identifying characters in the image frame to obtain texts contained in the image frame; and combining texts contained in all image frames in the video to be classified, and then classifying the texts to obtain a text classification vector of the video to be classified. The method for combining the texts contained in all the image frames in the video to be classified can be used for splicing the texts contained in all the image frames in the video to be classified to form a long text. The method adopted by the Text recognition may be CRNN (convolutional recurrent Neural Network) or CTPN (connective Text forward Network) connected to a Text candidate box Network; the method adopted for text classification can be TextCNN (algorithm for classifying texts by using convolutional neural network), FastText (fast text classification algorithm) or LSTM (Long-Short term memory, Long-term memory artificial neural network).
And 104, substituting the visual classification vector and the text classification vector into a preset classification model to obtain the category of the video to be classified.
In this embodiment, the preset classification model in step 104 may be generated in advance by using a model such as a neural network. The process of obtaining the category of the video to be classified through step 104 may be: splicing the visual classification vector and the text classification vector to synthesize a line vector to obtain a vector to be classified; and substituting the vectors to be classified into a preset classification model to obtain the category of the video to be classified.
In summary, the video classification method provided by the present invention realizes video classification by respectively obtaining the visual classification vector and the text classification vector, and substituting the visual classification vector and the text classification vector into the classification model. According to the technical scheme provided by the embodiment of the invention, the visual classification vector and the text classification vector are used as the parameters of video classification together, so that the accuracy of video classification is improved, and the problems of low efficiency and accuracy of the conventional video classification method are solved. In addition, the visual classification vector is obtained according to the feature vector of the key frame, and the text classification vector contains deeper semantic information, so that the accuracy of video classification can be further improved.
Example 2
As shown in fig. 2, an embodiment of the present invention provides a video classification method, including:
step 201 to step 203, obtaining the visual classification vector and the text classification vector, which is similar to step 101 to step 103 shown in fig. 1 and is not repeated here.
In this embodiment, the initial classifier in step 205 may adopt a convolutional neural network model, or may adopt other models, which is not limited herein.
And step 206, substituting the visual classification vector and the text classification vector into a preset classification model to obtain the category of the video to be classified. The process is similar to step 104 shown in fig. 1, and is not described in detail here.
In summary, the video classification method provided by the present invention realizes video classification by respectively obtaining the visual classification vector and the text classification vector, and substituting the visual classification vector and the text classification vector into the classification model. According to the technical scheme provided by the embodiment of the invention, the visual classification vector and the text classification vector are used as the parameters of video classification together, so that the accuracy of video classification is improved, and the problems of low efficiency and accuracy of the conventional video classification method are solved. In addition, the visual classification vector is obtained according to the feature vector of the key frame, and the text classification vector contains deeper semantic information, so that the accuracy of video classification can be further improved.
Example 3
As shown in fig. 3, an embodiment of the present invention provides a video classification apparatus, including:
the feature obtaining module 301 is configured to obtain a feature vector of each key frame in the video to be classified according to the key frame in the video to be classified;
the visual classification module 302 is connected with the feature acquisition module and is used for acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified;
the text classification module 303 is configured to obtain a text classification vector of the video to be classified according to a text included in an image frame in the video to be classified;
and the category obtaining module 304 is connected to the visual classification module and the text classification module, respectively, and is configured to substitute the visual classification vector and the text classification vector into a preset classification model to obtain a category of the video to be classified.
In this embodiment, the process of classifying videos through the feature obtaining module 301 to the category obtaining module 304 is similar to that provided in the first embodiment of the present invention, and is not repeated here.
Further, as shown in fig. 4, a visual classification module 302 in the video classification apparatus according to the embodiment of the present invention includes:
the vector combination submodule 3021 is configured to combine feature vectors of all key frames in a video to be classified according to rows to obtain a feature map;
and the vector fusion submodule 3022 is connected to the vector combination submodule, and is configured to fuse each line of data in the feature map into one data, so as to obtain a visual classification vector of the video to be classified.
Further, as shown in fig. 5, the video classification apparatus provided in the embodiment of the present invention further includes:
a sample obtaining module 305, configured to obtain a plurality of video samples, and a visual classification vector, a text classification vector, and a category value corresponding to each video sample;
and the training module 306 is connected to the sample acquisition module and the category acquisition module, and is configured to train the initial classifier according to the visual classification vector, the text classification vector and the category value corresponding to each video sample, so as to obtain a classification model.
When the video classification apparatus provided in this embodiment further includes the sample obtaining module 305 and the training module 306, the process of implementing video classification is similar to that provided in the second embodiment of the present invention, and details are not repeated here.
In summary, the video classification apparatus provided in the present invention realizes video classification by respectively obtaining the visual classification vector and the text classification vector, and substituting the visual classification vector and the text classification vector into the classification model. According to the technical scheme provided by the embodiment of the invention, the visual classification vector and the text classification vector are used as the parameters of video classification together, so that the accuracy of video classification is improved, and the problems of low efficiency and accuracy of the conventional video classification method are solved. In addition, the visual classification vector is obtained according to the feature vector of the key frame, and the text classification vector contains deeper semantic information, so that the accuracy of video classification can be further improved.
Specifically, the video category corresponding to the video sample used in the training is 20, and the 20 training categories include dance, music, gourmet, makeup, dance, sports, manual work, pets, mother and baby, drawing, life, wearing and building, games, animation, fitness, emotion, constellation, travel, digital code and furniture. Corresponding to different training classes, the class values of which respectively correspond to integers between 0 and 19. The process of training the classifier may specifically include:
and extracting N frames (N is more than or equal to 3) of key frames from each video sample, wherein N is 4 as an example. And training the video initial classifier through the extracted key frames and the corresponding class values to obtain a video classification model. The video initial classifier may employ models of Resnet50(Residual Network 50, depth Residual Network 50), Resnet101(Residual Network 101, depth Residual Network 101), Xceptance (depth separable convolution), and the like.
Taking a video sample as an example, the process of obtaining the row vector of each video sample includes:
extracting image frames from the video samples; identifying characters in the image frame to obtain texts contained in the image frame, wherein the identification method can be CRNN, CTPN and the like; and training the initial text classifier according to the texts and the corresponding class values contained in the image frames to obtain a text classification model. The initial text classifier may employ models such as TextCNN, FastText, LSTM, etc.
Acquiring key frames in a video sample, taking extracting the following 4 key frames as an example:
key frame 0, dimension (255, 3);
keyframe 1, dimension (255, 3);
keyframe 2, dimension (255, 3);
key frame 3, dimension (255, 3).
Substituting each key frame in the video sample into the video classification model to obtain a feature vector of each key frame in the video sample, wherein the dimension of each feature vector is (1,2048);
[-1.4759,-0.6063,1.2209,……,0.3973,-0.1676,2.7899]
[-0.7009,-0.4696,1.7640,……,1.1952,1.3861,0.2387]
[1.0831,-1.9600,0.8904,……,0.3973,-0.1676,2.7899]
[0.1322,0.6038,2.6935,……,0.3889,1.4386,1.0443]
combining the 4 eigenvectors according to rows to obtain an eigenvector mapRespectively fusing each line of data in the feature map into one data, taking the average fusion of each line of data as an example, obtaining a visual classification vector with the dimension of (2048,1) [ -0.2404, -0.6080,1.6422, … …,0.4514,0.6358 and 1.1756]。
Image frames are extracted from the video, and the image frames are obtained by taking the extraction at equal intervals of 1s as an example. Recognizing characters in an image frame to obtain a text 'national goods, highlight, card, heart, hello, great family, i is a makeup musician of a demeanor imitation makeup, i needs to make the best recently, the habit of habitually buying out is a person, i's old swan and i's colorpop are not used up yet, i's old swan and i's colophony are bought of two types of national goods, the reason of purchasing is not unique, i's shell is good-looking, the price is cheap, i's first type fawn, i's second color, rainbow color, powder is particularly high, the fash is wiped out by a brush, the fash is a little rainbow, the fash is a polarized light, the second type, the shell is grown to be more European and American, a laser package is opened, the fash has three colors, the fash is a dark pink bar, the fash is wiped down, the face color is an Ji, and the light eyebrow is wiped, when not making up, people can wipe a little high light, so that five sense organs of people can be more three-dimensional, and people can share the situation at the present flat price. The text is input into a text classification model, and a text classification vector with the dimensionality of (1024,1) [0.0107,0.2644,0.4699, … …,0.5430,0.8514,0.6103] is obtained.
And splicing the visual classification vector and the text classification vector to obtain a line vector with dimension (3072, 1).
And training the initial classifier according to the row vectors and the corresponding class values of all the video samples to obtain a classification model.
Finally, all videos can be classified through the video classification model, the text classification model and the classification model, and the classification process is similar to that provided by the first embodiment of the invention and is not repeated herein.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A method of video classification, comprising:
acquiring a feature vector of each key frame in the video to be classified according to the key frames in the video to be classified;
acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified;
acquiring a text classification vector of the video to be classified according to texts contained in image frames in the video to be classified;
and substituting the visual classification vector and the text classification vector into a preset classification model to obtain the category of the video to be classified.
2. The method for video classification according to claim 1, wherein the obtaining the visual classification vector of the video to be classified comprises:
combining the feature vectors of all key frames in the video to be classified according to rows to obtain a feature map;
and respectively fusing each line of data in the characteristic diagram into one data to obtain the visual classification vector of the video to be classified.
3. The video classification method according to claim 2, wherein the step of respectively fusing each line of data in the feature map into one data to obtain the visual classification vector of the video to be classified comprises:
respectively calculating the average value of each line of data in the characteristic diagram to obtain a visual classification vector of the video to be classified; or,
and respectively calculating the maximum value of each line of data in the characteristic diagram to obtain the visual classification vector of the video to be classified.
4. The method of claim 1, wherein before said assigning the visual classification vector and the text classification vector to a preset classification model, further comprising:
acquiring a plurality of video samples, and a visual classification vector, a text classification vector and a category value corresponding to each video sample;
and training the initial classifier according to the visual classification vector, the text classification vector and the class value corresponding to each video sample to obtain the classification model.
5. The method according to any one of claims 1 to 4, wherein the obtaining the feature vector of each key frame in the video to be classified comprises:
respectively extracting the features of each key frame in the video to be classified by using a preset image classifier to obtain the feature vector of each key frame in the video to be classified; or,
and acquiring the feature point of each key frame in the video to be classified, and acquiring the feature vector of each key frame in the video to be classified according to the feature points.
6. The method according to any one of claims 1 to 4, wherein the obtaining the feature vector of each key frame in the video to be classified comprises:
extracting key frames from the video to be classified according to a preset rule; the preset rules include: one of duration, weight, interval and click rate;
and acquiring the feature vector of each key frame in the video to be classified.
7. The video classification method according to any one of claims 1 to 4, wherein obtaining the text classification vector of the video to be classified comprises:
extracting image frames from the video to be classified;
identifying characters in the image frame to obtain texts contained in the image frame;
and combining texts contained in all image frames in the video to be classified, and then classifying the texts to obtain a text classification vector of the video to be classified.
8. A video classification apparatus, comprising:
the characteristic acquisition module is used for acquiring a characteristic vector of each key frame in the video to be classified according to the key frames in the video to be classified;
the visual classification module is connected with the feature acquisition module and used for acquiring a visual classification vector of the video to be classified according to the feature vector of each key frame in the video to be classified;
the text classification module is used for acquiring a text classification vector of the video to be classified according to texts contained in image frames in the video to be classified;
and the category acquisition module is respectively connected with the visual classification module and the text classification module and used for substituting the visual classification vector and the text classification vector into a preset classification model to acquire the category of the video to be classified.
9. The video classification device of claim 8, wherein the visual classification module comprises:
the vector combination submodule is used for combining the feature vectors of all key frames in the video to be classified according to rows to obtain a feature map;
and the vector fusion submodule is connected with the vector combination submodule and is used for respectively fusing each line of data in the characteristic diagram into one data to obtain the visual classification vector of the video to be classified.
10. The video classification apparatus according to claim 8 or 9, wherein the apparatus further comprises:
the system comprises a sample acquisition module, a classification module and a classification module, wherein the sample acquisition module is used for acquiring a plurality of video samples and a visual classification vector, a text classification vector and a category value corresponding to each video sample;
and the training module is respectively connected with the sample acquisition module and the category acquisition module and is used for training the initial classifier according to the visual classification vector, the text classification vector and the category value corresponding to each video sample to obtain the classification model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911058829.6A CN110879974B (en) | 2019-11-01 | 2019-11-01 | Video classification method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911058829.6A CN110879974B (en) | 2019-11-01 | 2019-11-01 | Video classification method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110879974A true CN110879974A (en) | 2020-03-13 |
CN110879974B CN110879974B (en) | 2020-10-13 |
Family
ID=69728219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911058829.6A Active CN110879974B (en) | 2019-11-01 | 2019-11-01 | Video classification method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110879974B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488489A (en) * | 2020-03-26 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Video file classification method, device, medium and electronic equipment |
CN111556377A (en) * | 2020-04-24 | 2020-08-18 | 珠海横琴电享科技有限公司 | Short video labeling method based on machine learning |
CN111859024A (en) * | 2020-07-15 | 2020-10-30 | 北京字节跳动网络技术有限公司 | Video classification method and device and electronic equipment |
CN113901253A (en) * | 2021-09-22 | 2022-01-07 | 成都飞机工业(集团)有限责任公司 | Component process route determining method, device, equipment and storage medium |
CN114157906A (en) * | 2020-09-07 | 2022-03-08 | 北京达佳互联信息技术有限公司 | Video detection method and device, electronic equipment and storage medium |
EP4053802A1 (en) * | 2021-03-05 | 2022-09-07 | Beijing Baidu Netcom Science Technology Co., Ltd. | Video classification method and apparatus, device and storage medium |
CN117708376A (en) * | 2023-07-17 | 2024-03-15 | 荣耀终端有限公司 | Video processing method, readable storage medium and electronic device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102222101A (en) * | 2011-06-22 | 2011-10-19 | 北方工业大学 | Method for video semantic mining |
CN104657468A (en) * | 2015-02-12 | 2015-05-27 | 中国科学院自动化研究所 | Fast video classification method based on images and texts |
US20160026872A1 (en) * | 2014-07-23 | 2016-01-28 | Microsoft Corporation | Identifying presentation styles of educational videos |
CN108241729A (en) * | 2017-09-28 | 2018-07-03 | 新华智云科技有限公司 | Screen the method and apparatus of video |
CN108763325A (en) * | 2018-05-04 | 2018-11-06 | 北京达佳互联信息技术有限公司 | A kind of network object processing method and processing device |
CN109660865A (en) * | 2018-12-17 | 2019-04-19 | 杭州柚子街信息科技有限公司 | Make method and device, medium and the electronic equipment of video tab automatically for video |
CN110019950A (en) * | 2019-03-22 | 2019-07-16 | 广州新视展投资咨询有限公司 | Video recommendation method and device |
CN110162669A (en) * | 2019-04-04 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Visual classification processing method, device, computer equipment and storage medium |
-
2019
- 2019-11-01 CN CN201911058829.6A patent/CN110879974B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102222101A (en) * | 2011-06-22 | 2011-10-19 | 北方工业大学 | Method for video semantic mining |
US20160026872A1 (en) * | 2014-07-23 | 2016-01-28 | Microsoft Corporation | Identifying presentation styles of educational videos |
CN104657468A (en) * | 2015-02-12 | 2015-05-27 | 中国科学院自动化研究所 | Fast video classification method based on images and texts |
CN108241729A (en) * | 2017-09-28 | 2018-07-03 | 新华智云科技有限公司 | Screen the method and apparatus of video |
CN108763325A (en) * | 2018-05-04 | 2018-11-06 | 北京达佳互联信息技术有限公司 | A kind of network object processing method and processing device |
CN109660865A (en) * | 2018-12-17 | 2019-04-19 | 杭州柚子街信息科技有限公司 | Make method and device, medium and the electronic equipment of video tab automatically for video |
CN110019950A (en) * | 2019-03-22 | 2019-07-16 | 广州新视展投资咨询有限公司 | Video recommendation method and device |
CN110162669A (en) * | 2019-04-04 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Visual classification processing method, device, computer equipment and storage medium |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488489A (en) * | 2020-03-26 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Video file classification method, device, medium and electronic equipment |
CN111488489B (en) * | 2020-03-26 | 2023-10-24 | 腾讯科技(深圳)有限公司 | Video file classification method, device, medium and electronic equipment |
CN111556377A (en) * | 2020-04-24 | 2020-08-18 | 珠海横琴电享科技有限公司 | Short video labeling method based on machine learning |
CN111859024A (en) * | 2020-07-15 | 2020-10-30 | 北京字节跳动网络技术有限公司 | Video classification method and device and electronic equipment |
CN114157906A (en) * | 2020-09-07 | 2022-03-08 | 北京达佳互联信息技术有限公司 | Video detection method and device, electronic equipment and storage medium |
CN114157906B (en) * | 2020-09-07 | 2024-04-02 | 北京达佳互联信息技术有限公司 | Video detection method, device, electronic equipment and storage medium |
EP4053802A1 (en) * | 2021-03-05 | 2022-09-07 | Beijing Baidu Netcom Science Technology Co., Ltd. | Video classification method and apparatus, device and storage medium |
US12094208B2 (en) | 2021-03-05 | 2024-09-17 | Beijing Baidu Netcom Science Technology Co., Ltd. | Video classification method, electronic device and storage medium |
CN113901253A (en) * | 2021-09-22 | 2022-01-07 | 成都飞机工业(集团)有限责任公司 | Component process route determining method, device, equipment and storage medium |
CN117708376A (en) * | 2023-07-17 | 2024-03-15 | 荣耀终端有限公司 | Video processing method, readable storage medium and electronic device |
Also Published As
Publication number | Publication date |
---|---|
CN110879974B (en) | 2020-10-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110879974B (en) | Video classification method and device | |
Chen et al. | Sketchygan: Towards diverse and realistic sketch to image synthesis | |
CN110781347B (en) | Video processing method, device and equipment and readable storage medium | |
Hong et al. | Inferring semantic layout for hierarchical text-to-image synthesis | |
CN110166827B (en) | Video clip determination method and device, storage medium and electronic device | |
CN109635680B (en) | Multitask attribute identification method and device, electronic equipment and storage medium | |
Kucer et al. | Leveraging expert feature knowledge for predicting image aesthetics | |
CN110083729B (en) | Image searching method and system | |
CN112102157B (en) | Video face changing method, electronic device and computer readable storage medium | |
Dimitropoulos et al. | Classification of multidimensional time-evolving data using histograms of grassmannian points | |
Wang et al. | Early action prediction with generative adversarial networks | |
CN111160134A (en) | Human-subject video scene analysis method and device | |
CN114332466B (en) | Continuous learning method, system, equipment and storage medium for image semantic segmentation network | |
CN115129934A (en) | Multi-mode video understanding method | |
CN113515669A (en) | Data processing method based on artificial intelligence and related equipment | |
More et al. | Seamless nudity censorship: an image-to-image translation approach based on adversarial training | |
CN116665083A (en) | Video classification method and device, electronic equipment and storage medium | |
CN115222858A (en) | Method and equipment for training animation reconstruction network and image reconstruction and video reconstruction thereof | |
CN115909390B (en) | Method, device, computer equipment and storage medium for identifying low-custom content | |
Baghel et al. | Image conditioned keyframe-based video summarization using object detection | |
Song et al. | Hierarchical LSTMs with adaptive attention for visual captioning | |
Liu et al. | A3GAN: An attribute-aware attentive generative adversarial network for face aging | |
Wang et al. | From attributes to faces: a conditional generative network for face generation | |
Tan et al. | RGBD-FG: A large-scale RGB-D dataset for fine-grained categorization | |
CN114328990B (en) | Image integrity recognition method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |