CN110807369A - Efficient short video content intelligent classification method based on deep learning and attention mechanism - Google Patents
Efficient short video content intelligent classification method based on deep learning and attention mechanism Download PDFInfo
- Publication number
- CN110807369A CN110807369A CN201910952622.7A CN201910952622A CN110807369A CN 110807369 A CN110807369 A CN 110807369A CN 201910952622 A CN201910952622 A CN 201910952622A CN 110807369 A CN110807369 A CN 110807369A
- Authority
- CN
- China
- Prior art keywords
- attention mechanism
- unit
- dimensional
- module
- short video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000013135 deep learning Methods 0.000 title claims abstract description 13
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 24
- 230000006870 function Effects 0.000 claims abstract description 9
- 230000002123 temporal effect Effects 0.000 claims abstract description 3
- 108091006146 Channels Proteins 0.000 claims description 34
- 238000010586 diagram Methods 0.000 claims description 18
- 230000004913 activation Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 2
- 238000012549 training Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an efficient short video content intelligent classification method based on a deep learning and attention mechanism, and relates to an efficient method for intelligently classifying short videos according to contents. A core algorithm model of the method is composed of a two-dimensional convolutional neural network and a pseudo three-dimensional convolutional neural network which are connected in series and used for extracting shallow spatial information and high-dimensional spatial and temporal information respectively, the probability that a video belongs to each category is obtained through a normalized exponential function, and final prediction classification is obtained according to the probability. The method gives consideration to time performance and prediction accuracy, can be used for real-time content supervision and classification of the short video, and the obtained result can be used for reference of short video recommendation.
Description
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to an intelligent short video content classification method based on deep learning and attention mechanism.
Background
The short video is the most rapid internet content spreading mode in recent years, the content is various, the making and releasing thresholds are low, the monitoring is difficult due to the mass short video, and illegal videos such as violence, yellow and the like are easily mixed. According to the short video recommendation method and device, the short video contents are automatically classified in a deep learning mode, the short video platform can be assisted to review and supervise the short videos uploaded by the user, the classified short videos can also be used as reference factors for short video recommendation, the associated short videos are recommended according to the watching history of the user, and the competitiveness of the short video platform is improved.
Deep learning has become one of the main ways of automatically classifying video contents, but the training is difficult, the parameter amount is large, the time cost is large, and the classification efficiency is low, so that the method is difficult to be applied to practical engineering application. Especially, the three-dimensional convolutional neural network can integrate the time information of the video compared with the two-dimensional convolutional neural network, and is difficult to train and expand due to the difficulty in training and the requirement of higher hardware resource conditions. In the prior art, "Zolfaghari M, Singh K, Brox T.ECO: effective connected Network for Online Video interpretation [ J ]. 2018". series connection of two-dimensional convolutional neural Network and three-dimensional convolutional neural Network is used for real-time classification of videos.
Disclosure of Invention
In order to solve the problem of an automatic video content classification method in the prior art, the invention provides an efficient short video content intelligent classification method based on a deep learning and attention mechanism.
In order to achieve the purpose, the invention adopts the following technical scheme:
an efficient short video content intelligent classification method based on deep learning and attention mechanism comprises the following steps:
step 1, simply preprocessing an original short video, and quickly converting the video into a picture by using an FFmpeg tool;
step 2, determining the number N of pictures of the input model according to the requirements of accuracy and time, uniformly extracting input images from all the images at equal intervals to form a section of ordered input frames, and cutting the input frames into the size of 224 pixels multiplied by 224 pixels;
step 3, inputting the pictures processed in the step 2 into a two-dimensional convolutional neural network with an attention mechanism, outputting shallow feature representation diagrams, and stacking the shallow feature representation diagrams according to the time and channel sequence to form a feature diagram sequence X with time and space information;
step 4, inputting the output result of the step 3 into a pseudo three-dimensional convolution neural network with an attention mechanism, and learning time information and high-dimensional space information; the pseudo three-dimensional convolutional neural network comprises a plurality of unit modules which are sequentially arranged, each unit module comprises a plurality of convolutional layers which are sequentially arranged and an attention mechanism module which is positioned behind the last convolutional layer, the attention mechanism module is used for recalibrating time and space information to obtain the weights of all channels in the unit module, the weight of each channel is multiplied by the output of the last convolutional layer to obtain the output result of the attention mechanism module in the unit module, the output result is output to the next unit module, and the attention mechanism module of the last unit module outputs high-dimensional characteristics;
and 5, inputting the high-dimensional features obtained in the step 4 into a full-connection layer to obtain the probability that the video belongs to each category, and obtaining the final prediction classification according to the probability.
The characteristic diagram sequence with time and space information output by the step 3Wherein: x represents the sequence of the stacked characteristic graphs output by the step 3, X represents a unit characteristic graph in X, subscript c represents the number of channels, superscript d represents the time dimension,a unit profile representing channel c, with time dimension d.
Further, in step 4, the convolutional layer structure is obtained by using a classical network and modifying a convolutional kernel to a pseudo three-dimensional convolutional kernel. The available classical networks are an inclusion Network, a Residual Network (Residual Network) and the like.
Further, the output U of each attention mechanism module in the step 4 is represented asWherein: u represents a unit characteristic diagram in U, subscript c represents the number of channels, superscript d represents a time dimension, and the calculation process of each attention mechanism module is described as follows:
Z=[z1,z2,...,zc]
s=σ(δ(Z,W1),W2)
s=[s1,s2,...,sc]
where D is the time dimension of the feature sequence, W and H refer to the width and height of each feature map, i, j, k refer to the spatial horizontal direction (width), spatial vertical direction (height) and time dimension indices of the image, xc(i, j, k) refers to pixel points with index i, j, k in the unit feature map with channel c, zcUnit profile x for channel ccZ is the global mean of all channels, i.e. Z ═ Z1,z2,…,zc]S weight of all channels in a unit block, scIs the weight of channel c in a unit module, W1And W2Parameters of two fully connected layers are respectively, delta represents a ReLU activation function, and sigma represents a Sigmoid activation function.
Further, in the step 5, the category of the maximum prediction probability value is selected as a video classification label, or all prediction categories larger than a threshold value are selected as video labels.
Compared with the prior art, the invention has the following beneficial effects:
the method combines the characteristics of the two-dimensional convolutional neural network and the three-dimensional convolutional neural network, adopts a mode of connecting the two-dimensional convolutional neural network and the pseudo three-dimensional convolutional network in series, and can improve the prediction accuracy and the model robustness, increase the attention module, improve the model performance and give consideration to the prediction accuracy.
Drawings
FIG. 1 is a data transmission flow diagram of the present invention;
FIG. 2 is a diagram of the overall framework of the model of the present invention;
FIG. 3 is a detailed diagram of a two-dimensional convolutional neural network layer incorporating an attention mechanism in the present invention;
FIG. 4 is a detailed view of a pseudo three-dimensional convolutional neural network layer incorporating an attention mechanism in the present invention;
fig. 5 is a flow chart of the present invention.
Detailed Description
The invention will be further elucidated with reference to specific embodiments.
As shown in fig. 1, the overall network framework is a series connection of two convolutional neural networks, and finally the prediction probability of each category is output.
As shown in fig. 2, the specific process is as follows: firstly, uniformly extracting frames of a short video, wherein the number of the extracted frame images is set as N: dividing all video frames into N parts, randomly extracting one frame from each part, arranging the frames according to time sequence, and sending the frames into a two-dimensional convolution neural network. Note that the mechanism adopts the Squeze-and-excitation module proposed by J.Hu, L.Shen, and G.Sun, "Squeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition,2018, pp.7132-7141.
As shown in fig. 3, details of the two-dimensional convolutional neural network layer are described, including a network structure detail diagram and convolutional kernel parameters, where the two-dimensional convolutional neural network adopts an inclusion network as an example. And finally outputting shallow features by the two-dimensional convolutional neural network layer. As shown in fig. 2, the shallow feature map sequence ordered according to the channels and time sequence is input to the pseudo three-dimensional convolutional neural network layer.
As shown in fig. 4, in the present embodiment, a residual error network is used as an example for the pseudo three-dimensional convolution network, and a convolution kernel is changed into pseudo three-dimensional. The attention mechanism adopts an optimized Squeeze-and-excitation attention module suitable for the three-dimensional network. And inputting the obtained high-dimensional features into two continuous full-connection layers, finally outputting the prediction probability of each category by using a Sigmoid function, and ending prediction.
As shown in fig. 5, in particular, a method for intelligently classifying high-efficiency short video contents based on deep learning and attention mechanism includes the following steps:
step 1, simply preprocessing an original short video, and quickly converting the video into a picture by using an FFmpeg tool;
step 2, determining the number N of pictures of the input model according to the requirements of accuracy and time, uniformly extracting input images from all the images at equal intervals to form a section of ordered input frames, and cutting the input frames into the size of 224 pixels multiplied by 224 pixels;
step 3, inputting the pictures processed in the step 2 into a two-dimensional convolutional neural network with an attention mechanism, outputting shallow feature representation diagrams, and stacking the shallow feature representation diagrams according to the time and channel sequence to form a feature diagram sequence X with time and space information;
specifically, the characteristic diagram sequence with time and space information output by the step 3Wherein: x represents the sequence of the stacked characteristic graphs output by the step 3, X represents a unit characteristic graph in X, subscript c represents the number of channels, superscript d represents the time dimension,a unit profile representing channel c, with time dimension d.
Step 4, inputting the output result of the step 3 into a pseudo three-dimensional convolution neural network with an attention mechanism, and learning time information and high-dimensional space information; the pseudo three-dimensional convolutional neural network comprises a plurality of unit modules which are sequentially arranged, each unit module comprises a plurality of convolutional layers which are sequentially arranged and an attention mechanism module which is positioned behind the last convolutional layer, the attention mechanism module is used for recalibrating time and space information to obtain the weights of all channels in the unit module, the weight of each channel is multiplied by the output of the last convolutional layer to obtain the output result of the attention mechanism module in the unit module, the output result is output to the next unit module, and the attention mechanism module of the last unit module outputs high-dimensional characteristics;
specifically, the output U of each attention mechanism module in the step 4 is represented asWherein: u represents a unit characteristic diagram in U, subscript c represents the number of channels, superscript d represents a time dimension, and the calculation process of each attention mechanism module is described as follows:
Z=[z1,z2,...,zc]
s=δ(δ(Z,W1),W2)
s=[s1,s2,...,sc]
where D is the time dimension of the feature sequence, W and H refer to the width and height of each feature map, i, j, k refer to the spatial horizontal direction (width), spatial vertical direction (height) and time dimension indices of the image, xc(i, j, k) refers to pixel points with index i, j, k in the unit feature map with channel c, zcUnit profile x for channel ccZ is the global mean of all channels, i.e. Z ═ Z1,z2,…,zc]S the weight of all channels in a unit block,scis the weight of channel c in a unit module, W1And W2Parameters of two fully connected layers are respectively, delta represents a ReLU activation function, and sigma represents a Sigmoid activation function.
And 5, inputting the high-dimensional features obtained in the step 4 into the full-link layer to obtain the probability that the video belongs to each category, and obtaining the final prediction classification according to the probability, specifically, selecting the category with the maximum prediction probability value as a video classification label, or selecting all prediction categories larger than a threshold value as video labels.
The invention discloses a high-efficiency method for intelligently classifying short videos according to contents. A core algorithm model of the method is composed of a two-dimensional convolutional neural network and a pseudo three-dimensional convolutional neural network which are connected in series and used for extracting shallow spatial information and high-dimensional spatial and temporal information respectively, the probability that a video belongs to each category is obtained through a normalized exponential function, and final prediction classification is obtained according to the probability. The method gives consideration to time performance and prediction accuracy, can be used for real-time content supervision and classification of the short video, and the obtained result can be used for reference of short video recommendation.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (4)
1. An efficient short video content intelligent classification method based on deep learning and attention mechanism is characterized by comprising the following steps:
step 1, simply preprocessing an original short video, and quickly converting the video into a picture by using an FFmpeg tool;
step 2, determining the number N of pictures of the input model according to the requirements of accuracy and time, uniformly extracting input images from all the images at equal intervals to form a section of ordered input frames, and cutting the input frames into the size of 224 pixels multiplied by 224 pixels;
step 3, inputting the pictures processed in the step 2 into a two-dimensional convolutional neural network with an attention mechanism, outputting shallow feature representation diagrams, and stacking the shallow feature representation diagrams according to the time and channel sequence to form a feature diagram sequence X with time and space information;
step 4, inputting the output result of the step 3 into a pseudo three-dimensional convolution neural network with an attention mechanism, and learning time information and high-dimensional space information; the pseudo three-dimensional convolutional neural network comprises a plurality of unit modules which are sequentially arranged, each unit module comprises a plurality of convolutional layers which are sequentially arranged and an attention mechanism module which is positioned behind the last convolutional layer, the attention mechanism module is used for recalibrating time and space information to obtain the weights of all channels in the unit module, the weight of each channel is multiplied by the output of the last convolutional layer to obtain the output result of the attention mechanism module in the unit module, the output result is output to the next unit module, and the attention mechanism module of the last unit module outputs high-dimensional characteristics;
and 5, inputting the high-dimensional features obtained in the step 4 into two continuous full-connection layers to obtain the probability that the video belongs to each category, and obtaining the final prediction classification according to the probability.
2. The method for intelligently classifying high-efficiency short video contents based on deep learning and attention mechanism as claimed in claim 1, wherein the feature map sequence with temporal and spatial information output in step 3 Wherein: x represents the sequence of the stacked characteristic graphs output by the step 3, X represents a unit characteristic graph in X, subscript c represents the number of channels, superscript d represents the time dimension,a unit profile representing channel c, with time dimension d.
3. The method for intelligently classifying high-efficiency short video contents based on deep learning and attention mechanism as claimed in claim 1, wherein the output U of each attention mechanism module in the step 4 is represented asWherein: u represents a unit characteristic diagram in U, subscript c represents the number of channels, superscript d represents a time dimension, and the calculation process of each attention mechanism module is described as follows:
Z=[z1,z2,…,zc]
s=σ(δ(Z,W1),W2)
s=[s1,s2,…,sc]
wherein D is the time dimension of the feature sequence, W and H refer to the width and height of each feature map respectively, i, j, k refer to the spatial horizontal direction, spatial vertical direction and time dimension index of each feature map respectively, and xc(i, j, k) refers to pixel points with index i, j, k in the unit feature map with channel c, zcUnit profile x for channel ccZ is the global mean of all channels, i.e. Z ═ Z1,z2,…,zc]S weight of all channels in a unit block, scIs the weight of channel c in a unit module, W1And W2Parameters of two continuous full-connection layers are respectively, delta represents a ReLU activation function, and sigma represents a Sigmoid activation function.
4. The method for intelligently classifying high-efficiency short video contents based on deep learning and attention mechanism as claimed in claim 1, wherein in said step 5, the category with the highest prediction probability value is selected as the video classification label, or all the prediction categories larger than the threshold are selected as the video labels.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910952622.7A CN110807369B (en) | 2019-10-09 | 2019-10-09 | Short video content intelligent classification method based on deep learning and attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910952622.7A CN110807369B (en) | 2019-10-09 | 2019-10-09 | Short video content intelligent classification method based on deep learning and attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110807369A true CN110807369A (en) | 2020-02-18 |
CN110807369B CN110807369B (en) | 2024-02-20 |
Family
ID=69487993
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910952622.7A Active CN110807369B (en) | 2019-10-09 | 2019-10-09 | Short video content intelligent classification method based on deep learning and attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110807369B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259874A (en) * | 2020-05-06 | 2020-06-09 | 成都派沃智通科技有限公司 | Campus security video monitoring method based on deep learning |
CN112948708A (en) * | 2021-03-05 | 2021-06-11 | 清华大学深圳国际研究生院 | Short video recommendation method |
CN113343865A (en) * | 2021-06-15 | 2021-09-03 | 陕西师范大学 | Face image classification method based on layered pseudo-three-dimensional attention convolution neural network |
CN114268782A (en) * | 2020-09-16 | 2022-04-01 | 镇江多游网络科技有限公司 | Attention migration-based 2D-to-3D video conversion method and device and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670453A (en) * | 2018-12-20 | 2019-04-23 | 杭州东信北邮信息技术有限公司 | A method of extracting short video subject |
CN110020682A (en) * | 2019-03-29 | 2019-07-16 | 北京工商大学 | A kind of attention mechanism relationship comparison net model methodology based on small-sample learning |
CN110032926A (en) * | 2019-02-22 | 2019-07-19 | 哈尔滨工业大学(深圳) | A kind of video classification methods and equipment based on deep learning |
CN110175580A (en) * | 2019-05-29 | 2019-08-27 | 复旦大学 | A kind of video behavior recognition methods based on timing cause and effect convolutional network |
US10402978B1 (en) * | 2019-01-25 | 2019-09-03 | StradVision, Inc. | Method for detecting pseudo-3D bounding box based on CNN capable of converting modes according to poses of objects using instance segmentation and device using the same |
-
2019
- 2019-10-09 CN CN201910952622.7A patent/CN110807369B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670453A (en) * | 2018-12-20 | 2019-04-23 | 杭州东信北邮信息技术有限公司 | A method of extracting short video subject |
US10402978B1 (en) * | 2019-01-25 | 2019-09-03 | StradVision, Inc. | Method for detecting pseudo-3D bounding box based on CNN capable of converting modes according to poses of objects using instance segmentation and device using the same |
CN110032926A (en) * | 2019-02-22 | 2019-07-19 | 哈尔滨工业大学(深圳) | A kind of video classification methods and equipment based on deep learning |
CN110020682A (en) * | 2019-03-29 | 2019-07-16 | 北京工商大学 | A kind of attention mechanism relationship comparison net model methodology based on small-sample learning |
CN110175580A (en) * | 2019-05-29 | 2019-08-27 | 复旦大学 | A kind of video behavior recognition methods based on timing cause and effect convolutional network |
Non-Patent Citations (1)
Title |
---|
MOHAMMADREZA ZOLFAGHARI等: "Efficient Convolutional Network for Online Video Understanding", pages 1 - 24 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259874A (en) * | 2020-05-06 | 2020-06-09 | 成都派沃智通科技有限公司 | Campus security video monitoring method based on deep learning |
CN114268782A (en) * | 2020-09-16 | 2022-04-01 | 镇江多游网络科技有限公司 | Attention migration-based 2D-to-3D video conversion method and device and storage medium |
CN112948708A (en) * | 2021-03-05 | 2021-06-11 | 清华大学深圳国际研究生院 | Short video recommendation method |
CN113343865A (en) * | 2021-06-15 | 2021-09-03 | 陕西师范大学 | Face image classification method based on layered pseudo-three-dimensional attention convolution neural network |
Also Published As
Publication number | Publication date |
---|---|
CN110807369B (en) | 2024-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110807369B (en) | Short video content intelligent classification method based on deep learning and attention mechanism | |
CN112001339B (en) | Pedestrian social distance real-time monitoring method based on YOLO v4 | |
US20220180199A1 (en) | Neural network model compression method and apparatus, storage medium, and chip | |
CN110363716B (en) | High-quality reconstruction method for generating confrontation network composite degraded image based on conditions | |
Remez et al. | Class-aware fully convolutional Gaussian and Poisson denoising | |
Jin et al. | Statistical study on perceived JPEG image quality via MCL-JCI dataset construction and analysis | |
CN104113789B (en) | On-line video abstraction generation method based on depth learning | |
CN112653899B (en) | Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene | |
CN111163338B (en) | Video definition evaluation model training method, video recommendation method and related device | |
CN112836646B (en) | Video pedestrian re-identification method based on channel attention mechanism and application | |
CN109218134B (en) | Test case generation system based on neural style migration | |
CN111339818B (en) | Face multi-attribute recognition system | |
CN105718932A (en) | Colorful image classification method based on fruit fly optimization algorithm and smooth twinborn support vector machine and system thereof | |
CN111160356A (en) | Image segmentation and classification method and device | |
CN109062811B (en) | Test case generation method based on neural style migration | |
CN111079864A (en) | Short video classification method and system based on optimized video key frame extraction | |
CN116229323A (en) | Human body behavior recognition method based on improved depth residual error network | |
CN116580184A (en) | YOLOv 7-based lightweight model | |
CN113420179B (en) | Semantic reconstruction video description method based on time sequence Gaussian mixture hole convolution | |
CN107729821B (en) | Video summarization method based on one-dimensional sequence learning | |
CN109508639A (en) | Road scene semantic segmentation method based on multiple dimensioned convolutional neural networks with holes | |
CN110490053B (en) | Human face attribute identification method based on trinocular camera depth estimation | |
CN116152699B (en) | Real-time moving target detection method for hydropower plant video monitoring system | |
CN112132207A (en) | Target detection neural network construction method based on multi-branch feature mapping | |
CN111652238B (en) | Multi-model integration method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |