CN108537195A - A kind of mankind's activity recognition methods indicating model based on single frames - Google Patents
A kind of mankind's activity recognition methods indicating model based on single frames Download PDFInfo
- Publication number
- CN108537195A CN108537195A CN201810344993.2A CN201810344993A CN108537195A CN 108537195 A CN108537195 A CN 108537195A CN 201810344993 A CN201810344993 A CN 201810344993A CN 108537195 A CN108537195 A CN 108537195A
- Authority
- CN
- China
- Prior art keywords
- model
- loss
- final
- light stream
- single frames
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
A kind of mankind's activity recognition methods indicating model based on single frames proposed in the present invention, main contents include:Pretreatment, single frames indicate model, activity recognition model, model optimization and training, its process is, it is first each video frame one light stream image of generation of input video, then, all video frame and corresponding light stream image are input to a single frames to indicate in model, model is generated to indicate, then, it indicates to generate the final active tags predicted according to the model that back generates using one long memory models in short-term, finally, by one final movable label is determined with the full articulamentum of Softmax activation primitives.The present invention solves the problems, such as that previous mankind's activity accuracy of identification is not high and identification is required and calculates time length, can not accurately identify group activity, can accurately identify some group activities, the accuracy of identification higher and required calculating time is less.
Description
Technical field
The present invention relates to activity recognition fields, indicating that the mankind's activity of model identifies based on single frames more particularly, to a kind of
Method.
Background technology
With the progress of Embedded System Design technology, powerful camera has been embedded in various smart machines,
Wireless camera can easily be deployed in the places such as street corner, traffic lights, large-scale stadium, railway station, resulting
Multitude of video has attracted a large number of researchers to study mankind's activity identification.Mankind's activity identification can be applied to be responsible for prison
The enforcement effort of the large-scale crowd activity of control carries out quick mankind's activity identification point by the multitude of video to street monitoring shooting
Analysis, quickly to identify suspicious or criminal offence;Similarly, if the excellent of automatic identification broadcasting match video is capable of in television relay
Segment, rather than the match of three or four hours of whole field can allow the sportsfans that can not watch big event in real time preferably glad
Reward match;In addition, after large-scale natural calamity event, efficient mankind's activity is carried out from the real-time video that unmanned plane is shot
Identification, can allow ambulance paramedic to find the victim for waving to wait for rescue on roof rapidly.However, the previous mankind live
Dynamic recognition methods is not high not only for the accuracy of identification of mankind's activity, identifies that the required calculating time is long, and can not accurately know
Other group activity.
A kind of mankind's activity recognition methods indicating model based on single frames proposed in the present invention, is first the every of input video
A video frame generates a light stream image and all video frame and corresponding light stream image is then input to a single frames table
It in representation model, generates model and indicates, then, indicate to generate according to the model that back generates using one long memory models in short-term
The final active tags of prediction finally determine final movable mark by one with the full articulamentum of Softmax activation primitives
Label.The accuracy of identification that the present invention solves in the past for mankind's activity is not high, identifies that the required calculating time is long, and can not be accurate
Ground identifies the problem of group activity, can accurately identify some group activities, the accuracy of identification higher and required calculating time is more
It is few.
Invention content
It is not high for the accuracy of identification in the past for mankind's activity, identify that the required calculating time is long, and can not be accurately
The problem of identifying group activity, the purpose of the present invention is to provide a kind of mankind's activity identification sides indicating model based on single frames
Method first generates a light stream image, then, by all video frame and corresponding light stream figure for each video frame of input video
It is indicated in model as being input to a single frames, generates model and indicate, then, using one long memory models in short-term according to back
The model of generation indicates that the final active tags for generating prediction finally pass through a full articulamentum with Softmax activation primitives
To determine final movable label.
To solve the above problems, the present invention provides a kind of mankind's activity recognition methods indicating model based on single frames, master
The content is wanted to include:
(1) it pre-processes;
(2) single frames indicates model;
(3) activity recognition model;
(4) model optimization and training.
Wherein, the pretreatment refers to that input primitive frame (including environmental information) and its corresponding light stream image (provide
Movable information), the video frame of the video frame of time t and time t-1 is then inputed into information flow network 2.0 to calculate light stream,
Because information flow network 2.0 has best performance on generating light stream image;Optic flow information is finally visualized as a coloured silk
Each frame of color image (triple channel), i.e. light stream image, wherein video will generate a light stream figure (in addition to first frame)
Picture.
Wherein, the single frames indicates model, including (one is used for two convolutional neural networks (CNN) feature extractors
Video frame, another is used for light stream image) and one long short-term memory (LSTM) model, although any CNN models can be made
It for the feature extractor in model, but explains, is replaced using VGG16, and fixed video frame and light stream image in order to simplify
Size be (224 × 224 × 3);After obtaining the light stream image of time t, it is input to two CNN moulds with corresponding video frame
Type extracts feature, to remove last four layers of VGG16 here;Then, add an overall situation be averaged pond layer and an overall situation
Maximum pond layer, LSTM models, i.e. LSTM1 are supplied to by the output of these global pool layers, it means that LSTM1 has 4 input steps
Suddenly, each step has 512 dimensions, and one is added in the output of LSTM1 final steps with the complete of Softmax activation primitives
Articulamentum, it will be that each input video frame generates character representation;Entire single frames indicates that the loss of model can use classification to intersect
Entropy loss calculates.
Further, the CNN feature extractors contain VGG16 from " module 1_ convolution net 2 " to " ponds module 5_ "
All layers, wherein the Output Size of " ponds module 5_ " layer be (7 × 7 × 512).
Further, it is the single layer for having 200 hidden units that the loss of single frames expression model, which refers to LSTM1,
LSTM, and the output dimension of full articulamentum (FC layers) 1 is set as final movable quantity, reference label when training is set as most
The vector of whole active tags;Then single frames is indicated that model is trained as a classification task, wherein character representation is each
The probability distribution of a video frame, because of the loss of this time t1Use loss1,tIt indicates, it can use classification cross entropy costing bio disturbance:
loss1,t=-∑ gt,ilog(pt,i) (1)
Wherein, g is reference label, and p is predicted value, and in test phase, model will generate a probability vector as each
The expression of video frame.
Wherein, the activity recognition model is to indicate the final active tags of sequence prediction according to the single frames of generation;Activity
Identification model is a LSTM network, i.e. LSTM2, single frame is denoted as inputting by it, therefore, the input step number of LSTM2
Equal to the quantity of current input video frame;Then, the output of the final step of LSTM2 is input to a full articulamentum, and made
Final active tags are predicted with Softmax activation primitives;The loss of entire activity recognition model can use classification cross entropy
Costing bio disturbance.
Further, the loss of the activity recognition model, which refers to LSTM2, to be also one and has 200 hidden units
The output of single layer LSTM, FC2 are set as the quantity of the class of activity, in order to train the model of classification task, using classification cross entropy
It loses to train loss2:
loss2=-∑I=1gilog(pti) (2)
Wherein, g is reference label, and p is predicted value.
Wherein, the model optimization and training refer to using Python programming languages and with Tensorflow study system
The optimization and training of the libraries the Keras implementation model of system.
Further, the optimization of the model refers to and is optimized model as a multi-task learning model, then whole
Bulk diffusion may be calculated:
Wherein, time step is the frame number that the final active tags of model prediction are based on, loss1,tIt is the single of t moment generation
The loss that frame indicates, and loss2It is to be lost to final movable prediction, lambda parameter is to indicate loss and finally for balancing single frames
Activity classification loses, it is contemplated that final purpose is the final activity of prediction, and setting λ=2 are weighed with distributing higher final active prediction
Then weight is optimized for the purpose of minimizing and losing.
Further, the training of the model refers to accelerate training process, better performance is obtained, in picture network
The VGG16 weights that pre-training is loaded on network data set, are then trained model using RMSProp optimizers, and be arranged
0.001 learning rate and 10-8Fuzzy factor, until loss restrain;Then, it is 0.0001 optimizer to be changed to learning rate
Stochastic gradient descent (SGD);Wherein RMSProp optimizers can help model Fast Convergent, and the SGD of small learning rate has
Help Optimized model.
Description of the drawings
Fig. 1 is a kind of system flow chart for the mankind's activity recognition methods indicating model based on single frames of the present invention.
Fig. 2 is a kind of practical application figure for the mankind's activity recognition methods indicating model based on single frames of the present invention.
Specific implementation mode
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
It mutually combines, invention is further described in detail in the following with reference to the drawings and specific embodiments.
Fig. 1 is a kind of system flow chart for the mankind's activity recognition methods indicating model based on single frames of the present invention.Main packet
Pretreatment is included, single frames indicates model, activity recognition model, the optimization and training of model.
Wherein, the pretreatment refers to that input primitive frame (including environmental information) and its corresponding light stream image (provide
Movable information), the video frame of the video frame of time t and time t-1 is then inputed into information flow network 2.0 to calculate light stream,
Because information flow network 2.0 has best performance on generating light stream image;Optic flow information is finally visualized as a coloured silk
Each frame of color image (triple channel), i.e. light stream image, wherein video will generate a light stream figure (in addition to first frame)
Picture.
Wherein, the single frames indicates model, including (one is used for two convolutional neural networks (CNN) feature extractors
Video frame, another is used for light stream image) and one long short-term memory (LSTM) model, although any CNN models can be made
It for the feature extractor in model, but explains, is replaced using VGG16, and fixed video frame and light stream image in order to simplify
Size be (224 × 224 × 3);After obtaining the light stream image of time t, it is input to two CNN moulds with corresponding video frame
Type extracts feature, to remove last four layers of VGG16 here;Then, add an overall situation be averaged pond layer and an overall situation
Maximum pond layer, LSTM models, i.e. LSTM1 are supplied to by the output of these global pool layers, it means that LSTM1 has 4 input steps
Suddenly, each step has 512 dimensions, and one is added in the output of LSTM1 final steps with the complete of Softmax activation primitives
Articulamentum, it will be that each input video frame generates character representation;Entire single frames indicates that the loss of model can use classification to intersect
Entropy loss calculates.
Further, the CNN feature extractors contain VGG16 from " module 1_ convolution net 2 " to " ponds module 5_ "
All layers, wherein the Output Size of " ponds module 5_ " layer be (7 × 7 × 512).
Further, it is the single layer for having 200 hidden units that the loss of single frames expression model, which refers to LSTM1,
LSTM, and the output dimension of full articulamentum (FC layers) 1 is set as final movable quantity, reference label when training is set as most
The vector of whole active tags;Then single frames is indicated that model is trained as a classification task, wherein character representation is each
The probability distribution of a video frame, because of the loss of this time t1Use loss1,tIt indicates, it can use classification cross entropy costing bio disturbance:
loss1,t=-∑ gt,ilog(pt,i) (1)
Wherein, g is reference label, and p is predicted value, and in test phase, model will generate a probability vector as each
The expression of video frame.
Wherein, the activity recognition model is to indicate the final active tags of sequence prediction according to the single frames of generation;Activity
Identification model is a LSTM network, i.e. LSTM2, single frame is denoted as inputting by it, therefore, the input step number of LSTM2
Equal to the quantity of current input video frame;Then, the output of the final step of LSTM2 is input to a full articulamentum, and made
Final active tags are predicted with Softmax activation primitives;The loss of entire activity recognition model can use classification cross entropy
Costing bio disturbance.
Further, the loss of the activity recognition model, which refers to LSTM2, to be also one and has 200 hidden units
The output of single layer LSTM, FC2 are set as the quantity of the class of activity, in order to train the model of classification task, using classification cross entropy
It loses to train loss2:
loss2=-∑I=1gilog(pti) (2)
Wherein, g is reference label, and p is predicted value.
Wherein, the model optimization and training refer to using Python programming languages and with Tensorflow study system
The optimization and training of the libraries the Keras implementation model of system.
Further, the optimization of the model refers to and is optimized model as a multi-task learning model, then whole
Bulk diffusion may be calculated:
Wherein, time step is the frame number that the final active tags of model prediction are based on, loss1,tIt is the single of t moment generation
The loss that frame indicates, and loss2It is to be lost to final movable prediction, lambda parameter is to indicate loss and finally for balancing single frames
Activity classification loses, it is contemplated that final purpose is the final activity of prediction, and setting λ=2 are weighed with distributing higher final active prediction
Then weight is optimized for the purpose of minimizing and losing.
Further, the training of the model refers to accelerate training process, better performance is obtained, in picture network
The VGG16 weights that pre-training is loaded on network data set, are then trained model using RMSProp optimizers, and be arranged
0.001 learning rate and 10-8Fuzzy factor, until loss restrain;Then, it is 0.0001 optimizer to be changed to learning rate
Stochastic gradient descent (SGD);Wherein RMSProp optimizers can help model Fast Convergent, and the SGD of small learning rate has
Help Optimized model.
Fig. 2 is a kind of practical application figure for the mankind's activity recognition methods indicating model based on single frames of the present invention.The mankind live
Dynamic identification can be applied to the enforcement effort for being responsible for the large-scale crowd activity of monitoring, by the multitude of video to street monitoring shooting into
The quick mankind's activity discriminance analysis of row, quickly to identify suspicious or criminal offence;Similarly, if television relay can be automatic
The wonderful of identification broadcasting match video, rather than the match of three or four hours of whole field, can allow can not watch large size in real time
The sportsfans of match preferably appreciate match;In addition, after large-scale natural calamity event, what is shot from unmanned plane regards in real time
Efficient mankind's activity identification is carried out in frequency, and ambulance paramedic can be allowed to find the chance for waving to wait for rescue on roof rapidly
Difficult person.
For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention
In the case of refreshing and range, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair
Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's
Protection domain.Therefore, the following claims are intended to be interpreted as including preferred embodiment and falls into all changes of the scope of the invention
More and change.
Claims (10)
1. a kind of healthy and strong efficient mankind's activity recognition methods, which is characterized in that main includes pretreatment (one);Single frames indicates mould
Type (two);Activity recognition model (three);Model optimization and training (four).
2. based on the pretreatment (one) described in claims 1, which is characterized in that input primitive frame (include environmental information) and its
Corresponding light stream image (providing movable information), then inputs to information flow by the video frame of the video frame of time t and time t-1
Network 2.0 calculates light stream, because information flow network 2.0 has a best performance on generating light stream image;Finally by light stream
Visualization of information turns to a coloured image (triple channel), i.e. light stream image, wherein each frame of video (in addition to first frame)
A light stream image will be generated.
3. indicating model (two) based on the single frames described in claims 1, which is characterized in that including two convolutional neural networks
(CNN) feature extractor (one is used for video frame, another is used for light stream image) and one long short-term memory (LSTM) model,
Although any CNN models all can serve as the feature extractor in model, explains to simplify, replaced using VGG16, and
And the size of fixed video frame and light stream image is (224 × 224 × 3);After obtaining the light stream image of time t, by it and accordingly
Video frame be input to two CNN models to extract feature, to remove last four layers of VGG16 here;Then, one is added
The average pond layer of the overall situation and a global maximum pond layer, LSTM models, i.e. LSTM1 are supplied to by the output of these global pool layers, this
Mean that LSTM1 has 4 input steps, each step there are 512 dimensions, one is added in the output of LSTM1 final steps
The full articulamentum of a band Softmax activation primitives, it will be that each input video frame generates character representation;Entire single frames indicates mould
The loss of type can use classification cross entropy costing bio disturbance.
4. based on the CNN feature extractors described in claims 3, which is characterized in that contain VGG16 from " module 1_ convolution
Net 2 " arrives all layers in " ponds module 5_ ", wherein the Output Size of " ponds module 5_ " layer is (7 × 7 × 512).
5. indicating the loss of model based on the single frames described in claims 3, which is characterized in that LSTM1 is to have 200 to hide list
The single layer LSTM of member, and the output dimension of full articulamentum (FC layers) 1 is set as final movable quantity, reference label when training
It is set as the vector of final active tags;Single frames is indicated that model is trained as a classification task, wherein character representation
It is the probability distribution of each video frame, because of the loss of this time t1Use loss1,tIt indicates, it can use classification to intersect entropy loss
It calculates:
loss1,t=-∑ gt,ilog(pt,i) (1)
Wherein, g is reference label, and p is predicted value, and in test phase, model will generate a probability vector as each video
The expression of frame.
6. based on the activity recognition model (three) described in claims 1, which is characterized in that indicate sequence according to the single frames of generation
Predict final active tags;Activity recognition model is a LSTM network, i.e. LSTM2, single frame is denoted as inputting by it, because
This, the input step number of LSTM2 is equal to the quantity of current input video frame;Then, the output of the final step of LSTM2 is inputted
To a full articulamentum, and final active tags are predicted using Softmax activation primitives;The damage of entire activity recognition model
Classification cross entropy costing bio disturbance can be used by losing.
7. the loss based on the activity recognition model described in claims 6, which is characterized in that LSTM2, which is also one, 200
The output of the single layer LSTM, FC2 of hidden unit are set as the quantity of the class of activity;In order to train the model of classification task, use
Classification intersects entropy loss to train loss2:
loss2=-∑I=1gilog(pti) (2)
Wherein, g is reference label, and p is predicted value.
8. based on described in claims 1 model optimization and training (four), which is characterized in that using Python programming languages and
The optimization and training of the libraries Keras implementation model with Tensorflow learning systems.
9. the optimization based on the model described in claims 6, which is characterized in that using model as a multi-task learning model
It optimizes, then whole loss may be calculated:
Wherein, time step is the frame number that the final active tags of model prediction are based on, loss1,tIt is the single frame table that t moment generates
The loss shown, and loss2It is to be lost to final movable prediction, lambda parameter is to indicate loss and final activity for balancing single frames
Classification Loss, it is contemplated that final purpose is the final activity of prediction, and setting λ=2 are to distribute higher final active prediction weight, so
It is optimized for the purpose of minimizing and losing afterwards.
10. the training based on the model described in claims 6, which is characterized in that in order to accelerate training process, obtain better
Performance loads the VGG16 weights of pre-training on picture Network data set, then RMSProp optimizers is used to carry out model
Training, and 0.001 learning rate and 10 is set-8Fuzzy factor, until loss restrain;Then, optimizer is changed to study speed
The stochastic gradient descent (SGD) that rate is 0.0001;Wherein RMSProp optimizers can help model Fast Convergent, and small
The SGD for practising rate contributes to Optimized model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810344993.2A CN108537195A (en) | 2018-04-17 | 2018-04-17 | A kind of mankind's activity recognition methods indicating model based on single frames |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810344993.2A CN108537195A (en) | 2018-04-17 | 2018-04-17 | A kind of mankind's activity recognition methods indicating model based on single frames |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108537195A true CN108537195A (en) | 2018-09-14 |
Family
ID=63481122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810344993.2A Withdrawn CN108537195A (en) | 2018-04-17 | 2018-04-17 | A kind of mankind's activity recognition methods indicating model based on single frames |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108537195A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111539988A (en) * | 2020-04-15 | 2020-08-14 | 京东方科技集团股份有限公司 | Visual odometer implementation method and device and electronic equipment |
CN111695501A (en) * | 2020-06-11 | 2020-09-22 | 青岛大学 | Equipment soft fault detection method based on operating system kernel calling data |
CN112381182A (en) * | 2020-12-11 | 2021-02-19 | 大连海事大学 | Daily activity prediction method based on interactive multi-task model |
CN112417989A (en) * | 2020-10-30 | 2021-02-26 | 四川天翼网络服务有限公司 | Invigilator violation identification method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105303193A (en) * | 2015-09-21 | 2016-02-03 | 重庆邮电大学 | People counting system for processing single-frame image |
CN107292247A (en) * | 2017-06-05 | 2017-10-24 | 浙江理工大学 | A kind of Human bodys' response method and device based on residual error network |
-
2018
- 2018-04-17 CN CN201810344993.2A patent/CN108537195A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105303193A (en) * | 2015-09-21 | 2016-02-03 | 重庆邮电大学 | People counting system for processing single-frame image |
CN107292247A (en) * | 2017-06-05 | 2017-10-24 | 浙江理工大学 | A kind of Human bodys' response method and device based on residual error network |
Non-Patent Citations (1)
Title |
---|
XIN LI,MOOI CHOO CHUAH.: ""ReHAR: Robust and Efficient Human Activity Recognition"", 《ARXIV》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111539988A (en) * | 2020-04-15 | 2020-08-14 | 京东方科技集团股份有限公司 | Visual odometer implementation method and device and electronic equipment |
CN111539988B (en) * | 2020-04-15 | 2024-04-09 | 京东方科技集团股份有限公司 | Visual odometer implementation method and device and electronic equipment |
CN111695501A (en) * | 2020-06-11 | 2020-09-22 | 青岛大学 | Equipment soft fault detection method based on operating system kernel calling data |
CN111695501B (en) * | 2020-06-11 | 2021-08-10 | 青岛大学 | Equipment soft fault detection method based on operating system kernel calling data |
CN112417989A (en) * | 2020-10-30 | 2021-02-26 | 四川天翼网络服务有限公司 | Invigilator violation identification method and system |
CN112381182A (en) * | 2020-12-11 | 2021-02-19 | 大连海事大学 | Daily activity prediction method based on interactive multi-task model |
CN112381182B (en) * | 2020-12-11 | 2024-01-19 | 大连海事大学 | Daily activity prediction method based on interactive multitasking model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108537195A (en) | A kind of mankind's activity recognition methods indicating model based on single frames | |
US10558892B2 (en) | Scene understanding using a neurosynaptic system | |
Jiang et al. | Deepurbanmomentum: An online deep-learning system for short-term urban mobility prediction | |
Wang et al. | Detection of unwanted traffic congestion based on existing surveillance system using in freeway via a CNN-architecture trafficnet | |
CN109191453A (en) | Method and apparatus for generating image category detection model | |
US11593610B2 (en) | Airport noise classification method and system | |
CN109902880A (en) | A kind of city stream of people's prediction technique generating confrontation network based on Seq2Seq | |
Jeon et al. | Artificial intelligence for traffic signal control based solely on video images | |
CN114049513A (en) | Knowledge distillation method and system based on multi-student discussion | |
CN113095346A (en) | Data labeling method and data labeling device | |
CN110413838A (en) | A kind of unsupervised video frequency abstract model and its method for building up | |
CN109961041B (en) | Video identification method and device and storage medium | |
KR20220004491A (en) | Artificial intelligence based tree data management system and tree data management method | |
CN112446331A (en) | Knowledge distillation-based space-time double-flow segmented network behavior identification method and system | |
Gaihua et al. | A serial-parallel self-attention network joint with multi-scale dilated convolution | |
CN117116048A (en) | Knowledge-driven traffic prediction method based on knowledge representation model and graph neural network | |
US11734557B2 (en) | Neural network with frozen nodes | |
CN115082752A (en) | Target detection model training method, device, equipment and medium based on weak supervision | |
CN115565146A (en) | Perception model training method and system for acquiring aerial view characteristics based on self-encoder | |
Jain et al. | Smart city management system using IoT with deep learning | |
Zheng et al. | A deep learning–based approach for moving vehicle counting and short-term traffic prediction from video images | |
Li et al. | Multi-branch semantic GAN for infrared image generation from optical image | |
CN116955698A (en) | Matching model training method and device, electronic equipment and storage medium | |
CN113255563B (en) | Scenic spot people flow control system and method | |
CN115965890A (en) | Method, device and equipment for video content recognition and model training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20180914 |