CN108537195A - A kind of mankind's activity recognition methods indicating model based on single frames - Google Patents

A kind of mankind's activity recognition methods indicating model based on single frames Download PDF

Info

Publication number
CN108537195A
CN108537195A CN201810344993.2A CN201810344993A CN108537195A CN 108537195 A CN108537195 A CN 108537195A CN 201810344993 A CN201810344993 A CN 201810344993A CN 108537195 A CN108537195 A CN 108537195A
Authority
CN
China
Prior art keywords
model
loss
final
light stream
single frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810344993.2A
Other languages
Chinese (zh)
Inventor
夏春秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Vision Technology Co Ltd
Original Assignee
Shenzhen Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Vision Technology Co Ltd filed Critical Shenzhen Vision Technology Co Ltd
Priority to CN201810344993.2A priority Critical patent/CN108537195A/en
Publication of CN108537195A publication Critical patent/CN108537195A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

A kind of mankind's activity recognition methods indicating model based on single frames proposed in the present invention, main contents include:Pretreatment, single frames indicate model, activity recognition model, model optimization and training, its process is, it is first each video frame one light stream image of generation of input video, then, all video frame and corresponding light stream image are input to a single frames to indicate in model, model is generated to indicate, then, it indicates to generate the final active tags predicted according to the model that back generates using one long memory models in short-term, finally, by one final movable label is determined with the full articulamentum of Softmax activation primitives.The present invention solves the problems, such as that previous mankind's activity accuracy of identification is not high and identification is required and calculates time length, can not accurately identify group activity, can accurately identify some group activities, the accuracy of identification higher and required calculating time is less.

Description

A kind of mankind's activity recognition methods indicating model based on single frames
Technical field
The present invention relates to activity recognition fields, indicating that the mankind's activity of model identifies based on single frames more particularly, to a kind of Method.
Background technology
With the progress of Embedded System Design technology, powerful camera has been embedded in various smart machines, Wireless camera can easily be deployed in the places such as street corner, traffic lights, large-scale stadium, railway station, resulting Multitude of video has attracted a large number of researchers to study mankind's activity identification.Mankind's activity identification can be applied to be responsible for prison The enforcement effort of the large-scale crowd activity of control carries out quick mankind's activity identification point by the multitude of video to street monitoring shooting Analysis, quickly to identify suspicious or criminal offence;Similarly, if the excellent of automatic identification broadcasting match video is capable of in television relay Segment, rather than the match of three or four hours of whole field can allow the sportsfans that can not watch big event in real time preferably glad Reward match;In addition, after large-scale natural calamity event, efficient mankind's activity is carried out from the real-time video that unmanned plane is shot Identification, can allow ambulance paramedic to find the victim for waving to wait for rescue on roof rapidly.However, the previous mankind live Dynamic recognition methods is not high not only for the accuracy of identification of mankind's activity, identifies that the required calculating time is long, and can not accurately know Other group activity.
A kind of mankind's activity recognition methods indicating model based on single frames proposed in the present invention, is first the every of input video A video frame generates a light stream image and all video frame and corresponding light stream image is then input to a single frames table It in representation model, generates model and indicates, then, indicate to generate according to the model that back generates using one long memory models in short-term The final active tags of prediction finally determine final movable mark by one with the full articulamentum of Softmax activation primitives Label.The accuracy of identification that the present invention solves in the past for mankind's activity is not high, identifies that the required calculating time is long, and can not be accurate Ground identifies the problem of group activity, can accurately identify some group activities, the accuracy of identification higher and required calculating time is more It is few.
Invention content
It is not high for the accuracy of identification in the past for mankind's activity, identify that the required calculating time is long, and can not be accurately The problem of identifying group activity, the purpose of the present invention is to provide a kind of mankind's activity identification sides indicating model based on single frames Method first generates a light stream image, then, by all video frame and corresponding light stream figure for each video frame of input video It is indicated in model as being input to a single frames, generates model and indicate, then, using one long memory models in short-term according to back The model of generation indicates that the final active tags for generating prediction finally pass through a full articulamentum with Softmax activation primitives To determine final movable label.
To solve the above problems, the present invention provides a kind of mankind's activity recognition methods indicating model based on single frames, master The content is wanted to include:
(1) it pre-processes;
(2) single frames indicates model;
(3) activity recognition model;
(4) model optimization and training.
Wherein, the pretreatment refers to that input primitive frame (including environmental information) and its corresponding light stream image (provide Movable information), the video frame of the video frame of time t and time t-1 is then inputed into information flow network 2.0 to calculate light stream, Because information flow network 2.0 has best performance on generating light stream image;Optic flow information is finally visualized as a coloured silk Each frame of color image (triple channel), i.e. light stream image, wherein video will generate a light stream figure (in addition to first frame) Picture.
Wherein, the single frames indicates model, including (one is used for two convolutional neural networks (CNN) feature extractors Video frame, another is used for light stream image) and one long short-term memory (LSTM) model, although any CNN models can be made It for the feature extractor in model, but explains, is replaced using VGG16, and fixed video frame and light stream image in order to simplify Size be (224 × 224 × 3);After obtaining the light stream image of time t, it is input to two CNN moulds with corresponding video frame Type extracts feature, to remove last four layers of VGG16 here;Then, add an overall situation be averaged pond layer and an overall situation Maximum pond layer, LSTM models, i.e. LSTM1 are supplied to by the output of these global pool layers, it means that LSTM1 has 4 input steps Suddenly, each step has 512 dimensions, and one is added in the output of LSTM1 final steps with the complete of Softmax activation primitives Articulamentum, it will be that each input video frame generates character representation;Entire single frames indicates that the loss of model can use classification to intersect Entropy loss calculates.
Further, the CNN feature extractors contain VGG16 from " module 1_ convolution net 2 " to " ponds module 5_ " All layers, wherein the Output Size of " ponds module 5_ " layer be (7 × 7 × 512).
Further, it is the single layer for having 200 hidden units that the loss of single frames expression model, which refers to LSTM1, LSTM, and the output dimension of full articulamentum (FC layers) 1 is set as final movable quantity, reference label when training is set as most The vector of whole active tags;Then single frames is indicated that model is trained as a classification task, wherein character representation is each The probability distribution of a video frame, because of the loss of this time t1Use loss1,tIt indicates, it can use classification cross entropy costing bio disturbance:
loss1,t=-∑ gt,ilog(pt,i) (1)
Wherein, g is reference label, and p is predicted value, and in test phase, model will generate a probability vector as each The expression of video frame.
Wherein, the activity recognition model is to indicate the final active tags of sequence prediction according to the single frames of generation;Activity Identification model is a LSTM network, i.e. LSTM2, single frame is denoted as inputting by it, therefore, the input step number of LSTM2 Equal to the quantity of current input video frame;Then, the output of the final step of LSTM2 is input to a full articulamentum, and made Final active tags are predicted with Softmax activation primitives;The loss of entire activity recognition model can use classification cross entropy Costing bio disturbance.
Further, the loss of the activity recognition model, which refers to LSTM2, to be also one and has 200 hidden units The output of single layer LSTM, FC2 are set as the quantity of the class of activity, in order to train the model of classification task, using classification cross entropy It loses to train loss2
loss2=-∑I=1gilog(pti) (2)
Wherein, g is reference label, and p is predicted value.
Wherein, the model optimization and training refer to using Python programming languages and with Tensorflow study system The optimization and training of the libraries the Keras implementation model of system.
Further, the optimization of the model refers to and is optimized model as a multi-task learning model, then whole Bulk diffusion may be calculated:
Wherein, time step is the frame number that the final active tags of model prediction are based on, loss1,tIt is the single of t moment generation The loss that frame indicates, and loss2It is to be lost to final movable prediction, lambda parameter is to indicate loss and finally for balancing single frames Activity classification loses, it is contemplated that final purpose is the final activity of prediction, and setting λ=2 are weighed with distributing higher final active prediction Then weight is optimized for the purpose of minimizing and losing.
Further, the training of the model refers to accelerate training process, better performance is obtained, in picture network The VGG16 weights that pre-training is loaded on network data set, are then trained model using RMSProp optimizers, and be arranged 0.001 learning rate and 10-8Fuzzy factor, until loss restrain;Then, it is 0.0001 optimizer to be changed to learning rate Stochastic gradient descent (SGD);Wherein RMSProp optimizers can help model Fast Convergent, and the SGD of small learning rate has Help Optimized model.
Description of the drawings
Fig. 1 is a kind of system flow chart for the mankind's activity recognition methods indicating model based on single frames of the present invention.
Fig. 2 is a kind of practical application figure for the mankind's activity recognition methods indicating model based on single frames of the present invention.
Specific implementation mode
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase It mutually combines, invention is further described in detail in the following with reference to the drawings and specific embodiments.
Fig. 1 is a kind of system flow chart for the mankind's activity recognition methods indicating model based on single frames of the present invention.Main packet Pretreatment is included, single frames indicates model, activity recognition model, the optimization and training of model.
Wherein, the pretreatment refers to that input primitive frame (including environmental information) and its corresponding light stream image (provide Movable information), the video frame of the video frame of time t and time t-1 is then inputed into information flow network 2.0 to calculate light stream, Because information flow network 2.0 has best performance on generating light stream image;Optic flow information is finally visualized as a coloured silk Each frame of color image (triple channel), i.e. light stream image, wherein video will generate a light stream figure (in addition to first frame) Picture.
Wherein, the single frames indicates model, including (one is used for two convolutional neural networks (CNN) feature extractors Video frame, another is used for light stream image) and one long short-term memory (LSTM) model, although any CNN models can be made It for the feature extractor in model, but explains, is replaced using VGG16, and fixed video frame and light stream image in order to simplify Size be (224 × 224 × 3);After obtaining the light stream image of time t, it is input to two CNN moulds with corresponding video frame Type extracts feature, to remove last four layers of VGG16 here;Then, add an overall situation be averaged pond layer and an overall situation Maximum pond layer, LSTM models, i.e. LSTM1 are supplied to by the output of these global pool layers, it means that LSTM1 has 4 input steps Suddenly, each step has 512 dimensions, and one is added in the output of LSTM1 final steps with the complete of Softmax activation primitives Articulamentum, it will be that each input video frame generates character representation;Entire single frames indicates that the loss of model can use classification to intersect Entropy loss calculates.
Further, the CNN feature extractors contain VGG16 from " module 1_ convolution net 2 " to " ponds module 5_ " All layers, wherein the Output Size of " ponds module 5_ " layer be (7 × 7 × 512).
Further, it is the single layer for having 200 hidden units that the loss of single frames expression model, which refers to LSTM1, LSTM, and the output dimension of full articulamentum (FC layers) 1 is set as final movable quantity, reference label when training is set as most The vector of whole active tags;Then single frames is indicated that model is trained as a classification task, wherein character representation is each The probability distribution of a video frame, because of the loss of this time t1Use loss1,tIt indicates, it can use classification cross entropy costing bio disturbance:
loss1,t=-∑ gt,ilog(pt,i) (1)
Wherein, g is reference label, and p is predicted value, and in test phase, model will generate a probability vector as each The expression of video frame.
Wherein, the activity recognition model is to indicate the final active tags of sequence prediction according to the single frames of generation;Activity Identification model is a LSTM network, i.e. LSTM2, single frame is denoted as inputting by it, therefore, the input step number of LSTM2 Equal to the quantity of current input video frame;Then, the output of the final step of LSTM2 is input to a full articulamentum, and made Final active tags are predicted with Softmax activation primitives;The loss of entire activity recognition model can use classification cross entropy Costing bio disturbance.
Further, the loss of the activity recognition model, which refers to LSTM2, to be also one and has 200 hidden units The output of single layer LSTM, FC2 are set as the quantity of the class of activity, in order to train the model of classification task, using classification cross entropy It loses to train loss2
loss2=-∑I=1gilog(pti) (2)
Wherein, g is reference label, and p is predicted value.
Wherein, the model optimization and training refer to using Python programming languages and with Tensorflow study system The optimization and training of the libraries the Keras implementation model of system.
Further, the optimization of the model refers to and is optimized model as a multi-task learning model, then whole Bulk diffusion may be calculated:
Wherein, time step is the frame number that the final active tags of model prediction are based on, loss1,tIt is the single of t moment generation The loss that frame indicates, and loss2It is to be lost to final movable prediction, lambda parameter is to indicate loss and finally for balancing single frames Activity classification loses, it is contemplated that final purpose is the final activity of prediction, and setting λ=2 are weighed with distributing higher final active prediction Then weight is optimized for the purpose of minimizing and losing.
Further, the training of the model refers to accelerate training process, better performance is obtained, in picture network The VGG16 weights that pre-training is loaded on network data set, are then trained model using RMSProp optimizers, and be arranged 0.001 learning rate and 10-8Fuzzy factor, until loss restrain;Then, it is 0.0001 optimizer to be changed to learning rate Stochastic gradient descent (SGD);Wherein RMSProp optimizers can help model Fast Convergent, and the SGD of small learning rate has Help Optimized model.
Fig. 2 is a kind of practical application figure for the mankind's activity recognition methods indicating model based on single frames of the present invention.The mankind live Dynamic identification can be applied to the enforcement effort for being responsible for the large-scale crowd activity of monitoring, by the multitude of video to street monitoring shooting into The quick mankind's activity discriminance analysis of row, quickly to identify suspicious or criminal offence;Similarly, if television relay can be automatic The wonderful of identification broadcasting match video, rather than the match of three or four hours of whole field, can allow can not watch large size in real time The sportsfans of match preferably appreciate match;In addition, after large-scale natural calamity event, what is shot from unmanned plane regards in real time Efficient mankind's activity identification is carried out in frequency, and ambulance paramedic can be allowed to find the chance for waving to wait for rescue on roof rapidly Difficult person.
For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention In the case of refreshing and range, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's Protection domain.Therefore, the following claims are intended to be interpreted as including preferred embodiment and falls into all changes of the scope of the invention More and change.

Claims (10)

1. a kind of healthy and strong efficient mankind's activity recognition methods, which is characterized in that main includes pretreatment (one);Single frames indicates mould Type (two);Activity recognition model (three);Model optimization and training (four).
2. based on the pretreatment (one) described in claims 1, which is characterized in that input primitive frame (include environmental information) and its Corresponding light stream image (providing movable information), then inputs to information flow by the video frame of the video frame of time t and time t-1 Network 2.0 calculates light stream, because information flow network 2.0 has a best performance on generating light stream image;Finally by light stream Visualization of information turns to a coloured image (triple channel), i.e. light stream image, wherein each frame of video (in addition to first frame) A light stream image will be generated.
3. indicating model (two) based on the single frames described in claims 1, which is characterized in that including two convolutional neural networks (CNN) feature extractor (one is used for video frame, another is used for light stream image) and one long short-term memory (LSTM) model, Although any CNN models all can serve as the feature extractor in model, explains to simplify, replaced using VGG16, and And the size of fixed video frame and light stream image is (224 × 224 × 3);After obtaining the light stream image of time t, by it and accordingly Video frame be input to two CNN models to extract feature, to remove last four layers of VGG16 here;Then, one is added The average pond layer of the overall situation and a global maximum pond layer, LSTM models, i.e. LSTM1 are supplied to by the output of these global pool layers, this Mean that LSTM1 has 4 input steps, each step there are 512 dimensions, one is added in the output of LSTM1 final steps The full articulamentum of a band Softmax activation primitives, it will be that each input video frame generates character representation;Entire single frames indicates mould The loss of type can use classification cross entropy costing bio disturbance.
4. based on the CNN feature extractors described in claims 3, which is characterized in that contain VGG16 from " module 1_ convolution Net 2 " arrives all layers in " ponds module 5_ ", wherein the Output Size of " ponds module 5_ " layer is (7 × 7 × 512).
5. indicating the loss of model based on the single frames described in claims 3, which is characterized in that LSTM1 is to have 200 to hide list The single layer LSTM of member, and the output dimension of full articulamentum (FC layers) 1 is set as final movable quantity, reference label when training It is set as the vector of final active tags;Single frames is indicated that model is trained as a classification task, wherein character representation It is the probability distribution of each video frame, because of the loss of this time t1Use loss1,tIt indicates, it can use classification to intersect entropy loss It calculates:
loss1,t=-∑ gt,ilog(pt,i) (1)
Wherein, g is reference label, and p is predicted value, and in test phase, model will generate a probability vector as each video The expression of frame.
6. based on the activity recognition model (three) described in claims 1, which is characterized in that indicate sequence according to the single frames of generation Predict final active tags;Activity recognition model is a LSTM network, i.e. LSTM2, single frame is denoted as inputting by it, because This, the input step number of LSTM2 is equal to the quantity of current input video frame;Then, the output of the final step of LSTM2 is inputted To a full articulamentum, and final active tags are predicted using Softmax activation primitives;The damage of entire activity recognition model Classification cross entropy costing bio disturbance can be used by losing.
7. the loss based on the activity recognition model described in claims 6, which is characterized in that LSTM2, which is also one, 200 The output of the single layer LSTM, FC2 of hidden unit are set as the quantity of the class of activity;In order to train the model of classification task, use Classification intersects entropy loss to train loss2
loss2=-∑I=1gilog(pti) (2)
Wherein, g is reference label, and p is predicted value.
8. based on described in claims 1 model optimization and training (four), which is characterized in that using Python programming languages and The optimization and training of the libraries Keras implementation model with Tensorflow learning systems.
9. the optimization based on the model described in claims 6, which is characterized in that using model as a multi-task learning model It optimizes, then whole loss may be calculated:
Wherein, time step is the frame number that the final active tags of model prediction are based on, loss1,tIt is the single frame table that t moment generates The loss shown, and loss2It is to be lost to final movable prediction, lambda parameter is to indicate loss and final activity for balancing single frames Classification Loss, it is contemplated that final purpose is the final activity of prediction, and setting λ=2 are to distribute higher final active prediction weight, so It is optimized for the purpose of minimizing and losing afterwards.
10. the training based on the model described in claims 6, which is characterized in that in order to accelerate training process, obtain better Performance loads the VGG16 weights of pre-training on picture Network data set, then RMSProp optimizers is used to carry out model Training, and 0.001 learning rate and 10 is set-8Fuzzy factor, until loss restrain;Then, optimizer is changed to study speed The stochastic gradient descent (SGD) that rate is 0.0001;Wherein RMSProp optimizers can help model Fast Convergent, and small The SGD for practising rate contributes to Optimized model.
CN201810344993.2A 2018-04-17 2018-04-17 A kind of mankind's activity recognition methods indicating model based on single frames Withdrawn CN108537195A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810344993.2A CN108537195A (en) 2018-04-17 2018-04-17 A kind of mankind's activity recognition methods indicating model based on single frames

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810344993.2A CN108537195A (en) 2018-04-17 2018-04-17 A kind of mankind's activity recognition methods indicating model based on single frames

Publications (1)

Publication Number Publication Date
CN108537195A true CN108537195A (en) 2018-09-14

Family

ID=63481122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810344993.2A Withdrawn CN108537195A (en) 2018-04-17 2018-04-17 A kind of mankind's activity recognition methods indicating model based on single frames

Country Status (1)

Country Link
CN (1) CN108537195A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539988A (en) * 2020-04-15 2020-08-14 京东方科技集团股份有限公司 Visual odometer implementation method and device and electronic equipment
CN111695501A (en) * 2020-06-11 2020-09-22 青岛大学 Equipment soft fault detection method based on operating system kernel calling data
CN112381182A (en) * 2020-12-11 2021-02-19 大连海事大学 Daily activity prediction method based on interactive multi-task model
CN112417989A (en) * 2020-10-30 2021-02-26 四川天翼网络服务有限公司 Invigilator violation identification method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303193A (en) * 2015-09-21 2016-02-03 重庆邮电大学 People counting system for processing single-frame image
CN107292247A (en) * 2017-06-05 2017-10-24 浙江理工大学 A kind of Human bodys' response method and device based on residual error network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303193A (en) * 2015-09-21 2016-02-03 重庆邮电大学 People counting system for processing single-frame image
CN107292247A (en) * 2017-06-05 2017-10-24 浙江理工大学 A kind of Human bodys' response method and device based on residual error network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIN LI,MOOI CHOO CHUAH.: ""ReHAR: Robust and Efficient Human Activity Recognition"", 《ARXIV》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539988A (en) * 2020-04-15 2020-08-14 京东方科技集团股份有限公司 Visual odometer implementation method and device and electronic equipment
CN111539988B (en) * 2020-04-15 2024-04-09 京东方科技集团股份有限公司 Visual odometer implementation method and device and electronic equipment
CN111695501A (en) * 2020-06-11 2020-09-22 青岛大学 Equipment soft fault detection method based on operating system kernel calling data
CN111695501B (en) * 2020-06-11 2021-08-10 青岛大学 Equipment soft fault detection method based on operating system kernel calling data
CN112417989A (en) * 2020-10-30 2021-02-26 四川天翼网络服务有限公司 Invigilator violation identification method and system
CN112381182A (en) * 2020-12-11 2021-02-19 大连海事大学 Daily activity prediction method based on interactive multi-task model
CN112381182B (en) * 2020-12-11 2024-01-19 大连海事大学 Daily activity prediction method based on interactive multitasking model

Similar Documents

Publication Publication Date Title
CN108537195A (en) A kind of mankind's activity recognition methods indicating model based on single frames
US10558892B2 (en) Scene understanding using a neurosynaptic system
Jiang et al. Deepurbanmomentum: An online deep-learning system for short-term urban mobility prediction
Wang et al. Detection of unwanted traffic congestion based on existing surveillance system using in freeway via a CNN-architecture trafficnet
CN109191453A (en) Method and apparatus for generating image category detection model
US11593610B2 (en) Airport noise classification method and system
CN109902880A (en) A kind of city stream of people's prediction technique generating confrontation network based on Seq2Seq
Jeon et al. Artificial intelligence for traffic signal control based solely on video images
CN114049513A (en) Knowledge distillation method and system based on multi-student discussion
CN113095346A (en) Data labeling method and data labeling device
CN110413838A (en) A kind of unsupervised video frequency abstract model and its method for building up
CN109961041B (en) Video identification method and device and storage medium
KR20220004491A (en) Artificial intelligence based tree data management system and tree data management method
CN112446331A (en) Knowledge distillation-based space-time double-flow segmented network behavior identification method and system
Gaihua et al. A serial-parallel self-attention network joint with multi-scale dilated convolution
CN117116048A (en) Knowledge-driven traffic prediction method based on knowledge representation model and graph neural network
US11734557B2 (en) Neural network with frozen nodes
CN115082752A (en) Target detection model training method, device, equipment and medium based on weak supervision
CN115565146A (en) Perception model training method and system for acquiring aerial view characteristics based on self-encoder
Jain et al. Smart city management system using IoT with deep learning
Zheng et al. A deep learning–based approach for moving vehicle counting and short-term traffic prediction from video images
Li et al. Multi-branch semantic GAN for infrared image generation from optical image
CN116955698A (en) Matching model training method and device, electronic equipment and storage medium
CN113255563B (en) Scenic spot people flow control system and method
CN115965890A (en) Method, device and equipment for video content recognition and model training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20180914