CN114140879A - Behavior identification method and device based on multi-head cascade attention network and time convolution network - Google Patents

Behavior identification method and device based on multi-head cascade attention network and time convolution network Download PDF

Info

Publication number
CN114140879A
CN114140879A CN202111446154.XA CN202111446154A CN114140879A CN 114140879 A CN114140879 A CN 114140879A CN 202111446154 A CN202111446154 A CN 202111446154A CN 114140879 A CN114140879 A CN 114140879A
Authority
CN
China
Prior art keywords
attention
network
video
local
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111446154.XA
Other languages
Chinese (zh)
Inventor
郭媛君
杨之乐
陈雪健
冯伟
王尧
吴承科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202111446154.XA priority Critical patent/CN114140879A/en
Publication of CN114140879A publication Critical patent/CN114140879A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a behavior identification method and a device based on a multi-head cascade attention network and a time convolution network, comprising the following steps: collecting a video and extracting video characteristic information in the video; capturing local attention weight values in a self-attention mode; capturing other characteristic information in the video by adopting a multi-head attention mechanism; weighting the characteristic values in the characteristic space by adopting a linear transformation and normalization method, and increasing the diversity of the self-attention characteristics; integrating local features into a plurality of global representations by using the local attention weight, and learning attention weight by taking the self-attention feature as input; extracting time sequence characteristics according to the multi-stage time convolution network, and improving a prediction result; and analyzing the prediction result through an expert system to obtain a final behavior category. The behavior recognition method effectively solves the limitation of the existing recognition method, and has the advantages of accurate and timely monitoring result and low possibility of being influenced by external factors such as dust, volatile gas and the like.

Description

Behavior identification method and device based on multi-head cascade attention network and time convolution network
Technical Field
The invention belongs to the field of behavior identification, and particularly relates to a behavior identification method, a behavior identification system, an electronic device and application based on a multi-head cascade attention network and a time convolution network.
Background
With the development of monitoring technology, image information acquired by an optical camera is input into a computer, and a computer vision technology is utilized to perform real-time information processing and pattern recognition on sequence images in a video according to a previously designed algorithm so as to detect smoking behaviors. Compared with a manual supervision method and a traditional sensor smoke alarm, the smoking behavior detection system based on computer vision has the advantages of wide monitoring range, high utilization rate of monitoring resources, automatic positioning of smokers, alarm sending and the like.
The traditional smoking detection method generally adopts the modes of manual supervision, smoke sensors, wearable equipment, manual supervision and the like to detect. These methods have a number of limitations: firstly, the smoke concentration in an outdoor scene is greatly diluted and cannot be sensed by a smoke sensor; secondly, the wearable equipment has higher detection cost; thirdly, the manual detection method requires great manpower investment. And the traditional physical detection method cannot locate smokers in real time.
Smoking detection and intervention have used different available technologies in the past few years, including sensors, computer vision, wearable sensory computing technologies, and the like. Due to the characteristics of low cigarette concentration and easiness in dispersion, the smoke detection equipment based on the sensor is limited by the size and the sealing degree of a use space, is easily interfered by external factors such as dust, volatile gas and the like, and cannot be applied to smoking behavior detection in most public places. Meanwhile, the traditional smoke sensing equipment cannot position smokers in real time and cannot effectively guarantee the effective operation of smoke prohibition work.
Therefore, a monitoring means which is low in cost, efficient and capable of determining the target action in real time is urgently needed to be developed.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a behavior identification method and device based on a multi-head cascade attention network and a time convolution network, which are used for solving at least one of the technical problems.
In order to achieve the purpose, the invention adopts the specific scheme that:
a behavior identification method based on a multi-head cascade attention network and a time convolution network comprises the following steps:
collecting a video and extracting video characteristic information in the video;
learning at least 1 attention feature through the video feature information, and capturing a local attention weight in a self-attention mode;
capturing other characteristic information in the video by adopting a multi-head attention mechanism;
weighting the characteristic values in the characteristic space by adopting a linear transformation and normalization method, and increasing the diversity of the self-attention characteristics;
adopting a multi-head cascade attention network, integrating local features into a plurality of global representations by using the local attention weight, and learning attention weight by taking the self-attention feature as input;
acquiring a first action label corresponding to the video characteristic information according to the attention weight, and extracting time sequence characteristics of the first action label according to a multi-stage time convolution network to improve a prediction result;
and analyzing the prediction result through an expert system to obtain a final behavior category.
The acquiring a video and extracting video feature information in the video includes:
by I ═ I1,I2...Ik]k segments representing the video, with a parameter theta1The feature extraction network extracts feature information of the video:
X=[x1,x2,...xK]=[r(I1;θ1),...,r(IK;θ1)]
wherein Ii ∈ RH*W*3*LH and W are the height and width of the incoming video segment, respectively, and L is the length of the video segment.
The "learning at least 1 attention feature through the video feature information and capturing a local attention weight in a self-attention manner" includes:
inputting the video feature information into the next two full-connection layers, and obtaining various learned attention features through the first connection layer for learning self-attention weight and the normalization of data combined with the second connection layer;
attention weight αijThe inputs of (a) are defined as follows:
Figure BDA0003384048190000031
each output of the first FC layer is a weighted value of the ith primitive feature and the attention weight of the ith head attention module, which is defined as follows:
Figure BDA0003384048190000032
wherein k is the number of video segments; xjThe j frame video characteristic information; w is a parameter of the fully connected layer of the global attention module.
The weighting of the feature values in the feature space by adopting the linear transformation and normalization method comprises the following steps:
the linear transformation is performed by the following procedure:
Figure BDA0003384048190000033
wherein y' is obtained by linear transformation of y of the full connection layer; and N is the number of the self-attention modules.
The "adopting a multi-head cascade attention network, integrating local features into a plurality of global representations by using the local attention weight, and learning the attention weight by using the self-attention feature as an input" includes:
learning attention weights by connecting the video representation and the cascaded layers of self-attention features, each attention weight being defined as follows, taking the self-attention features as input:
βi=sigmoid(wT[yi′;G])
wherein w is a parameter of the fully connected layer of the global attention module; [ y ]i′;G]Denotes a number yi' and G are connected by a concatenation operator; i is 1, 2, 3 … ….
The "acquiring a first action tag corresponding to the video feature information according to the attention weight and extracting a time series feature of the first action tag according to a multi-stage time convolution network" includes:
introducing a multi-stage time convolution network to finish the task of dividing the time action, and introducing expansion convolution in the time convolution network;
each stage in the time convolutional network takes an initial prediction from a previous stage and refines it.
A behavior recognition system based on a multi-head cascade attention network and a time convolution network comprises:
the multi-head cascade network module is used for acquiring local attention weights in the video and integrating local features into a plurality of global representations according to the local attention weights;
and the action logic combination module performs data interaction with the multi-head cascade network module and is used for acquiring behavior classification of the video information.
The multi-head cascade network module comprises:
the local attention module is connected with the outside and used for learning a plurality of attention weights of each network segment from network segment characteristics generated by the backbone of the multi-head cascade network module, capturing the importance of the local characteristics in a self-attention mode and obtaining local attention weights;
the global attention module is in data interaction with the local attention module and is used for integrating local features into a plurality of global representations by using the local attention weight values and then learning secondary attention of global information in a relational manner;
and the global attention module performs data interaction with the action logic combination module and is used for performing behavior recognition and classification.
An electronic device based on behavior recognition of a multi-headed cascade attention network and a time convolution network, comprising:
a storage medium for storing a computer program;
a processing unit, which exchanges data with the storage medium, and is used for executing the computer program through the processing unit when performing behavior recognition, so as to perform the steps of the behavior recognition method based on the multi-head cascade attention network and the time convolution network according to any one of claims 1 to 6.
The behavior identification method based on the multi-head cascade attention network and the time convolution network is applied to the smoking monitoring direction.
Has the advantages that: the invention has the following advantages:
the method comprises the steps of firstly collecting video clips in monitoring videos of various public places, and marking the obtained videos to form a data set. And inputting the marked data set into a multi-head cascade attention network for pre-training to obtain pre-training weights, testing, training again and updating the pre-training weights, so that the accuracy of the network for identifying and positioning smoking behaviors achieves a better effect. The method effectively solves the limitation of the existing identification method, and has the advantages of accurate and timely monitoring result and low possibility of being influenced by external factors such as dust, volatile gas and the like.
The system of the invention completes the identification and classification of behaviors in the video by constructing two layers of attention modules and combining with a time convolution network, wherein the two layers of attention modules comprise: the local attention module and the global attention module capture the importance of local features in a self-attention mode to obtain local attention weights by utilizing the local attention module, then integrate the local features into a plurality of global representations by utilizing the global attention module, and then learn secondary attention of global information in a relational mode; and finally, carrying out final identification and classification through a multistage time convolution network. The system has the advantages of simple structure and accurate recognition result after two-stage recognition.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of a multi-head cascade network according to the present invention.
FIG. 3 is a block diagram of a multi-stage time convolutional network action logic combination.
Fig. 4 is a block diagram of an electronic device based on behavior recognition of a multi-headed cascade attention network and a time convolution network.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
In this context: each self-attention module obtains self-attention characteristics, and all the self-attention characteristics are collectively called attention characteristics; specific example I: in this embodiment, a method and an apparatus for behavior recognition based on a multi-headed cascade attention network and a time convolution network according to the present invention are described in detail by taking smoking monitoring in real time as an example.
The specific technical flow chart of the invention is shown in the attached figures 1-3, and the detailed scheme of the behavior identification method based on the multi-head cascade attention network and the time convolution network comprises the following steps:
s1, first using I ═ I1,I2...Ik]k segments represent the video, and then feature information of the video is extracted through a feature extraction network with a parameter theta: x ═ X1,x2,...xK]=[r(I1;θ1),...,r(IK;θ1)]Wherein, Ii∈RH *W*3*LH and W are the height and width of the incoming video segment, respectively, and L is the length of the video segment.
And S2, inputting the video feature information subjected to feature extraction into the next two full-connection layers, wherein the first full-connection layer aims to learn the self-attention weight, and the second full-connection layer is combined with the normalization of data and aims to learn various attention features. The self-attention module obtains respective attention weights; note that the input of the weights is defined as follows: first we divide the video of K frames into { I }1,I2,…,IkIs passed through a parameter of θ1Is used for extracting the characteristic of the network r (·; theta)1) Obtaining the characteristic X ═ X of the video K frame1,x2,…,xk](ii) a Input video clip feature XjSelf-attention weight of (a)ijIs defined as follows:
Figure BDA0003384048190000071
each output of the first fully connected layer is a weighted value of the ith original feature and the attention weight of the ith head attention module. The definition is as follows:
Figure BDA0003384048190000072
s3, because a single-head self-attention weight feature may intelligently reflect the feature of a certain aspect of the video, for this reason, a multi-head attention mechanism is adopted to capture the feature of more aspects of the video.
S4, in order to avoid that these self-attention weighted features always tend to focus on similar signals, and increase the diversity of the self-attention features, we adopt the methods of linear transformation and normalization to weight or shift the feature values in the feature space. The characteristics are different from each other and the distribution is also different while the scale invariance is ensured through linear transformation and normalization, and the scale invariance is also beneficial to optimizing the network. The process is defined as follows:
Figure BDA0003384048190000073
wherein y' is obtained by linearly transforming y of the full connection layer.
The global attention module learns attention weights by concatenating the video representation and the approximate video representation as input, S5. This module operates mainly on global features, and each attention weight can be defined in the form:
βi=sigmoid(wT[yi′;G])
where w is a parameter of the fully connected layer of the global attention module, [ yi′;G]Indicating that yi' and G are connected by a concatenation operator.
And S6, after the first action label is obtained, adding a multi-stage time convolution network at the end, and extracting the time sequence characteristics of the obtained result. The effect of this combination is a gradual improvement of the predictions of the first few stages.
On the basis, a multi-stage time convolution network is introduced to complete the division task of the time action; in order to reduce the number of parameters that need to be processed, a dilation convolution is introduced into this time convolution network. In this multi-stage model, each stage takes an initial prediction from the previous stage and refines it. Using such a multi-stage architecture helps to provide more context to predict class labels for each segment. In addition, since the output of each stage is an initial prediction, the network can capture the dependency relationship between the action classes and learn the possible action sequences, which helps to reduce over-segmentation errors.
S7, after obtaining the result of the action class label, we can proceed the logic analysis between the actions through an expert system. Finally, an accurate behavior category is obtained.
Specific example II:
the invention also discloses an embodiment: as shown in fig. 4, a behavior recognition system based on a multi-head cascade attention network and a time convolution network includes: a multi-head cascade network module 100 and an action logic combination module 200; the multi-head cascade network module 100 is configured to obtain a local attention weight in a video and integrate local features into a plurality of global representations according to the local attention weight; the action logic combination module 200 performs data interaction with the multi-head cascade network module, and is used for behavior classification of video information.
The multi-head cascade network module 100 includes: a local attention module 101 and a global attention module 102; the local attention module 101 is connected with the outside, and is configured to learn a plurality of attention weights of each network segment from network segment features generated by a backbone of the multi-head cascade network module, and capture the importance of the local features in a self-attention manner to obtain local attention weights; the global attention module 102 performs data interaction with the local attention module 101, and is configured to integrate local features into a plurality of global representations by using the local attention weights, and then learn secondary attention of global information in a relational manner; the global attention module 102 performs data interaction with the action logic combination module 200 for behavior recognition and classification.
Specific example III:
the invention also provides an embodiment: an electronic device based on behavior recognition of a multi-headed cascade attention network and a time convolution network, comprising: a storage medium and a processing unit; a storage medium for storing a computer program; the processing unit exchanges data with the storage medium, and is used for executing the computer program through the processing unit when performing behavior recognition, so as to perform the steps of the behavior recognition method based on the multi-head cascade attention network and the time convolution network.
In the electronic device, the storage medium is preferably a storage device such as a mobile hard disk, a solid state disk, or a usb disk; a processing unit, preferably a CPU, for exchanging data with the storage medium, and executing the computer program by the processing unit when performing behavior recognition, so as to perform the above-mentioned steps of behavior recognition based on the multi-headed cascade attention network and the time convolution network.
The CPU described above can execute various appropriate actions and processes according to a program stored in a storage medium. The electronic device also includes peripherals including an input part for a keyboard, a mouse, etc., and an output part such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.; in particular, according to the disclosed embodiments of the invention, the processes as described in any of FIG. 1 may be implemented as computer software programs.
An embodiment provided by the invention comprises a computer program product comprising a computer program carried on a computer readable medium, the computer program comprising instructions for executing a method as described in fig. 1
Program code for the method shown in any of the flowcharts. The computer program may be downloaded and installed from a network. The computer program, when executed by the CPU, performs the above-described functions defined in the system of the present invention.
The present invention also provides a computer-readable storage medium having a computer program stored therein; the computer program, when executed, performs the steps of the behavior recognition method based on the multi-headed cascade attention network and the time convolution network as described above.
In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily change or replace the present invention within the technical scope of the present invention. Therefore, the protection scope of the present invention is subject to the protection scope of the claims.

Claims (10)

1. A behavior identification method based on a multi-head cascade attention network and a time convolution network is characterized by comprising the following steps:
collecting a video and extracting video characteristic information in the video;
learning at least 1 attention feature through the video feature information, and capturing a local attention weight in a self-attention mode;
capturing other characteristic information in the video by adopting a multi-head attention mechanism;
weighting the characteristic values in the characteristic space by adopting a linear transformation and normalization method, and increasing the diversity of the self-attention characteristics;
adopting a multi-head cascade attention network, integrating local features into a plurality of global representations by using the local attention weight, and learning attention weight by taking the self-attention feature as input;
acquiring a first action label corresponding to the video characteristic information according to the attention weight, and extracting time sequence characteristics of the first action label according to a multi-stage time convolution network to improve a prediction result;
and analyzing the prediction result through an expert system to obtain a final behavior category.
2. The behavior identification method based on the multi-head cascade attention network and the time convolution network as claimed in claim 1, wherein said "acquiring a video and extracting video feature information in the video" comprises:
by I ═ I1,I2...Ik]k segments representing the video, with a parameter theta1The feature extraction network extracts feature information of the video:
X=[x1,x2,...xK]=[r(I1;θ1),...,r(IK;θ1)]
wherein Ii ∈ RH*W*3*LH and W are the height and width of the incoming video segment, respectively, and L is the length of the video segment.
3. The method according to claim 1, wherein the "learning at least 1 attention feature through the video feature information and capturing local attention weight in a self-attention manner" includes:
inputting the video feature information into the next two full-connection layers, and obtaining various learned attention features through the first connection layer for learning self-attention weight and the normalization of data combined with the second connection layer;
the video of K frames is first divided into { I }1,I2,…,IkA feature extraction network r (-) through a parameter theta1) Obtaining the characteristic X ═ X of the video K frame1,x2,…,xk];
Input video clip feature XjSelf-attention weight of (a)ijIs defined as follows:
Figure FDA0003384048180000021
each output of the first FC layer is a weighted value of the ith primitive feature and the attention weight of the ith head attention module, which is defined as follows:
Figure FDA0003384048180000022
wherein k is the number of video segments; xjThe j frame video characteristic information; w is a parameter of the fully connected layer of the global attention module.
4. The behavior identification method based on the multi-head cascade attention network and the time convolution network as claimed in claim 1, wherein the "weighting feature values in feature space by using linear transformation and normalization" comprises:
the linear transformation is performed by the following procedure:
Figure FDA0003384048180000023
wherein y' is obtained by linear transformation of y of the full connection layer; and N is the number of the self-attention modules.
5. The method according to claim 1, wherein the step of integrating local features into a plurality of global representations by using the local attention weight value and learning the attention weight value by using the self-attention feature as an input comprises the steps of:
learning attention weights by connecting the video representation and the cascaded layers of self-attention features, each attention weight being defined as follows, taking the self-attention features as input:
βi=sigmoid(wT[yi′;G])
wherein w is a parameter of the fully connected layer of the global attention module; [ y ]i′;G]Denotes a number yi' and G are connected by a concatenation operator; i is 1, 2, 3 … ….
6. The method according to claim 1, wherein the step of obtaining a first action tag corresponding to the video feature information according to the attention weight and extracting a time series feature of the first action tag according to a multi-stage time convolution network comprises:
introducing a multi-stage time convolution network to finish the task of dividing the time action, and introducing expansion convolution in the time convolution network;
each stage in the time convolutional network takes an initial prediction from a previous stage and refines it.
7. A behavior recognition system based on a multi-head cascade attention network and a time convolution network is characterized by comprising:
the multi-head cascade network module is used for acquiring local attention weights in the video and integrating local features into a plurality of global representations according to the local attention weights;
and the action logic combination module performs data interaction with the multi-head cascade network module and is used for acquiring behavior classification of the video information.
8. The behavior recognition system based on the multi-head cascade attention network and the time convolution network as claimed in claim 7, wherein the multi-head cascade network module comprises:
the local attention module is connected with the outside and used for learning a plurality of attention weights of each network segment from network segment characteristics generated by the backbone of the multi-head cascade network module, capturing the importance of the local characteristics in a self-attention mode and obtaining local attention weights;
the global attention module is in data interaction with the local attention module and is used for integrating local features into a plurality of global representations by using the local attention weight values and then learning secondary attention of global information in a relational manner;
and the global attention module and the action logic combination module carry out data interaction for behavior identification and classification.
9. An electronic device based on behavior recognition of a multi-headed cascade attention network and a time convolution network, comprising:
a storage medium for storing a computer program;
a processing unit, which exchanges data with the storage medium, and is used for executing the computer program through the processing unit when performing behavior recognition, so as to perform the steps of the behavior recognition method based on the multi-head cascade attention network and the time convolution network according to any one of claims 1 to 6.
10. The use of a behavior recognition method based on a multi-headed cascade attention network and a time convolution network according to any one of claims 1 to 6 in the smoking monitoring direction.
CN202111446154.XA 2021-11-30 2021-11-30 Behavior identification method and device based on multi-head cascade attention network and time convolution network Pending CN114140879A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111446154.XA CN114140879A (en) 2021-11-30 2021-11-30 Behavior identification method and device based on multi-head cascade attention network and time convolution network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111446154.XA CN114140879A (en) 2021-11-30 2021-11-30 Behavior identification method and device based on multi-head cascade attention network and time convolution network

Publications (1)

Publication Number Publication Date
CN114140879A true CN114140879A (en) 2022-03-04

Family

ID=80386065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111446154.XA Pending CN114140879A (en) 2021-11-30 2021-11-30 Behavior identification method and device based on multi-head cascade attention network and time convolution network

Country Status (1)

Country Link
CN (1) CN114140879A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649630A (en) * 2024-01-29 2024-03-05 武汉纺织大学 Examination room cheating behavior identification method based on monitoring video stream

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649630A (en) * 2024-01-29 2024-03-05 武汉纺织大学 Examination room cheating behavior identification method based on monitoring video stream
CN117649630B (en) * 2024-01-29 2024-04-26 武汉纺织大学 Examination room cheating behavior identification method based on monitoring video stream

Similar Documents

Publication Publication Date Title
CN110728209B (en) Gesture recognition method and device, electronic equipment and storage medium
CN109086811B (en) Multi-label image classification method and device and electronic equipment
CN110796018B (en) Hand motion recognition method based on depth image and color image
CN113537070B (en) Detection method, detection device, electronic equipment and storage medium
CN116579616B (en) Risk identification method based on deep learning
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
CN112215831B (en) Method and system for evaluating quality of face image
CN116310850B (en) Remote sensing image target detection method based on improved RetinaNet
CN111199238A (en) Behavior identification method and equipment based on double-current convolutional neural network
CN116229052B (en) Method for detecting state change of substation equipment based on twin network
Geng et al. An improved helmet detection method for YOLOv3 on an unbalanced dataset
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN111310837A (en) Vehicle refitting recognition method, device, system, medium and equipment
CN114724140A (en) Strawberry maturity detection method and device based on YOLO V3
CN114140879A (en) Behavior identification method and device based on multi-head cascade attention network and time convolution network
CN112712005B (en) Training method of recognition model, target recognition method and terminal equipment
CN111428567B (en) Pedestrian tracking system and method based on affine multitask regression
CN117152815A (en) Student activity accompanying data analysis method, device and equipment
CN115719428A (en) Face image clustering method, device, equipment and medium based on classification model
CN116309343A (en) Defect detection method and device based on deep learning and storage medium
CN116189286A (en) Video image violence behavior detection model and detection method
CN112633089B (en) Video pedestrian re-identification method, intelligent terminal and storage medium
CN117407557B (en) Zero sample instance segmentation method, system, readable storage medium and computer
Priyadharsini et al. Performance Investigation of Handwritten Equation Solver using CNN for Betterment
CN114998990B (en) Method and device for identifying safety behaviors of personnel on construction site

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination