CN111104855B - Workflow identification method based on time sequence behavior detection - Google Patents

Workflow identification method based on time sequence behavior detection Download PDF

Info

Publication number
CN111104855B
CN111104855B CN201911097168.8A CN201911097168A CN111104855B CN 111104855 B CN111104855 B CN 111104855B CN 201911097168 A CN201911097168 A CN 201911097168A CN 111104855 B CN111104855 B CN 111104855B
Authority
CN
China
Prior art keywords
candidate
network
time
time sequence
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911097168.8A
Other languages
Chinese (zh)
Other versions
CN111104855A (en
Inventor
胡海洋
王庆文
李忠金
陈洁
俞佳成
张力
余嘉伟
周美玲
陈振辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201911097168.8A priority Critical patent/CN111104855B/en
Publication of CN111104855A publication Critical patent/CN111104855A/en
Application granted granted Critical
Publication of CN111104855B publication Critical patent/CN111104855B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a workflow identification method based on time sequence behavior detection. The invention provides a time sequence video sparse sampling method, which reduces useless data and accelerates the overall speed of a frame. Meanwhile, in order to accelerate the recognition speed and recognition accuracy, the three-dimensional residual error network is used for extracting the features, so that the speed and efficiency of space-time feature extraction are ensured. In order to avoid missing some candidate fragments in the time sequence candidate sub-network, the invention uses Soft-NMS to update NMS, thereby ensuring recall rate of detection result. Through the strategy, the framework provided by the invention is more suitable for workflow identification in a complex factory production environment. The method solves the time sequence positioning problem of actions in the video, effectively utilizes a large amount of intelligent monitoring videos generated in the factory environment, detects the types of activities in the video and the time slices of the activities in the video through the neural network, models the workflow, and further optimizes the whole production flow.

Description

Workflow identification method based on time sequence behavior detection
Technical Field
The invention belongs to computer vision, and relates to application of deep learning in the aspect of recognition technology of factory production operation behaviors, which is used for recognizing operation types of production operations and time slices of the production operations. In the current industrial production, video data with tens of thousands of valuable values are generated daily by intelligent monitoring, and in order to fully utilize the video data, a workflow identification method is designed to automatically extract features from a large amount of video data, and identify the type of the industrial production operation and the time slices of the industrial production operation.
Background
With the development of information technology and manufacturing technology, intelligent manufacturing has become an important trend in the field of industrial production. Workflow identification is also rapidly innovating as a big technical direction for intelligent manufacturing. Generally, a workflow is generally considered to be a sequence of independent activities. The traditional workflow identification technology mainly adopts a process mining technology, namely, related content of service execution is extracted and analyzed from a system log generated by a service process information system, and a service flow or production decision is adjusted in time.
Thanks to the development of computer vision technology, the current workflow identification mainly shoots various production activities on a production line through cameras in a production workshop, processes and calculates videos, and realizes rapid detection of industrial processes. The obvious light change exists in a factory production workshop, and the scene identified by the object motion shielding workflow has the specificity compared with the common scene, so that the traditional identification method relying on target object detection is difficult to be applicable. Because the monitoring video is a real-time video, workflow identification has a real-time requirement on identification speed.
Meanwhile, as the requirements of factory production on workflow identification are further improved, different tasks in the workflow often have different execution times, and no clear definition exists between the start of the task and the end of the task, and workflow identification based on behavior identification cannot perform time sequence positioning on activities in video. Therefore, the present invention changes the emphasis of workflow identification from behavior identification to time-series behavior detection. Unlike workflow identification based on behavior identification, workflow identification methods based on time series behavior detection also include positioning of an activity in time series, i.e., start time and end time of the activity. The key points of the task are mainly the following two points: 1. many methods adopt a framework for classifying candidate fragments, and for such methods, it is important to have higher quality of candidate fragments, i.e. to reduce the number of candidate fragments under the condition of ensuring that the recognition result is correct. 2. The category of the behavior can accurately obtain the category information of the time sequence segment.
However, the production operation behavior recognition technology has its complexity and specificity. Deep learning approaches have enjoyed tremendous success in the field of image processing, and many classification architectures based on convolutional neural networks have been designed to handle workflow recognition in unprocessed long videos. The invention designs a workflow method based on time sequence behavior detection to detect the category of actions in unprocessed long videos in factories and the time slices of actions.
Disclosure of Invention
The invention discloses a workflow identification method based on time sequence behavior detection. Workflow identification is complex and special compared to general video scenes because of frequent light changes of the background, severe occlusion between objects, various noise interferences, and long duration of workers in the manufacturing environment in which the workflow identification is located. Because of the complex environment of a factory, workers may be involved in a large number of unwanted video frames during a production campaign for a long period of time. Aiming at the phenomenon, the invention provides a time sequence video sparse sampling method, which reduces useless data and accelerates the overall speed of a frame. Meanwhile, in order to accelerate the recognition speed and recognition accuracy, the three-dimensional residual error network is used for extracting the features, so that the speed and efficiency of space-time feature extraction are ensured. In order to avoid missing some candidate fragments in the time sequence candidate sub-network, the invention uses Soft-NMS to update NMS, thereby ensuring recall rate of detection result. Through the strategy, the framework provided by the invention is more suitable for workflow identification in a complex factory production environment.
The method comprises the following specific steps:
and (1) processing the video to be processed by using a sparse sampling strategy, wherein the method comprises the steps of dividing continuous frames in the video into a section, and randomly sampling the inside of the section, so that the redundancy of the video is avoided.
And (2) extracting features by using a three-dimensional residual error network, wherein the three-dimensional residual error network is mainly used for reducing training time and reducing model size.
Step (3), obtaining candidate active fragments by using an anchor point mechanism to form anchor point fragments;
and (4) judging whether the candidate anchor segments contain actions or not through a classification network, and determining the boundaries of the anchor segments through a boundary regression network, so that a candidate list I is obtained.
And (5) removing the highly overlapped and low-confidence active fragments in the candidate list I by using a Soft-NMS method to obtain a final candidate list II.
Step (6), through a maximum pooling method, candidate features with any length are changed into features I with fixed dimensions of 512 x 1 x 4.
And (7) inputting the feature I with the fixed dimension into two full-connection layers simultaneously, wherein two continuous full-connection layers are connected with one softmax classifier for judging the activity category, and the other two continuous full-connection layers are connected with one regression layer for improving the time period of candidate activity occurrence.
And (8) modeling the workflow according to the obtained action category and the generated activity fragment thereof, thereby further optimizing the whole production flow.
The invention has the following beneficial effects:
the workflow identification method based on time sequence behavior detection provided by the invention is mainly characterized by innovation: 1) The input video is processed by a sparse sampling method; 2) Extracting features of the input video by using a three-dimensional residual neutral network; 3) The high overlap and low confidence candidate segments are processed using a Soft-NMS method.
In order to avoid redundant frames generated by performing a certain production activity for a long time, the input video is processed by the sparse sampling method provided by the invention. The three-dimensional residual neural network is used for reducing training time and narrowing the model. To avoid the occurrence of highly overlapping and low confidence candidate segments, the present invention uses a soft-NMS approach to improve the quality of the candidate segments.
The method solves the time sequence positioning problem of actions in the video, effectively utilizes a large amount of intelligent monitoring videos generated in the factory environment, detects the types of activities in the video and the time slices of the activities in the video through the neural network, models the workflow, and further optimizes the whole production flow.
Drawings
Fig. 1 is a schematic diagram of three-dimensional residual neural network construction.
Fig. 2 is a schematic diagram of an anchor mechanism employed in the present invention.
Fig. 3 is an overall flow from input to output of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and examples.
Related concept definition and symbol description
f t : representing a video frame of the video at time t.
a k : representing the size of the kth anchor at a certain timing position.
Lcls: a multi-class softmax penalty function is represented for determining the class of active segments in the workflow.
Lreg: the L1 smoothing loss function is represented for optimizing the relative offset of the candidate segment and the real case.
PLIST: a candidate list containing confidence levels.
RLIST: the return list obtained after screening by soft-NMS.
ROI: a region of interest.
softmax: a multi-type classifier, the probability of each class is as follows:
as shown in fig. 1-3, a workflow identification method based on time sequence behavior detection specifically comprises the following implementation steps:
step (1), avoiding redundancy generated during long-time operation by a video sparse sampling mode, wherein the specific sampling mode is as follows:
1-1. Decomposing the original video into a sequence of successive video frames { f 1 ,f 2 ,f 3 ,…,f t }。
And 1-2, taking continuous 4 frames as an interval, and randomly reading one frame in one interval at a time, so that the video frames are prevented from being acquired at the same position at each time while the time sequence redundancy is avoided.
And 1-3, taking the obtained continuous random frames as training samples to be input into a three-dimensional residual neutral network.
And (2) extracting space-time characteristics by using a three-dimensional residual error network. How to increase the model speed while ensuring the model size, the invention adopts a three-dimensional residual neural network (generally Res 18) to extract the space-time characteristics of the input video frames, and in order to ensure the calculation efficiency and the end-to-end training, the time sequence candidate sub-network and the behavior classification sub-network share the space-time characteristics. (see FIG. 1)
2-1. The dimensions of the input video frame are compressed to 112 x 112 to maximize GPU performance.
2-2, avoiding the phenomenon of gradient disappearance or gradient explosion through a residual block, and increasing the depth of a network.
2-3. Input continuous RGB video frames of size 3 x 112 to a three-dimensional residual neural network, which outputs a final spatio-temporal signature of 512 x l/8 x 7.
Step (3), an anchor point mechanism is adopted to acquire anchor point fragments with different sizes (see figure 2).
3-1, aiming at space-time characteristics, the time sequence candidate sub-network can quickly generate anchor point fragments with different sizes, judge the probability that the video in the anchor point fragments is a target or background and be used for initially generating a candidate list I, wherein the expression formula of the anchor point fragments is as follows:
anchor={c i ,l i }
wherein c i Representing the center position of the anchor segment, l i Representing the length of the anchor segment in time sequence.
3-2, anchor segments are distributed in the time-space characteristics with the length of 8/L, and k anchor points are arranged at the time sequence position of each time-space characteristic, so that k anchor segment sequences with different lengths are arranged at the time sequence position of each time-space characteristic in the time sequence candidate sub-network. That is, the length increment sequence of the anchor point segment at a certain time sequence position is:
{a 1 ,a 2 ,a 3 ,…,a k }
ak is the kth anchor point at a certain time sequence position;
3-3. Read f frames per second (fps=f), then the overlap length of these anchor segments at the time sequence positions is:
{a 1 *8/f,a 2 *8/f,a 3 *8/f,…,a k *8/f}
the anchor segments of different lengths can be used to determine the time sequence position of an anchor segment through a boundary regression network.
Step (4), judging whether the anchor segments contain actions or not through a classification network and determining the anchor segments through a boundary regression network, so as to obtain a candidate list I; and (3) performing a series of operations on the space-time characteristics generated in the step (2-3), and taking the generated candidate activity fragments as the input of a behavior classification network of the next stage.
4-1. Adding a three-dimensional convolution kernel with the size of 3 x 3 to expand the space-time receptive field.
4-2. Adding a three-dimensional max-pooling kernel of size 1*H/16 x w/16 for generating a signature comprising only timing features.
4-3. After adding two 1 x 1 convolution kernels, the final feature map size is 512 x l/8 x 1.
4-4. Candidate list I (candidate list PLIST) is obtained by a boundary regression network and a behavior classification network. The specific loss function is as follows:
wherein N is cls The normalized value is classified, namely batch processing quantity; n (N) reg The regression normalization value, namely the number of anchor point fragments, i is the index value of the anchor point fragments in the feature map; λ is a weight value used to balance the two losses, and since cls term and reg term are almost equal in weight, λ takes 1.
And (5) removing the highly overlapped and low-confidence active fragments in the candidate list I (candidate list) by using a Soft-NMS method to obtain a final candidate list II (return list PLIST). The method adopts a certain linear method to reduce rather than directly clear, so that the method ensures the precision and simultaneously avoids missing some fragments with lower scores as much as possible, and the specific process is as follows:
5-1. Selecting a candidate active fragment M with the highest confidence from the candidate list PLIST, deleting the candidate active fragment M from the candidate list RLIST and putting the candidate active fragment M into the return list RLIST.
5-2 for each candidate b within the candidate list PLIST i The confidence score is s i If b is calculated i And if the overlap ratio with M is larger than the threshold value, the confidence coefficient is reduced in a linear mode. Namely:
s i (1-iou(M,b i ))
where iou is the intersection ratio, i.e. the ratio of the intersection of M and bi to the union.
5-3. Repeating steps 5-1 and 5-2 until the candidate list PLIST is empty.
Step (6), obtaining a feature with a fixed dimension by a maximum pooling method, and extracting a feature I with a fixed dimension from the space-time feature by using ROI pooling, namely, performing maximum pooling on the input 512 x L/8 x 7 in a grid of 1 x 4 to obtain a final unified dimension 512 x 1 x 4.
Step (7), inputting the feature I with fixed dimension into two full-connection layers simultaneously, wherein two continuous full-connection layers are connected with a softmax classifier for judging the activity category, and the other two continuous full-connection layers are connected with a regression layer for improving the time period of candidate activity occurrence, and the method is concretely realized as follows:
7-1 two fully connected layers are added.
7-2 adding a boundary regression network to carry out boundary correction, and carrying out behavior classification through a behavior classification network to obtain a target action category, wherein the specific loss function is as follows:
wherein N is cls The normalized value is classified, namely batch processing quantity; n (N) reg The regression normalization value is the number of candidate fragments; λ is a weight value for balancing the two losses, still set to 1.
And (8) determining the relation among the tasks according to the target action category and the generated time slices, so that the tasks in the workflow are identified, and the production efficiency of the industrial global equipment can be analyzed.

Claims (1)

1. The workflow identification method based on time sequence behavior detection is characterized by comprising the following steps:
the method comprises the steps of (1) processing a video to be processed by using a sparse sampling strategy, wherein continuous frames in the video are divided into a section, and random sampling is carried out in the section so as to avoid video redundancy;
step (2), extracting features by using a three-dimensional residual error network, reducing training time and reducing the size of a model;
step (3), obtaining candidate active fragments by using an anchor point mechanism to form anchor point fragments;
step (4), judging whether the candidate anchor points contain actions or not through a classification network, and determining the boundaries of the anchor points through a boundary regression network, so that a candidate list I is obtained;
step (5), removing the active fragments with high overlap and low confidence in the candidate list I by using a Soft-NMS method to obtain a final candidate list II;
step (6), through a maximum pooling method, candidate features with any length are changed into features I with fixed dimensions of 512 x 1 x 4;
step (7), inputting the feature I with fixed dimension into two full-connection layers simultaneously, wherein two continuous full-connection layers are connected with a softmax classifier for judging the activity category, and the other two continuous full-connection layers are connected with a regression layer for improving the time period of candidate activity occurrence;
step (8), modeling the workflow according to the obtained action category and the generated activity fragment thereof, thereby further optimizing the whole production flow;
the relevant concept definition and notation are described as follows:
f t : a video frame representing a video at time t;
a k : representing the size of the kth anchor point at a certain timing position;
lcls: representing a multi-class softmax penalty function for determining the class of active segments in the workflow;
lreg: representing an L1 smoothing loss function for optimizing the relative offset of the candidate segment and the real situation;
PLIST: a candidate list comprising a confidence level;
PLIST: a return list obtained after screening by soft-NMS;
ROI: a region of interest;
softmax: a multi-type classifier, the probability of each class is as follows:
the specific sampling mode of the step (1) is as follows:
1-1. Decomposing the original video into linksSuccessive video frame sequences { f 1 ,f 2 ,f 3 ,…,f t };
1-2, taking continuous 4 frames as an interval, and randomly reading one frame in each interval, so that the video frames are prevented from being acquired at the same position each time while the time sequence redundancy is avoided;
1-3, taking the obtained continuous random frames as training samples to be input into a three-dimensional residual neutral network;
and (2) extracting space-time characteristics by using a three-dimensional residual error network, extracting the space-time characteristics of an input video frame, and enabling a time sequence candidate sub-network and a behavior classification sub-network to share the space-time characteristics in order to ensure the calculation efficiency and the end-to-end training, wherein the method is specifically realized as follows:
2-1, compressing the dimension of the input video frame to 112×112 for maximizing GPU performance;
2-2, avoiding the phenomenon of gradient disappearance or gradient explosion through a residual block, and increasing the depth of a network;
2-3, inputting continuous RGB video frames with the size of 3 x 112 into a three-dimensional residual neutral network, and finally outputting the space-time characteristics of 512 x L/8 x 7 by the network;
the step (3) is specifically realized as follows:
3-1, aiming at space-time characteristics, the time sequence candidate sub-network can quickly generate anchor point fragments with different sizes, judge the probability that the video in the anchor point fragments is a target or background and be used for initially generating a candidate list I, wherein the expression formula of the anchor point fragments is as follows:
anchor={c i ,l i }
wherein c i Representing the center position of the anchor segment, l i Representing the length of the anchor segment in time sequence;
3-2, anchor segments are distributed in the time-space characteristics with the length of 8/L, and k anchor points are arranged at the time sequence position of each time-space characteristic, so that k anchor segment sequences with different lengths are arranged at the time sequence position of each time-space characteristic in the time sequence candidate sub-network; that is, the length increment sequence of the anchor point segment at a certain time sequence position is:
{a 1 ,a 2 ,a 3 ,...,a k }
ak is the kth anchor point at a certain time sequence position;
3-3. Read f frames per second (fps=f), then the overlap length of these anchor segments at the time sequence positions is:
{a 1 *8/f,a 2 *8/f,a 3 *8/f,···,a k *8/f}
the anchor point segments with different lengths can determine the time sequence position of an anchor point segment through a boundary regression network;
step (4) judging whether the anchor segments contain actions or not through a classification network and determining the anchor segments through a boundary regression network, so as to obtain a candidate list I; and (3) carrying out a series of operations on the space-time characteristics generated in the step (2-3), and taking the generated candidate active fragments as the input of a next-stage behavior classification network, wherein the method is specifically realized as follows:
4-1, adding a three-dimensional convolution kernel with the size of 3 x 3 to expand the space-time receptive field;
4-2, adding a three-dimensional maximum pooling kernel with the size of 1*H/16 x W/16 for generating a characteristic diagram only comprising time sequence characteristics;
4-3, adding two convolution kernels with the size of 1 x 1, wherein the size of the finally obtained characteristic diagram is 512 x L/8 x 1;
4-4, obtaining a candidate list I, namely a candidate list PLIST, through a boundary regression network and a behavior classification network, wherein the specific loss function is as follows:
wherein N is cls The normalized value is classified, namely batch processing quantity; n (N) reg The regression normalization value, namely the number of anchor point fragments, i is the index value of the anchor point fragments in the feature map; λ is a weight value used to balance the two losses, and since cls term and reg term are almost equal in weight, λ takes 1;
step (5) uses a Soft-NMS method to remove the highly overlapped and low confidence active fragments in the candidate list I to obtain the final candidate list II, namely a return list RLIST, and the specific process is as follows:
5-1, selecting a candidate active fragment M with the highest confidence from the candidate list PLIST, deleting the candidate active fragment M from the candidate list PLIST and putting the candidate active fragment M into a return list RLIST;
5-2 for each candidate b within the candidate list PLIST i The confidence score is s i If b is calculated i The confidence coefficient is reduced in a linear mode when the coincidence degree with M is larger than a threshold value; namely:
s i (1-iou(M,b i ))
wherein iou is the intersection ratio, i.e. the ratio of the intersection of M and bi to the union;
5-3, repeating the steps 5-1 and 5-2 until a candidate list PLIST is empty;
step (6) obtaining a feature with a fixed dimension through a maximum pooling method, extracting a feature I with a fixed dimension from the space-time feature by using ROI pooling, namely, performing maximum pooling on the input 512 x L/8 x 7 in a grid of 1 x 4 to obtain a final unified dimension 512 x 1 x 4;
step (7) inputting the feature I with fixed dimension into two full-connection layers simultaneously, wherein two continuous full-connection layers are connected with a softmax classifier for judging the activity category, and the other two continuous full-connection layers are connected with a regression layer for improving the time period of candidate activity occurrence, and the method is concretely realized as follows:
7-1 adding two full connection layers;
7-2 adding a boundary regression network to carry out boundary correction, and carrying out behavior classification through a behavior classification network to obtain a target action category, wherein the specific loss function is as follows:
wherein N is cls For classifying normalized values, i.e. batchManaging the quantity; n (N) reg The regression normalization value is the number of candidate fragments; λ is a weight value for balancing the two losses, still set to 1.
CN201911097168.8A 2019-11-11 2019-11-11 Workflow identification method based on time sequence behavior detection Active CN111104855B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911097168.8A CN111104855B (en) 2019-11-11 2019-11-11 Workflow identification method based on time sequence behavior detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911097168.8A CN111104855B (en) 2019-11-11 2019-11-11 Workflow identification method based on time sequence behavior detection

Publications (2)

Publication Number Publication Date
CN111104855A CN111104855A (en) 2020-05-05
CN111104855B true CN111104855B (en) 2023-09-12

Family

ID=70420741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911097168.8A Active CN111104855B (en) 2019-11-11 2019-11-11 Workflow identification method based on time sequence behavior detection

Country Status (1)

Country Link
CN (1) CN111104855B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860289B (en) * 2020-07-16 2024-04-02 北京思图场景数据科技服务有限公司 Time sequence action detection method and device and computer equipment
CN112149546A (en) * 2020-09-16 2020-12-29 珠海格力电器股份有限公司 Information processing method and device, electronic equipment and storage medium
CN113139530B (en) * 2021-06-21 2021-09-03 城云科技(中国)有限公司 Method and device for detecting sleep post behavior and electronic equipment thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573246A (en) * 2018-05-08 2018-09-25 北京工业大学 A kind of sequential action identification method based on deep learning
CN109409257A (en) * 2018-10-11 2019-03-01 北京大学深圳研究生院 A kind of video timing motion detection method based on Weakly supervised study
CN109784269A (en) * 2019-01-11 2019-05-21 中国石油大学(华东) One kind is based on the united human action detection of space-time and localization method
CN110188733A (en) * 2019-06-10 2019-08-30 电子科技大学 Timing behavioral value method and system based on the region 3D convolutional neural networks
CN110188654A (en) * 2019-05-27 2019-08-30 东南大学 A kind of video behavior recognition methods not cutting network based on movement

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4221011B2 (en) * 2006-05-31 2009-02-12 株式会社日立製作所 Work motion analysis method, work motion analysis apparatus, and work motion analysis program
US8081809B2 (en) * 2006-11-22 2011-12-20 General Electric Company Methods and systems for optimizing high resolution image reconstruction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573246A (en) * 2018-05-08 2018-09-25 北京工业大学 A kind of sequential action identification method based on deep learning
CN109409257A (en) * 2018-10-11 2019-03-01 北京大学深圳研究生院 A kind of video timing motion detection method based on Weakly supervised study
CN109784269A (en) * 2019-01-11 2019-05-21 中国石油大学(华东) One kind is based on the united human action detection of space-time and localization method
CN110188654A (en) * 2019-05-27 2019-08-30 东南大学 A kind of video behavior recognition methods not cutting network based on movement
CN110188733A (en) * 2019-06-10 2019-08-30 电子科技大学 Timing behavioral value method and system based on the region 3D convolutional neural networks

Also Published As

Publication number Publication date
CN111104855A (en) 2020-05-05

Similar Documents

Publication Publication Date Title
CN110516536B (en) Weak supervision video behavior detection method based on time sequence class activation graph complementation
CN111104855B (en) Workflow identification method based on time sequence behavior detection
CN111028217A (en) Image crack segmentation method based on full convolution neural network
CN109919032B (en) Video abnormal behavior detection method based on motion prediction
CN111126202A (en) Optical remote sensing image target detection method based on void feature pyramid network
CN111385602B (en) Video auditing method, medium and computer equipment based on multi-level and multi-model
CN111844101B (en) Multi-finger dexterous hand sorting planning method
CN108805151B (en) Image classification method based on depth similarity network
CN110751195B (en) Fine-grained image classification method based on improved YOLOv3
CN113744311A (en) Twin neural network moving target tracking method based on full-connection attention module
WO2023216721A1 (en) Concrete dam defect time sequence image intelligent identification method
CN110633738B (en) Rapid classification method for industrial part images
CN113076957A (en) RGB-D image saliency target detection method based on cross-modal feature fusion
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN114972759A (en) Remote sensing image semantic segmentation method based on hierarchical contour cost function
CN116740119A (en) Tobacco leaf image active contour segmentation method based on deep learning
CN111462090A (en) Multi-scale image target detection method
CN115908793A (en) Coding and decoding structure semantic segmentation model based on position attention mechanism
CN111199255A (en) Small target detection network model and detection method based on dark net53 network
CN116778346B (en) Pipeline identification method and system based on improved self-attention mechanism
CN110942463B (en) Video target segmentation method based on generation countermeasure network
CN115641445B (en) Remote sensing image shadow detection method integrating asymmetric inner convolution and Transformer
CN116630850A (en) Twin target tracking method based on multi-attention task fusion and bounding box coding
CN115457385A (en) Building change detection method based on lightweight network
CN114494284A (en) Scene analysis model and method based on explicit supervision area relation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant