CN115063710A - Time sequence analysis method based on double-branch attention mechanism TCN - Google Patents

Time sequence analysis method based on double-branch attention mechanism TCN Download PDF

Info

Publication number
CN115063710A
CN115063710A CN202210513520.7A CN202210513520A CN115063710A CN 115063710 A CN115063710 A CN 115063710A CN 202210513520 A CN202210513520 A CN 202210513520A CN 115063710 A CN115063710 A CN 115063710A
Authority
CN
China
Prior art keywords
branch
attention
global
time sequence
causal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210513520.7A
Other languages
Chinese (zh)
Inventor
张弘力
宋进
徐光洋
刘周
孙赫然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Province Jilin Xiangyun Information Technology Co ltd
Original Assignee
Jilin Province Jilin Xiangyun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin Province Jilin Xiangyun Information Technology Co ltd filed Critical Jilin Province Jilin Xiangyun Information Technology Co ltd
Priority to CN202210513520.7A priority Critical patent/CN115063710A/en
Publication of CN115063710A publication Critical patent/CN115063710A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of time sequence analysis and discloses a time sequence analysis method based on a double-branch attention mechanism TCN. Step 1: performing input embedding processing on time series data of an input video; step 2: based on the input embedding of the step 1, performing feature extraction by using a double-branch attention time sequence module group; and step 3: and (3) processing by adopting a downstream task branch of video behavior analysis based on the characteristic data of the step (2). The method is used for solving the problem that the remote dependence modeling capability is insufficient when the time sequence data is processed in the prior art.

Description

Time sequence analysis method based on double-branch attention mechanism TCN
Technical Field
The invention belongs to the technical field of time sequence analysis, and particularly relates to a time sequence analysis method based on a double-branch attention mechanism TCN.
Background
In the time-series data processing, long-term dependence is used to describe how long the state of the current time can be affected by the state of the previous time. In the tasks related to time sequence processing, such as voice recognition, text translation, video behavior analysis and the like, whether the content of the next time period can be accurately predicted is determined by keeping the long-term dependency relationship, so that the establishment of effective long-term dependency is the key point. The traditional way of extracting the dependency information is an unsupervised or semi-supervised method. Bert (bidirectional Encoder retrieval from transformations) is a neural Network used for modeling Long-Term dependence in sequence data, and can store Long-distance dependence relations due to the characteristics of a large number of parameter quantities and a large Network scale, and application of bert (bidirectional Encoder retrieval) and Long-Short Term Memory networks (LSTM) with few parameters is limited by insufficient modeling capability for Long-Term dependence.
The attention mechanism in the Transformer has strong capability of storing long-term dependencies and is therefore well-suited for the processing of sequence data. BERT also achieves the best effect in natural language processing tasks through a language model constructed by a Transformer. However, training the BERT model consumes a large amount of computational resources and a large amount of training data, and the convergence speed of the network is slow, thereby limiting the application of the network. However, the attention mechanism can be applied to other time series models for capturing long-term dependency information to further improve the prediction accuracy of the model.
RNNs with a ring structure have the ability to model sequence data, but it is difficult to extract long-term dependency information efficiently. The Time Convolutional Network (TCN) designed by applying convolution to modeling of the sequence problem can reach or even exceed the accuracy of the RNN model, but the ability to obtain long-distance dependence is still insufficient because the size of the receptive field is limited when processing the time sequence problem due to the convolution kernel calculation adopted therein.
Therefore, when the existing neural network model processes time series data, the modeling capability for remote dependence still needs to be improved.
Disclosure of Invention
The invention provides a time sequence analysis method based on a double-branch attention mechanism TCN, which is used for solving the problem that the long-distance dependence modeling capability is insufficient when time sequence data is processed in the prior art.
The invention provides an electronic device.
The invention provides a computer-readable storage medium.
The invention is realized by the following technical scheme:
a timing analysis method based on a dual-branch attention mechanism (TCN) is characterized by comprising the following steps of:
step 1: performing input embedding processing on time series data of an input video;
and 2, step: based on the input embedding of the step 1, performing feature extraction by using a double-branch attention time sequence module group;
and step 3: and (3) processing by adopting a downstream task branch of video behavior analysis based on the characteristic data of the step (2).
A timing analysis method based on a dual branch attention mechanism TCN, where in step 1, the input time series data may also be audio, text or image;
and embedding the time sequence data with the time length t into a sparse space with the dimension c.
A timing sequence analysis method based on double-branch attention (TCN), wherein the double-branch attention timing sequence module in the step 2 is used for carrying out feature extraction on time sequence data, and each timing sequence module maps an input sequence from an input dimension to a higher dimension space;
the two-branch attention sequence module group comprises 4 sequence modules, each of which comprises a causal inflation residual connecting branch and a global attention branch.
A time sequence analysis method based on a dual-branch attention mechanism TCN is disclosed, wherein the dimension change process of data in the processing process of a dual-branch attention time sequence module is as follows: representing input timing data of a network model as { x } 1 ,x 2 ,x 3 ,...,x n }, mark as X 1×n First, each node of the input data is embedded into a c-dimensional space, and the embedding result is marked as X c×n Then the data is processed by a time sequence module, and the dimension change is expressed as
X 1×n →X c1×n →X c2×n →X c3×n →X c4×n =X out
A time sequence analysis method based on a double-branch attention mechanism TCN is characterized in that a time sequence module needs to perform feature fusion on feature extraction results of a sparse causal residual error branch and a global attention branch, and the feature fusion is represented as
X temporal_block =X causal_dilated +X global_residual
Wherein X temporal_block Representing output data of time-sequential modules, X causal_dilated Output data representing sparse causal branches, X global_residual Output data representing a global attention branch.
A time sequence analysis method based on a double-branch attention mechanism TCN is disclosed, wherein a causal swelling residual connecting branch consists of 2 layers of causal swelling volumes and 2 residual connections; wherein the causal dilation convolution component preserves the structural settings in the original TCN; the residual connection is composed of a linear layer, softmax and a normalization layer and is used for extracting the similarity and the dependency between the output results of the causal convolution layer; the calculation process of the branch is represented as
X causal_dilated =f linear1 (X cd1 )+f linear2 (X cd2 )+X cd2
Wherein, X causal_dilated Output data representing sparse causal residual branches, f linear1 ,f linear2 Representing residual join calculation, X cd1 ,X cd2 Output data representing a sparse causal convolutional layer.
A timing analysis method based on a dual-branch attention mechanism TCN is provided, the global attention branch is specifically,
firstly, extracting features from time sequence data through one-layer one-dimensional convolution, and then carrying out position coding on the feature extraction result of the convolution layer, wherein the position information in the time sequence is the key for predicting the subsequent state, so the global position information in the sequence is combined during coding;
secondly, inputting the characteristics with position codes into a multi-head attention layer for extracting global dependency in time sequence data, and adding a residual error connection containing one-dimensional convolution operation for improving the training convergence speed of global attention;
the calculation process of the branch is represented as
X global_residual =X global-attention +X conv1d
Wherein, X global_residual Output result, X, representing a global attention branch global_attention Output result, X, representing the global attention horizon conv1d Representing the output of the one-dimensional convolution operation.
A time sequence analysis method based on a double-branch attention mechanism TCN is characterized in that downstream task branches of downstream task branches in step 3 adopt different branch processing according to different tasks.
The invention has the beneficial effects that:
the invention effectively solves the problems that the RNN can not carry out large-scale parallel processing because only elements in one sequence can be processed at one time and the RNN is computationally intensive because all intermediate results are saved in the task processing by introducing the convolution operation into the time sequence task, thereby improving the convergence speed of model training.
According to the invention, by introducing an attention mechanism into the TCN, the problem of insufficient long-term dependence modeling capability of the TCN due to limited receptive field is effectively solved, and sequence information can be more effectively utilized through position coding.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a diagram of an attention mechanism TCN network model architecture of the present invention.
Fig. 3 is a diagram of the effect of recognizing human behavior in the video according to the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
A timing analysis method based on a dual-branch attention mechanism (TCN) is characterized by comprising the following steps of:
step 1: performing input embedding processing on time series data of an input video;
step 2: based on the input embedding of the step 1, performing feature extraction by using a double-branch attention time sequence module group;
and step 3: and (3) processing by adopting a downstream task branch of video behavior analysis based on the characteristic data of the step (2).
A timing analysis method based on a dual branch attention mechanism TCN, where in step 1, the input time series data may also be audio, text or image;
and embedding the time sequence data with the time length t into a sparse space with the dimension c.
A timing sequence analysis method based on double-branch attention (TCN), wherein the double-branch attention timing sequence module in the step 2 is used for carrying out feature extraction on time sequence data, and each timing sequence module maps an input sequence from an input dimension to a higher dimension space;
the two-branch attention sequential module group comprises 4 sequential modules, and each sequential module comprises a causal inflation residual error connecting branch and a global attention branch.
A time sequence analysis method based on a dual-branch attention mechanism TCN is disclosed, wherein the dimension change process of data in the processing process of a dual-branch attention time sequence module is as follows: representing input timing data of a network model as { x } 1 ,x 2 ,x 3 ,...,x n Is marked as X 1×n Firstly, each node of the input data is embedded into the c-dimensional space, and the embedding result is marked as X c×n Then the data is processed by a time sequence module, and the dimension change is expressed as
X 1×n →X c1×n →X c2×n →X c3×n →X c4×n =X out
A time sequence analysis method based on a double-branch attention mechanism TCN is characterized in that a time sequence module needs to perform feature fusion on feature extraction results of a sparse causal residual error branch and a global attention branch, and the feature fusion is represented as
X temporal_block =X causal_dilated +X global_residual
Wherein, X temporal_block Representing output data of time-sequential modules, X causal_dilated Output data representing sparse causal branches, X alobal_residual Output data representing a global attention branch.
A time sequence analysis method based on a double-branch attention mechanism TCN is disclosed, wherein a causal swelling residual connecting branch consists of 2 layers of causal swelling volumes and 2 residual connections; wherein the causal dilation convolution component preserves the structural settings in the original TCN; the residual error connection consists of a linear layer, softmax and a normalization layer and is used for extracting the similarity and the dependency between the output results of the causal convolution layer; the calculation process of the branch is represented as
X causal_dilated =f linear1 (X cd1 )+f linear2 (X cd2 )+X cd2
Wherein, X causal_dilated Output data representing sparse causal residual branches, f linear1 ,f linear2 Representing residual join calculation, X cd1 ,X cd2 Output data representing a sparse causal convolutional layer.
A timing analysis method based on a dual branch attention mechanism TCN, wherein a global attention branch is used for considering global semantic similarity information and emphasizing meanings of input entities at different positions by adding position information,
firstly, extracting features from time sequence data through one-layer one-dimensional convolution, and then carrying out position coding on the feature extraction result of the convolution layer, wherein the position information in the time sequence is the key for predicting the subsequent state, so the global position information in the sequence is combined during coding;
secondly, inputting the characteristics with position codes into a multi-head attention layer for extracting global dependency in time sequence data, and adding a residual error connection containing one-dimensional convolution operation for improving the training convergence speed of global attention;
the calculation process of the branch is represented as
X global_residual =X global_attention +X conv1d
Wherein, X global_residual Output result, X, representing a global attention branch global_attention Output result, X, representing a global attention horizon conv1d Representing the output of the one-dimensional convolution operation.
A time sequence analysis method based on a double-branch attention mechanism TCN is characterized in that the downstream task branch of step 3 adopts different branch processing according to different tasks. Besides the downstream task branch of the video behavior analysis, the corresponding downstream task branch can be adopted for processing according to other different tasks, such as voice recognition, text translation, image classification and the like.
An electronic device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of the method when executing the program stored in the memory.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method steps of the above.
Taking the behavior recognition of human beings in the video as an example: the purpose is to give a piece of video, such as a picture, where a person is making a salad, identify and analyze the behavior occurring during the making process and the corresponding start and stop times. The input video can be regarded as time series data formed by a series of image frames, the input time series data is embedded firstly, the characteristic extraction is carried out on the embedding result by a double-branch attention time sequence module group, the long-range dependence between the image frame data is modeled, so that the action classification label corresponding to each image frame is predicted, on the basis, the starting and stopping time of each action is judged by the downstream task branch in combination with sequence information, and the returned identification result is the name of all the actions and the corresponding action starting and stopping time in the video. The effect is shown in fig. 3.

Claims (10)

1. A timing analysis method based on a dual-branch attention mechanism (TCN) is characterized by comprising the following steps of:
step 1: performing input embedding processing on time series data of an input video;
and 2, step: based on the input embedding of the step 1, performing feature extraction by using a double-branch attention time sequence module group;
and step 3: and (3) processing by adopting a downstream task branch of video behavior analysis based on the characteristic data of the step (2).
2. The timing analysis method based on the TCN of claim 1, wherein: specifically, in step 1, the input time-series data may also be audio, text or images;
and embedding the time sequence data with the time length t into a sparse space with the dimension c.
3. The timing analysis method based on the TCN of claim 1, wherein: the dual-branch attention time sequence module in the step 2 is used for extracting the characteristics of the time sequence data, and each time sequence module maps the input sequence to a higher dimension space from the input dimension;
the two-branch attention sequential module group comprises 4 sequential modules, and each sequential module comprises a causal inflation residual error connecting branch and a global attention branch.
4. The timing analysis method based on the TCN of claim 3, wherein: the dimension change process of the data in the processing process of the double-branch attention time sequence module is as follows: representing input timing data of a network model as { x } 1 ,x 2 ,x 3 ,...,x n Is marked as X 1×n Firstly, each node of the input data is embedded into the c-dimensional space, and the embedding result is marked as X c×n Then the data is processed by a time sequence module, and the dimension change is expressed as
X 1×n →X c1×n →X c2×n →X c3×n →X c4×n =X out
5. The timing analysis method based on the TCN of claim 4, wherein: the time sequence module needs to perform feature fusion on the feature extraction results of the sparse causal residual error branch and the global attention branch, and the result is represented as
X temporal_block =X causal_dilated +X global_residual
Wherein, X temporal_block Representing output data of time-sequential modules, X causal_dilated Output data representing sparse causal branches, X global_residual Output data representing a global attention branch.
6. The timing analysis method based on the TCN of claim 3, wherein: the causal inflation residual connecting branch consists of 2 layers of causal inflation volumes and 2 residual connections; wherein the causal dilation convolution component preserves the structural settings in the original TCN; the residual error connection consists of a linear layer, softmax and a normalization layer and is used for extracting the similarity and the dependency between the output results of the causal convolution layer; the calculation process of the branch is represented as
X causal_dilated =f linear1 (X cd1 )+f linear2 (X cd2 )+X cd2
Wherein X causal_dilated Output data representing sparse causal residual branches, f linear1 ,f linear2 Representing residual join calculation, X cd1 ,X cd2 Output data representing a sparse causal convolutional layer.
7. The timing analysis method based on the TCN of claim 3, wherein: the global attention branch is specifically a branch of attention,
firstly, extracting features from time sequence data through one-layer one-dimensional convolution, and then carrying out position coding on the feature extraction result of the convolution layer, wherein the position information in the time sequence is the key for predicting the subsequent state, so the global position information in the sequence is combined during coding;
secondly, inputting the characteristics with position codes into a multi-head attention layer for extracting global dependency in time sequence data, and adding a residual error connection containing one-dimensional convolution operation for improving the training convergence speed of global attention;
the calculation process of the branch is represented as
X global_residual =X global_attention +X conv1d
Wherein, X global_residual Output result, X, representing a global attention branch global_attention Output result, X, representing a global attention horizon conv1d Representing the output of the one-dimensional convolution operation.
8. The TCN-based timing analysis method according to claim 3, wherein: and the downstream task branch of the step 3 adopts different branch processing according to different tasks.
9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 8 when executing a program stored in the memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-8.
CN202210513520.7A 2022-05-12 2022-05-12 Time sequence analysis method based on double-branch attention mechanism TCN Pending CN115063710A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210513520.7A CN115063710A (en) 2022-05-12 2022-05-12 Time sequence analysis method based on double-branch attention mechanism TCN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210513520.7A CN115063710A (en) 2022-05-12 2022-05-12 Time sequence analysis method based on double-branch attention mechanism TCN

Publications (1)

Publication Number Publication Date
CN115063710A true CN115063710A (en) 2022-09-16

Family

ID=83199257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210513520.7A Pending CN115063710A (en) 2022-05-12 2022-05-12 Time sequence analysis method based on double-branch attention mechanism TCN

Country Status (1)

Country Link
CN (1) CN115063710A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116227365A (en) * 2023-05-06 2023-06-06 成都理工大学 Landslide displacement prediction method based on improved VMD-TCN

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116227365A (en) * 2023-05-06 2023-06-06 成都理工大学 Landslide displacement prediction method based on improved VMD-TCN
CN116227365B (en) * 2023-05-06 2023-07-07 成都理工大学 Landslide displacement prediction method based on improved VMD-TCN

Similar Documents

Publication Publication Date Title
CN111091839B (en) Voice awakening method and device, storage medium and intelligent device
CN112418292B (en) Image quality evaluation method, device, computer equipment and storage medium
CN112580328A (en) Event information extraction method and device, storage medium and electronic equipment
CN113515951A (en) Story description generation method based on knowledge enhanced attention network and group-level semantics
CN111738169A (en) Handwriting formula recognition method based on end-to-end network model
CN112883231B (en) Short video popularity prediction method, system, electronic equipment and storage medium
CN107463928A (en) Word sequence error correction algorithm, system and its equipment based on OCR and two-way LSTM
CN115577678B (en) Method, system, medium, equipment and terminal for identifying causal relationship of document-level event
CN111666931B (en) Mixed convolution text image recognition method, device, equipment and storage medium
CN114708436B (en) Training method of semantic segmentation model, semantic segmentation method, semantic segmentation device and semantic segmentation medium
CN115063710A (en) Time sequence analysis method based on double-branch attention mechanism TCN
CN116824694A (en) Action recognition system and method based on time sequence aggregation and gate control transducer
CN112507059B (en) Event extraction method and device in public opinion monitoring in financial field and computer equipment
CN117132923A (en) Video classification method, device, electronic equipment and storage medium
CN117095460A (en) Self-supervision group behavior recognition method and system based on long-short time relation predictive coding
CN117494762A (en) Training method of student model, material processing method, device and electronic equipment
CN114511813B (en) Video semantic description method and device
CN116129881A (en) Voice task processing method and device, electronic equipment and storage medium
CN113344060B (en) Text classification model training method, litigation state classification method and device
CN111476131B (en) Video processing method and device
CN114065210A (en) Vulnerability detection method based on improved time convolution network
CN111325068A (en) Video description method and device based on convolutional neural network
CN117373121B (en) Gesture interaction method and related equipment in intelligent cabin environment
CN113792163B (en) Multimedia recommendation method and device, electronic equipment and storage medium
CN111353282B (en) Model training, text rewriting method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination