CN114998647B - Breast cancer full-size pathological image classification method based on attention multi-instance learning - Google Patents

Breast cancer full-size pathological image classification method based on attention multi-instance learning Download PDF

Info

Publication number
CN114998647B
CN114998647B CN202210526657.6A CN202210526657A CN114998647B CN 114998647 B CN114998647 B CN 114998647B CN 202210526657 A CN202210526657 A CN 202210526657A CN 114998647 B CN114998647 B CN 114998647B
Authority
CN
China
Prior art keywords
network
stage
full
instances
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210526657.6A
Other languages
Chinese (zh)
Other versions
CN114998647A (en
Inventor
张建新
侯存巧
张冰冰
韩雨童
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Minzu University
Original Assignee
Dalian Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Minzu University filed Critical Dalian Minzu University
Priority to CN202210526657.6A priority Critical patent/CN114998647B/en
Publication of CN114998647A publication Critical patent/CN114998647A/en
Application granted granted Critical
Publication of CN114998647B publication Critical patent/CN114998647B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The breast cancer full-size pathological image classification method based on attention multi-instance learning comprises the following steps: step 1: acquiring a data set and a label; step 2: preprocessing a data set; step 3: constructing a two-stage full-size pathological image (WSI) classification network; step 4: storing the optimal weight of the two-stage network; step 5: and calculating the accuracy of the network on the test set. The SAMIL of the present invention introduces a lightweight and efficient SA module that fuses spatial attention and channel attention, which are used to capture pixel-level pairings and channel dependencies, respectively. SAMIL stack MHA with LSTM to adaptively highlight the most unique instance features to better calculate correlations between selected instances, improving classification accuracy.

Description

Breast cancer full-size pathological image classification method based on attention multi-instance learning
Technical Field
The invention relates to the technical field of image classification methods, in particular to a breast cancer full-size pathological image classification method based on attention multi-instance learning.
Background
According to the recent global cancer estimation, 230 ten thousand new diagnosis cases of breast cancer are found in 2020 women, and lung cancer has become the most common cancer worldwide. At the same time, the digitization of full size images (WSI), i.e. hematoxylin eosin (H & E) stained biopsy specimens, provides an exact reference for breast cancer diagnosis.
In recent years, with the breakthrough success of deep learning in various computer tasks, computer-aided WSI classification methods for cancer diagnosis have received more attention. In particular, some researchers turn WSI classification into a weakly supervised task and introduce multi-instance learning (MILs) as a solution to the problems of massive WSI scale and difficulty in pixel-level labeling in fully supervised learning. The MIL solution mainly focuses on two key links, namely an instance level selection module is constructed, positive probability of slice level images is calculated based on the extracted depth features, and the first K slices with the highest probability are taken as candidate instances; the design aggregation operator generates packet embeddings for calculating the score for each packet. Although multi-instance learning has made great progress in the task of classifying full-slice pathology images.
The defects of the method are that: feature correlation of each sub-feature is rarely described in the spatial or channel dimensions, which is detrimental to the discovery of cancer cells with microscopic breast cancer lymph node metastases. There are limitations in capturing dependencies between different instances that help classify WSI.
Disclosure of Invention
The invention aims to provide a full-size breast cancer pathological image classification method based on attention multi-instance learning, which can acquire more discriminative patch level representation and can improve accuracy of breast cancer metastasis lymph node pathological image classification.
A breast cancer full-size pathological image classification method based on attention multi-instance learning comprises the following steps:
Step 1: acquiring a data set and a label: acquiring a data set and a label of a breast cancer histopathological image, and randomly dividing the breast cancer histopathological image into a training set, a verification set and a test set according to a proportion;
Step 2: preprocessing a data set: preprocessing the divided data set based on inverse binarization thresholding operation, generating a mask of background/tissue area for each WSI picture, cutting the tissue area into slices with a size of a×a, and storing the coordinate set of the slices. In order to further reduce the calculated amount, a probability p is added, when the part of the tissue region in the slice is larger than the probability p, the coordinates of the slice are saved, and the processed WSI image X 'i can be expressed as X' i={xi,1,xi,2…,xi,m }, wherein m is the number of the slices in each full-size breast cancer pathological image;
Step 3: a two-stage full-scale pathology image (WSI) classification network is constructed: the method comprises the steps of selecting an instance in a first stage, extracting features of slices by using an SA-ResNet network, selecting the first K instances with the highest probability in each WSI (wireless sensor array) by using a multi-instance learning method, predicting the full-size level in a second stage, and reliably predicting the whole WSI image by using an aggregator constructed by superposing a multi-head attention (MHA) network and a long short-term memory (LSTM) network;
Step 31: at one stage, the SA-ResNet network performs feature extraction on the slice: taking a slice X ' ∈R C×H×W as the input of a pre-trained SA-ResNet network, obtaining a feature matrix X ε R c×h×w after the residual structure of ResNet, dividing X into G groups along the channel dimension by replacement attention, namely X= [ X 1,…,XG],Xk∈Rc/G×h×w,Xk ] is continuously divided into two branches, namely X k1,Xk2∈Rc/2G×h×w, one branch utilizes the inter-channel correlation, outputting a channel attention map, the other branch utilizes the inter-feature spatial relationship, generating a space attention map, connecting the results of the two branches, enabling the number of channels X ' k to be the same as the number of channels of X k, and then carrying out polymerization operation on all feature matrices X ' k, wherein the final output of the SA module is X out∈Rc×h×w.Xout, and generating the feature vector X gap of the slice through global average pooling.
Step 32: acquiring a small training SA-ResNet network: after the feature vector of each slice is obtained, the probability of each slice is obtained through a Softmax function, the probabilities of the slices in each full-size image are ordered from small to large, and the T small blocks with the top probability rank in each full-size image are taken to train the SA-ResNet network.
Step 33: input V to obtain full-size level prediction: and predicting the slices in each WSI by using a one-stage pre-trained optimal weight file, sequencing the predicted probabilities, and taking the first K instances with the highest probability in each full-size image as the input V= [ V 1,…,vK]∈RK×C ] of full-size level prediction.
Step 34: the first K instances with highest aggregate probability: with MHA and LSTM, for the i-th head attention unit (H i) in MHA, the calculation formula is as follows:
Wherein v= [ V 1,…,vK]∈RK×C, V represents the number of instances of the first K selected instance features, K represents the number of instances, V 1,…,vK represents a single instance feature, V j,vk e V, C is the instance feature embedding dimension, the convolution kernels are W e R D×1 and Z e R D×C, D is the feature embedding dimension. The hyperbolic tangent tanh is the activation function. After element multiplication o, for MHA, another convolution is performed to project back to the original dimension for all outputs of the connector unit:
Wherein, Representing the first K instances after feature enhancement, v= [ V 1,…,vK]∈RK×C, V represents the selected first K instance features, K represents the number of instances, V 1,…,vK represents a single instance feature, W pro∈R(H×D)×C represents a convolution kernel, T represents a transpose of the matrix, H 1,…,Hh represents the head attention unit, H represents the number of heads, C and D feature embedding dimensions.
Step 35: further modeling the dependencies between the selected Top-K instances: LSTM is further used to construct interactions and fuse interaction instances to obtain differentiated image level representations. LSTM can capture short-term and long-term dependencies, given an input feature sequence (v 1,…,vK), and the hidden layer of LSTM is recursively calculated from t=1 to t=k using the following formula:
Wherein f t,it,ot represents a forget gate, an input gate, and an output gate, respectively. W {f,i,o,c} and U {f,i,o,c} represent weight matrices to be learned, b {f,i,o,c} represents bias vectors, h t-1 is a hidden vector, c t represents memory cells, sigmoid and hyperbolic tangent tanh represent activation functions. The output of the last LSTM is used as the final packet level representation vector for prediction.
Step 4: saving the optimal weight of the two-stage network: inputting the data set into a two-stage classification network, training the one-stage network by adopting a training set, updating network parameters in each iteration, verifying the verification set once every three iterations, storing the optimal weight of the one-stage network according to the accuracy of the optimal verification set, processing the data set by using the optimal weight of the one-stage, selecting K instances with the highest probability rank in each WSI as the input of the two stages, initializing the two-stage network by using the optimal weight of the one-stage, verifying once after finishing one iteration in each training, and storing the optimal weight of the two-stage network according to the accuracy of the optimal verification set;
Step 5: calculating the accuracy of the network on the test set: and initializing a network by using two-stage optimal weights, inputting a test set into the network to obtain a prediction result of each WSI, comparing the prediction result with real tag data, counting the number of WSIs which are correctly predicted and incorrectly predicted, and calculating the accuracy of the network on the test set.
Compared with the prior art, the invention has the following beneficial effects:
(1) SAMIL introduces a lightweight and efficient SA module that fuses spatial attention and channel attention, which are used to capture pixel-level pairwise relationships and channel dependencies, respectively.
(2) SAMIL stack MHA with LSTM to adaptively highlight the most unique instance features to better calculate correlations between selected instances, improving classification accuracy.
Drawings
Fig. 1 is an overall frame diagram of the SAMIL model.
Detailed Description
Experimental data used in the present invention was from the lymph node metastasis dataset of 2016Camelyon Grand Challenge. The dataset contained 399 complete full-size images, including both normal and metastatic forms, for detection of metastasis in HE-stained tissue sections of sentinel-assisted lymph nodes of breast cancer patients.
In a schematic diagram of the invention, a method for classifying full-size pathological images of two-stage breast cancer based on attention-deficit-increasing examples comprises the following steps:
step 1: acquiring a data set and a label: the lymph node metastasis data set is randomly divided into training sets according to the proportion of 2:1:1: verification set: test set, wherein training set 204, verification set 95, test set 100.
Step 2: preprocessing a data set: the method is used for preprocessing the divided data set based on inverse binarization thresholding operation, generating a mask of a background/tissue area for each WSI picture, dividing the tissue area into sections with the size of 512 multiplied by 512, and storing coordinate sets of the sections. In order to further reduce the calculated amount, a probability value of 0.4 is added, coordinates of the slice are saved when the part of the tissue area in the slice is larger than 0.4, and the processed WSI image X 'i can be expressed as X' i={xi,1,xi,2…,xi,m, wherein m is the number of the slices in each full-size breast cancer pathological image;
Step 3: a two-stage full-scale pathology image (WSI) classification network is constructed: the method comprises the steps of selecting a first stage for example, extracting features of slices by using an SA-ResNet network, selecting 10 examples with the highest probability in each WSI (wireless sensor array) by using a multi-example learning method, predicting the whole WSI image by using a full-size level prediction model, and reliably predicting the whole WSI image by using an aggregator constructed by superposing a multi-head attention (MHA) network and a long short-term memory (LSTM) network;
Step 31: at one stage, the SA-ResNet network performs feature extraction on the slice: slice x i,j∈R3×512×512 is scaled to 224 x 3 pixels as input to the pre-training SA-ResNet network. The SA module is inserted into each residual stage (e.g., conv2_x) in ResNet-50. The input to SA is the feature matrix X ε R 256×56×56. The SA module first divides X into 64 groups along the channel dimension, i.e., x= [ X 1,…,Xk,…,X64],Xk∈R4×56×56,Xk ] is further divided into two branches, X k1,Xk2∈R2×56×56 respectively, one branch uses the inter-channel relationship, outputs a channel attention pattern X 'k1∈R2×56×56, the other branch uses the inter-feature spatial relationship, generates a spatial attention pattern X' k2∈R2×56×56, connects the two branches to obtain X 'k∈R4×56×56, and then performs an aggregation operation on all feature matrices X' k, and the final output of the SA module is X out∈R256×56×56. The SA modules in conv3_x, conv4_x, conv5_x residual blocks are the same, and the feature vector generated by global average pooling of X out is X gap∈R2048×1×1.
Step 32: acquiring a small training SA-ResNet network: after the feature vector of each slice is obtained, the probability of each slice is obtained through a Softmax function, the probabilities of the slices in each full-size image are ordered from small to large, and 2 small blocks with the highest probability rank in each full-size image are taken to train the SA-ResNet network.
Step 33: input V to obtain full-size level prediction: and predicting the slices in each WSI by using a one-stage pre-trained optimal weight file, sequencing the predicted probabilities, and taking the first 10 instances with the highest probability in each full-size image as input V= [ V 1,…,v10]∈R2048×1 ] of two-stage full-size level prediction.
Step 34: the first K instances with highest aggregate probability: with MHA and LSTM, for the i-th head attention unit in the multi-head attention, the calculation formula is as follows:
Where v= [ V 1,…,v10]∈R10×2048, V denotes the first 10 example features selected, V 1,…,v10 denotes the single example feature, V j,vk e V, and the convolution kernels are W e R 512×1 and Z e R 512×2048. The hyperbolic tangent tanh is the activation function. In element multiplication Thereafter, the key instances are highlighted according to the relationship between them. For MHA, all outputs of the connector unit of the invention, another convolution is performed to project back to the original dimension:
Wherein, Representing the first 10 examples after feature enhancement, v= [ V 1,…,v10]∈R10×2048, V represents the first 10 example features selected, V 1,…,v10 represents a single example feature, W pro∈R(3×512)×2048 represents a convolution kernel, T represents a transpose of the matrix, H 1,…,Hh represents the head attention unit, H represents the number of heads, in this study h=3. The multi-headed attention recalibrates all instance features from different representation subspaces enriching the original selected instance V.
Step 35: further modeling the dependencies between the first 10 selected instances: LSTM is further used to construct interactions and fuse interaction instances to obtain differentiated image level representations. LSTM can capture short-term and long-term dependencies, given an input feature sequence (v 1,…,v10), the hidden layer of LSTM is recursively calculated from t=1 to t=10 using the following formula: :
Wherein f t,it,ot represents a forget gate, an input gate, and an output gate, respectively. W {f,i,o,c} and U {f,i,o,c} represent weight matrices to be learned, b {f,i,o,c} represents bias vectors, h t is a hidden vector, c t is a memory unit, sigmoid and hyperbolic tangent tanh represent activation functions. In the feature fusion module, the present invention stacks two layers of LSTM so that enhanced instances can interact more fully. The output of the last LSTM is used as the final packet level representation vector for prediction.
Step 4: saving the optimal weight of the two-stage network: inputting the data set into a two-stage classification network, training the one-stage network by adopting a training set, updating network parameters in each iteration, verifying the verification set once every three iterations, storing the optimal weight of the one-stage network according to the accuracy of the optimal verification set, and during the training process, using an Adam optimizer to relieve the gradient vibration problem, wherein the learning rate is set to be 1e-4, and the weight attenuation is set to be 1e-5. Processing the data set by using one-stage optimal weights, selecting the 10 instances with the highest probability ranking in each WSI as two-stage inputs, initializing a two-stage network by using the one-stage optimal weights, setting the learning rate to be 1e-4 and the weight attenuation to be 1e-4 by using an Adam optimizer in the two-stage training process, performing 1 verification after each training is completed for 1 iteration, and storing the optimal weights of the two-stage network according to the accuracy of the optimal verification set;
Step 5: calculating the accuracy of the network on the test set: and initializing a network by using two-stage optimal weights, inputting a test set into the network to obtain a prediction result of each WSI, comparing the prediction result with 100 real label data of the test set, and counting the number of WSIs which are correctly predicted and incorrectly predicted so as to calculate SAMIL accuracy rate on the test set.
According to the steps, the invention provides a novel SAMIL model for a breast cancer WSI classification task. SAMIL uses a permuted attention (SA) module to select discrimination instances and uses LSTM's multi-head attention (MHA) to implement packet level prediction, thus exploring the benefits of attention mechanisms to solve MIL problems. In addition, the experimental result shows that compared with the most advanced MIL method, the method has excellent performance on Camelyon data sets, and the accuracy is 96.56% at most.

Claims (1)

1. The breast cancer full-size pathological image classification method based on attention multi-instance learning is characterized by comprising the following steps of: the method comprises the following steps: step 1: acquiring a data set and a label: acquiring a data set and a label of a breast cancer histopathological image, and randomly dividing the breast cancer histopathological image into a training set, a verification set and a test set according to a proportion; step 2: preprocessing a data set: preprocessing the divided data set based on inverse binarization thresholding operation, generating a mask of a background/tissue region for each WSI image, dividing the tissue region into slices with a size of a multiplied by a, storing a coordinate set of the slices, adding a probability p for further reducing the calculation amount, and storing the coordinates of the slices when the part of the tissue region in the slices is larger than the probability p, wherein m is the number of the slices in each full-size breast cancer pathological image, and the processed WSI image X 'i can be expressed as X' i={xi,1,xi,2…,xi,m; step 3: a two-stage full-scale pathology image (WSI) classification network is constructed: the method comprises the steps of selecting an instance in a first stage, extracting features of slices by using an SA-ResNet network, selecting the first K instances with the highest probability in each WSI (wireless sensor array) by using a multi-instance learning method, predicting the full-size level in a second stage, and reliably predicting the whole WSI image by using an aggregator constructed by superposing a multi-head attention (MHA) network and a long short-term memory (LSTM) network; step 4: saving the optimal weight of the two-stage network: inputting the data set into a two-stage classification network, training the one-stage network by adopting a training set, updating network parameters in each iteration, verifying the verification set once every three iterations, storing the optimal weight of the one-stage network according to the accuracy of the optimal verification set, processing the data set by using the optimal weight of the one-stage, selecting K instances with the highest probability rank in each WSI as the input of the two stages, initializing the two-stage network by using the optimal weight of the one-stage, verifying once after finishing one iteration in each training, and storing the optimal weight of the two-stage network according to the accuracy of the optimal verification set; step 5: calculating the accuracy of the classification network on the test set: using a two-stage optimal weight initializing network, inputting a test set into the classification network to obtain a prediction result of each WSI, comparing the prediction result with real tag data, counting the number of WSIs which are predicted correctly and mispredicted, and calculating the accuracy of the classification network on the test set; in step 3, step 31: at one stage, the SA-ResNet network performs feature extraction on the slice: taking a slice X ' ∈R C×H×W as an input of a pre-trained SA-ResNet network, after a residual structure of ResNet, obtaining a feature matrix X ε R c×h×w, dividing X into G groups along a channel dimension by replacement attention, namely X= [ X 1,…,XG],Xk∈Rc/G×h×w,Xk ] is continuously divided into two branches, namely X k1,Xk2∈Rc/2G×h×w, one branch utilizes the interrelation among channels, a channel attention map is output, the other branch utilizes the spatial relationship among features, a space attention map is generated, the results of the two branches are connected, the number of channels X ' k is the same as the number of channels of X k, then, performing polymerization operation on all feature matrices X ' k, and the final output of the SA module is X out∈Rc×h×w,Xout to generate a feature vector X gap of the slice through global average pooling; step 32: acquiring a small training SA-ResNet network: after the feature vector of each slice is obtained, the probability of each slice is obtained through a Softmax function, the probabilities of the slices in each full-size image are ordered from small to large, and T small blocks with the highest probability rank in each full-size image are taken to train an SA-ResNet network; step 33: input V to obtain full-size level prediction: predicting the slices in each WSI by using a one-stage pre-trained optimal weight file, sequencing the predicted probabilities, and taking the first K instances with the highest probability in each full-size image as the input V= [ V 1,…,vK]∈RK×C ] of full-size level prediction; step 34: the first K instances with highest aggregate probability: with MHA and LSTM, for the i-th head attention unit (H i) in MHA, the calculation formula is as follows:
Where v= [ V 1,…,vK]∈RK×C, V denotes the number of instances of the first K selected instance features, K denotes the number of instances, V 1,…,vK denotes the single instance feature, V j,vk e V, C is the instance feature embedding dimension, the convolution kernels are W e R D×1 and Z e R D×C, D is the feature embedding dimension, hyperbolic tangent tanh is the activation function, and after element multiplication, another convolution is performed for all outputs of the connector unit to project back to the original dimension:
Wherein, Representing the first K instances after feature enhancement, v= [ V 1,…,vK]∈RK×C, V represents the selected first K instance features, K represents the number of instances, V 1,…,vK represents a single instance feature, W pro∈R(H×D)×C represents a convolution kernel, T represents a transpose of the matrix, H 1,…,Hh represents the head attention unit, H represents the number of heads, C and D feature embedding dimensions; step 35: further modeling the dependencies between the selected Top-K instances: LSTM is further used to construct interactions and fuse interaction instances to obtain differentiated image level representations, LSTM can capture short-term and long-term dependencies, given an input feature sequence (v 1,…,vK), and hidden layers of LSTM are recursively computed from t=1 to t=k using the following formula:
Wherein f t,it,ot represents a forgetting gate, an input gate and an output gate, W {f,i,o,c} and U {f,i,o,c} represent weight matrices to be learned, b {f,i,o,c} represents a bias vector, h t-1 is a hidden vector, c t represents a memory unit, sigmoid and hyperbolic tangent tanh represent an activation function, and the output of the last LSTM is used as a final packet level representation vector for prediction.
CN202210526657.6A 2022-05-16 2022-05-16 Breast cancer full-size pathological image classification method based on attention multi-instance learning Active CN114998647B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210526657.6A CN114998647B (en) 2022-05-16 2022-05-16 Breast cancer full-size pathological image classification method based on attention multi-instance learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210526657.6A CN114998647B (en) 2022-05-16 2022-05-16 Breast cancer full-size pathological image classification method based on attention multi-instance learning

Publications (2)

Publication Number Publication Date
CN114998647A CN114998647A (en) 2022-09-02
CN114998647B true CN114998647B (en) 2024-05-07

Family

ID=83027208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210526657.6A Active CN114998647B (en) 2022-05-16 2022-05-16 Breast cancer full-size pathological image classification method based on attention multi-instance learning

Country Status (1)

Country Link
CN (1) CN114998647B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237781B (en) * 2023-11-16 2024-03-19 哈尔滨工业大学(威海) Attention mechanism-based double-element fusion space-time prediction method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415212A (en) * 2019-06-18 2019-11-05 平安科技(深圳)有限公司 Abnormal cell detection method, device and computer readable storage medium
CN114238577A (en) * 2021-12-17 2022-03-25 中国计量大学上虞高等研究院有限公司 Multi-task learning emotion classification method integrated with multi-head attention mechanism

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083705B (en) * 2019-05-06 2021-11-02 电子科技大学 Multi-hop attention depth model, method, storage medium and terminal for target emotion classification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415212A (en) * 2019-06-18 2019-11-05 平安科技(深圳)有限公司 Abnormal cell detection method, device and computer readable storage medium
CN114238577A (en) * 2021-12-17 2022-03-25 中国计量大学上虞高等研究院有限公司 Multi-task learning emotion classification method integrated with multi-head attention mechanism

Also Published As

Publication number Publication date
CN114998647A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN108804530B (en) Subtitling areas of an image
CN107229757B (en) Video retrieval method based on deep learning and Hash coding
CN109033978B (en) Error correction strategy-based CNN-SVM hybrid model gesture recognition method
CN108764019A (en) A kind of Video Events detection method based on multi-source deep learning
CN104077742B (en) Human face sketch synthetic method and system based on Gabor characteristic
CN111325237B (en) Image recognition method based on attention interaction mechanism
CN111276240A (en) Multi-label multi-mode holographic pulse condition identification method based on graph convolution network
CN112597324A (en) Image hash index construction method, system and equipment based on correlation filtering
CN112163114B (en) Image retrieval method based on feature fusion
Salazar On Statistical Pattern Recognition in Independent Component Analysis Mixture Modelling
CN113868448A (en) Fine-grained scene level sketch-based image retrieval method and system
CN114998647B (en) Breast cancer full-size pathological image classification method based on attention multi-instance learning
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN116363750A (en) Human body posture prediction method, device, equipment and readable storage medium
Ma et al. Dirichlet process mixture of generalized inverted dirichlet distributions for positive vector data with extended variational inference
CN113240033B (en) Visual relation detection method and device based on scene graph high-order semantic structure
CN113516019B (en) Hyperspectral image unmixing method and device and electronic equipment
Afzal et al. Discriminative feature abstraction by deep L2 hypersphere embedding for 3D mesh CNNs
Wei et al. A multiobjective group sparse hyperspectral unmixing method with high correlation library
CN113822134A (en) Instance tracking method, device, equipment and storage medium based on video
Deffo et al. CNNSFR: A convolutional neural network system for face detection and recognition
Termritthikun et al. Evolutionary neural architecture search based on efficient CNN models population for image classification
CN116188428A (en) Bridging multi-source domain self-adaptive cross-domain histopathological image recognition method
CN113887509B (en) Rapid multi-modal video face recognition method based on image set
CN114821631A (en) Pedestrian feature extraction method based on attention mechanism and multi-scale feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant