CN112150442A

CN112150442A - New crown diagnosis system based on deep convolutional neural network and multi-instance learning

Info

Publication number: CN112150442A
Application number: CN202011019778.9A
Authority: CN
Inventors: 杨光; 高远; 牛张明; 夏军; 江荧辉; 叶晴昊; 王旻浩
Original assignee: Digong Hangzhou Science And Technology Industry Co ltd
Current assignee: Digong Hangzhou Science And Technology Industry Co ltd
Priority date: 2020-09-25
Filing date: 2020-09-25
Publication date: 2020-12-29

Abstract

The invention provides a new crown diagnosis system based on a deep convolutional neural network and multi-instance learning.A feature extraction module packs a CT sequence of a patient and performs time-domain convolution, and simultaneously, infected slice instances are screened through weak supervised learning to obtain infected segments; a multi-branch network system, which is configured to input the characteristic sequences extracted from the CT sequence of the patient into a plurality of parallel branch networks, wherein the activation-like sequences output by different parallel branch networks have differences, so as to locate different infection segments, model the integrity of the specific case characteristics of the patient and diverge the attention induced by the weak supervised learning resistance, so as to enhance the accuracy and the robustness of the infection segments; the multi-instance learning module is configured to perform multi-instance bagging for feature fusion of time domain convolution to enhance the patient-specific case feature expression; the gated attention mechanism module is configured to perform adaptive instance feature weighted fusion to avoid gradient vanishing in multi-instance learning.

Description

New crown diagnosis system based on deep convolutional neural network and multi-instance learning

Technical Field

The invention relates to the technical field of deep learning and medical image processing, in particular to a new crown diagnosis system based on a deep convolutional neural network and multi-instance learning.

Background

The world health organization announces that the world has been in a pandemic state since 3/11/2020. By 30 days 8/2020, 2510 ten thousand COVID-19 cases have been recorded. To date, 1650 million have been cured and 844k patients succumb to infection. COVID-19 is a highly contagious disease that can cause fever, cough, myalgia, headache and gastrointestinal symptoms, and even acute respiratory distress or multiple organ failure in severe cases (c.huang et al, 2020). Therefore, it is important to make a rapid and accurate diagnosis of this new fatal disease. Currently, the method of choice by most clinicians is the Reverse Transcription Polymerase Chain Reaction (RTPCR) test (Xie et al, 2020), but its high false negative rate, low sensitivity and time consuming nature make it a poor choice for clinical treatment. In medical imaging tools, CT scanning has been found to be a very effective method of detecting and assessing the severity of various types of pneumonia (Chen et al, 2020).

CT (computed tomography) examination plays a crucial role in diagnosis of new coronary pneumonia, and once in a major epidemic area, the CT examination is used as a main basis for clinical diagnosis. However, the conventional CT examination has defects, and relatively hidden lesions are difficult to observe at the early stage and difficult to distinguish from other viral pneumonia and bacterial pneumonia. Generally, a diagnostic imaging physician observes and distinguishes CT examination images by human eyes, and makes subjective judgment according to the imaging performance and the personal experience of the physician. This is very restrictive and time consuming, has many subjective factors, and can only interpret a portion of the apparent image features. The automatic intelligent diagnosis system established by the deep learning technology can convert and abstract visual image information into deep characteristic information, so that the diagnosis accuracy is improved, a doctor is assisted to read the film, the diagnosis efficiency is greatly improved, and the burden of the doctor is reduced.

Many researchers recently used modern computer vision algorithms on CT scans to automate the diagnosis and evaluation of COVID 19. Although these studies have demonstrated encouraging results in terms of diagnosis of COVID-19 and detection of infected areas using chest CT, most of the existing methods are based on commonly used supervised learning schemes. This requires a significant amount of manual marking data work. However, in such outbreak, the clinician has only limited time to perform tedious manual drawing, so these supervised deep learning methods have great limitations, and often do not have enough samples to train, and thus have limited generalization. Moreover, most studies currently consider CT sections from all COVID-19 patients as positive, all at once, during training. However, for some patients with COVID-19 (usually in mildly infected patients), there may be a large number of healthy uninfected sections. CT scans of even severely infected COVID-19 patients usually contain some healthy slices (s.hu et al, 2020). Most of the methods are trained on image-level labels, and the noise of the rough labels can have serious influence on the performance and generalization of the model. Existing automated interpretation of deep learning based lung CT images typically requires extensive manual fine labeling for training, which is impractical, especially during pandemic periods.

Disclosure of Invention

The invention aims to provide a new coronary diagnosis system based on a deep convolutional neural network and multi-instance learning, and aims to solve the problem that the conventional lung CT image automatic interpretation based on deep learning usually needs a large amount of manual fine labels for training.

In order to solve the above technical problem, the present invention provides a new crown diagnosis system based on deep convolutional neural network and multi-instance learning, comprising:

the lung segmentation module is used for segmenting lung slices, packaging the lung slices to form a patient CT sequence, and inputting the patient CT sequence into the feature extraction module;

the characteristic extraction module is used for carrying out time domain convolution on the CT sequence of the patient and screening infected slice examples through weak supervised learning to obtain infected segments;

a multi-branch network system, which is configured to input the characteristic sequences extracted from the CT sequence of the patient into a plurality of parallel branch networks, wherein the activation-like sequences output by different parallel branch networks have differences, so as to locate different infection segments, model the integrity of the specific case characteristics of the patient and diverge the attention induced by the weak supervised learning resistance, so as to enhance the accuracy and the robustness of the infection segments;

a multi-instance learning module configured to perform multi-instance bagging for feature fusion of time-domain convolution to enhance the patient-specific case feature expression;

and the gated attention mechanism module is configured to perform adaptive instance feature weighted fusion to avoid gradient disappearance in multi-instance learning.

Optionally, in the new crown diagnosis system based on deep convolutional neural network and multi-instance learning, the lung segmentation module includes:

an open dataset configured to train a deep neural network module for lung delineation;

a deep neural network module configured to segment lung regions from the CT images for quantitative estimation of lung infection levels and to facilitate infection detection and classification;

based on the image enhancement method, a fixed-size sliding window W is used_Q，SFinding a covering pixel value, wherein Q is the size of a window, and S is the step length of the sliding process;

the lung segmentation network based on the multi-view U-Net comprises a multi-window voting post-processing program and a sequential information attention module, and utilizes the information of each view of a 3D volume and enhances the integrity of a 3D lung structure;

the lung segmentation model uses artificial labeling fact training, cross validation and testing on an open data set;

and using the trained lung segmentation model to extract the complete lung region of the detection object.

Optionally, in the new coronary diagnosis system based on deep convolutional neural network and multi-instance learning, the lung segmentation module further includes:

forming a plurality of 2D slices on an H axis, a W axis and a D axis respectively for a 3D CT image with the spatial resolution of H x W x D, respectively sending an image sequence of the 2D slices of each axis to a sub-network corresponding to the axis respectively, and training three sub-networks;

and fusing the information of the adjacent slices by adopting an attention mechanism, predicting the segmentation result of the current slice by using a feature map output by the attention mechanism, and then executing sliding window voting size on the prediction probability maps of the H axis, the W axis and the D axis to finally output.

Optionally, in the new crown diagnosis system based on deep convolutional neural network and multi-instance learning, the feature extraction module includes a deep convolutional neural network and a time-domain convolutional layer, where:

given a CT slice sequence of the axial surface of a patient, extracting a characteristic sequence through a depth convolution neural network

Where N represents the number of slices and D represents the feature dimension;

the extracted feature sequences provide a high-level representation of the patient-specific CT sequences and are fed into the time-domain convolution layer for fusion;

by using the time convolution layer and linear rectification (ReLU) active layer embedding characteristics, the expression is as follows:

Embed(X)＝max(θ_emd*X+b_emd，0) (1)

wherein denotes a convolution operation, θ_emdAnd b_emdAre the weights and biases of the time-domain filter,

representing the embedded spatio-temporal feature representation, F being the number of filters;

the time-domain convolution integrates information from between adjacent slices, enabling the network to capture the spatio-temporal structure in the sequence of slices.

Optionally, in the new crown diagnosis system based on deep convolutional neural network and multi-instance learning, in a multi-branch network system,

each parallel branch network inputs the feature sequence embedded by the feature extraction module into the corresponding time domain convolution layer and outputs a series of classification scores:

wherein

And

the weight and the bias of the kth branch classifier respectively;

each S^kThe class distribution is generated at each slice position by normalizing the exponential function (i.e. softmax) along the class dimension:

P^k＝softmax(S^k) (3)

P^kis a class activation sequence.

a cosine similarity based diversity loss function is applied above the class activation sequence of each branch:

the diversity loss function calculates cosine similarity between class activation sequences of every two adjacent branches and averages all branch pairs and classes;

a class c activation sequence representing the ith branch;

the class activation sequence scores from the multiple branches are averaged and made along the class dimension by normalizing the exponential function:

P^avgthe average activation-like sequence, including all activation parts, corresponds to all detected infection intervals;

add regularization terms to the original class score sequence and incorporate them into other loss functions while optimizing:

equipped with diversity loss and specification normalization.

Optionally, in the new crown diagnosis system based on the deep convolutional neural network and multi-instance learning, the gated attention mechanism module includes:

extracting the space-time characteristics of the CT sequence

Performing grouping convolution, and promoting the circulation of positive and negative gradients by using a hyperbolic function;

introducing a sigmoid activation function to perform parallel activation, performing point-to-point corresponding multiplication on the features activated by grouping convolution by using a gating mechanism, and avoiding gradient disappearance in a return process by parallel shunting of the gating mechanism, wherein attention is given to mathematical expression of weight:

wherein the content of the first and second substances,

and

representing the weight of the convolution of the packet,

weights, V, U, W, representing attention maps_attAre all learnable parameters.

Optionally, in the new crown diagnosis system based on the deep convolutional neural network and multi-instance learning, the gated attention mechanism module further includes:

acquired attention weight

For average class activation sequence P^avgAnd (3) calculating a weighted sum, inputting the weighted sum into a normalized exponential function to obtain a classified prediction of the patient:

wherein the content of the first and second substances,

representing the probability distribution of the classification, C being the total number of classes, C ═ 2;

substituting equation (8) into binary cross entropy yields multi-instance learning loss:

L_MIL＝-y_tlog(Prob)-(1-y_t)log(1-Prob) (9)

wherein, y_tE {0, 1} represents the true label of the patient, 0 represents the non-new crown, 1 represents the new crown; to obtainThe overall loss function is:

L＝L_MIL+αL_D+βL_norm

wherein alpha and beta are super parameters for balancing the contribution of the loss function in the training process;

the entire model is optimized by stochastic gradient descent to find the minimum loss L.

Optionally, in the new crown diagnosis system based on the deep convolutional neural network and multi-instance learning, the method further includes:

the infection sequence visualization module and the infection degree quantitative evaluation module are used for providing quick decision;

the inter-patient and patient-specific sequence clustering modules and the sample embedding module are used for providing an interactive consulting function.

Optionally, in the new crown diagnosis system based on deep convolutional neural network and multi-instance learning, the infection sequence visualization module provides a quick decision including:

step 1: let x_i，x_jAny two data points in the feature set; gauss distribution models the adjacency between data points, x_iAnd x_jThe probability of mutual neighbors is:

step 2: let y_iIs x_iLow-dimensional mapping of (2); the adjacency of data points in the low-dimensional space can be written as:

the q-profile is optimized by using a gradient descent.

In the new crown diagnosis system based on the deep convolutional neural network and the multi-instance learning, the CT sequence of a patient is packaged and subjected to time domain convolution, and infected slice instances are screened through weak supervised learning to obtain infected segments; the multi-branch network system is configured to input a feature sequence extracted from a CT sequence of a patient into a plurality of parallel branch networks to position different infection fragments and resist attention divergence induced by weak supervised learning, the multi-instance learning module performs feature fusion of time domain convolution by multi-instance bagging so as to enhance feature expression of a specific case of the patient, the gated attention mechanism module performs adaptive instance feature weighted fusion so as to avoid gradient disappearance in multi-instance learning, and an automatic intelligent diagnosis system established by using a deep learning technology converts visual image information into deep-level feature information so as to improve the accuracy of diagnosis on one hand and assist a doctor in reading the film on the other hand, thereby greatly improving the efficiency of diagnosis and reducing the burden of the doctor.

The invention provides a novel weak supervised learning technology without fine marking (such as infection sequence fragments, positions of infection in slices and the like), multi-instance bagging and time domain convolution feature fusion are carried out to enhance the feature expression of specific cases of patients, a gate control attention machine is provided to carry out self-adaptive instance feature weighted fusion to avoid the problem of gradient disappearance in multi-instance learning, multi-branch counterstudy induced attention divergence is carried out to enhance the accuracy and robustness of infection positioning, infection sequence visualization and infection degree quantitative evaluation help doctors to make quick decisions, and clustering of specific sequences among patients and sample embedding are convenient for doctors to carry out interactive consultation. The invention was pioneered compared to the single image-based discrimination methods that are currently popular in an attempt to train by packaging patient-specific CT slices using a weakly supervised learning approach. In the invention, a gating attention mechanism is introduced to solve the problem of gradient disappearance and carry out a large amount of verification, and the effectiveness of the method is tested and proved on a mainstream deep neural network backbone. In addition, automated visualization tools are built to evaluate the performance of these models and help clinicians make decisions faster and more accurate. The trained model can be flexibly deployed in a large scale, a large number of candidate slices can be used as input, the critical new coronary patients can be automatically analyzed and positioned, and abnormal slices are selected for further examination by clinicians.

Drawings

FIG. 1 is a block diagram of a split-flow multi-instance deep learning architecture according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a lung segmentation method according to an embodiment of the present invention;

FIG. 3 is a schematic view of a gated attention module according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a new crown diagnosis process based on deep convolutional neural network and multi-instance learning according to an embodiment of the present invention.

Detailed Description

The new crown diagnosis system based on deep convolutional neural network and multi-instance learning proposed by the present invention is further described in detail with reference to the accompanying drawings and specific embodiments. Advantages and features of the present invention will become apparent from the following description and from the claims. It is to be noted that the drawings are in a very simplified form and are not to precise scale, which is merely for the purpose of facilitating and distinctly claiming the embodiments of the present invention.

Furthermore, features from different embodiments of the invention may be combined with each other, unless otherwise indicated. For example, a feature of the second embodiment may be substituted for a corresponding or functionally equivalent or similar feature of the first embodiment, and the resulting embodiments are likewise within the scope of the disclosure or recitation of the present application.

The core idea of the invention is to provide a new coronary diagnosis system based on deep convolutional neural network and multi-instance learning, so as to solve the problem that the conventional lung CT image automatic interpretation based on deep learning usually needs a large amount of manual fine labels for training.

In order to realize the idea, the invention provides a new crown diagnosis system based on deep convolutional neural network and multi-instance learning, which comprises: the lung segmentation module is used for segmenting lung slices, packaging the lung slices to form a patient CT sequence, and inputting the patient CT sequence into the feature extraction module; the characteristic extraction module is used for carrying out time domain convolution on the CT sequence of the patient and screening infected slice examples through weak supervised learning to obtain infected segments; a multi-branch network system, which is configured to input the characteristic sequences extracted from the CT sequence of the patient into a plurality of parallel branch networks, wherein the activation-like sequences output by different parallel branch networks have differences, so as to locate different infection segments, model the integrity of the specific case characteristics of the patient and diverge the attention induced by the weak supervised learning resistance, so as to enhance the accuracy and the robustness of the infection segments; a multi-instance learning module configured to perform multi-instance bagging for feature fusion of time-domain convolution to enhance the patient-specific case feature expression; and the gated attention mechanism module is configured to perform adaptive instance feature weighted fusion to avoid gradient disappearance in multi-instance learning.

In an embodiment of the present invention, as shown in FIG. 1, a first attempt is made to pack a patient CT sequence and perform a time domain convolution while an infected slice instance is located by weakly supervised learning. In contrast to most fully supervised approaches (i.e. explicit labeling of infected lung segment intervals), the greatest challenge of weakly supervised learning is how to detect the whole and find infected instance segments without full annotation. To address this challenge, the present embodiment proposes a multi-branch network architecture. To model the integrity of patient-specific symptoms, a sequence of features extracted from an input sequence is input into a network of multiple parallel classification branches. Inspired by countertraining, a loss of diversity is designed to ensure differences between class activation sequences output by different branches, training each branch to locate a different infected segment. Finally, the complete infectious fragment is retrieved by aggregating activations from multiple branches. Furthermore, for multi-instance learning, another challenge is faced with overfitting due to gradient vanishing, introducing a gated attention mechanism to suppress unimportant sequence features, and minimizing binary cross entropy combined with diversity loss to learn network parameters. Through experiments, the gated attention mechanism effectively solves the over-fitting problem in multi-instance learning.

In order to achieve quantitative estimation of the lung infection degree and high precision of infection detection and classification, the present embodiment provides a lung segmentation method, which is implemented by first segmenting lung regions from CT images and using a deep neural network. The deep neural network for lung delineation is trained using an open data set (TCIA data set). The data may be accessed from public access rights of a cancer image archive (TCIA). A total of 60 3D CT lung scan images were retrieved and the lung anatomy was manually delineated. These open data sets were publicly available from scans obtained from three different institutions (MD anderson cancer center, slon-kattelin cancer center, and MAASTRO clinic), with 20 cases per institution. All data were scanned using a matrix of 512 x 512, field of view 500mm x 500mm, and reconstructed slice thicknesses varied within 1mm, 2.5mm or 3 mm.

Further, the lung segmentation is processed before and after as follows: unlike conventional pre-processing methods, the input slices are not normalized according to a predefined Hounsfield Unit (HU) window, but rather a more flexible approach is devised based on previously proposed image enhancement methods. Suggesting the use of a fixed size sliding window WQ_，s；

Where Q represents the size of the window and S represents the step size of the sliding process, rather than clipping based on the HU window, to find a window covering most of the pixel values. This may reduce the variance of data acquired from different centers and different scanners. A multi-view U-Net based lung segmentation network is proposed, which consists of a multi-window voting post-processing procedure and a sequential information attention module, as shown in fig. 2, to exploit the information of each view of the 3D volume and enhance the integrity of the 3D lung structure. The lung segmentation model was trained, cross-validated and tested using artificial annotation facts on the TCIA data set. And then, the trained lung segmentation model is used for extracting the complete lung region of the detection object.

A 3D CT image with spatial resolution H x W x D was sliced into multiple 2D slices in three axes and three subnetworks were trained (one for each axis). During the test, the 3D CT image under test is divided into three parts along the H, W and D axes, respectively, and each part (2D image sequence) is sent into the network corresponding to the current axis, respectively. The segmentation results are merged together to obtain a 3D segmentation result. For good performance of the 3D segmentation task, it is considered that image features of neighboring slices are fused in the middle layer. For different slices, the slice sequence information is different, so in order to predict the segmentation probability map of each slice, it is also necessary to adopt an attention mechanism to fuse the information of the adjacent slices, predict the segmentation result of the current slice by the feature map of attention output, and then perform sliding window voting size final output on the prediction probability map of 3 coordinates.

The new crown diagnosis system based on the deep convolutional neural network and the multi-instance learning further comprises a feature extraction module, and the feature extraction module is composed of two parts: deep convolutional neural networks and time-domain convolutional layers. In order to ensure the universality of the model, in the experiment, the mainstream convolutional neural network architecture is used as a framework for verification, and the verified neural network architecture comprises the following steps: ResNet50(lee and cholelet, 2020), ResNet18(He et al, 2015), inclusion V1(w.liu et al, 2014), inclusion V2 (szegydy et al, 2015), inclusion V3, szueze and Excite ResNet50(j.hu et al, 2017). Given a CT slice sequence of a patient axial surface, a characteristic sequence can be extracted through a deep convolution neural network

Where N represents the number of slices and D represents the feature dimension. The extracted feature sequences provide a high-level representation of the patient-specific CT sequence and are fed into the temporal convolution layer for fusion. Features are embedded by using a time convolution layer and a ReLU active layer, and the expression is as follows:

Embed(X)＝max(θ_emd*X+b_emd，0) (1)

representing the embedded spatio-temporal feature representation, F is the number of filters. The time-domain convolution integrates information from between adjacent slices, enabling the network to capture the spatio-temporal structure in the sequence of slices.

The multi-branch network of this embodiment includes a split confrontation learning module in which parallel K classification branches are designed to form complementary confrontations to capture the complete infected fragment. Each branch inputs the feature sequence embedded by the feature extraction module into a corresponding time domain convolution layer, and outputs a series of classification scores:

wherein

And

respectively the weight and bias of the kth branch classifier. Then, each S^kThe class distribution is generated at each slice position by normalizing the exponential function along the class dimension:

P^k＝softmax(S^k) (3)

P^kis a class activation sequence. To ensure the integrity of infection identification, it is desirable that the class activation sequences from multiple branches differ from each other. However, it was found from experiments that without constraints, the multi-branch classifier was easily over-fitted, so that each branch class activation sequence was concentrated in a single infected area and thus not all infected fragments could be captured. In order to avoid the degradation situation that the branches give the same result, the invention provides a diversity loss function based on cosine similarity, and is applied on the class activation sequence of each branch:

the diversity loss function will calculate the cosine similarity between the class activation sequences of every two adjacent branchesAnd averaged over all branch pairs and categories.

The class c activation sequence for the ith branch is indicated. By minimizing this loss of diversity, each branch can be encouraged to produce activation over different infection intervals. Then, the class activation sequence scores from the multiple branches are averaged and made along the class dimension by normalizing the exponential function:

P^avgthe average activation-like sequence, including all activation portions therein, corresponds to all detected infection intervals. It is noted from the experiments that class activation scores from certain branches, i.e. S^kAlmost all tend to zero, while those from other branches tend to explode, which severely undermines the training process so that the model fits very soon after training begins. More importantly, if one branch is dominant, the average class activation sequence will only respond to a single infection interval and not capture all the infected fragments. It is expected that these parallel branches may compete against each other to find different discriminative fragment features. Eventually the branches will converge to a balanced steady state and have comparable recognition capabilities. Similar ideas can be seen in training strategies to generate countermeasure networks (Ian et al, 2014). To apply this antagonistic learning between branches, one regularization term is added to the original class score sequence and incorporated into other loss functions for optimization:

with diversity loss and normative normalization, the multi-branch design can discover different new crown infection features without full supervision, capturing all infected slices.

For most new crown hazardsIn order to enable a model to notice infection examples of different fragments, the invention designs a group of gated attention mechanism modules, on one hand, important parts after space-time fusion of slice sequence characteristics can be found, and meanwhile, redundant characteristics can be ignored, and on the other hand, the gated attention mechanism is found to be helpful for enhancing the diversity of the expression of the characteristics of the multi-branch module of the shunting confrontation learning module. As shown in fig. 3, the spatial-temporal features of the CT sequence extracted by the feature extraction module are

Performing a packet convolution using two activation functions: hyperbolic function, i.e. tanh, and sigmoid function, i.e. σ. The hyperbolic function is used to facilitate the flow of positive and negative gradients. But tanh (x), for x ∈ [ -1, 1]Is substantially linear. This may inhibit the expression of relationships between instances learned by the model. In order to deal with the nonlinear limitation, a sigmoid function is introduced to carry out parallel activation, then a gating mechanism is used for carrying out point-to-point corresponding multiplication on the characteristics activated by grouping convolution, and the problem of gradient disappearance in the return process is avoided through parallel shunting of the gating mechanism, so the mathematical expression of attention weight can be written as follows:

wherein the content of the first and second substances,

and

representing the weight of the convolution of the packet,

weights, V, U, W, representing attention maps_attAre all learnable parameters.

This embodiment alsoIncluding a joint learning module that derives attention weights from a gated attention module

Average class activation sequence P in the learning module against split^avgThe weighted sum is then input to a normalized exponential function to derive a categorical prediction of the patient:

wherein the content of the first and second substances,

representing the probability distribution of the class, C is the total number of classes (in this study C ═ 2, i.e. new or non-new crowns.) finally, substituting (8) into the binary cross entropy yields the multi-instance learning loss:

L_MIL＝-y_tlog(Prob)-(1-y_t)log(1-Prob) (9)

wherein, y_tE {0, 1} represents the patient's true label, 0 represents the non-new crown, and 1 represents the new crown. Finally, the overall loss function is obtained as:

L＝L_MIL+αL_D+βL_norm

wherein, alpha and beta are super parameters used for balancing the contribution of the loss function in the training process. Finally, the whole model is optimized by stochastic gradient descent with the aim of finding the minimum loss L.

This embodiment also has feature space visualization functionality, and t-distributed random neighborhood embedding (t-SNE), developed by Maaten and Geoffrey Hinton (2008), is a powerful nonlinear dimension reduction technique that enables visualization of high-dimensional feature vectors by projecting it into 2D or 3D space. t-SNE can accurately reflect domain relationships of high-dimensional data at many different scales due to its ability to accurately capture the local structure of high-dimensional data while preserving the ability of global structures (e.g., clusters of various sizes). the core of t-SNE is divided into two phases: the first step is as follows: first, t-SNE creates a probability distribution in the high-dimensional space, the probability distribution representing the neighborhood structure across data points; the second step is that: secondly, the t-SNE establishes probability distribution in a lower dimensional space, the two probability distributions are expected to be similar as much as possible, the relative entropy of the two probability distributions is reduced through continuous optimization, and finally the projection of the high-dimensional data point in the lower dimensional space is obtained.

Step 1: let x_i，x_jAny two data points in the feature set. Modeling the adjacency relationships between data points using a Gaussian distribution, x_iAnd x_jThe probability of mutual neighbors is:

it is compared with the point x_iThe central gaussian probability density is proportional. Sigma_iIs equal to x_iA gaussian variance at the center. The likelihood that points are neighbors of each other can be described by the distance between them, the probability of selecting neighbors that are farther away from a reference point decreasing rapidly as the distance from the reference point increases. Sigma_iVarying with the density of dots, it is desirable to have σ in the dense region_iLower, but in sparse regions_iHigher. In this way, any given point can be prevented from being disproportionately affected (by limiting the number of neighbors for all points to be approximately equal). The number of neighbors of all points is driven by a hyper-parameter called confusability (usually chosen to be 5 to 50). The greater the confusion, the more local cluster structures of the high dimensional dataset remain.

Step 2: let y_iIs x_iLow-dimensional mapping of (2). It is desirable that the low-dimensional map represent a distribution similar to that in the high-dimensional space. In practice, however, it has been found that if a gaussian distribution is also used in a low dimensional space, the tail of the gaussian distribution is short and therefore squeezes nearby points together, leading to crowding problems (points tend to be crowded in the low dimensional space due to the cursing of dimensions). To make the distribution of points sparse in the low-dimensional mapping space, t-distribution is chosen instead of Gaussian because the nature of t-distribution is w-Cauchy distribution, compared to GaussianThe distribution, cauchy, has a longer tail. Then, the adjacency of the data points in the low-dimensional space can be written as:

finally, the q-distribution is optimized by minimizing the Kullback-Leibler (KL) dispersion between p and q using gradient descent. The KL dispersion is a measure of the difference of one probability distribution from another expected probability distribution.

In summary, the above embodiments have described in detail different configurations of the new crown diagnosis system based on the deep convolutional neural network and multi-instance learning, and it is understood that the present invention includes, but is not limited to, the configurations listed in the above embodiments, and any content that is transformed based on the configurations provided by the above embodiments falls within the scope of the present invention. One skilled in the art can take the contents of the above embodiments to take a counter-measure.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The above description is only for the purpose of describing the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention, and any variations and modifications made by those skilled in the art based on the above disclosure are within the scope of the appended claims.

Claims

1. A new crown diagnostic system based on deep convolutional neural network and multi-instance learning, comprising:

2. The new-crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 1, wherein the lung segmentation module comprises:

3. The new-crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 2, wherein the lung segmentation module further comprises:

4. The deep convolutional neural network and multi-instance learning based new crown diagnostic system of claim 3, wherein the feature extraction module comprises a deep convolutional neural network and a time domain convolutional layer, wherein:

Where N represents the number of slices and D represents the feature dimension;

by utilizing the embedding characteristics of the time convolution layer and the linear rectification active layer, the expression is as follows:

Embed(X)＝max(θ_emd*X+b_emd，0) (1)

5. The new crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 4, wherein in a multi-branch network architecture,

each parallel branch network inputs the feature sequence embedded by the feature extraction module into the corresponding time domain convolution layer and outputs a classification score:

wherein

And

the weight and the bias of the kth branch classifier respectively;

each S^kThe class distribution is generated at each slice position by normalizing the exponential function along the class dimension:

P^k＝softmax(S^k) (3)

P^kis a class activation sequence.

6. The new crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 5, wherein in a multi-branch network architecture,

a class c activation sequence representing the ith branch;

equipped with diversity loss and specification normalization.

7. The new crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 6, wherein the gated attention mechanism module comprises:

extracting the space-time characteristics of the CT sequence

wherein the content of the first and second substances,

and

representing the weight of the convolution of the packet,

weights, V, U, W, representing attention maps_attAre all learnable parameters.

8. The new-crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 7, wherein the gated attention mechanism module further comprises:

acquired attention weight

wherein the content of the first and second substances,

L_MIL＝-y_t log(Prob)-(1-y_t)log(1-Prob) (9)

wherein, y_tE {0, 1} represents the true label of the patient, 0 represents the non-new crown, 1 represents the new crown; the overall loss function is found to be:

L＝L_MIL+αL_D+βL_norm

9. The new-crown diagnostic system based on deep convolutional neural network and multi-instance learning of claim 8, further comprising:

10. The deep convolutional neural network and multi-instance learning based new crown diagnostic system of claim 9, wherein the infection sequence visualization module provides fast decisions comprising:

the q-profile is optimized by using a gradient descent.