CN117314888A

CN117314888A - Crohn disease detection method based on multi-example learning and pathological image

Info

Publication number: CN117314888A
Application number: CN202311452201.0A
Authority: CN
Inventors: 冯仕庭; 李雪华; 黄炳升; 毛仁; 叶子茵; 李飞; 王海鹏; 马锦婷
Original assignee: First Affiliated Hospital of Sun Yat Sen University
Current assignee: First Affiliated Hospital of Sun Yat Sen University
Priority date: 2023-11-02
Filing date: 2023-11-02
Publication date: 2023-12-29

Abstract

The application discloses a Crohn disease detection method based on multi-example learning and pathological images, which comprises the steps of obtaining a plurality of pathological images and obtaining slice characteristics corresponding to each pathological image, wherein the pathological images correspond to the same target object, and each pathological image comprises a preset number of image blocks; determining a slice prediction result corresponding to each pathological image based on each slice characteristic; object features of a target object are determined based on the slice features, and an object prediction result of the target object is determined based on the object features and the slice features. According to the method, after the slice characteristics corresponding to the pathological image are extracted, the slice prediction result and the target object characteristics of the pathological image are determined based on the slice characteristics, then the target object prediction result is determined based on the characteristics of the slice level and the characteristics of the patient level, and the slice and the patient level are jointly acted on the process of detecting the Crohn's disease, so that the detection accuracy of the Crohn's disease is improved, and the misdiagnosis rate is reduced.

Description

Crohn disease detection method based on multi-example learning and pathological image

Technical Field

The application relates to the technical field of biology, in particular to a Crohn disease detection method based on multi-example learning and pathological images.

Background

Crohn's Disease (CD) and tuberculosis (ITB) are two intestinal diseases that vary widely in their treatment and prognosis, despite their overlap in clinical symptoms, endoscopic features, histological features, microbiological features and serological indicators. These features are not unique to CD or ITB, and are less diagnostic and cannot accurately identify both. However, at present, it is generally clinically dependent on various examinations and experiences of doctors to distinguish, which results in high dependence of examination results on experiences of doctors, and accuracy of the examination results cannot be guaranteed.

There is thus a need for improvements and improvements in the art.

Disclosure of Invention

The technical problem to be solved by the application is to provide a Crohn disease detection method based on multi-example learning and pathological images aiming at the defects of the prior art.

To solve the above technical problem, a first aspect of embodiments of the present application provides a method for detecting crohn's disease based on multi-example learning and pathology images, the method including:

acquiring a plurality of pathological images, and acquiring slice characteristics corresponding to each pathological image, wherein the pathological images correspond to the same target object, and each pathological image comprises a preset number of image blocks;

Determining a slice prediction result corresponding to each pathological image based on each slice characteristic;

object features of a target object are determined based on the slice features, and an object prediction result of the target object is determined based on the object features and the slice features.

According to the technical means, after the slice characteristics corresponding to the pathological image are extracted, the slice prediction result of the pathological image and the object characteristics of the target object are determined based on the slice characteristics, and then the target object prediction result is determined based on the slice characteristics of the slice level and the object characteristics of the patient level, so that the slice level and the patient level are jointly acted on the process of detecting the Crohn disease, a more accurate and powerful reference basis is provided for a clinician, and the misdiagnosis rate is reduced and the treatment effect of a patient is improved.

In one implementation, the method for detecting crohn's disease based on multi-example learning and pathology images, wherein the slice features include image block features of each image block; the determining the slice prediction result corresponding to each pathological image based on each slice characteristic specifically comprises the following steps:

for each slice feature, selecting a number of representative features from the slice features;

Selecting a plurality of reference features from a preset feature library based on the plurality of typical features;

interacting the plurality of representative features with the plurality of reference features to obtain a plurality of updated representative features and a plurality of updated reference features;

and determining a slice prediction result of the pathological image corresponding to the slice feature based on the slice feature, the plurality of updated typical features and the plurality of updated reference features.

According to the technical means, the method accurately reflects specific characteristics of pathological image slices by selecting a plurality of typical characteristics from each slice characteristic and selecting a plurality of reference characteristics from a preset characteristic library based on the typical characteristics. Further, by interacting the representative features with the reference features, updated representative features and reference features are obtained, thereby enhancing complementarity and discrimination between features. The prediction result of the pathological image slice is effectively determined by comprehensively using slice characteristics, updating typical characteristics and updating reference characteristics, so that the detection accuracy of the Crohn disease is improved.

In one implementation manner, the method for detecting the crohn's disease based on the multi-example learning and pathology image, wherein the selecting a plurality of typical features from the slice features specifically includes:

Modeling each image block feature in the slice features in a Gaussian function linear combination mode to obtain a feature distribution model;

the feature distribution model is iterated to select a number of typical features among the slice features.

According to the technical means, each image block feature in the slice feature is modeled by adopting a modeling mode of Gaussian function linear combination, so that the difference of pathological images is more accurately expressed. Meanwhile, a plurality of typical features are selected through iteration of the feature distribution model, so that the representativeness of the features is ensured, and the identification accuracy of the detection method is improved.

In one implementation manner, the method for detecting the crohn's disease based on the multi-example learning and the pathological image, wherein determining the slice prediction result of the pathological image corresponding to the slice feature based on the slice feature, the plurality of updated typical features and the plurality of updated reference features specifically includes:

inputting the slice feature, the plurality of updated representative features, and the plurality of updated reference features into a trained Transformer network;

and determining a slice prediction result of the pathological image corresponding to the slice characteristic through the transducer network.

According to the technical means, the correlation between each slice characteristic and other characteristics is fully considered through the self-attention mechanism of the transducer network, so that the potential correlation in the pathological image can be captured, and the accuracy of differential diagnosis is improved.

In one implementation manner, the method for detecting the crohn's disease based on multi-example learning and pathological images, wherein after determining a slice prediction result of the pathological image corresponding to the slice feature based on the slice feature, a plurality of updated typical features and a plurality of updated reference features, the method further includes:

and updating the preset feature library based on the confidence level by adopting the plurality of typical features.

According to the technical means, the characteristic library is ensured to be always kept up to date by updating the preset characteristic library by adopting typical characteristics based on the confidence level, and the latest or most representative pathological image characteristics are reflected;

meanwhile, as the feature library is continuously updated, the new or changed pathological image features are more accurately identified and matched, so that the overall diagnosis accuracy is improved;

and based on the dynamic update of the feature library, the risk of misdiagnosis or missed diagnosis caused by using the wrong features is reduced.

In one implementation manner, the method for detecting the crohn's disease based on the multi-example learning and pathology image, wherein determining the object feature of the target object based on each slice feature specifically includes:

determining the attention scores corresponding to the slice features, and fusing the slice features based on the determined attention scores to obtain initial object features;

and interacting the initial object feature with each slice feature to obtain an object feature.

According to the technical means, by determining the attention scores corresponding to the slice features, proper weights are given to each slice feature, so that more important or relevant slice features in the object features are ensured to be weighted more, and the accuracy of the fusion effect is improved. In addition, by interacting the initial object features with the slice features, the interrelationship and influence between different slice features and the initial object features are considered, so that the obtained object features are more comprehensive and accurate.

In one implementation manner, the method for detecting the crohn's disease based on the multi-example learning and pathology image, wherein the interaction between the initial object feature and each slice feature, to obtain the object feature, specifically includes:

Inputting the initial object characteristics and the slice characteristics into a transducer network;

and determining the output weight corresponding to each slice feature through a transducer network, and weighting each slice feature based on the output weight to obtain the object feature.

In one implementation, the initial object features and the slice features are interactively processed through a transducer network to ensure that the slice features get proper consideration and weighting during the object feature formation process. The transducer network enables the method to adaptively allocate output weights for the slice features, ensures the emphasis of key slice features and the weakening of non-key slice features, and thus obtains object features more accurately and finely.

The beneficial effects are that: compared with the prior art, the application provides a Crohn disease detection method based on multi-example learning and pathology images, which comprises the steps of obtaining a plurality of pathology images and obtaining slice characteristics corresponding to each pathology image, wherein the plurality of pathology images correspond to the same target object, and each pathology image comprises a preset number of image blocks; determining a slice prediction result corresponding to each pathological image based on each slice characteristic; object features of a target object are determined based on the slice features, and an object prediction result of the target object is determined based on the object features and the slice features. According to the method, after the slice characteristics corresponding to the pathological image are extracted, the slice prediction result and the target object characteristics of the pathological image are determined based on the slice characteristics, and then the target object prediction result is determined based on the characteristics of the slice level and the characteristics of the patient level, so that the slice and the patient level are jointly acted on the process of detecting the Crohn's disease, the detection accuracy of the Crohn's disease is improved, and the misdiagnosis rate is reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without creative effort for a person of ordinary skill in the art.

Fig. 1 is a flowchart of a method for detecting crohn's disease based on multiple example learning and pathology images provided herein.

Fig. 2 is a diagram of a pathological image classification network structure based on typical feature propagation provided in the present application.

Fig. 3 is a block diagram of slice feature-patient feature interaction based on the self-attention mechanism provided in the present application.

Fig. 4 is a schematic structural diagram of a crohn's disease detection device based on multiple example learning and pathology images provided herein.

Fig. 5 is a schematic structural diagram of a terminal device provided in the present application.

Detailed description of the preferred embodiments

The present application provides a method and related device for detecting crohn's disease based on multiple example learning and pathological images, and for making the purposes, technical solutions and effects of the present application clearer and more specific, the present application will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.

It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

It should be understood that the sequence number and the size of each step in this embodiment do not mean the sequence of execution, and the execution sequence of each process is determined by the function and the internal logic of each process, and should not constitute any limitation on the implementation process of the embodiment of the present application.

The inventors have found through studies that Crohn's Disease (CD) and tuberculosis (ITB) are two intestinal diseases, and their methods of treatment and prognosis vary widely, although they overlap in clinical symptoms, endoscopic features, histological features, microbiological features and serological indicators. These features are not unique to CD or ITB, and are less diagnostic and cannot accurately identify both. However, at present, it is generally clinically dependent on various examinations and experiences of doctors to distinguish, which results in high dependence of examination results on experiences of doctors, and accuracy of the examination results cannot be guaranteed.

In order to solve the above problem, in an embodiment of the present application, a plurality of pathology images are acquired, and slice features corresponding to each pathology image are acquired, where the plurality of pathology images correspond to the same target object, and each pathology image includes a preset number of image blocks; determining a slice prediction result corresponding to each pathological image based on each slice characteristic; object features of a target object are determined based on the slice features, and an object prediction result of the target object is determined based on the object features and the slice features. According to the method, after the slice characteristics corresponding to the pathological image are extracted, the slice prediction result and the target object characteristics of the pathological image are determined based on the slice characteristics, then the target object prediction result is determined based on the characteristics of the slice level and the characteristics of the patient level, and the slice and the patient level are jointly acted on the process of detecting the Crohn's disease, so that the detection accuracy of the Crohn's disease is improved, and the misdiagnosis rate is reduced.

The application will be further described by the description of embodiments with reference to the accompanying drawings.

The present embodiment provides a method for detecting crohn's disease based on multi-example learning and pathology images, as shown in fig. 1, the method includes:

s10, acquiring a plurality of pathological images and acquiring slice characteristics corresponding to each pathological image, wherein the pathological images correspond to the same target object, and each pathological image comprises a preset number of image blocks.

In particular, pathology images are mainly derived from the medical pathology department for disease diagnosis and research. Wherein the pathology images are derived from different parts or different perspectives of the same target object, which may be a specific cancer cell or tissue sample, such as breast cancer or lung cancer, in the implementation of the present application, the target object is a crohn's disease and tuberculosis tissue sample. By acquiring a plurality of pathological images of the same target object, the cell structure and morphological change of the target object can be more comprehensively known, so that the accuracy of diagnosis is improved.

Each pathology image is acquired by a specialized pathologist or pathologist and obtained by a pathology slicer and high resolution scanner. The pathology image is usually coloured and has an extremely high resolution to ensure that the microstructure of the cells and tissues can be clearly observed.

Each pathology image is divided into a number of image blocks of a preset number, for example 100 or 200. The blocking method is beneficial to processing and analyzing the pathology image with large size, and simultaneously, the analysis of the specific area is more convenient and efficient.

For each pathological image, the slice characteristics mainly describe the medical properties and information of the image, such as cell morphology, tissue structure, staining condition and the like. Slice features may be manually noted by a pathologist or automatically extracted by techniques such as deep learning. For example, in implementations of the present application, for a crohn's disease pathology image, the slice characteristics may include cell size, morphology, nuclear staining, and the like.

In practice, multiple pathological images of the same target object can help doctors to observe and evaluate it from different angles and levels, for example, it is of great value in early diagnosis and staging of cancer.

In one implementation, acquisition of several pathology images and extraction of slice features may be accomplished through an automated process. For example, image segmentation and feature extraction are performed using a deep learning model such as U-Net or VGG. In addition, for each image block, the method and flow of feature extraction is the same as the overall pathology image, except that the scope is defined within that image block.

Preferably, in an implementation manner of the present application, the method for acquiring a plurality of pathological images includes:

s11, using a software library OpenSlide to read the pathological image. To ensure that only the valid tissue portions of the image are processed and that blank or extraneous regions are excluded, openCV is used for segmentation of the tissue contours at lower resolutions. Wherein when the resolution is raised to 40X, the entire tissue image is cut into a plurality of image blocks of 512X 512 size. The tiles are designed to be non-overlapping to ensure uniqueness of each tile. For each cut image block, the pixel coordinates of its upper left corner are saved for image processing and analysis.

S12, re-reading the corresponding image block according to the coordinate information stored in the step S11. To enhance the generalization ability of the model and increase the diversity of the dataset, the following amplification technique was applied randomly to each read image block:

(1) image rotation: including a random 90 deg. rotation operation, or a horizontal flip, to add versatility to the image while preserving its original characteristics.

(2) Affine transformation: including scaling and rotation of the image. Specifically, the zoom range of the image is set to 80% to 120% of the original image, and the angle of rotation is in the range of-30 ° to +30°, to increase the diversity of the image.

(3) Noise and blur: and adding Gaussian noise to the image, and then using a Gaussian filter to carry out fuzzy processing to simulate possible image quality problems in a real situation.

(4) Image brightness and contrast: the brightness and contrast of the image are adjusted as needed to enhance or attenuate certain portions of the image to simulate different image acquisition conditions.

(5) Color adjustment: the hue and saturation of the image are adjusted to simulate color deviations due to device or environmental differences.

The diversity of training data can be increased through the amplification technology, and the robustness of the model to various real world changes can be improved, so that the final application performance is improved.

Specifically, the slice feature includes an image block feature of each image block.

The slice refers to a larger image divided into a plurality of smaller image blocks. These tiles may be square, rectangular, or any other shape, depending on the particular application and processing requirements. For example, in medical image processing, a large medical scan image may be divided into multiple small image blocks for more detailed feature extraction and analysis.

The image block features refer to a feature set obtained by extracting features of each image block. These features may include, but are not limited to: color features, texture features, shape features, edge features, etc. Different tiles will also differ in the resulting tile characteristics due to differences in their content, background and objects.

In a specific implementation, the slice feature extraction process is as follows:

and (3) preprocessing a large image by using an image processing algorithm such as Canny edge detection, sobel operator and the like, so as to enhance the characteristic performance of the large image.

Dividing an image: the preprocessed large image is segmented into a plurality of image blocks according to a predetermined size.

And extracting the characteristics of each image block. For example, each image block is extracted with its image block features using a trained feature extraction model, such as VGG16, mobileNet, etc.

Combining characteristics: the image block features of all image blocks are combined to form a composite feature representation of the slice.

It is noted that when processing different image blocks, the policies and parameters of feature extraction may need to be adjusted accordingly to ensure that the features of each image block are accurately extracted.

In a specific implementation of the application, feature extraction is performed on pathological image blocks using a Resnet50 network model based on ImageNet pre-training. ImageNet is a large database of images containing millions of annotated images covering a wide variety of categories. The ImageNet-based pre-training model has learned a number of generic features that help to quickly launch specific tasks.

Before feature extraction is performed on the pathological image block, the parameters of the Resnet50 model are first fine-tuned to make the parameters more suitable for the characteristics and requirements of the pathological image. The process of fine tuning typically involves freezing certain layers of the model, such as the first 3 layers, and then continuing training on a particular pathology image dataset to update the weights of the model.

On the trimmed Resnet50 model, each pathology image block is fed into the model for forward propagation to extract the depth features of each pathology image block. The depth feature of each pathological image block not only reflects the cell and tissue information of the image block, but also merges the high-level abstract features in the model, thereby facilitating the subsequent classification and analysis work.

For convenience of subsequent study and analysis, the features of all image blocks are integrated and saved as one ". Pt" file. ". pt" is a file format commonly used in the PyTorch framework for preserving parameters or data of the model. Storing the characteristics of all image blocks in one file may make subsequent data loading and processing more efficient.

In this way, subsequent pathological image classification studies, such as machine learning or deep learning training using these features, can be conveniently performed to identify and classify different pathological conditions.

S20, as shown in fig. 2, slice prediction results corresponding to each pathological image are determined based on each slice characteristic.

Specifically, the slice features refer to a group of features obtained by extracting features from a pathological image, and can reflect various information of the slice image, such as color, shape, texture and the like. Features may be extracted from trained deep neural network models, such as models VGG16, VGG19, res net, and the like.

When the pathological image is a pathological section image of human tissue, different feature extraction models can be selected according to the characteristics of different organs of the human body. For example, for lung pathology images, a ResNet model may be selected for feature extraction, and for liver pathology images, a VGG16 model may be selected for feature extraction.

The slice prediction result is obtained by predicting through a prediction model according to slice characteristics. The predictive model is a deep learning model that has been trained, such as a convolutional neural network model, a long-term memory model, or other suitable machine learning model. The prediction result may be a classification of the slice image, e.g., benign, malignant; the score of the slice image may be, for example, the degree of attack, the degree of malignant transformation, or the like.

For example, when a pathology image is obtained, features thereof are first extracted using the ResNet model, and then these features are input into a trained CNN model, thereby obtaining a slice prediction result of the pathology image. If a plurality of pathological images exist, processing each pathological image to obtain a respective slice prediction result of each pathological image.

It should be noted that when a plurality of pathology images are processed, the processing flow of each pathology image is the same. The slice characteristics of each pathological image are only required to be respectively input into a prediction model, so that respective slice prediction results can be obtained.

In one implementation, the determining the object feature of the target object based on each slice feature, and determining the object prediction result of the target object based on the object feature and the slice feature includes:

s21, selecting a plurality of typical features from the slice features for each slice feature;

specifically, based on the slice features obtained in S10, not all slice features are employed in order to reduce the computational complexity and improve the accuracy of the analysis, but rather several most representative or typical features are selected in each slice for analysis. These selected features are selected based on their uniqueness, indicative of a condition, or other relevant criteria.

For example, in detecting pathological sections of Crohn's disease, typical features may include non-cheesy granulomas under the mucosa of the digestive tract, ulcers of deep-reaching muscle layers, thickening of the intestinal wall, and the like. The attributes of brightness, shape, size, etc. of these features in the image may be distinguished from normal tissue and thus may be identified and selected.

To more efficiently perform feature selection, a model may be built to automatically select the most typical and critical features from each slice by training with previously annotated data using a machine learning method. For example, a convolutional neural network based on deep learning may be used to analyze the slice images and extract key features therefrom.

In one implementation manner of the present application, the selecting a plurality of typical features from the slice features specifically includes:

s211, modeling each image block feature in slice features in a Gaussian function linear combination mode to obtain a feature distribution model;

in particular, in actual clinical procedures, particularly in the pathology field, doctors often diagnose patients based on slices with typical lesion characteristics. In order to extract the typical lesion features from a large number of slices more efficiently and accurately, the application proposes a pathological image classification method based on typical feature propagation.

The pathological image classification method is characterized by an innovative network structure, and a plurality of typical characteristics can be accurately selected from slice characteristics aiming at each slice characteristic. Specifically, to achieve the acquisition of typical features, a desired maximum attention algorithm, EM attention for short, is employed.

The working principle of the EM intent algorithm is based on a gaussian mixture model (Gaussian mixture model, GMM for short). The algorithm performs feature statistics on each slice, models the image block features in the slices in a Gaussian function linear combination mode, and effectively identifies typical lesion features in the slices.

For further explanation, considering various features and changes thereof in the slice, the EM Attention algorithm can distribute weights according to the distribution and importance of the features, ensure that only the features with the most representativeness and the diagnosis value are selected, not only improve the accuracy of diagnosis, but also greatly shorten the analysis time, and provide more visual and accurate references for doctors.

Further, the mathematical formula for modeling may be:

where n is the number of gaussian functions,is the feature of the image block, p is the model built on the slice feature distribution, +. >And zik mean, covariance and weight of the kth gaussian, d is 1024, respectively.

In particular implementations of the present application, some specific configurations are provided for the expectation-maximization algorithm. First, the number of gaussian functions is set to 8, thereby ensuring that more feature distributions can be covered and captured during the computation. At the same time, the iteration number of the EM algorithm is set to be 4, so that a good balance between algorithm convergence and calculation resources is ensured.

In addition, a balance parameter is set to control the weight between the original feature and the updated feature, the parameter being set to 0.5. This means that equal attention is given to both types of features during the calculation process.

Notably, at the initial stage of training, typical features may not have been adequately learned and captured. Thus, a stepwise strategy is adopted: in the first 50 training periods, training is performed by using only typical features in the slice; and the typical features between the slices are added to the training process after 50 training cycles are completed. Such a strategy may avoid prematurely introducing typical features that may not be accurate, thereby affecting the training effect of the model.

S212, selecting a plurality of typical features from the slice features by iterating the feature distribution model.

Specifically, the encryption algorithm performs random initialization of the mean value in an initial stage. At each iteration, the algorithm goes through two key steps. First is the E Step (estimation Step) in which the algorithm calculates the weights of the Gaussian function, which represent the probability that each image block feature is generated by the Gaussian function. Next, the M step (Maximization Step) is performed, mainly for updating the mean value of the gaussian.

These two steps are performed alternately in order to capture global context information of the slice in a non-local manner. By constant iteration and adjustment, the model is made to learn gradually which parts or features in the slice are most likely to be typical or most diagnostically valuable.

S22, selecting a plurality of reference features from a preset feature library based on the plurality of typical features.

Specifically, in order to ensure that a more accurate result can be obtained in the classification process, not only the typical characteristics of the current slice are considered, but also a preset characteristic library is used, and the typical characteristics of various middle-high confidence slices are stored in the preset characteristic library. This strategy is based on a recognition: even in the same type of slice, typical features may exhibit subtle differences due to image blocks of different morphologies. Thus, by presetting the feature library, the typical features between different slices are better propagated, making maximum use of the differences in the images between each class of slices, thereby helping the network to more accurately classify those slices that are indistinguishable.

The preset feature library exists in the form of a memory table, wherein typical features of each type of slice and corresponding prediction scores are recorded. When new slice data is entered, the typical characteristics and the prediction scores of the new slice data are obtained first, and then the new slice data are compared with the scores of the similar slices in the storage table.

Further, in one implementation, the score comparison process based on the preset feature library is as follows:

first, each slice class is preset with a prediction score threshold, which is obtained through statistical analysis according to historical data. Such a design ensures that only when the predictive score of a new slice is above this preset threshold, its associated typical features qualify for entry into the preset feature library or replacement of existing data therein.

When new slice data are submitted to the system, their predictive scores are carefully compared with the scores of the same class slices in the pre-set feature library.

In addition, a weighting system is introduced to optimize score comparisons. Each feature is given a weight, considering that the importance of the different typical features in an actual clinical diagnosis may vary. The weights are determined based on the rarity of the features and their contribution to the classification. This means that the weights assigned in making the comparison of the scores are taken into account, thereby reflecting the true value of the individual features more accurately.

Thus, when a new slice needs to be classified, not only the typical features extracted directly from the current slice are considered, but similar reference features are retrieved in the preset feature library based on these typical features. The combination method not only improves the accuracy of classification, but also enhances the discrimination capability of the model for difficult-to-distinguish slices.

S23, the plurality of typical features and the plurality of reference features are interacted to obtain a plurality of updated typical features and a plurality of updated reference features.

Specifically, in order to further optimize and update the typical features and the reference features, a method based on a bidirectional random walk module is adopted, and the method is specially designed for the features in the slices and among the slices, so that the features are more representative and differentiated.

Among these, typical feature propagation plays a central role. Specifically, given a set of slice features, a random walk operation is used to propagate typical features into the current slice features, thereby achieving feature updating and optimization.

The core calculation formula of this propagation operation is:

F ^* ＝w·Z ^* μ ^* +(1-w)·F

wherein w is a moiety located in [0,1 ]]Parameters within the range have the main effect of balancing the weights between the original and propagated features. The balanced design ensures that the updated features do not deviate excessively from the original features, thereby preserving the stability and continuity of the features. And Z is ^* Weights representing gaussian functions, u ^* Then it is a typical feature of the current slice or a typical feature extracted from the historical slice.

Through the propagation of typical characteristics and the action of the bidirectional random walk module, not only updated intra-slice characteristics but also updated inter-slice characteristics, namely F, are obtained ^* The characteristic system is more perfect, and the improvement is facilitatedAccuracy and robustness of the model.

S24, determining a slice prediction result of the pathological image corresponding to the slice feature based on the slice feature, the plurality of updated typical features and the plurality of updated reference features.

Specifically, in order to ensure that the classification prediction result of the pathological image is accurate and stable, the current slice feature, the updated typical feature and the reference feature are integrated for judgment.

Firstly, obtaining the feature description of the current pathological image slice through a feature extraction network. The feature description represents key information such as pathological morphology, texture, color distribution and the like on the slice, and provides basic data for subsequent predictive analysis.

And secondly, selecting the updated typical features from the preset feature library. The typical features were previously selected and stored according to their high predictive scores, representing the most representative features of the pathological section.

Again, in addition to the typical features, reference features similar to the current slice feature are retrieved from the feature library. The reference features are compared and extracted from previous pathology section data with data in the existing library, providing additional clues about the current section pathology classification.

After the three kinds of characteristic data are obtained, a distinguishing stage is carried out.

In one implementation, the three features described above are considered and fused together using a deep learning classifier. Based on the fused features, the classifier generates a prediction result for the current pathological section, representing the most likely pathological category of the section.

To increase the accuracy of the predictions, a weighting system is used to assign different weights to the features from different sources. These weights are determined based on the importance of the features and the degree of contribution to the classification.

Notably, in the specific implementation of the present application, a multi-classification task at the slice level is faced. Because of the imbalance phenomenon of various numbers in pathological section samples, the loss function is LADE. The LADE loss function fully considers the sample number of each category in the data set, and performs importance weight distribution on different categories, so that the attention degree and the classification performance of few categories are improved.

Preferably, in an implementation manner of the present application, the determining, based on the slice feature, the plurality of updated typical features, and the plurality of updated reference features, a slice prediction result of a pathological image corresponding to the slice feature specifically includes:

s241, inputting the slice feature, the plurality of updated representative features and the plurality of updated reference features into a trained transducer network.

Among other things, the self-attention mechanism based Transformer network was originally designed for natural language processing and other sequence learning tasks. However, with time, the Transformer network has been widely used in the field of computer vision due to its superior processing power and flexibility. For example Vision Transformer and swinTransformer, etc., these networks have achieved remarkable results in the field of image analysis.

Further, a key feature is that a transducer network can handle sequences of arbitrary length theoretically. By adopting a self-attention mechanism, the transducer network can effectively focus on global information in the sequence and capture the relation between different elements, which is particularly critical to pathological image analysis. Since pathology images contain a lot of tiny, detailed information, contrast and analysis is required in a global scope.

Typically, in applications for computer vision, an image is first broken up into image blocks and the image blocks are converted into a sequence for network analysis. In this context, each image block may be regarded as a sequence element. For pathological images, after feature extraction, each image block feature is regarded as a sequence element, so that the input data structure provided for the transducer is consistent with the expected input in the design process.

In summary, the individual image block features, updated representative features, and updated reference features of the pathological slice are provided as input sequences to the transducer network. The transducer can effectively capture the correlations and links between features, thereby providing a comprehensive and comprehensive analysis result for the whole pathological section.

S242, determining a slice prediction result of the pathological image corresponding to the slice characteristic through the transducer network.

Specifically, the above-described small image blocks divided from the entire pathological image section are based on. These image blocks represent local features of the image and can be seen as elements similar to the vocabulary in the Transformer network.

Next, the image block is input into a converter Encoder (Encoder). The core of the encoder is a multi-headed self-attention mechanism. The multi-headed self-attention mechanism allows the model to take into account the relationships between the selected image block and all other image blocks at the same time, thereby capturing interdependencies and context information between different locations in the image. The advantage of the self-attention mechanism is that a unique representation of each image block weighted with respect to all other image blocks can be generated.

In addition to the multi-headed self-attention mechanism, the encoder also contains a feed-forward neural network and residual connections. The feed-forward neural network further processes the self-attention output, enhancing the expressive power of the model. While the residual connection helps to prevent the gradient vanishing problem that occurs when training deep networks.

Each image block is processed by the encoder to obtain an encoded representation that captures its relationships and context information with other image blocks. Eventually, all image block encodings will be combined or summarized to form a vector representation representing the entire slice.

Further, to achieve classification, the vector representation representing the entire slice is fed into a fully connected layer that outputs a predictive score for each class. And determining the final prediction result of the pathological image slice according to the score.

In another implementation, after determining a slice prediction result of the pathology image corresponding to the slice feature based on the slice feature, the plurality of updated representative features, and the plurality of updated reference features, the method further includes:

Specifically, after the typical features of each slice are acquired, a confidence-based strategy is adopted to update a preset feature library.

A specialized feature library is first created for preserving the typical features of high confidence slices under different classifications. The creation of the feature library follows a standard: i.e. different slices under the same classification may contain image blocks of the same kind but of different morphology. Therefore, by storing the typical features of each slice in the feature library and subsequently propagating the typical features, the image differences among different slices under the same classification are fully utilized, so that the model can more accurately identify the slices which are difficult to identify during subsequent training and classification.

More specifically, a feature library is used to record the typical features of each class of slices and their corresponding predictive scores. When new slice characteristics and prediction scores are generated, the new data can be compared with the scores of the similar slices in the characteristic library. If the new predicted score is higher than the score in the feature library, the new representative feature and its score will replace the original data and thus be updated into the feature library, a mechanism that ensures that the feature library is maintained in the most up-to-date, representative state at all times.

Further, in order to effectively manage the feature library and ensure that it does not consume too much memory, a policy is formulated: under each classification, only the typical features of the predictive score rank in front k are saved, wherein k is a preset parameter, the value of k can be 5 and represents the typical features of the front five, and the value of k can also be 10 and represents the typical features of the front ten, and preferably, in the specific implementation of the application, the typical features of the 10 highest confidence degrees are reserved, and the balance is achieved between key information reservation and calculation resource saving.

In summary, through the score comparison and updating mechanism, not only is the high quality of data in the preset feature library ensured, but also the aim of gradually improving the classification precision of the model in the continuous learning process is fulfilled, so that the model is more accurate and stable when processing the difficult-to-distinguish slices.

S30, as shown in FIG. 3, determining object characteristics of a target object based on each slice characteristic, and determining an object prediction result of the target object based on the object characteristics and the slice characteristics.

In one implementation manner, the determining the object feature of the target object based on each slice feature specifically includes:

s31, determining the attention scores corresponding to the slice features, and fusing the slice features based on the determined attention scores to obtain initial object features;

For the field of pathology image analysis, it is a common challenge how to effectively fuse different slice features from the same subject (e.g., crohn's disease pathological tissue). Each slice may contain information that is valuable for diagnosis, but the importance of different slices may be different.

Therefore, a manner is needed to determine the respective weights of the slice features, and the slice features are fused with a emphasis based on the weights, so as to obtain the corresponding object features.

In general, there are various alternatives for determining the respective weights of the slice features, such as:

maximum slice strategy (Maxs): in this strategy, the predicted probability for each slice feature is first calculated, then the probabilities are ordered, and the slice feature with the highest probability is selected as the feature at the patient level. This strategy is based on the assumption that the slice with the highest probability often contains the most important information, and therefore, using this slice feature as a patient-level feature should give a better prediction.

Maximum minimum slice strategy (MaxMinS): in this strategy, the predicted probability of each slice feature is first calculated, then the probabilities are ordered, the two slice features with the largest and smallest probabilities are taken, and the features are averaged as patient-level features. This strategy is based on the assumption that slice features with the greatest and least probability often contain the most abundant information, and by averaging their features, a more comprehensive patient-level feature can be obtained.

Attention feature fusion strategy (AFS): in this strategy, the attention score of each slice feature is calculated first using an attention network, and then the individual slice features are fused based on the attention scores, resulting in a patient-level feature. This strategy is based on the assumption that the importance of individual slice features is different, and that by means of a attentive mechanism a weighted fusion can be made according to the importance of each feature, resulting in a more accurate patient-level feature. Of course, it is worth noting that the slice level network may be trained and then the patient level network trained when employing the attention-giving feature fusion strategy.

Preferably, to address this issue, the present application introduces a attention feature fusion strategy. Specifically, the core of the attention feature fusion strategy is the attention mechanism.

The attention mechanism is a technique for capturing the relative importance between the parts of the input. Attention mechanisms have been widely used in the fields of natural language processing, computer vision, and recommendation systems.

First, the respective attention score for each slice feature is determined. This process can be regarded as a learning process, in which the model learns the extent to which the features of the individual slices affect the diagnostic result by learning the training data, and assigns a corresponding attention score to each slice feature. This score represents the importance of the feature in all slice features, with higher values indicating that the feature is more important.

The slice features are then fused based on the determined attention scores. This is achieved by means of a weighted average, the weight of each slice feature being its corresponding attention score. In this way, the slice features from the same object are effectively fused while preserving the important information of the features.

Specifically, feature vector set { h } for a plurality of image blocks of the same target object _j Input it into the fully connected neural network layer and apply the logic cliff activation function for each h _j Output weight alpha _j . Second, based onThe weight performs weighted average on the vector to finally obtain the characteristic v of the patient _j Wherein the formula is:

α _j ＝FC(h _j |j＝1 to N _i )

in summary, the initial object features are obtained. The initial object features fuse all slice features from the same object, both preserving the important information of each slice feature and taking into account the relative importance between slice features. Thus, this feature can effectively improve the predictive performance of the model.

S32, the initial object features and the slice features are interacted to obtain object features.

In particular, the interaction between the initial object features (patient features) and the slice features is critical for extracting useful diagnostic information when processing pathology images. In this application, to more effectively utilize the link between slice features and patient features, a feature interaction module based on a self-attention mechanism is designed.

In particular, the module treats the feature vector of the initial object and the feature vector of the slice as equally important information inputs, rather than treating them differently. The self-attention mechanism based feature interaction module allows for processing of feature vectors of the initial object and feature vectors of the slice simultaneously in the same processing stage and for automatically learning the inherent links between these features through the self-attention mechanism.

To achieve this goal, a Transformer network is employed as the main processing framework. The Transformer network has powerful feature extraction and relational modeling capabilities, and can process and integrate patient-level and slice-level features.

After input to the transducer network, the prediction at the patient level and slice level is achieved using the fully connected layer. In addition, the attention moment array of the model is visualized, so that a clinician can intuitively know the attention distribution of the model when the two types of characteristics are processed.

The design of the characteristic interaction module based on the self-attention mechanism provides a universal network framework for the application and adapts to different slice network designs. The main advantage is that the correlation between slice and patient features is used more efficiently.

In a specific implementation manner, the interaction between the initial object feature and each slice feature to obtain the object feature specifically includes:

s321, inputting the initial object characteristics and the slice characteristics into a transducer network.

S322, determining output weights corresponding to the slice features through a transducer network, and weighting the slice features based on the output weights to obtain object features.

In particular, to better extract and analyze slice features in images, a Transformer network is used. The principal advantage of a Transformer network is that it is capable of handling complex dependencies between items in the input, suitable for various types of feature extraction and representation.

First, the initial object features and the individual slice features are entered as inputs into the transducer network. The input portion of a Transformer network typically includes an Embedding layer (Embedding layer) that converts these features into a continuous vector space, thereby enabling the network to process it.

The core of the transducer network is its Self-attention mechanism (Self-attention mechanism). The self-attention mechanism allows the model to dynamically assign different weights to the input features when processing a feature, while taking into account all features in the input, and the contribution of each feature to the processing is different, where the weights can be understood as the importance of each slice feature in generating the object feature.

After the feature passes through the self-attention mechanism, the feature vector is further transformed and encoded through a Feed-forward neural network (Feed-forward neural network). After passing through the transducer network, the respective output weights of each slice feature are obtained.

Next, each slice feature is weighted according to these output weights. In other words, the proportion of each slice feature and the initial object feature in the final object feature is determined according to the weights obtained in the transducer network, and the information of all slice features is fully utilized, while the attention to important features is ensured.

In summary, a feature is obtained that integrates the slice feature information and the initial object feature information. The integrated object features contain unique information of each individual slice, and integrate correlations between the slices and with the initial object features, so that comprehensive and accurate feature description is provided for subsequent pathological image classification.

In a specific implementation of the present application, a structure with 8 blocks and 8 attention headers is set, and this configuration is the same as a common Transformer Encoder network structure. In this configuration, each block includes a self-attention layer and a feed-forward neural network layer, and each attention header represents a different area of concentration.

Notably, no separate classification token is provided in the implementation of the present application. Instead, a global context representation is generated by interacting and integrating all elements in the input sequence, depending on the function of the transducer network itself. Specifically, the Transformer network takes into account each image block feature in the pathological section, as well as the corresponding updated representative features and updated reference features, to form a comprehensive context representation. This representation captures the complex interactions between slice features and initial object features, as well as their relative relationships in the global context.

Furthermore, in implementations of the present application, an ADAM optimizer is optionally used to adjust network parameters in processing both the slice feature network and the initial object feature network. The ADAM optimizer has the characteristic of self-adaptive learning rate, and can provide independent learning rate for each parameter, so that the model can be converged more quickly and stably in the training process.

In the initial stage, the learning rate is set to be 1e-4, which is a moderate choice, so that the training stability can be ensured, and the too slow training speed caused by too small learning rate can be avoided. As training proceeds, a cosine annealing strategy is employed to gradually reduce the learning rate. This strategy can effectively balance the relationship of exploration and utilization, enabling the network to adjust more carefully as the optimal solution is approached. Under this setting, the learning rate gradually decays during each period, which is defined as 50 rounds. After one period is over, the learning rate is adjusted to approximately 1e-12, which is small enough to ensure stability of the model near the optimal solution.

To prevent the overfitting phenomenon, a weight decay mechanism is introduced, and the coefficient of weight decay is set to 5e-4. When the weight is updated each time, a part of the weight of the model is restrained, so that the complexity of the network is controlled, and the risk of overfitting is reduced.

Under such a setting, the summarized network will perform 400 rounds of training. The set of optimization strategies ensures the training efficiency of the model and the generalization capability and stability of the model.

Meanwhile, a five-fold cross-validation strategy is adopted in the implementation mode of the method so as to improve the generalization performance of the model. To ensure the distribution balance of data, two principles were followed: first, the sample slices of the same patient are divided into the same compromise; second, the number of slices per compromise remains as consistent as possible. Based on this strategy, the entire dataset is divided into five parts: three parts are used for model training, one part is used for model verification, and the last part is used for model testing.

For training of the model, all networks are started from scratch without any pre-training parameters. In order to enable the model to better capture detailed features of pathological images, a two-stage training strategy is adopted. First, a slice-level network is trained, and then a target-level network of the subject is retrained on the basis of the slice-level network. It is noted that in the process of training the object target level network, the parameters of the slice level network are not frozen, so that in the whole training process, the parameters of the slice level network and the object target level network can be dynamically adjusted, and the model performance can be improved.

In summary, the present embodiment provides a method for detecting crohn's disease based on multi-example learning and pathology images, the method includes obtaining a plurality of pathology images, and obtaining slice features corresponding to each pathology image, where the plurality of pathology images correspond to a same target object, and each pathology image includes a preset number of image blocks; determining a slice prediction result corresponding to each pathological image based on each slice characteristic; object features of a target object are determined based on the slice features, and an object prediction result of the target object is determined based on the object features and the slice features. According to the method, after the slice characteristics corresponding to the pathological image are extracted, the slice prediction result and the target object characteristics of the pathological image are determined based on the slice characteristics, then the target object prediction result is determined based on the characteristics of the slice level and the characteristics of the patient level, and the slice and the patient level are jointly acted on the process of detecting the Crohn's disease, so that the detection accuracy of the Crohn's disease is improved, and the misdiagnosis rate is reduced.

Based on the above-mentioned method for detecting the crohn's disease by using multiple learning and pathological images, the present embodiment provides a device for detecting the crohn's disease based on the multiple learning and pathological images, as shown in fig. 4, the device includes:

The feature extraction module 100 is configured to obtain a plurality of pathology images, and obtain slice features corresponding to each pathology image, where the plurality of pathology images correspond to a same target object, and each pathology image includes a preset number of image blocks;

a first prediction module 200, configured to determine a slice prediction result corresponding to each pathological image based on each slice feature;

the second prediction module 300 is configured to determine an object feature of a target object based on each slice feature, and determine an object prediction result of the target object based on the object feature and the slice feature.

Based on the above-described method for detecting crohn's disease based on multi-example learning and pathology images, the present embodiment provides a computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the method for detecting crohn's disease based on multi-example learning and pathology images as described in the above-described embodiments.

Based on the above method for detecting crohn's disease based on multi-example learning and pathology images, the present application further provides a terminal device, as shown in fig. 5, which includes at least one processor (processor) 20; a display screen 21; and a memory (memory) 22, which may also include a communication interface (Communications Interface) 23 and a bus 24. Wherein the processor 20, the display 21, the memory 22 and the communication interface 23 may communicate with each other via a bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. The communication interface 23 may transmit information. The processor 20 may invoke logic instructions in the memory 22 to perform the methods of the embodiments described above.

Further, the logic instructions in the memory 22 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product.

The memory 22, as a computer readable storage medium, may be configured to store a software program, a computer executable program, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 performs functional applications and data processing, i.e. implements the methods of the embodiments described above, by running software programs, instructions or modules stored in the memory 22.

The memory 22 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. In addition, the memory 22 may include high-speed random access memory, and may also include nonvolatile memory. For example, a plurality of media capable of storing program codes such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or a transitory storage medium may be used.

In addition, the specific processes that the storage medium and the plurality of instruction processors in the terminal device load and execute are described in detail in the above method, and are not stated here.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A method for detecting crohn's disease based on multi-example learning and pathology images, the method comprising:

2. The method of claim 1, wherein the slice features comprise image block features for each image block; the determining the slice prediction result corresponding to each pathological image based on each slice characteristic specifically comprises the following steps:

3. The method for detecting crohn's disease based on multi-example learning and pathology images according to claim 2, characterized in that the selecting representative features among the slice features comprises in particular:

4. The method for detecting crohn's disease based on multi-example learning and pathology images according to claim 2, wherein the determining a slice prediction result of a pathology image corresponding to the slice feature based on the slice feature, a plurality of updated representative features, and a plurality of updated reference features specifically includes:

5. The method for detecting crohn's disease based on multi-example learning and pathology images according to claim 2 or 4, characterized in that, after determining a slice prediction result of a pathology image corresponding to the slice feature based on the slice feature, a number of updated representative features and a number of updated reference features, the method further comprises:

6. The method for detecting crohn's disease based on multi-example learning and pathology images according to claim 1, characterized in that the determining object features of the target object based on each slice feature specifically includes:

7. The method for detecting crohn's disease based on multi-example learning and pathology images according to claim 6, wherein the interacting the initial object feature with each slice feature to obtain an object feature specifically includes:

8. A crohn's disease detection device based on multiple example learning and pathology images, the device comprising:

the feature extraction module is used for acquiring a plurality of pathological images and slice features corresponding to each pathological image, wherein the pathological images correspond to the same target object, and each pathological image comprises a preset number of image blocks;

The first prediction module is used for determining a slice prediction result corresponding to each pathological image based on each slice characteristic;

and the second prediction module is used for determining the object characteristics of the target object based on the slice characteristics and determining the object prediction result of the target object based on the object characteristics and the slice characteristics.

9. A computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the method for multi-example learning and pathology image-based crohn's disease detection of any one of claims 1-7.

10. A terminal device, comprising: a processor and a memory;

the memory has stored thereon a computer readable program executable by the processor;

the processor, when executing the computer readable program, implements the steps in the method for detecting crohn's disease based on multiple example learning and pathology images as claimed in any one of claims 1 to 7.