CN114565919B

CN114565919B - Tumor microenvironment spatial relationship modeling system and method based on digital pathological image

Info

Publication number: CN114565919B
Application number: CN202210060093.1A
Authority: CN
Inventors: 秦文健; 刁颂辉; 何佳慧; 侯嘉馨
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2022-01-19
Filing date: 2022-01-19
Publication date: 2024-06-07
Anticipated expiration: 2042-01-19
Also published as: CN114565919A

Abstract

The invention discloses a tumor microenvironment spatial relationship modeling system and method based on a digital pathological image. The system comprises: the image dyeing standardization module is used for determining the pixel distribution type of the pathological image so as to carry out color standardization on the dyeing distribution change and obtain a dyeing standardization image; the structure region segmentation module is used for detecting a region of interest by using a weak supervision deep learning model aiming at the dyeing standardized image, and further segmenting the region of interest to obtain a target structure region; a cell detection module for extracting a plurality of types of cell information from the target structural region; and the spatial relationship construction module is used for modeling a multilayer network by adopting a multilayer graph to characterize co-spatial distribution among multiple types of cells, and carrying out cluster analysis on the multilayer graph to obtain a spatial distribution quantitative model. The invention can accurately reveal the relevance of the heterogeneity in the tumor and the spatial distribution rule of the cells and tissues in the tumor microenvironment, and provides a new quantitative analysis idea for a tumor evolution mechanism.

Description

Tumor microenvironment spatial relationship modeling system and method based on digital pathological image

Technical Field

The invention relates to the technical field of medical image processing, in particular to a tumor microenvironment spatial relationship modeling system and method based on a digital pathological image.

Background

Tumor tissue is a complex structure formed by tumor microenvironment formed by cancer cells and surrounding non-cancer cells (such as stromal cells, lymphocytes and the like), the spatial heterogeneity is very complex, and although the identification and localization of spatial positions of cancer cells, lymphocytes, stromal cells and other types of cells (such as macrophages, T cells or non-discriminating cells) in digital pathology images can be realized by using weak supervised learning algorithms, the existing methods cannot realize complete expression due to simple distance measurement, cell density statistics or clustering modes only. In addition, because of the abundant cell types in the tumor microenvironment, the cell space organization relationship constructed by using the graphic neural network at present cannot be suitable for carrying out full-automatic comprehensive quantitative analysis on the spatial organization distribution of multiple cell types at the same time. Therefore, a new multi-layer network topology clustering method for multi-cell types needs to be studied.

The tumor microenvironment controls the formation, development, metastasis and drug resistance of solid tumors, and is the result of anti-tumor immune response generated by the interaction of tumor cells with non-tumor cells such as interstitial cells, immune cells and the like, and strong clinical and experimental evidence supports the importance of the tumor microenvironment in the development of cancers and the mediation of drug resistance. However, the complex anatomy and local microenvironment have yet to be explored in depth for metabolic and immune response. The interaction between the tumor and the microenvironment is difficult to capture by the pathologist in conventional qualitative or semi-quantitative parameter visual inspection, so that the characteristics of the tumor microenvironment, particularly the spatial heterogeneity in the tumor, are decrypted by utilizing digital pathology image calculation and analysis, a new thinking mode is provided for solving the problem of tumor microenvironment analysis, and more importantly, potential biomarkers related to cancer treatment can be mined, so that the most suitable accurate medical treatment scheme is designed for patients.

With the development of digital pathology panoramic imaging and a pathology image processing algorithm based on deep learning, a pathologist can be assisted in calculating pathology to check histological data of a patient in a high-throughput, quantitative and objective mode, various cells obtained by an automatic detection algorithm can be utilized to construct a cell space relation diagram in a tumor, and a combined space analysis method is used for realizing accurate assessment of tumor treatment response and prognosis. The spatial analysis research work of tumor microenvironment based on digital pathology usually adopts a clustering algorithm to perform spatial localization and morphological measurement on the cell characteristics extracted from the digital pathology image so as to describe the relationship between the immune cell spatial distribution pattern and the disease. For example, firstly, a convolutional neural network is utilized to realize the identification of tumor-infiltrating lymphocytes (TILs) and the segmentation of tumor necrosis areas, then an affine clustering algorithm is adopted to model the space modes of the infiltrating lymphocytes, and further the space modes of the TILs are described by extracting corresponding clustering features, so that the relation between the TIL modes and immune subtypes, tumor types, immune cell fragments and survival of patients is revealed. Still other studies have explored their clinical significance by measuring cell distribution density using euclidean distance to quantify the spatial relationship between cancer and components of the microenvironment. These studies indicate that using image analysis can override sample cell counts and conduct spatial analysis of tumor microenvironments on a spatial distance basis. In order to better utilize the high-level spatial distribution information, kunHuang et al adopts the topological spatial modeling of deep learning characteristics based on delaunay triangulation graphs, firstly adopts the high-level semantic characteristics of stacked self-coding network learning cells, then utilizes K-means clustering to acquire the spatial mode of cell nuclei, finally confirms that the spatial topological characteristics of kidney tumor microenvironment are obviously related to survival time in a side histogram statistical mode, and also verifies that the topological characteristics have superior performance compared with clinical characteristics and cell morphological characteristics in the aspect of survival prediction. Guanghua Xiao et al used a spatial tissue map based on a cell statistical density approach to construct regions, used a deep convolutional network to fully automatically identify cell types, and finally calculated 2 spatially distributed features to predict survival of lung cancer patients.

In summary, the tumor microenvironment is very complex and has spatial heterogeneity, the existing method cannot achieve complete expression by simple distance measurement, cell statistics or clustering, and although the latest research tries to construct a spatial organization relationship by using a graph mode, the spatial analysis of the graph depends on manually extracting the characteristics of the number of adjacent node connections, edge histogram and the like of the graph, and only a few simple cell correlations can be analyzed. Because of the abundant cell types in the tumor microenvironment, the current space analysis method is difficult to realize full-automatic comprehensive quantitative analysis on the space tissue distribution of various components (including structures such as blood vessels, different types of cells such as lymphocytes and stromal cells) at the same time.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a tumor microenvironment spatial relationship modeling system and method based on digital pathological images, and is a novel technical scheme for multi-layer network topology clustering of multicellular types.

According to a first aspect of the invention, a tumor microenvironment spatial relationship modeling system based on a digital pathology image is provided. The system comprises:

image staining standardization module: the method comprises the steps of determining pixel distribution types of a pathological image, and performing color standardization on dyeing distribution changes according to overall distribution conditions of pixels of the pathological image to obtain a dyeing standardization image;

the structure region segmentation module: the method comprises the steps of detecting a region of interest by using a weak supervision deep learning model aiming at the dyeing standardized image, and then dividing the region of interest to obtain a target structure region;

Cell detection module: for extracting a plurality of types of cell information from the obtained target structure region;

The spatial relation construction module: the method is used for modeling a multilayer network by using a multilayer graph to represent the co-space distribution among multiple types of cells, and carrying out cluster analysis on the multilayer graph to obtain a space distribution quantitative model, wherein the space distribution quantitative model is used for quantitatively representing the interaction between tumor cells and tumor microenvironment, the multilayer graph comprises intra-layer relationships and inter-layer interactions, nodes on the same layer represent the same type of cells, and connections between different layers represent the space connection relationships between different types of cells or structures.

According to a second aspect of the invention, a tumor microenvironment spatial relationship modeling method based on a digital pathology image is provided. The method comprises the following steps:

Step S1: determining the pixel distribution type of the pathological image, and carrying out color standardization on the dyeing distribution change according to the overall distribution condition of each pixel of the pathological image to obtain a dyeing standardization image;

step S2: aiming at the dyeing standardized image, detecting an interested region by using a weak supervision deep learning model, and then dividing the interested region to obtain a target structure region;

Step S3: extracting various cell information from the obtained target structure region;

Step S4: the method is used for modeling a multilayer network by using a multilayer graph to represent the co-space distribution among multiple types of cells, and carrying out cluster analysis on the multilayer graph to obtain a space distribution quantitative model, wherein the space distribution quantitative model is used for quantitatively representing the interaction between tumor cells and tumor microenvironment, the multilayer graph comprises intra-layer relationships and inter-layer interactions, nodes on the same layer represent the same type of cells, and connections between different layers represent the space connection relationships between different types of cells or structures.

Compared with the prior art, the invention has the advantages that because of the richness and the spatial heterogeneity of tumor cell types, a plurality of components of the tumor microenvironment have strong spatial correlation with cancer cells, the invention provides a topological space mathematical model for constructing the tumor cells and the tumor microenvironment, can reveal the correlation of the intratumoral heterogeneity and the spatial distribution rule of the tumor microenvironment cells and tissues, and provides a brand-new quantitative analysis thought for a tumor evolution mechanism.

Other features of the present invention and its advantages will become apparent from the following detailed description of exemplary embodiments of the invention, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is an architecture diagram of a digital pathology image-based tumor microenvironment spatial relationship modeling system according to one embodiment of the invention;

FIG. 2 is a flow chart of a method for modeling tumor microenvironment spatial relationship based on a digital pathology image according to one embodiment of the invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.

The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of exemplary embodiments may have different values.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

Referring to fig. 1, the provided tumor microenvironment spatial relationship modeling system based on the digital pathology image comprises an image staining standardization module, a structural region segmentation module, a cell detection module and a spatial relationship construction module. The dyeing standardization module is used for solving the problem of inconsistent color distribution of different slices; the structure region segmentation module is used for combining the multi-scale imaging characteristics of the pathological image and realizing segmentation of the pathological region and the structure under high resolution through a regulatable weak supervision learning method; the cell detection module is used for detecting and identifying each type of cells of the small cluster target; the spatial relationship construction module is used for realizing identification of immune cell types through an image registration algorithm, constructing a topological spatial relationship model between multi-type structure-cells and tumor cells through a multi-layer graph network, and realizing quantitative analysis of tumor microenvironment.

The function of each module and the specific embodiments will be described hereinafter.

(1) Image dyeing standardization module

The distribution of the stained pixels of the pathology image generally conforms to a gaussian distribution or a meta-normal distribution, which can be determined using a self-supervision algorithm based on parameter estimation of a distribution model, wherein the Probability Density Function (PDF) of the multi-element meta-normal distribution is:

Where θ= (μ ^T,a^T,λ^T)^T is the unknown parameter vector, a is the element of the triangular part on the matrix Σ, Is the square root of a symmetric matrix,/>Phi _d (U, sigma) is the PDF and covariance matrix of the d-variable Gaussian distribution with a mu-mean vector, (. Cndot.) refers to phi _d (U, sigma)) as the cumulative distribution function of the standard single-variable Gaussian distribution. The probability density function of the multivariate mixed Gaussian distribution is:

wherein the parameter pi _d is called the mixing coefficient (mixing coefficients), And pi _d is more than or equal to 0 and less than or equal to 1. /(I)For the prior probability of the kth distribution selected, density/> Is the probability of x given the kth distribution.

In one embodiment, the actual distribution model type of the pixels of the pathological image can be confirmed by adopting the Jack-Bara test (Jarque-Bera test), for example, when the parameters of the distribution model are further solved by adopting the test based on the skewness and kurtosis of pixel data, the parameters of the model can be estimated and updated by adopting a deep convolution method, so that the overall distribution condition of each pixel of the pathological image is obtained, and finally, the color standardization of the image to be analyzed is realized through the dyeing distribution change model.

The color of the pathological image is obviously changed due to various differences of preservation solutions, coloring agents and film-making processes of different manufacturers and different digital scanners, and the problem of inconsistent color distribution of the pathological image caused by dyeing operation, dyeing conditions or equipment images can be solved through dyeing standardization, so that the result of subsequent analysis and identification is improved.

(2) Structural region segmentation module

The structural region segmentation module sequentially carries out regularization coding, regularization decoding, weak supervision learning, interested region detection and the like through the multi-dye standard image to obtain a structural region reservation segmentation map.

In particular, for building a weakly supervised segmentation model, one of the key issues is to extract enough key feature encodings with limited information to effectively assist in segmentation. The method comprises the steps of constructing the characteristics of a key region in data according to image-level labels, removing irrelevant redundant information and noise, abstracting original data information into two large-class data matrixes, solving the two matrixes respectively by using low-rank matrix as target structure information and sparse matrix as redundant and noise information to finally obtain characteristic information of target structure data, using the method on a multi-scale pathological image segmentation model, and designing a regularizer for the loss to perform model training. For example, assuming that an image is I and its label is Y, and assuming that f _θ (I) is the output of a θ -parameterized partition network, the optimization problem corresponding to convolutional neural network training using joint regularization loss can be expressed as:

Wherein, Is the loss between the true and predicted values, R (S) is the regularization loss, the parameter s=f _θ(I)∈[0,1]^|Ω|×K, i.e. the softmax segmentation result of the K channels generated by the network, λ and μ are the set corresponding term weight parameters.

In order to fuse the characteristic information of the images under different multiplying powers into the learning process of the algorithm, the input of the model can fuse the image information of a plurality of multiplying powers so as to realize different attentions to cells and medium-low multiplying power tissues under high multiplying power, thereby fully considering the specificity and universality of the data sample and learning the key characteristics of the data. And simultaneously, the diagnosis flow of a clinician is simulated, and different attention weights are given to the image features of different multiplying powers so as to fully consider the data features under the images of each multiplying power. For example, the optimization function of the corresponding multi-rate regularization loss is:

Wherein I _d represents image input at d magnification, f _θ,η represents feature calculation under the attention weight of η under the θ parameter; in addition, η is calculated primarily by Softmax (f _θ (I)). The parameters are learned and optimized through the deep convolution network, and the model has the capability of efficiently extracting the characteristics, so that the model can learn data prior better and faster.

After the parameterized model is learned, according to the identified target category, the weighted category activation mapping diagram is obtained by fusion calculation of the characteristics of a plurality of scales, and the target tissue and the structural area can be obtained after post-processing.

The weak supervision learning can replace the pixel-by-pixel truth labeling by using the truth labeling which is more easily obtained, so that the data labeling cost is reduced and the image segmentation efficiency is improved. The structural area splitting network may use AlexNet, VGG, googleNet, resNet or other types, and the invention is not limited in this regard.

(3) Cell detection module

Because of the large scale of the pathological image, after obtaining the structural region of interest of the cancer and the key features of the distinguishing characteristics of the cancer and the non-cancer tissues (namely, the key distinguishing matrix is obtained by calculating the probability distance of the corresponding matrix according to the high-dimensional features of different types of images, and the tumor region is taken as a benchmark reference, namely, the cancer is close in distance and the other is far away), different types of cell information needs to be extracted from the target structural region, however, whether an algorithm can accurately detect the cancer faces a great challenge due to the fact that the distribution of the cells is numerous and the ratio of the cells is small.

In one embodiment, according to the difference and correlation between cells and structures, the coding and decoding calculation of each cell and structure of the image is realized based on a self-attention transformation network, and meanwhile, fusion analysis is carried out by combining a key discrimination matrix, specifically as follows:

First, the series of feature maps X extracted through the convolutional network are converted into visual markers (visual tokens) T, expressed as:

T＝SOFTMAX_HW(XW_A)^TX (5)

Wherein, W _A is a learnable weight and/>H. W, C each represent a respective directional dimension of the feature, L represents the number of visual markers T, and L < < HW.

After the visual marker T is obtained, modeling of the dependency relationship between T is carried out by utilizing self-attention transformation, the modeling is projected to the dimension of a normal feature map, and the key discrimination matrix G of the preamble is combined, and the key discrimination matrix G is expressed as:

X_out＝X_in+SOFTMAX_L((X_inW_Q)(TW_K)^T)T+G (6)

Wherein X _in represents the image characteristics obtained during multi-scale pathological image tumor region detection, X _out represents the final output result of the cell detection module, W _Q and W _K are respectively leachable weight parameters, and a large amount of data learning is performed after each characteristic relation of the image is constructed, so that the identification and positioning of cells and structures of different categories are realized.

(4) Spatial relationship construction module

The spatial relation construction module sequentially executes the processes of image dicing, image feature coding, low-rate rigid registration, high-rate non-rigid registration, multi-layer graph network construction, graph embedding dimension reduction, acquisition of point cloud distribution data, continuous coherent modeling, feature clustering analysis and the like, and finally a spatial distribution quantitative model is obtained. The following focuses on multi-layer graph network construction and feature cluster analysis.

In particular, to analyze co-spatially distributed expression among multiple cell types to quantitatively characterize interactions between tumor cells and the tumor microenvironment, in one embodiment, a multi-layer graph modeling multi-layer network approach is employed, the multi-layer graph being a collection of weighted single-layer graph adjacency matrices containing intra-layer relationships and inter-layer interactions. The specific implementation comprises a multi-layer network constructed based on a multi-layer diagram and clustering calculation aiming at the multi-layer network, so that the construction of a spatial distribution expression model between tumor cells and tumor microenvironment multiple components is finally realized.

1) Modeling of spatial higher order relationships of a multi-layer network

The single layer graph network is defined as: g= (V, E, ω), where V is the set of nodes,Is a collection of edges. The total number of points in graph G is n= |v|. omega/(I)The weight of the edge E _uv epsilon E is expressed as omega _uv, and the adjacency matrix A is a symmetrical matrix, namely A _ij＝A_ji, which indicates whether each node has a connection relationship, namely the information of different cell nodes.

From the definition based on single-layer graph, a multi-layer network can be constructed Consists of non-overlapping m layers, each modeled by a weighted graph G _i of adjacency matrix a _i, i=1, …, m. The elements in the set a= { a ₁,A₂,…,A_m } are called intra-layer matrices, representing connections within a single layer, i.e. intra-layer connections. For modeling of the relationship between two graphs, G _k and G _l and their adjacency matrices can be represented as a _k and a _l (k, l=1, 2, …, m; k+.l), respectively, which represent one-to-one symmetric internal connections between nodes of two related graphs. In this way, a set C _p＝{A_l,k of cross-layer adjacency matrices, k+.l, representing the edges between nodes at different layers, p representing the number of contact graphs, can be obtained.

To sum up, a multi-layer networkInter-layer connection set/>, with one connection cross-layer nodeFor the edgeThere are u ε V (G _k) and V ε V (G _l), and k+.l. Defined multilayer network/>Has a block matrix structure:

The diagonal elements in set a are intra-layer matrices, and the non-diagonal elements a _kl (k, l=1, 2, …, m; k+.l) represent inter-layer connections connecting nodes in the G _k layer with nodes in the G _l layer. In one embodiment, the same-layer nodes are defined to represent cells of the same type, and the connections between different layers represent spatial connections between cells or structures of different types. Taking a vascular structure and tumor cells as examples, the establishment of the cell-structure interlayer relation can obtain the value of an interlayer off-diagonal element A _kl based on the size of a space distance, and tumors or immune cells close to the blood vessel have strong connection with a structural layer, and vice versa; diagonal elements are intra-layer matrices and are also obtained by euclidean distance between cells.

After the multi-layer graph network is built, considering that a large amount of computation and memory are required for extracting meaningful information from the complex network, in order to solve the two problems, the network is converted into a low-dimensional space through node embedding and the structural information is reserved, for example, the dimension reduction is realized by adopting a graph embedding method.

2) Multi-layer network topology analysis method based on persistence graph clustering

In order to infer a tumor evolution conclusion from node embedding of tumor microenvironment multicellular types and cancer cells, clustering calculation is needed for node embedding. By forming clusters based on shape dynamics, it is helpful to find persistent node clusters with similar patterns, in one embodiment, the concept of topology data analysis (topological DATA ANALYSIS, TDA) is introduced into complex multi-layer network topology analysis.

Assuming a weighted graph G, if a threshold ε _j >0 is selected and only the edges with weights satisfying ω _uv≤∈_j are retained, an adjacency matrix is obtained asIs shown in FIG. G _j. If the threshold is changed to epsilon ₁<∈₂<…<∈_n, the hierarchical nested sequence/>, of the graph is obtainedReferred to as "network filtering". Taking the widely used simplex complex Vietoris-riss (VR) complex as an example, the VR complex at threshold v _j is defined asBy means of network filtering, the persistence features over a wide range of thresholds e _j are detected by evaluating the generalized changes in the network topology, the goal being to detect persistence features exceeding different thresholds e, which are the features of the internal spatial organization distribution. The persistence graph clustering algorithm can obtain more accurate clustering results.

In summary, most of the current multi-layer network clustering methods are based on graph decomposition to embed graphs into Euclidean space, and do not explicitly consider local graph geometry and topology, but the multi-layer network clustering method adopted in the embodiment of the invention performs clustering calculation on the multi-layer network under the unsupervised condition from the aspect of data shape similarity of multi-resolution records. In order to quantify the shape dynamics of a multi-layer network at evolving similarity scales, a multi-lens tool of TDA was introduced in the clustering calculation, the core idea of which is that if the local neighborhoods of two points are similar in shape at all resolution scales, the distance between them is sufficiently close to cluster into one cluster. Therefore, the persistence map clustering utilizes the distance function and the local space information around the points, and a more accurate clustering result can be obtained for the multi-layer map network.

Correspondingly, the invention also provides a tumor microenvironment spatial relationship modeling method based on the digital pathological image, which is used for realizing the functions of each module in the system. For example, the method includes: step S110, determining the pixel distribution type of the pathological image, and carrying out color standardization on the dyeing distribution change according to the overall distribution condition of each pixel of the pathological image to obtain a dyeing standardization image; step S120, aiming at the dyeing standardized image, detecting an interested region by using a weak supervision deep learning model, and then dividing the interested region to obtain a target structure region; step S130, extracting multiple types of cell information from the obtained target structure area; and step S140, a multi-layer graph modeling multi-layer network is used for representing co-space distribution among multiple types of cells, and clustering analysis is carried out on the multi-layer graph to obtain a space distribution quantitative model. The multi-layer graph comprises intra-layer relations and inter-layer interactions, nodes on the same layer represent cells of the same type, and connections between different layers represent spatial connection relations between cells or structures of different types.

In summary, compared with the prior art, the invention has at least the following technical effects:

1) The multi-scale pathological image rapid calculation method based on the learning regularization constraint coding and decoding weak supervision learning is designed, and aiming at the problems that the pathological image Shan Zhangchao billion pixels are difficult to calculate and the information under different multiplying power scales is not fully utilized, the weak supervision thought and the deep learning technology are combined, large-scale data labeling is not needed, and the trans-scale information is fully utilized, so that the rapid pathological region detection and the nuclear accurate positioning of the digital panoramic pathological image rapid pathological region of interest are realized.

2) The coding and decoding of each cell and structure are realized by adopting a self-attention transformation network by combining the difference and the correlation of the cells and the structure, so that the rapid detection and the accurate identification of the clustered multi-type small target cells are realized.

3) And the tumor microenvironment topological space modeling method based on persistence graph clustering further realizes quantitative calculation of pathological diagnosis indexes. The conventional distance or statistical method is difficult to realize the spatial expression analysis of complex tumor microenvironment, the concept of topology data analysis is introduced into complex multi-layer network clustering calculation, a topology space modeling method of persistence graph clustering is provided, the relevance between intra-tumor heterogeneity and the spatial distribution rule of tumor microenvironment cells and tissues is revealed, and a new quantitative analysis idea is provided for a tumor evolution mechanism.

The present invention may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as SMALLTALK, C ++, python, etc., and conventional procedural programming languages, such as the "C" language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information for computer readable program instructions, which can execute the computer readable program instructions.

Various aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, implementation by software, and implementation by a combination of software and hardware are all equivalent.

The foregoing description of embodiments of the invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvements in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A tumor microenvironment spatial relationship modeling system based on digital pathology images, comprising:

2. A tumor microenvironment spatial relationship modeling method based on digital pathological images comprises the following steps:

3. The method of claim 2, wherein the input of the weakly supervised deep learning model fuses multiple magnifications of image information and the training process employs a multiple magnifications regularization penalty as an optimization objective, expressed as:

Wherein I _d represents the image input at d magnification, f _θ,η represents the feature calculation under the attention weight of eta under the theta parameter, eta is calculated according to Softmax (f _θ (I)), Is a loss between the true value and the predicted value, R (S) is a regularization loss, the parameter s=f _θ(I)∈[0,1]^|Ω|×K, K represents the number of channels, λ and μ are set weight parameters, I represents the image, and Y represents the label to which the image corresponds.

4. The method according to claim 2, characterized in that step S3 comprises the sub-steps of:

Converting the extracted feature map X of the target structure region into a visual mark T;

modeling of the dependency relationship between T is performed by using self-attention transformation, and the modeling is projected to the dimension of a normal feature map, and is expressed as follows:

X_out＝X_in+SOFTMAX_L((X_inW_Q)(tW_K)^t)T+G

wherein G is a key judgment matrix, W _Q and W _K are weight parameters, X _in represents image characteristics obtained during multi-scale pathological image tumor area detection, and X _out represents an output result;

According to each characteristic relation of the constructed image, the identification and the positioning of different types of cells are realized through data learning.

5. The method of claim 2, wherein the multi-layer graph comprises non-overlapping m layers, each layer being modeled by a weighted graph G _i with adjacency matrices a _i, i = 1, …, m, the elements in the set a= { a ₁,A₂,…,A_m } being referred to as intra-layer matrices representing intra-layer connections; for modeling of the relationship between two graphs, G _k and G _l and their adjacency matrices are denoted as a _k and a _l, respectively, which represent a one-to-one symmetric internal connection between nodes of two related graphs, the set of cross-layer adjacency matrices C _p＝{A_l,k, k+.l }, representing edges between nodes of different layers, p representing the number of relationship graphs, where k, l=1, 2, …, m, k+.l.

6. The method of claim 5, wherein for a multi-layer network constructed from multi-layer graphsInter-layer connection set/>, with one connection cross-layer nodeFor edge/>With u ε V (G _k) and V ε V (G _l), and k+.l, the multi-layer network/>Has a block matrix structure, expressed as:

Wherein the diagonal elements in set a are intra-layer matrices, and the non-diagonal elements a _kl (k, l=1, 2, …, m; k+.l) represent inter-layer connections connecting nodes in the G _k layer with nodes in the G _l layer.

7. The method of claim 6, wherein for inter-layer connections, values of inter-layer off-diagonal elements a _kl are obtained based on spatial distance size, and for intra-layer matrices, values of diagonal elements are obtained by euclidean distance between cells.

8. The method of claim 5, further comprising dimension reduction of the multi-layer network using graph embedding and clustering the dimension reduced multi-layer network according to shape similarity of local neighbors of two points on all resolution scales.

9. The method of claim 2, wherein the pixel distribution type of the pathology image is determined using a jac-bera test.

10. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor realizes the steps of the method according to any of claims 2 to 9.