CN116564401A

CN116564401A - Model training, cell segmentation system, method and storage medium

Info

Publication number: CN116564401A
Application number: CN202310412653.XA
Authority: CN
Inventors: 杨梦�; 杨悦羽霄; 蓝琳琪
Original assignee: MGI Tech Co Ltd
Current assignee: MGI Tech Co Ltd
Priority date: 2023-04-10
Filing date: 2023-04-10
Publication date: 2023-08-08

Abstract

The embodiment of the disclosure discloses a model training and cell segmentation system, a method and a storage medium. The model training system comprises: a first processor configured to: acquiring a first data set and a self-supervision training model; enhancing the first data set to obtain first enhanced data and second enhanced data, inputting the first enhanced data into a first encoder to obtain first positive sample characteristics, and inputting the second enhanced data into a second encoder to obtain second positive sample characteristics; inputting the first positive sample characteristics, the second positive sample characteristics and the negative sample characteristics constructed aiming at the second positive sample characteristics into a contrast learning module to obtain a contrast learning result; and adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing the segmentation process of at least two cells. According to the technical scheme, the cells can be accurately segmented.

Description

Model training, cell segmentation system, method and storage medium

Technical Field

The embodiment of the disclosure relates to the field of machine learning, in particular to a model training and cell segmentation system, a method and a storage medium.

Background

Spatial histology techniques can measure both spatial background information and molecular level information (e.g., proteins) that help to understand and characterize molecular differences in complex tissues.

Based on the method, for the space histology data obtained based on the space histology technology measurement, space microenvironment segmentation is realized based on the space histology data, namely, the segmentation process of at least two cells corresponding to the space histology data is realized, which is helpful for finding out the structure of tumor microenvironment, so that the tumor-immunity interaction mechanism is known, and the method is helpful for clinical and transformation medicine.

However, the segmentation accuracy achieved by the currently employed cell segmentation schemes is to be improved.

Disclosure of Invention

Embodiments of the present disclosure provide a model training, cell segmentation system, method, and storage medium to improve the accuracy of a segmented cell solution based on a first data set generated using principles of space histology.

According to an aspect of the present disclosure, there is provided a model training system, which may include:

a first processor configured to:

acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises data produced by applying a space group theory principle for each cell in at least two cells, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module;

Enhancing the first data set to obtain first enhanced data and second enhanced data, inputting the first enhanced data into a first encoder to obtain first positive sample characteristics, and inputting the second enhanced data into a second encoder to obtain second positive sample characteristics;

inputting the first positive sample characteristics, the second positive sample characteristics and the negative sample characteristics constructed aiming at the second positive sample characteristics into a contrast learning module to obtain a contrast learning result;

and adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing the segmentation process of at least two cells.

According to another aspect of the present disclosure, there is provided a cell segmentation system, which may include:

a second processor configured to:

acquiring a feature extraction model obtained by training a model training system provided by any embodiment of the disclosure, and a first data set applied in a training process of the feature extraction model, wherein the first data set comprises data generated by applying a space histology principle for each cell of at least two cells;

inputting the first data set into a feature extraction model to obtain the cell features of each cell, and clustering the cell features of each cell to obtain the segmentation result of at least two cells.

According to another aspect of the present disclosure, there is provided a model training method, which may include:

According to another aspect of the present disclosure, there is provided a cell segmentation method, which may include:

Acquiring a feature extraction model obtained by training according to the model training method provided by any embodiment of the disclosure, and a first data set applied in the training process of the feature extraction model, wherein the first data set comprises data generated by applying a space histology principle for each cell of at least two cells;

According to another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer instructions operable, when executed by a processor, to implement a model training method or a cell segmentation method provided by any embodiment of the present disclosure.

According to the technical scheme of the embodiment of the disclosure, for a first encoder, a second encoder and a contrast learning module in a self-supervision training model to be trained, a first data set is enhanced, then obtained first enhanced data are input to the first encoder to obtain first positive sample characteristics, and obtained second enhanced data are input to the second encoder to obtain second positive sample characteristics, wherein the two positive sample characteristics are used for constructing positive sample pairs; for negative sample characteristics which can form a negative sample pair with the second positive sample characteristics, inputting the two positive sample characteristics and the negative sample characteristics into a contrast learning module so that the contrast learning module carries out contrast learning through the positive sample pair and the negative sample pair, and thus a contrast learning result is obtained; and further, adjusting model parameters of the self-supervision training model based on the comparison learning result to obtain a feature extraction model after training. According to the technical scheme, the deep layer features of the first data set can be extracted through the feature extraction model trained through contrast learning, and compared with the original first data set or the shallow layer features extracted for the first data set, the deep layer features do not contain excessive interference information, so that better segmentation accuracy can be obtained when cell segmentation is carried out through the deep layer features.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is a flow chart of execution of a first processor in a model training system provided in accordance with an embodiment of the present disclosure;

FIG. 2 is a flowchart of the execution of a first processor in another model training system provided in accordance with an embodiment of the present disclosure;

FIG. 3 is a flowchart of the execution of a first processor in another model training system provided in accordance with an embodiment of the present disclosure;

FIG. 4 is a flowchart of the execution of a first processor in another model training system provided in accordance with an embodiment of the present disclosure;

FIG. 5 is a flowchart of the execution of a second processor in a cell segmentation system according to an embodiment of the present disclosure;

FIG. 6a is a schematic diagram of an example of data preprocessing in a cell segmentation system provided in accordance with an embodiment of the present disclosure;

FIG. 6b is a schematic diagram of an example of data fusion, feature extraction, and encoding in a cell segmentation system provided in accordance with an embodiment of the present disclosure;

FIG. 6c is a schematic diagram of a comparative learning example in a cell segmentation system provided in accordance with an embodiment of the present disclosure;

fig. 7 is a schematic diagram of the structure of an electronic device implementing the model training system or cell segmentation system in an embodiment of the present disclosure.

Detailed Description

In order that those skilled in the art will better understand the present disclosure, a technical solution in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure, shall fall within the scope of the present disclosure.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The cases of "target", "original", etc. are similar and will not be described in detail herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Fig. 1 is a flowchart of steps configured by a first processor in a model training system provided in an embodiment of the present disclosure. The embodiment is applicable to the case of obtaining a feature extraction model for extracting deep features of a first dataset through self-supervised learning, and is particularly applicable to the case of obtaining the feature extraction model through contrast learning. This step may be performed by a model training means, which may be implemented in software and/or hardware, which may be integrated in the first processor.

Referring to fig. 1, a model training system provided by an embodiment of the present disclosure may include a first processor that may be configured to:

s110, acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises data produced by applying a space histology principle aiming at each cell in at least two cells, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module.

Wherein the first data set may comprise data produced using principles of space histology for each of the at least two cells, the data may comprise at least one of spatial data, protein data, and nucleic acid data. Taking protein data as an example, the protein data can be spatial proteomics data, and the spatial proteomics data can be characterized by space coordinates and protein abundance information; taking again the nucleic acid data as an example, in situ sequencing (spatial genomics data) can be used, which can be characterized by the nucleic acid presence sites and expression levels. In practical applications, optionally, in addition to the above spatial proteomic data and spatial genomic data, the above data may specifically be at least one of spatial transcriptomic data, spatial mass spectrum, spatial metabonomic data, spatial apparent histology, spatial multiunit data, and the like, which is not specifically limited herein.

Illustratively, protein data is taken as an example, which can pass X ε R ^N×D The representation is made wherein X is the first dataset, N is the number of cells contained in X, i.e. the specific number of at least two, D is the number of proteins contained in each cell. In practice, the first data set may optionally include, in addition to the protein data, other data, such as the location data (X, y) of each cell, which may then pass through X ε R ^N×(D+2) The expression is performed wherein d+2 represents the sum of the number of proteins and (x, y).

It should be noted that many examples are described below by taking protein data as an example, but this is merely illustrative and not a specific limitation of the first data set.

The self-monitoring training model may be understood as a deep learning model to be trained, which may implement a feature extraction function, and includes a first encoder, a second encoder, and a contrast learning module, whose specific functions will be described in the following steps. On this basis, the self-supervision training model may further include a first integration mapping module and a second integration mapping module, and/or a first nonlinear coding module and a second nonlinear coding module, and the like, which are not specifically limited herein.

S120, enhancing the first data set to obtain first enhanced data and second enhanced data, inputting the first enhanced data into a first encoder to obtain first positive sample characteristics, and inputting the second enhanced data into a second encoder to obtain second positive sample characteristics.

The first data set is enhanced to obtain first enhanced data and second enhanced data which can be used for constructing the positive sample pair because the self-supervision training model is required to be trained based on contrast learning later. On the basis, the first enhancement data is input to a first encoder, so that the first enhancement data is subjected to feature extraction by the first encoder to obtain first positive sample features. Similarly, the second enhancement data is input to a second encoder, and feature extraction is performed on the second enhancement data by the second encoder, so as to obtain second positive sample features. The combination of the first positive sample feature and the second positive sample feature obtained here can be used as a positive sample pair for subsequent contrast learning.

S130, inputting the first positive sample characteristics, the second positive sample characteristics and the negative sample characteristics constructed aiming at the second positive sample characteristics into a contrast learning module to obtain a contrast learning result.

The negative sample feature may be understood as a negative sample feature constructed for the second positive sample feature, i.e. a negative sample feature that may be combined with the second positive sample feature as a negative sample pair. On the basis, the first positive sample characteristic, the second positive sample characteristic and the negative sample characteristic are input into a contrast learning module, so that contrast learning is performed based on the three characteristics through the contrast learning module, and a corresponding contrast learning result is obtained.

And S140, adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing the segmentation process of at least two cells.

The model parameters of the self-supervision training model are adjusted through comparison of the learning results, so that a feature extraction model can be obtained after multiple rounds of comparison learning, and the feature extraction model comprises a first encoder after training is completed. On this basis, optionally, in the case that the self-supervised training model further includes a first integration mapping module and a second integration mapping module, and/or the first nonlinear encoding module and the second nonlinear encoding module, the feature extraction model may further include the first integration mapping module after the training is completed.

It should be noted that, the feature extraction model obtained through the comparison learning training may extract the deep features of the first data set, and compared with the original first data set or the shallow features extracted for the first data set, the deep features do not contain excessive interference information, which means that compared with the scheme of cell segmentation by applying the first data set or the shallow features, the embodiment of the disclosure can better ensure the segmentation accuracy by applying the deep features to perform cell segmentation.

An optional solution, the second positive sample feature is obtained during a current round of training of the self-supervised training model based on the second enhancement data, and the first processor is further configured to:

acquiring historical sample characteristics obtained in a training process of carrying out historical turns on the self-supervision training model based on second enhancement data, wherein the historical turns occur before the current turns;

based on the historical sample characteristics, negative sample characteristics are constructed.

In order to ensure the model training effect, multiple rounds of training can be performed based on the first data set, namely, iterative training is performed for multiple times based on the first data set. On the basis, in the case that the second positive sample feature is obtained in the training process of performing the current round on the self-supervision training model based on the second enhancement data, the history sample feature obtained in the training process of performing the history round on the self-supervision training model based on the second enhancement data can be applied as the negative sample feature. In practical applications, the number of history turns may be one, two or more, and the number of negative sample features may be one, two or more, respectively. On the basis, when two or more negative sample characteristics exist, a negative sample queue can be constructed based on the negative sample characteristics for application, and by constructing the negative sample queue to apply enough negative sample characteristics, model collapse can be avoided and the utilization of GPU resources can be saved.

According to the technical scheme, the negative sample pair is constructed based on the sample characteristics obtained after the second enhancement data are encoded in different training rounds, the invariance of the second enhancement data can ensure the construction effect of the negative sample pair, and therefore the model training effect is ensured.

In another alternative solution, the self-supervised training model may include a third branch, a fourth branch, and a contrast learning module, the third branch including the first encoder and the first nonlinear encoding module, and the fourth branch including the second encoder and the second nonlinear encoding module, the first processor further configured to:

inputting the first positive sample characteristic into a first nonlinear coding module to obtain a first nonlinear characteristic, and inputting the second positive sample characteristic into a second nonlinear coding module to obtain a second nonlinear characteristic;

inputting the first positive sample feature, the second positive sample feature and the negative sample feature constructed for the second positive sample feature into a contrast learning module to obtain a contrast learning result, wherein the method comprises the following steps:

and inputting the first nonlinear characteristic, the second nonlinear characteristic and the negative sample characteristic constructed aiming at the second nonlinear characteristic into a comparison learning module to obtain a comparison learning result.

The self-supervision training model adopts a two-branch structure, and the two-branch modules have the same constitution and both comprise a nonlinear coding module and an encoder. The two branches are here exemplified as a third branch including the first encoder and the first nonlinear encoding module and a fourth branch including the second encoder and the second nonlinear encoding module. The first nonlinear coding module and the second nonlinear coding module are substantially identical, and thus, the nonlinear coding modules are named differently only for distinguishing the nonlinear coding modules on the two branches, and are not particularly limited in their essential meanings.

A nonlinear coding (Head project) module may be understood as a module specifically prepared for contrast learning, and in practical applications, may optionally include a first linear mapping unit, a modified linear unit (rectified linear unit, RELU), and a second linear mapping unit connected in sequence. And distributing the positive sample features extracted by the encoder to the corresponding Head project module, and further using the nonlinear features output by the Head project module for contrast learning training, thereby ensuring the model training effect.

It should be noted that, since the Head project module is generated for the comparison task and is very close to the comparison task, and the downstream task (cluster segmentation task) is a different new task, the Head project module that is relatively close to the comparison task is not required to be involved in the downstream task, and the Head project module is not required to be included in the feature extraction model obtained by training when features are extracted.

FIG. 2 is a flowchart of steps configured by a first processor in another model training system provided in an embodiment of the present disclosure. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, the data includes positional data for each cell; enhancing the first data set to obtain first enhancement data and second enhancement data, including: mapping the first data set onto a three-dimensional space according to the position data of each cell to obtain three-dimensional space data; and enhancing the three-dimensional space data to obtain first enhancement data and second enhancement data. The same or corresponding terms as those of the above embodiments are not repeated herein.

Referring to fig. 2, the first processor of the present embodiment may be configured to:

S210, acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises position data produced by each cell of at least two cells by applying a space group theory principle, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module.

S220, mapping the first data set to a three-dimensional space according to the position data of each cell to obtain three-dimensional space data.

Wherein in order to better characterize the first data set, the positional data of each cell may be utilized in addition to the protein data of each cell during modeling. Specifically, the first data set is mapped onto a three-dimensional space according to the position data of each cell to obtain three-dimensional space data, and illustratively, protein data of all cells can be filled into one three-dimensional space data according to the position data of each cell. Thus, when the contrast learning is performed through the three-dimensional space data, the semantics (namely, gene expression information) of the space group can be learned, and the real space information can be learned.

S230, enhancing the three-dimensional space data to obtain first enhanced data and second enhanced data, inputting the first enhanced data to a first encoder to obtain first positive sample characteristics, and inputting the second enhanced data to a second encoder to obtain second positive sample characteristics.

S240, inputting the first positive sample characteristics, the second positive sample characteristics and the negative sample characteristics constructed aiming at the second positive sample characteristics into a contrast learning module to obtain a contrast learning result.

S250, adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing a segmentation process of at least two cells.

According to the technical scheme, when the first data set is modeled, the real space information and the protein information (namely the protein data) are combined and trained, so that deep features extracted by the feature extraction model obtained through training are extracted, and the segmentation accuracy of cell segmentation is guaranteed.

An optional technical scheme for enhancing three-dimensional space data to obtain first enhancement data and second enhancement data, includes:

cutting the three-dimensional space data to obtain at least two pieces of block space data;

and enhancing the block space data aiming at each block space data in the at least two block space data to obtain first enhancement data and second enhancement data.

Under the condition that the size of the three-dimensional space data is relatively large, the enhancement data obtained based on the three-dimensional space data cannot be directly input into the self-supervision training model for application. In order to solve the above problem, the three-dimensional space data may be cut to obtain at least two diced space data. Since each of the at least two tile space data is not large in size (or medium in size), the tile space data may be enhanced to obtain enhanced data, such that the medium-sized enhanced data is input into the self-supervising training model for application. According to the technical scheme, the three-dimensional space data can be processed under any large space by dividing the three-dimensional space data, namely, the large-size three-dimensional space data can be trained.

FIG. 3 is a flowchart of steps configured by a first processor in another model training system provided in an embodiment of the present disclosure. The present embodiment optimizes based on the above-described switching space data. In this embodiment, optionally, enhancing the slice spatial data to obtain the first enhancement data and the second enhancement data may include: dividing the block space data to obtain at least two block space data; and enhancing the at least two slice space data to obtain first enhancement data and second enhancement data. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.

Referring to fig. 3, the first processor of the present embodiment may be configured to:

s310, acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises position data produced by each cell of at least two cells by applying a space group theory principle, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module.

And S320, mapping the first data set to a three-dimensional space according to the position data of each cell to obtain three-dimensional space data, and cutting the three-dimensional space data to obtain at least two pieces of block space data.

S330, dividing the block space data according to each block space data in the at least two block space data to obtain at least two block space data.

The method comprises the steps of dividing at least two pieces of block space data into at least two pieces of block space data, wherein the at least two pieces of block space data can be divided for each piece of block space data in the at least two pieces of block space data, and at least two pieces of block space data which can represent local information are obtained. In this way, when the enhancement is performed on at least two slice space data, the method can be applied to global information represented by the slice space data (or global information of all slice space data under the slice space data) and local information represented by the slice space data, and the combination of the global information and the local information is beneficial to improving the model training effect, namely, deep features extracted by the feature extraction model obtained through training can better represent the first data set. In practical applications, the global information and the local information may be understood as semantic information.

S340, enhancing the at least two slice space data to obtain first enhancement data and second enhancement data, inputting the first enhancement data into a first encoder to obtain first positive sample characteristics, and inputting the second enhancement data into a second encoder to obtain second positive sample characteristics.

The at least two slice space data obtained by dividing can be taken as a whole, and the whole is enhanced, so that first enhancement data and second enhancement data are obtained. In practical applications, the enhancement may be optionally performed in a plurality of ways, for example, considering the characteristics of the first dataset, the enhancement cannot be performed in a common manner of flipping, stretching, or toning, where the data includes protein data, by determining a target dimension in a protein number dimension of at least two slice space data, for example, a dimension of 0.5% randomly selected from the protein number dimensions is taken as the target dimension; and then, data of at least two slice space data in the target dimension are interfered to obtain first enhancement data and second enhancement data. Of course, the at least two slice spatial data may also be enhanced by the remaining means, which is not limited herein.

On this basis, optionally, in order to ensure that interactions inside sparse enhancement data can be learned, i.e. correlations between slice spatial data, in view of the sparseness of the enhancement data, the first encoder and/or the second encoder may be implemented based on a visual transducer (Vision Transformer, viT), i.e. ViT is employed as an encoder instead of a convolution structure, so that correlations between slice spatial data can be learned by using a self-attention mechanism.

S350, inputting the first positive sample characteristics, the second positive sample characteristics and the negative sample characteristics constructed aiming at the second positive sample characteristics into a contrast learning module to obtain a contrast learning result.

S360, adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing a segmentation process of at least two cells.

According to the technical scheme, the slice space data are divided to obtain at least two slice space data, and therefore when the slice space data are subsequently enhanced to be applied, the method can be applied to global information represented by the slice space data and local information represented by the slice space data, and the combination application of the global information and the local information is beneficial to enabling deep features extracted by a trained feature extraction model to be better represented by the first data set.

An optional technical scheme, the self-supervision training model comprises a first branch, a second branch and a comparison learning module, wherein the first branch comprises a first encoder and a first integration mapping module, and the second branch comprises a second encoder and a second integration mapping module;

The first processor is further configured to:

acquiring the number of cells contained in each slice space data in at least two slice space data;

inputting the quantity and the first enhancement data into a first integration mapping module to obtain first mapping data, and inputting the quantity and the second enhancement data into a second integration mapping module to obtain second mapping data;

inputting the first enhancement data to a first encoder to obtain a first positive sample feature, and inputting the second enhancement data to a second encoder to obtain a second positive sample feature, comprising:

inputting the first mapping data to a first encoder to obtain a first positive sample characteristic, and inputting the second mapping data to a second encoder to obtain a second positive sample characteristic;

the feature extraction model further comprises a first integration mapping module after training is completed.

The self-supervision training model adopts a two-branch structure, and the two-branch modules have the same constitution and both comprise an integrated mapping module and an encoder. The two branches are referred to herein as a first branch and a second branch, the first branch comprising a first encoder and a first integration mapping module, and the second branch comprising a second encoder and a second integration mapping module, for example. It should be noted that, the first integrated mapping module and the second integrated mapping module are substantially the same, and are all integrated mapping (integration) modules, so that they are named differently only for distinguishing the integration modules on the two branches, and are not limited in their essential meaning. The above-mentioned Embedding module may also be referred to as a fusion module.

In dividing the slice space data, there may be a difference in the number of cells contained in each slice space data of the at least two slice space data obtained by the division, for example, zero, one, or two, where the number of cells contained in each slice space data may be acquired. At least two slice spatial data and the number of cells contained in each slice spatial data have been acquired, which have been used in the model training process. Further, in order to improve the convenience of the subsequent feature extraction in terms of computation, the above-mentioned number and the first enhancement data may be input to a first integration mapping module to be integrated and mapped, so as to obtain first mapping data, and the above-mentioned number and the second enhancement data may be input to a second integration mapping module to be integrated and mapped, so as to obtain second mapping data. On this basis, the first mapping data and the second mapping data may be encoded for feature extraction, respectively. It should be noted that, according to the foregoing, the Embedding module plays a role in improving the convenience in terms of calculation of the subsequent feature extraction, so the feature extraction model obtained by training may also include the first integrated mapping module after training is completed.

On this basis, optionally, the first integration mapping module may implement data integration and data mapping by: summing the first enhancement data in the space long dimension and the space wide dimension, and dividing the obtained summation result by the number to obtain integrated data; acquiring a pre-defined trainable parameter, and performing linear mapping on integrated data based on the trainable parameter to obtain linear mapping data; and obtaining first mapping data based on the integrated data and the linear mapping data.

Wherein the first enhancement data is summed over a spatially long dimension and a spatially wide dimension, and the resulting summation is divided by the number of cells contained in each cell, thereby obtaining integrated data. In practical applications, optionally, to ensure uniform data distribution, the integrated data may be laterally normalized (Layer Normalization), and updated based on the obtained laterally normalized results.

Further, a pre-defined trainable parameter is obtained, optionally, the trainable parameter may relate to a protein number dimension and a pre-set mapping dimension; further, the integrated data is subjected to linear mapping based on the trainable parameters to obtain linear mapping data, for example, vectors of mapping dimensions are distributed to the protein number dimensions of the integrated data based on the trainable parameters, so that the linear mapping data is obtained; still further, the first map data is obtained based on the integrated data and the linear map data, for example, the integrated data is multiplied by the linear map data and averaged, thereby obtaining the first map data.

FIG. 4 is a flowchart of steps configured by a first processor in another model training system provided by an embodiment of the present disclosure. The present embodiment optimizes based on slice spatial data as described above. In this embodiment, optionally, after obtaining the first enhancement data and the second enhancement data, the first processor is further configured to: determining global semantic parameters of the block space data; connecting the global semantic parameters with the first enhancement data to obtain global semantic parameters of the first connection data; inputting the first enhancement data to a first encoder to obtain a first positive sample feature, comprising: inputting the first connection data to a first encoder to obtain a first positive sample feature; on this basis, optionally, the first processor is further configured to: connecting the global semantic parameters with the second enhancement data to obtain second connection data; inputting the second enhancement data to a second encoder to obtain a second positive sample characteristic may include: the second connection data is input to a second encoder, resulting in a second positive sample characteristic. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.

Referring to fig. 4, the first processor of the present embodiment may be configured to:

S410, acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises position data produced by each cell of at least two cells by applying a space group theory principle, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module.

And S420, mapping the first data set onto a three-dimensional space according to the position data of each cell to obtain three-dimensional space data, and cutting the three-dimensional space data to obtain at least two pieces of block space data.

S430, dividing the block space data according to each block space data in the at least two block space data to obtain at least two slice space data, and enhancing the at least two slice space data to obtain first enhancement data and second enhancement data.

S440, determining global semantic parameters of the block space data, connecting the global semantic parameters with the first enhancement data to obtain first connection data, and connecting the global semantic parameters with the second enhancement data to obtain second connection data.

The global semantic parameter can be understood as a parameter for representing global semantics of the block space data, and in practical application, the global semantic parameter can be optionally represented by a custom [ CLS ] tag; still alternatively, the global semantic parameters may be updated during model training. Connecting the global semantic parameters with the first enhancement data to obtain first connection data with global information and local information; correspondingly, the global semantic parameters are connected with the second enhancement data, so that second connection data with global information and local information are obtained.

S450, inputting the first connection data to a first encoder to obtain a first positive sample characteristic, and inputting the second connection data to a second encoder to obtain a second positive sample characteristic.

Since the first connection data has both global information and local information, the first connection data may be encoded by the first encoder to obtain the first positive sample feature, so that the first positive sample feature also has both global information and local information. The second positive sample feature is similar.

S460, inputting the first positive sample feature, the second positive sample feature and the negative sample feature constructed by aiming at the second positive sample feature into a contrast learning module to obtain a contrast learning result.

The first positive sample feature, the second positive sample feature and the negative sample feature have global information and local information at the same time, so that when the comparison learning is performed based on the three sample features, the comparison learning can be performed between the global (i.e. the sliced spatial data) and the local (i.e. the sliced spatial data), in other words, the comparison learning can be performed between the local (i.e. the sliced spatial data), the correlation of the slice spatial data in the adjacent space can be considered, the correlation of the slice spatial data can be considered, and the combination of the global comparison learning and the local comparison learning is beneficial to improving the model training effect.

And S470, adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing the segmentation process of at least two cells.

According to the technical scheme, the global semantic parameters are connected with the enhancement data capable of representing the local semantics, so that the connection data with global information and local information are obtained, the positive sample characteristics obtained by encoding the connection data are provided with the global information and the local information, and the contrast learning module can conduct contrast learning between the global and the local, and the combination of the global contrast learning and the local contrast learning is beneficial to improving the model training effect.

An optional technical solution, the first positive sample feature includes a first positive global feature corresponding to a global semantic parameter and a first positive local feature corresponding to first enhancement data;

the second positive sample features comprise second positive global features corresponding to global semantic parameters and second positive local features corresponding to second enhancement data;

the negative sample feature comprises a negative global feature corresponding to the second positive global feature and a negative local feature corresponding to the second positive local feature;

The contrast learning module performs contrast learning through the following steps:

forming a global positive sample pair based on the first positive global feature and the second positive global feature, forming a global negative sample pair based on the first positive global feature and the negative global feature, and performing contrast learning based on the global positive sample pair and the global negative sample pair to obtain global contrast learning loss;

forming a local positive sample pair based on the first positive local feature and the second positive local feature, forming a local negative sample pair based on the first positive local feature and the negative local feature, and performing contrast learning based on the local positive sample pair and the local negative sample pair to obtain local contrast learning loss;

and obtaining target contrast learning loss based on the global contrast learning loss and the local contrast learning loss, and taking the target contrast learning loss as a contrast learning result.

In other words, global contrast learning loss can be obtained by constructing a global positive sample pair and a global negative sample pair and performing contrast learning based on the global positive sample pair and the global negative sample pair, and correspondingly, local contrast learning loss can be obtained by constructing a local positive sample pair and a local negative sample pair and performing contrast learning based on the local positive sample pair and the local negative sample pair. And then, combining the global contrast learning loss and the local contrast learning loss to obtain a final target contrast learning loss, and then taking the target contrast learning loss as a contrast learning result to adjust model parameters, thereby realizing the effective combination of the global contrast learning and the local contrast learning.

Fig. 5 is a flowchart of steps configured by a second processor in a cell segmentation system provided in an embodiment of the present disclosure. This embodiment may be applied in the case of cell segmentation of the first dataset by self-supervision and clustering. This step may be performed by a cell segmentation means, which may be implemented in software and/or hardware, which may be integrated in a second processor, which may be the same or a different processor than the first processor described above.

Referring to fig. 5, a cell segmentation system provided by an embodiment of the present disclosure may include a second processor that may be configured to:

s510, acquiring a feature extraction model obtained through training by a model training system provided by any embodiment of the disclosure, and a first data set applied in a training process of the feature extraction model, wherein the first data set comprises data generated by applying a space histology principle for each cell in at least two cells.

S520, inputting the first data set into a feature extraction model to obtain the cell features of each cell, and clustering the cell features of each cell to obtain the segmentation result of at least two cells.

According to the technical scheme, the characteristic extraction model obtained based on self-supervision training is obtained, and the cell characteristics of each cell contained in the first data set are extracted based on the characteristic extraction model; because the training process of the feature extraction model enables the extracted cell features to be deep features, the accurate segmentation of at least two cells can be realized by clustering the cell features of each cell.

An alternative solution, the data comprising position data for each cell, the training process applying the position data for each cell;

inputting the first data set into a feature extraction model to obtain the cell features of each cell, and clustering the cell features of each cell to obtain the segmentation result of at least two cells, wherein the method comprises the following steps:

mapping the first data set into a three-dimensional space according to the position data of each cell to obtain three-dimensional space data, and inputting the three-dimensional space data into a feature extraction model to obtain the cell feature of each cell;

for each of the cell characteristics of each cell, determining a target cell corresponding to the cell characteristic from at least two cells based on the position data corresponding to the cell characteristic and the position data of each cell, and assigning the cell characteristic to the target cell;

Clustering the cell characteristics of each cell, and obtaining a segmentation result of at least two cells according to the obtained clustering result and the distribution result of the cell characteristics of each cell.

When the training process of the feature extraction model involves the position data of each cell, for each cell feature in all the cell features, the cell feature can be allocated to the corresponding cell (i.e., the target cell) according to the position data corresponding to the cell feature and the position data of each cell, and the target cell can be understood as the cell with the cell feature; further, the cell characteristics of each cell are clustered, and the segmentation result of at least two cells is obtained according to the obtained clustering result and the distribution result of the cell characteristics of each cell (namely, the cells to which the cell characteristics belong respectively).

In order to better understand the above-described respective technical solutions as a whole, an exemplary description thereof is given below in conjunction with specific examples. By way of example only, and not by way of limitation,

1) Data preprocessing, see fig. 6a:

given protein data X ε R ^N×(D+2) Where N is the number of cells contained in X, D is the number of proteins contained in each cell (fig. 6a is exemplified by d=46), d+ 2 represents the sum of the number of proteins and the position data (x, y). Converting X into three-dimensional space data X' E R according to (X, y) ^H×W×D Where H is the high after mapping and W is the wide after mapping, both of which can be determined from the maximum of all (x, y). Then, X' is cut into n non-overlapping diced spatial data (Patch), denoted as P _i ∈R ^100×100×D I=1, 2, …, n, P can be _i Processed as one sample data. P (P) _i Further divided into 100 pieces of slice space data (token) with length and width of 10, denoted as T E R ^{100×10×10×D} And the number of cells contained in each token is recorded separately, denoted C.epsilon.R ¹⁰⁰ ，C _i The number of (C) ranges from 0 to 2 _i ∈(0,2]I=1, …,100. For T, randomly selecting 0.5% of the dimension in D dimension (i.e. protein number dimension) for random interference to generate enhanced samples corresponding to token (i.e. first enhanced data Q _Input And second enhancement data K _Input )。

2) An Embedding module, see fig. 6b:

and after data preprocessing, obtaining T and C corresponding to the T input into the coding module. First, for T _i Summing in terms of the spatially long dimension and spatially wide dimension, then dividing by the corresponding C _i I=1,..100. To ensure uniform data distribution, layer Normalization is then performed:

e _x ＝LayerNorm(e _x )，

Wherein e _x ∈R ^100×D The method comprises the steps of carrying out a first treatment on the surface of the Then, define a trainable parameter of (D, end_size) as e _x D dimensions of (a) are allocated with an ebed_size dimension vector, and an Embedding operation (table lookup) is performed, that is, the sequence number of dimension D is searched in the trainable parameters, and the corresponding vector is returned:

e _d ＝Embedding(e _x )，

wherein the method comprises the steps of，e _d ∈R ^{100×D×embed_size} The method comprises the steps of carrying out a first treatment on the surface of the Finally, e _x And e _d The multiplication averages and serves as the output of the Embedding module after Layer Normalization:

e＝LayerNorm(e)，

wherein e is E R ^{100×embed_size} (example of example is ebed_size=128 in fig. 6 b).

3) ViT module (i.e. encoder), see fig. 6b:

a custom CLS tag is assigned to the first position of the output e, which characterizes the global semantic data of the current Patch, and then assigns the corresponding position code, which is added to e as input to the encoder. The encoder here uses ViT (V) to model to locally correlate between various token in the Patch:

e＝Concat([CLS],e)，

X _e ＝V(e)，

wherein X is _e ∈R ^{101×embed_size} 。

4) Head project module, see FIG. 6b:

pair X by Head project (H) module _e For further nonlinear encoding, the specific structure of the Head project module may consist of the following units: the linear mapping is performed first, then RELU, and finally linear mapping. X is to be _e Input to the Head project module, the first location corresponds to [ CLS ] ]The remaining positions correspond to the code of token:

Y _CLS ＝H(X _e (:,0,:))，

Y _tokens ＝H(X _e (:,1:,:))，

wherein Y is _CLS ∈R ^d ，Y _tokens ∈R ^100×d D represents the dimension output by the Head project module. The feature vector representation (i.e., positive sample features) of each cell thus far obtained is denoted Y ε R ^N×d ：

Y＝f(X′)

It should be noted that f adopts a two-branch structure, and encapsulates an encapsulation module, a ViT module, and a Head injection module, which are denoted as f respectively _Q (similar to the first and third branches above) and f _K (similar to the second and fourth branches above), the module structure of both branches is identical. At the time of reverse updating, f _Q Gradient update, f _K Only momentum updates are made:

θ _k ＝mθ _k +(1-m)θ _q ，m＝0.999，

where θ is the model parameter of f.

For the first enhancement data Q set forth above _Input And second enhancement data K _Input The two enhancement data are respectively processed by two branches to generate a first positive global feature q of a Patch _CLS And a second positive global feature k _CLS And generating a first positive local feature q of the token _tokens And a second positive local feature k _tokens ：

q _CLS ,q _tokens ＝f _Q (Q _Input )，

k _CLS ,k _tokens ＝f _K (K _Input )，

Wherein q _CLS, k _CLS ∈R ^d ,q _tokens, k _tokens ∈R ^100×d Will q _CLS And k _CLS As a global positive sample pair, and q _tokens And k _tokens As a local positive sample pair.

5) MoCo module (i.e. contrast learning module), see fig. 6c:

constructing two negative sample queues, denoted k _queue1 And k _queue2 Wherein k is _queue1 For storing the current f _K K under the branch _CLS To act as a negative global feature, q, during the training of the next round _CLS And k is equal to _queue1 As a global negative sample pair; k (k) _queue2 For storing the current f _K K under the branch _tokens To be belowActing as a negative local feature, q, during a round of training _tokens And k is equal to _queue2 As a global positive sample pair. The enqueuing operation of each queue is first in first out, i.e. the feature stored earliest in the queue is rejected each time and the current feature is filled in. Finally, two different levels of comparison tasks are respectively carried out based on [ CLS ]]And token-based contrast learning.

The loss function employed in this example is InfoNCE loss, which consists essentially of two loss: one is based on [ CLS ]]Loss function for contrast learningThe other is a loss function based on token for comparison>Final target loss->Get->And->Is a weighted sum of:

wherein τ1, τ2 represent for adjustment respectivelyAnd->Temperature coefficient of (c); k1 and k2 are respectively k _queue1 And k is equal to _queue2 Is of a size of (2); epsilon was set to 0.5./>

6) Clustering

After contrast learning, the learned cell characteristics of each cell are clustered and segmented, belonging to downstream tasks. Firstly, fusing the learned token feature vector with a corresponding [ CLS ] feature vector to be used as a final feature vector of the token; then mapping back to the original position according to the (x, y) corresponding to the token, finding out the corresponding cell and giving the final feature vector to the cell; next, KMeas was used for cluster segmentation.

To verify the validity of the above examples, experiments were performed based on CRC data and TNBC data, and the experimental procedure and experimental results are as follows:

data set and preprocessing method:

CRC data 35 advanced colorectal cancer (CRC) patient data obtained by CODEX, which contains 140 tissue regions and 56 protein markers and corresponding positional data. The whole data contained 238385 cells, with the functional protein markers removed and 46 proteins remaining. The preprocessing stage is divided according to the tissue area, then mapped into a three-dimensional space, diced into blocks, firstly cut into 100×100, then divided into 10×10 token, and finally data enhancement is carried out.

TNBC data: 41 triple negative breast cancer patients (TNBC) data obtained by MIBI-TOF are firstly subjected to removal of functional protein markers, the remaining 27 protein data and corresponding position data are mapped into three-dimensional space data, then the three-dimensional space data are firstly cut into 128×128 pieces, then the three-dimensional space data are divided into 8×8 token pieces, and finally data enhancement is carried out.

Parameter setting: the coding part uses Transformer Block of 8 layers and 8 heads, and the dimension of an Embedding layer is set to 128,Head Projection layers and 128 layers; the length of the negative sample queue storing [ CLS ] is set to 64000 and the length of the negative sample queue storing token is set to 65536; τ1 and τ2 were set to 0.07 and 0.03, respectively. The batch size=32 for the training process, the learning rate was set to 1e-6. The Adam optimizer is adopted to update parameters, about 100 rounds of training are carried out, and a model with smaller loss is selected.

Training process: observing the descending condition of loss, and selecting and storing the model parameters of the better f.

The prediction process comprises the following steps: loading the stored model parameters, and selecting only f _Q The module structure and model parameters under the branch are removed from the Head project module, and the output result is only the result output by adopting the ViT module, namely [ CLS ]]And the feature vector of each token. Then, the code vector of the filling position is shielded, and the feature vector of the token is recorded and corresponds to the feature vector [ CLS ]]Feature vectors, (x, y) and corresponding tissue region coding, etc. The recorded data is then further processed to find the corresponding cell based on the mapping between (x, y) and the cell's original location and to match token's feature vector with the corresponding [ CLS ]]To which it is assigned. Finally, the feature vector of token is combined with [ CLS ]]And (3) summing the eigenvectors according to the ratio to obtain a final eigenvector, and then carrying out KMeans clustering on the final eigenvector to obtain a final segmentation result.

Experimental analysis: the above experiments outline classical follicular structures and verify enriched T cell-macrophage cross-over, and can extract a bounded layout of tertiary lymphoid structures without supervision, indicating classical crohn-like response behavior.

The embodiment of the disclosure provides a model training method. The embodiment is applicable to the case of obtaining a feature extraction model for extracting deep features of a first dataset through self-supervised learning, and is particularly applicable to the case of obtaining the feature extraction model through contrast learning. The method may be performed by a model training apparatus, which may be implemented in software and/or hardware, which may be integrated in the first processor as set forth in any of the embodiments described above.

The method of the embodiment of the disclosure specifically comprises the following steps:

s610, acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises data produced by applying a space histology principle aiming at each cell in at least two cells, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module.

S620, enhancing the first data set to obtain first enhanced data and second enhanced data, inputting the first enhanced data to a first encoder to obtain first positive sample characteristics, and inputting the second enhanced data to a second encoder to obtain second positive sample characteristics.

S630, inputting the first positive sample feature, the second positive sample feature and the negative sample feature constructed by aiming at the second positive sample feature into a contrast learning module to obtain a contrast learning result.

And S640, adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing the segmentation process of at least two cells.

According to the technical scheme of the embodiment of the disclosure, for a first encoder, a second encoder and a contrast learning module in a self-supervision training model to be trained, a first data set is enhanced, then obtained first enhanced data are input to the first encoder to obtain first positive sample characteristics, and obtained second enhanced data are input to the second encoder to obtain second positive sample characteristics, wherein the two positive sample characteristics are used for constructing positive sample pairs; for negative sample characteristics which can form a negative sample pair with the second positive sample characteristics, inputting the two positive sample characteristics and the negative sample characteristics into a contrast learning module so that the contrast learning module carries out contrast learning through the positive sample pair and the negative sample pair, and thus a contrast learning result is obtained; and further, adjusting model parameters of the self-supervision training model based on the comparison learning result to obtain a feature extraction model after training. According to the technical scheme, the deep layer features of the data set can be extracted through the feature extraction model obtained through comparison learning training, and compared with the original first data set or the shallow layer features extracted for the first data set, the deep layer features do not contain excessive interference information, so that better segmentation precision can be obtained when cell segmentation is carried out through the deep layer features.

Embodiments of the present disclosure provide a cell segmentation method. This embodiment may be applied in the case of cell segmentation of the first dataset by self-supervision and clustering. The method may be performed by a cell segmentation apparatus, which may be implemented in software and/or hardware, which may be integrated in a second processor as set forth in any of the embodiments described above.

s710, acquiring a feature extraction model obtained through training by the model training method provided by any embodiment of the disclosure, and a first data set applied in the training process of the feature extraction model, wherein the first data set comprises data generated by applying the principle of space histology for each cell in at least two cells.

S720, inputting the first data set into the feature extraction model to obtain the cell features of each cell, and clustering the cell features of each cell to obtain the segmentation result of at least two cells.

As used herein, the term "sequencer" generally refers to a device that is used to determine the sequence of genetic material of a sample. The sequencer can function in a variety of ways and based on a variety of techniques, including sequencing by primer extension using labeled or unlabeled nucleotides, such as sequencing while ligating or pyrophosphate sequencing, etc., for example, using any of the Sanger dideoxy methods, nanopores, or "NexGen" sequencing methods of the art (e.g., sequencing platforms using MGI, ROCHE 454 sequencing platform, ILLUMINATM SOLEXATM sequencing platform, mollictm sequencing platform of LIFE TECHNOLOGIES/APPLIED BIOSYSTEMS, smrtm sequencing platform of PACIFIC BIOSCIENCES, POLLONATOR Polony sequencing platform, COMPLETE GENOMICS sequencing platform, INTELLIGENT BIOSYSTEMS sequencing platform, HELICOS sequencing platform, or any other sequencer and system in the art).

Real-time fluorescent quantitative PCR (Quantitative Real-time PCR, QPCR) is a method of measuring the total amount of product after each Polymerase Chain Reaction (PCR) cycle in a DNA amplification reaction using fluorescent chemicals. And quantitatively analyzing the specific DNA sequence in the sample to be tested by an internal reference method or an external reference method.

QPCR is a technique based on immunological, biochemical and microscopic techniques. According to the principle of antigen-antibody reaction, a known antigen or antibody is firstly marked with a fluorescent group, and then the fluorescent antibody (or antigen) is used as a probe to detect the corresponding antigen (or antibody) in cells or tissues. The cells or tissues where the fluorescence is located can be visualized using a fluorescence microscope to determine the nature and location of the antigen or antibody, and the content determined using quantitative techniques such as flow cytometry. But the detection is realized by adopting an optical fiber conduction technology, the optical fiber conduction has light attenuation, the conduction path is long, and the structure is complex.

The above model training system or cell segmentation system may be implemented in electronic devices intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, sequencers, gene sequencing systems, large population genomics one-stop technology platforms, PCR instruments, spatiotemporal-histology integrated machines, laboratory automation systems, sample preparation devices, dispensing devices, library production devices, pipetting devices, bead detection devices, nucleic acid purification devices, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

By way of example, an electronic device 10 as shown in fig. 7 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively coupled to the at least one processor 11, in which the memory stores computer programs executable by the at least one processor, the processor 11 may perform various suitable actions and processes in accordance with the computer programs stored in the Read Only Memory (ROM) 12 or the computer programs loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 is a computing unit and may be various general and/or special purpose processing components with processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a central processing unit (Central Processing Unit ), a graphics processing unit (Graphic Processing Units, graphics processing unit), various specialized artificial intelligence (Artificial Intelligence, AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (Digital Signal Processor, DSP), and any suitable processor, controller, microcontroller, etc. The processor 11 may perform the various methods and processes described above, such as a model training method or a cell segmentation method.

In some embodiments, the model training method or the cell segmentation method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the model training method or cell segmentation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the model training method or the cell segmentation method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit System, FPGA (Field Programmable Gate Array ), ASIC (Application-Specific Integrated Circuit, application-specific integrated circuit), ASSP (Application Specific Standard Product, special-purpose standard product), SOC (System On Chip ), CPLD (Complex Programmable Logic Device, complex programmable logic device), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a computer-readable storage medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may be a machine readable signal medium or a machine readable storage medium. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, RAM, ROM, EPROM (Electrically Programmable Read-Only-Memory, erasable programmable read-Only Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A model training system, comprising:

a first processor configured to:

acquiring a first data set and a self-supervision training model to be trained, wherein the first data set comprises data produced by applying a space histology principle for each cell in at least two cells, and the self-supervision training model comprises a first encoder, a second encoder and a comparison learning module;

Enhancing the first data set to obtain first enhancement data and second enhancement data, inputting the first enhancement data to the first encoder to obtain first positive sample characteristics, and inputting the second enhancement data to the second encoder to obtain second positive sample characteristics;

inputting the first positive sample characteristics, the second positive sample characteristics and the negative sample characteristics constructed for the second positive sample characteristics into the contrast learning module to obtain contrast learning results;

and adjusting model parameters of the self-supervision training model according to the comparison learning result to obtain a feature extraction model, wherein the feature extraction model is used for realizing the segmentation process of the at least two cells.

2. The system of claim 1, wherein the second positive sample feature is obtained during a current round of training of the self-supervised training model based on the second enhancement data;

the first processor is further configured to:

acquiring a history sample feature obtained in a training process of performing a history round on the self-supervision training model based on the second enhancement data, wherein the history round occurs before the current round;

And constructing and obtaining the negative sample characteristic based on the history sample characteristic.

3. The system of claim 1, wherein the data generated using the principles of space histology includes location data for each of the cells, and wherein the enhancing the first data set to obtain first enhanced data and second enhanced data comprises:

mapping the first data set to a three-dimensional space according to the position data of each cell to obtain three-dimensional space data;

and enhancing the three-dimensional space data to obtain first enhancement data and second enhancement data.

4. The system of claim 3, wherein the enhancing the three-dimensional spatial data to obtain first enhancement data and second enhancement data comprises:

5. The system of claim 4, wherein enhancing the diced space data to obtain first enhancement data and second enhancement data comprises:

Dividing the dicing space data to obtain at least two pieces of dicing space data;

and enhancing the at least two slice space data to obtain first enhancement data and second enhancement data.

6. The system of claim 5, wherein after the obtaining the first enhancement data and the second enhancement data, the first processor is further configured to:

determining global semantic parameters of the block space data;

connecting the global semantic parameters with the first enhancement data to obtain first connection data;

said inputting said first enhancement data to said first encoder resulting in a first positive sample characteristic comprising: and inputting the first connection data to the first encoder to obtain a first positive sample characteristic.

7. The system according to claim 6, wherein:

the first positive sample feature comprises a first positive global feature corresponding to the global semantic parameter and a first positive local feature corresponding to the first enhancement data;

the second positive sample feature comprises a second positive global feature corresponding to the global semantic parameter and a second positive local feature corresponding to the second enhancement data;

The negative sample feature includes a negative global feature corresponding to the second positive global feature and a negative local feature corresponding to the second positive local feature;

and obtaining target contrast learning loss based on the global contrast learning loss and the local contrast learning loss, and taking the target contrast learning loss as the contrast learning result.

8. The system of claim 5, wherein the self-supervised training model comprises a first branch, a second branch, and the contrast learning module, the first branch comprising the first encoder and a first integration mapping module, the second branch comprising the second encoder and a second integration mapping module;

The first processor is further configured to:

acquiring the number of cells contained in each slice space data in the at least two slice space data;

inputting the quantity and the first enhancement data to the first integration mapping module to obtain first mapping data, and inputting the quantity and the second enhancement data to the second integration mapping module to obtain second mapping data;

the inputting the first enhancement data to the first encoder to obtain a first positive sample feature, and the inputting the second enhancement data to the second encoder to obtain a second positive sample feature, includes:

inputting the first mapping data to the first encoder to obtain a first positive sample characteristic, and inputting the second mapping data to the second encoder to obtain a second positive sample characteristic;

the feature extraction model further comprises the first integration mapping module after training is completed.

9. The system of claim 8, wherein the first integration mapping module performs data integration and data mapping by:

summing the first enhancement data in a space long dimension and a space wide dimension, and dividing the obtained summation result by the number to obtain integrated data;

Acquiring a predefined trainable parameter, and performing linear mapping on the integrated data based on the trainable parameter to obtain linear mapping data;

and obtaining the first mapping data based on the integrated data and the linear mapping data.

10. The system of claim 5, wherein the data further comprises protein data for each cell, the enhancing the at least two slice spatial data to obtain first enhanced data and second enhanced data, comprising:

determining a target dimension of the protein number dimensions of the at least two slice spatial data;

and the data of the at least two slice space data in the target dimension are interfered to obtain first enhancement data and second enhancement data.

11. The system of claim 1, wherein the self-supervised training model includes a third branch, a fourth branch, and the contrast learning module, the third branch including the first encoder and first nonlinear encoding module, and the fourth branch including the second encoder and second nonlinear encoding module, the first processor further configured to:

Inputting the first positive sample characteristic into the first nonlinear coding module to obtain a first nonlinear characteristic, and inputting the second positive sample characteristic into the second nonlinear coding module to obtain a second nonlinear characteristic;

the step of inputting the first positive sample feature, the second positive sample feature and the negative sample feature constructed by aiming at the second positive sample feature to the contrast learning module to obtain a contrast learning result comprises the following steps:

and inputting the first nonlinear characteristic, the second nonlinear characteristic and the negative sample characteristic constructed aiming at the second nonlinear characteristic into the contrast learning module to obtain a contrast learning result.

12. A cell segmentation system, comprising:

a second processor configured to:

acquiring a feature extraction model trained by a model training system according to any one of claims 1-11 and a first data set applied in the training process of the feature extraction model, wherein the first data set comprises data produced by applying the principle of space histology for each of at least two cells;

inputting the first data set into the feature extraction model to obtain the cell features of each cell, and clustering the cell features of each cell to obtain the segmentation result of the at least two cells.

13. The system of claim 12, wherein the data comprises location data for each cell, the training process applying the location data for each cell;

inputting the first data set into the feature extraction model to obtain the cell feature of each cell, and clustering the cell feature of each cell to obtain the segmentation result of the at least two cells, wherein the method comprises the following steps:

mapping the first data set into a three-dimensional space according to the position data of each cell to obtain three-dimensional space data, and inputting the three-dimensional space data into the feature extraction model to obtain the cell feature of each cell;

for each cell characteristic in the cell characteristics of each cell, determining a target cell corresponding to the cell characteristic from the at least two cells according to the position data corresponding to the cell characteristic and the position data of each cell, and distributing the cell characteristic to the target cell;

clustering the cell characteristics of each cell, and obtaining the segmentation result of the at least two cells according to the obtained clustering result and the distribution result of the cell characteristics of each cell.

14. A method of model training, comprising:

15. A method of cell segmentation comprising:

Acquiring a feature extraction model trained according to the model training method of claim 14 and a first data set applied in the training process of the feature extraction model, wherein the first data set comprises data generated by applying the principle of space histology for each of at least two cells;

16. A computer readable storage medium storing computer instructions for causing a processor to perform the model training method of claim 14 or the cell segmentation method of claim 15 when executed.