CN107389536B

CN107389536B - Flow cell particle classification counting method based on density-distance center algorithm

Info

Publication number: CN107389536B
Application number: CN201710641341.0A
Authority: CN
Inventors: 陶靖
Original assignee: Shanghai Nano Derivatives Technology Co Ltd
Current assignee: Shanghai Nano Derivatives Technology Co Ltd
Priority date: 2017-07-31
Filing date: 2017-07-31
Publication date: 2020-03-31
Anticipated expiration: 2037-07-31
Also published as: CN107389536A

Abstract

The invention relates to a flow cytometry particle classification counting method based on a density-distance center algorithm, which comprises the following steps: 1) acquiring a flow data set of cell particles to be classified and counted by adopting a flow cytometry, wherein the flow data set comprises multi-dimensional data of the particles; 2) obtaining local density and distance parameters of each particle in the streaming data set according to a density-distance center algorithm, screening and sequencing to obtain an initial cluster center to be clustered; 3) and taking the initial cluster center as an initial value of a mixed model algorithm, clustering the particle swarm according to the mixed model to obtain a plurality of classified particle clusters, and counting. Compared with the prior art, the method has the advantages of high accuracy, good stability, adaptability to the distribution of streaming data, adaptability to the classification of small sample particle swarms, high calculation speed and the like.

Description

Flow cell particle classification counting method based on density-distance center algorithm

Technical Field

The invention relates to the field of cell particle classification measurement, in particular to a flow cell particle classification counting method based on a density-distance center algorithm.

Background

Flow Cytometry (FCM) is a quantitative analysis technique using a flow cytometer, which utilizes the hydrodynamic focusing principle to align cells or microparticles to be analyzed in a row, rapidly flow detection light beams one by one, and measure multi-angle scattered light and multi-color fluorescence caused by the cells or microparticles by a high-precision optical system, electronic signal processing and computer data analysis, thereby obtaining physical and chemical characteristics such as the size, internal structure, nucleic acid, protein, and the like of tens of thousands of cells or microparticles in a short time. Flow cytometry is an important basic scientific research instrument for advanced scientific research in the field of biomedical science due to the advantages of rapidness, accuracy, large batch, multi-parameter analysis and the like of flow cytometry; meanwhile, the device is also an important clinical examination device.

Multi-angle scattered light and multicolor fluorescence caused by each cell or particle are collected by an optical system and converted into electric signals by a photoelectric sensor, the electric signals are processed and sampled into digital signals, and the digital signals are stored and analyzed by a computer; the characteristic data of all cells or particles acquired by the flow cytometer is called flow data.

Traditionally, analysis of streaming data relies on experienced personnel projecting the data into a two-dimensional scattergram and then using area gating to analyze the clusters of interest, such as classification and counting, known as manual gating. With the continuous development of flow cytometry, the amount of flow data is multiplied, and the automatic analysis of the data becomes a main direction for the future development of flow cytometry technology. For the cluster analysis of streaming data, some automatic analysis methods are proposed in sequence, and mainly classified into a clustering method based on probability distribution and a clustering method based on spatial information.

The clustering method based on probability distribution is mainly a finite mixed model clustering algorithm, such as a Gaussian mixed model algorithm based on Bayesian information criterion, and the algorithm has better processing capacity on a cell group consisting of normal or near-normal distributed data sets; the t-distribution mixed model algorithm converts the data of the non-normal distribution into the near-normal distribution, and replaces a Gaussian mixed model to perform cluster analysis on the flow data; and a skewed t-distribution mixed model algorithm can better process data in asymmetric distribution. The hybrid model clustering algorithms are continuously developed, and the adaptability of the models to different data distributions is improved. However, the solutions found by the mixture models themselves, such as gaussian, t-and biased t-distributions, are locally optimal, so that the clustering algorithm based on the finite mixture model depends on the position of the initial point (i.e., the cluster center). Because actual data is often complex, for example, under the condition of more noise points, the mixed model clustering algorithm has wrong scores, and the stability of the algorithm is not high.

The clustering method based on the spatial information is another main method for analyzing the streaming data, such as a K-means algorithm and a DBSCAN algorithm, and the clustering capability of the streaming data is limited. The clustering algorithm based on the finite mixture model is more suitable for analyzing the streaming data and is applied more. Since the finite mixture model-based clustering algorithm depends on the location of the initial point (i.e., the cluster center), it is sensitive to the initial value of the model. The clustering algorithm based on K-means and a mixed model is usually random for selecting the central point of the initial cluster, people are used to make the mutual distance of the initial clustering centers as far as possible, but the K-means algorithm itself obtains a local optimal solution, so that the random initial value still possibly falls into local optimal, the initial value of the model is difficult to be stably selected, and the accuracy and the stability of the result cannot be ensured.

In practical situations, streaming data is often complex, and various adverse situations have great challenges in clustering analysis of the streaming data, for example, when there are many noise points, the noise points are sometimes mistakenly classified into a single cluster by the predecessor method. In addition, the small sample size and sparsely distributed clusters do not provide a good solution. For example, in the classification analysis of leukocytes in human peripheral blood, monocytes usually account for 2% to 10% of the total leukocytes, eosinophils usually account for 1% to 6% of the total leukocytes, lymphocytes account for about 40% and granulocytes account for about 50%, which are the most predominant group. In such multi-class clustering analysis, the number of large sample classes and small sample classes are very different and close to each other, and the difficulty is the positioning and distinguishing of the small sample classes. The small sample group is easy to be interfered by the adjacent dominant group due to small sample amount and sparse distribution, and is wrongly divided into one part of other groups, so that the small sample group has high requirements on the discrimination and stability of the algorithm.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a flow cytometry particle classification counting method based on a density-distance center algorithm.

The purpose of the invention can be realized by the following technical scheme:

a flow cytometry particle classification counting method based on a density-distance center algorithm comprises the following steps:

1) acquiring a flow data set of cell particles to be classified and counted by adopting a flow cytometry, wherein the flow data set comprises multi-dimensional data of the particles;

2) obtaining local density and distance parameters of each particle in the streaming data set according to a density-distance center algorithm, screening and sequencing to obtain an initial cluster center to be clustered;

3) and taking the initial cluster center as an initial value of a mixed model algorithm, clustering the particle swarm according to the mixed model to obtain a plurality of classified particle clusters, and counting.

In the step 1), when the data in the streaming data set is two-dimensional data, taking the data of a forward scattering light channel as a y-axis and the data of a side scattering light channel as an x-axis to form a two-dimensional scattergram; or taking the data of the side scattered light channel as a y axis and the data of the fluorescence channel as an x axis to form a two-dimensional scatter diagram; when the data in the streaming data set is three-dimensional data, the data of the forward scattering light channel is taken as an x-axis, the data of the side scattering light channel is taken as a y-axis, and the data of the fluorescence channel is taken as a z-axis to form a three-dimensional scattergram.

The step 2) specifically comprises the following steps:

21) for streaming data set S ═ x₁,x₂...x_i...x_nDefine the ith particle x therein_iLocal density of (p)_iAnd a distance delta_iThe parameters are respectively;

wherein d is_ijIs x_iTo x_jEuclidean distance of d_cχ (x) is a function of the truncation distance;

22) setting a local density threshold ρ₀And excluding particles having a local density less than a threshold;

23) arranging all the remaining particles into a sequence according to the sequence of the distances from large to small;

24) and setting the number k of the clusters, and sequentially selecting the first k particles as initial cluster centers to be clustered according to the sequence.

In the step 21) described above, the step,

when the ith particle is the point with the highest local density, the value is delta_iThe maximum of the distances from the ith particle to all points is then:

in the step 21) described above, the step,

when there are a plurality of particle points having the same local density, an increment approaching 0 is added to the local density, and then the local density and distance parameters of each particle are recalculated.

In the step 24), when the euclidean distance between the centers of the two clusters is smaller than the set threshold, the centers are regarded as the same cluster, and any one point in the centers of the two clusters is taken as a new cluster center, or a point with a higher local density in the centers of the two clusters is taken as a new cluster center.

In the step 3), the mixed model algorithm comprises a Gaussian mixed model, a t-distribution mixed model and a partial t-distribution mixed model.

Compared with the prior art, the invention has the following advantages:

firstly, the accuracy is high, and the stability is good: the density-distance center algorithm is adopted to find the initial center of each particle group, so that the clustering process is high in accuracy and good in stability, and the situation of wrong classification caused by local optimal solution is avoided.

Secondly, adapting the distribution of streaming data: and a mixed model (such as a Gaussian model, a t-distribution mixed model, a partial t-distribution mixed model and the like) is adopted for clustering, so that the distribution characteristics of the streaming data can be effectively adapted.

Thirdly, adapting to the classification of small sample particle swarms: the method can effectively process the small sample particle swarm and has high positioning and classifying accuracy.

Fourthly, the calculation speed is high: and determining an initial cluster center by a density-distance center algorithm, wherein the initial cluster center is used as an initial center value of a mixed model clustering algorithm, and the calculation speed is accelerated.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

FIG. 2 is a schematic diagram of embodiment I of the present invention, wherein FIG. 2a is a distance-density distribution diagram, FIG. 2b is a two-dimensional scatter diagram, and FIG. 2c is the result after clustering.

FIG. 3 is a schematic diagram of example II of the present invention, in which FIG. 3a is a distance-density distribution diagram, FIG. 3b is a two-dimensional scatter diagram, and FIG. 3c is the result after clustering.

FIG. 4 is a schematic view of example III of the present invention, in which FIG. 4a is a distance-density distribution diagram, FIG. 4b is a two-dimensional scatter diagram, and FIG. 4c is a result after clustering.

Detailed Description

The invention is described in detail below with reference to the figures and specific embodiments.

The invention provides a density-distance center-based mixed model stream data clustering method, which applies a density-distance center algorithm to the positioning of an initial clustering center of stream data to determine an initial cluster center, thereby ensuring the stability and accuracy of a finite mixed model result. The method integrates methods based on probability distribution and spatial information (density and distance), so that the problem of distinguishing small sample groups can be well solved, and meanwhile, the method is strong in noise resistance, high in stability and good in accuracy.

Fig. 1 shows a specific flow of the clustering method for processing streaming data according to the present invention. The following clustering steps are described in detail below with reference to fig. 1:

in step 401, a flow cytometer is used to obtain a flow data set to be analyzed, such as characteristic data of cellular particles, including detected amounts of multi-angle scattered light and polychromatic fluorescence. The streaming dataset to be analyzed contains multidimensional data of the particles. When the data in the streaming data set is two-dimensional data, such as data including forward scattered light channel data and side scattered light channel data, the forward scattered light channel data may be used as the y-axis, and the side scattered light channel data may be used as the x-axis to form a two-dimensional scattergram, as shown in fig. 2 (b); if the data comprises the data of the side scattering light channel and the data of the fluorescence channel, the data of the side scattering light channel can be used as a y axis, and the data of the fluorescence channel can be used as an x axis to form a two-dimensional scattergram; when the data in the streaming data set is three-dimensional data, such as data including forward scattered light channel data, data of side scattered light channel, and data of fluorescence channel, the three-dimensional scattergram can be formed by using the forward scattered light channel data as the x-axis, the data of side scattered light channel as the y-axis, and the data of fluorescence channel as the z-axis.

In step 402, for the streaming data set to be analyzed, the local density and distance parameters of each particle are obtained by a density-distance center algorithm, represented in a distance-density profile, as shown in fig. 2 (a).

For a dataset to be clustered, S ═ { x₁,x₂,…,x_nDefine the ith particle x therein_iLocal density of (p)_iAnd a distance delta_iTwo parameters (i ∈ [1, n ]]). The local density reflects the density of data within a certain interval, which is defined as follows:

wherein the function

Parameter d_ijIs represented by x_iTo x_jSuch as the spatial euclidean distance. Parameter d_c>0 is a truncation distance, which is preset according to actual sample data, if d is taken_c5. From the formula (1), the local density ρ_iRepresenting the sum of x in the data set_i(excluding itself) is less than d_cThe number of data points of (a).

Distance delta to a point_iThe definition of (1) is to calculate the distance from the point to all points with the density larger than the local density, and take the minimum value, and the specific formula is as follows:

if this point is already the point of maximum local density, then δ_iThe value is assigned as the maximum of its distances to all points.

According to formulas (1) to (4), each point x_iA local density p can be obtained_iAnd a distance value delta_i。

Specifically, if there are a plurality of particle points having the same local density, an increment approaching 0 is added to the local density, and then the local density and distance parameters of each particle are recalculated.

In step 403, a local density threshold ρ is set₀And making a judgment. If the local density of a particle spot is less than the threshold value p₀The particle spot is deleted from the data set.

In step 404, all the remaining particles are arranged in a sequence with the distances from large to small.

In step 405, a cluster number k is set, and the first k particles are sequentially selected as the initial cluster center to be clustered according to the sequence.

For certain streaming data which is determined to be analyzed, the number of the clusters to be classified of the same type of experimental sample is determined a priori and is the same, and the number of the clusters is preset to a fixed value k, for example, k is 4.

Set the center of the class group as

(j∈[1,k])，c_jThe reference number indicating the center point of the cluster (i.e. sequentially selected delta)_iI) and D represents the set of labels of the center points of the selected clusters, the specific formula is as follows:

particularly, if the spatial euclidean distance between the two cluster centers is smaller than a predetermined threshold, the two cluster centers are regarded as the same cluster, and any one of the two cluster centers is taken as a new cluster center, or a point having a large local density is taken as a new cluster center.

In step 406, the initial cluster center is taken asThe initial value of the hybrid model algorithm, i.e. the position parameter μ of the respective t-distribution component density function_jAnd performing cluster analysis on the particle swarm according to the mixed model, wherein parameter estimation is performed by using a maximum likelihood algorithm.

And considering the distribution characteristics of the streaming data, the clustering algorithm based on the finite mixture model is more suitable. The Gaussian mixture model algorithm has better processing capability on a cell group consisting of a normal or near-normal distribution data set; the t-distribution mixed model algorithm can adapt to the data of the abnormal distribution; the skew t-distribution mixed model algorithm can better process data with asymmetric distribution. The hybrid model clustering algorithms are continuously developed, and the adaptability of the models to different data distributions is improved. The method for solving the initial cluster center according to the density-distance center algorithm can be applied to all mixed models (Gaussian model, t-distribution mixed model and partial t-distribution mixed model). However, according to the distribution characteristics of the flow data of the blood cells, and considering the complexity of algorithm implementation and the operation efficiency, a t-distribution mixed model is adopted for clustering analysis.

The following describes the specific algorithm of the hybrid model:

1) hybrid model

Let X be a p-dimensional random vector, and X₁,x₂,…,x_nN p-dimensional random sample observations of a random vector X, and independent of each other, the probability density function of the multivariate mixture model generated by X and composed of k components is defined as:

wherein k is the component number of the mixed model; Θ ═ (|)₁,...,π_k-1,θ₁,...,θ_k) Is an unknown parameter matrix; f (x; theta)_i) Representing the probability density function, theta, of the ith component_iIs its unknown parameter vector; pi_iFor the mixing ratio, the ratio of the ith component density in the mixing model is expressed, which satisfies

2) t-hybrid model

If f (x; theta) in formula (5)_i) Is a t-distribution, then f (x; Θ) is a t-mixture model. The probability density function of the P-dimensional t-distribution is of the form:

where μ is a position parameter, Σ is a positive definite matrix, υ is a degree of freedom, δ (x; μ, Σ) ═ x- μ)^TΣ (x- μ), which is the square of the mahalanobis distance between x and μ, and Γ (x) is a Gamma function, defined as

For the t-hybrid model, each component density function is a P-dimensional t-distribution density function, and the hybrid model formula is:

for streaming data, if it can be divided into k classes, the t-hybrid model assumes that it consists of k t-distributions. The final clustering result is to find k flow cell groups corresponding to k t-distributions. By establishing maximum likelihood estimation on the streaming data samples, the mixing parameters of the maximum likelihood estimation can be obtained by adopting an EM algorithm. X_iFor a certain p-dimensional sample value, X, in streaming data_i＝(x_i1,x_i2,...,x_ip)^T. Introduction of X_iComponent label vector Z_i＝(z_i1,z_i2,...,z_ik)^TAnd satisfies the following conditions: x_iWhen it belongs to the jth t-distribution, z_ij1, otherwise z _ij0. Namely Z_iRepresents the sample value X_iTo which t-profile it belongs. At this point, the complete data vector set is X_C＝(X^T，Z₁ ^T，Z₂ ^T，...,Z_n ^T)^T. Wherein X ═ X₁ ^T,X₂ ^T,...,X_n ^T)^T. Its corresponding log-likelihood function can be written as:

3) EM algorithm estimation

For the t-hybrid model, the process of parameter estimation by using the EM algorithm is as follows:

(1) and E stage: let Θ be^(t)Is the estimated value of the t-th iteration, then under the given condition theta^(t)The conditions of the log-likelihood function under are expected to be

Q(Θ；Θ^(t)))＝E(ln(L_c(Θ|X_c))；Θ^(t)) (9)

(2) And (3) an M stage: from equation (8), the theta is calculated^(t+1)Let Q (theta; theta)^(t+1)) At a maximum, i.e.

Θ^(t+1)＝argmax(Q(Θ；Θ^(t))) (10)

(3) And (5) iterating the loop of the formula (9) and the formula (10) until the parameters converge to obtain an estimated value of the parameter theta.

The iterative formula of the corresponding parameters obtained by the EM algorithm is:

degree of freedom υ_j ^(t+1)Is a non-linear equation

Wherein

In step 407, clustering is performed to obtain a plurality of particle clusters, which can be identified by different colors, and performing a classification count statistic, as shown in fig. 2 c.

Example I:

fig. 2 shows embodiment I of the method of the present invention. A two-dimensional scattergram is created from the measurement data of the forward scattered light channel (FSC) and the side scattered light channel (SSC) for the streaming data sample to be processed, as shown in fig. 2b (side scattered light channel on horizontal axis and forward scattered light channel on vertical axis). The sample is a normal sample, the monocyte population accounts for about 5%, the various populations are clearly distinguished, the upper left is the lymphocyte population, the lower left is the erythrocyte debris, the upper middle is the monocyte population, and the right is the granulocyte population.

The distance and the local density parameter for each particle obtained by the density-distance center algorithm are shown in the distance-density distribution diagram, and as shown in fig. 2a, the horizontal axis represents the local density and the vertical axis represents the distance.

Setting a local density threshold value, and excluding particles with local density smaller than the threshold value; arranging all the rest particles into a sequence according to the sequence of the distances from large to small; and setting the number k of the clusters to be 4, and sequentially selecting the first k particles as the centers of the initial clusters to be clustered according to the sequence. The selected cluster centers are indicated in FIG. 2b by "o", "+", "Δ", and "□", respectively.

The selected 1 st initial cluster center is X2719 in the data set and is marked as Xc 1;

the 2 nd initial cluster center is selected as X102 in the data set and is marked as Xc 2;

the selected 3 rd initial cluster center is X3546 in the data set and is marked as Xc 3;

the 4 th initial cluster center is selected as X1568 in the data set and is denoted as Xc 4.

And the solved initial cluster center is used as an initial value of the hybrid model, iterative solution is carried out on the flow data according to the hybrid model, and parameter estimation is carried out by combining a maximum likelihood algorithm. The results of the clustering analysis using the t-distribution mixture model are shown in FIG. 2 c. And identifying each particle group by different colors, and performing classification counting statistics. The noise points in fig. 2 are more, and if the solution is performed only according to the hybrid model, the solution is easy to be wrongly divided, and falls into a local optimal solution. And determining the initial cluster center by using a density-distance center algorithm, thereby ensuring the stability and accuracy of the finite mixture model result.

Based on the classification result of the artificial gating method, the samples clustered by the algorithm are divided into 4 groups, namely red blood cell fragments, lymphocytes, monocytes and granulocytes. Compared with the classification result of the manual gating method, the error of the algorithm is 0.33% for the mononuclear cells with less particles.

Example II:

fig. 3 shows embodiment II of the method herein. A two-dimensional scattergram is created from the data of the forward scattered light channel (FSC) and the side scattered light channel (SSC) for the streaming data sample to be processed, as shown in figure (3b) (side scattered light on the horizontal axis and forward scattered light on the vertical axis). The sample size of the monocyte group of this sample is very small, about 2%, which is a disease or extreme condition.

The distance and the local density parameter for each particle obtained by the density-distance center algorithm are shown in the distance-density distribution diagram, and as shown in fig. 3a, the horizontal axis represents the local density and the vertical axis represents the distance.

Setting a local density threshold value, and excluding particles with local density smaller than the threshold value; arranging all the rest particles into a sequence according to the sequence of the distances from large to small; and setting the number k of the clusters to be 4, and sequentially selecting the first k particles as the centers of the initial clusters to be clustered according to the sequence. The selected cluster centers are indicated in FIG. 3b by "o", "+", "Δ", and "□", respectively.

And the solved initial cluster center is used as an initial value of the hybrid model, iterative solution is carried out on the flow data according to the hybrid model, and parameter estimation is carried out by combining a maximum likelihood algorithm. The results of the clustering analysis using the t-distribution mixture model are shown in FIG. 3 c. And identifying each particle group by different colors, and performing classification counting statistics. The sample has a small sample size of the mononuclear cell group, is distributed sparsely, is easily interfered by the adjacent dominant group, and is mistakenly divided into a part of other groups. And determining the initial cluster center by using a density-distance center algorithm, thereby ensuring the stability and accuracy of the finite mixture model result.

Based on the classification result of the artificial gating method, the samples clustered by the algorithm are divided into 4 groups, namely red blood cell fragments, lymphocytes, monocytes and granulocytes. Compared with the classification result of the manual gating method, the error of the algorithm is 0.19% for the mononuclear cells with less particles.

Example III:

fig. 4 shows embodiment III of the method herein. A two-dimensional scattergram is created from the data of the forward scattered light channel (FSC) and the side scattered light channel (SSC) for the streaming data sample to be processed, as shown in figure (4b) (side scattered light on the horizontal axis and forward scattered light on the vertical axis). The mononuclear cell population of the sample is not only small in sample size (about 2%), but also close to the lymphocyte population, and is partially mixed.

The distance and the local density parameter for each particle obtained by the density-distance center algorithm are shown in the distance-density distribution diagram, and as shown in fig. 4a, the horizontal axis represents the local density and the vertical axis represents the distance.

Setting a local density threshold value, and excluding particles with local density smaller than the threshold value; arranging all the rest particles into a sequence according to the sequence of the distances from large to small; and setting the number k of the clusters to be 4, and sequentially selecting the first k particles as the centers of the initial clusters to be clustered according to the sequence. The selected cluster centers are indicated in FIG. 4b by "o", "+", "Δ", and "□", respectively.

And the solved initial cluster center is used as an initial value of the hybrid model, iterative solution is carried out on the flow data according to the hybrid model, and parameter estimation is carried out by combining a maximum likelihood algorithm. The results of the clustering analysis using the t-distribution mixture model are shown in FIG. 4 c. And identifying each particle group by different colors, and performing classification counting statistics. The sample has a small sample size of the monocyte group, is close to the lymphocyte group, is partially mixed, is easily interfered by the adjacent dominant group, and is mistakenly divided into a part of the lymphocyte group. And determining the initial cluster center by using a density-distance center algorithm, thereby ensuring the stability and accuracy of the finite mixture model result.

Based on the classification result of the artificial gating method, the samples clustered by the algorithm are divided into 4 groups, namely red blood cell fragments, lymphocytes, monocytes and granulocytes. Compared with the classification result of the manual gating method, the error of the algorithm is 0.27% for the mononuclear cells with less particles.

By combining the above embodiments, the density-distance center algorithm has stable results for distinguishing various bad distribution situations such as small sample groups and groups close to each other. Therefore, the initial cluster center is determined by the density-distance center algorithm, the obtained cluster center is accurate and reliable, the problems of positioning and classifying of small sample clusters can be well solved, the interference of various noise points can be effectively eliminated, and the stability and the accuracy of the finite mixture model result are guaranteed; and the initial central value of the mixed model clustering algorithm is used, so that the calculation speed is accelerated.

Claims

1. A flow cytometry particle classification counting method based on a density-distance center algorithm is characterized by comprising the following steps:

2) the method comprises the following steps of obtaining local density and distance parameters of each particle in a flow data set according to a density-distance center algorithm, screening and sequencing to obtain an initial cluster center to be clustered, and specifically comprises the following steps:

when a plurality of particle points with the same local density exist, adding an increment approaching 0 to the local density, and then recalculating the local density and distance parameters of each particle;

24) setting a cluster number k, sequentially selecting the first k particles as initial cluster centers to be clustered according to the sequence, and when the Euclidean distance between the two cluster centers is smaller than a set threshold value, regarding the two cluster centers as the same cluster, and taking any one point in the two cluster centers as a new cluster center or taking a point with higher local density in the two cluster centers as a new cluster center;

2. The method for classifying and counting flow cytometry based on the density-distance center algorithm as claimed in claim 1, wherein in the step 1), when the data in the flow data set is two-dimensional data, the data of the forward scattering light channel is used as y-axis, and the data of the side scattering light channel is used as x-axis to form a two-dimensional scattergram; or taking the data of the side scattered light channel as a y axis and the data of the fluorescence channel as an x axis to form a two-dimensional scatter diagram; when the data in the streaming data set is three-dimensional data, the data of the forward scattering light channel is taken as an x-axis, the data of the side scattering light channel is taken as a y-axis, and the data of the fluorescence channel is taken as a z-axis to form a three-dimensional scattergram.

3. A flow cytometry particle classifying and counting method based on density-distance center algorithm as claimed in claim 1, wherein in step 21),

4. the flow cytometry particle classification and counting method based on the density-distance center algorithm as claimed in claim 1, wherein in the step 3), the mixture model algorithm comprises a gaussian mixture model, a t-distribution mixture model and a biased t-distribution mixture model.