CN107389536A - Fluidic cell particle classifying method of counting based on density distance center algorithm - Google Patents
Fluidic cell particle classifying method of counting based on density distance center algorithm Download PDFInfo
- Publication number
- CN107389536A CN107389536A CN201710641341.0A CN201710641341A CN107389536A CN 107389536 A CN107389536 A CN 107389536A CN 201710641341 A CN201710641341 A CN 201710641341A CN 107389536 A CN107389536 A CN 107389536A
- Authority
- CN
- China
- Prior art keywords
- mrow
- particle
- density
- data
- msub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000002245 particle Substances 0.000 title claims abstract description 69
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000009826 distribution Methods 0.000 claims abstract description 40
- 238000013480 data collection Methods 0.000 claims abstract description 10
- 230000003287 optical effect Effects 0.000 claims description 25
- 238000010586 diagram Methods 0.000 claims description 18
- 239000004744 fabric Substances 0.000 claims 1
- 238000012216 screening Methods 0.000 claims 1
- 210000004027 cell Anatomy 0.000 description 22
- 239000000203 mixture Substances 0.000 description 11
- 210000001616 monocyte Anatomy 0.000 description 10
- 238000007621 cluster analysis Methods 0.000 description 8
- 210000004698 lymphocyte Anatomy 0.000 description 7
- 238000007476 Maximum Likelihood Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 239000003086 colorant Substances 0.000 description 4
- 238000000684 flow cytometry Methods 0.000 description 4
- 210000003714 granulocyte Anatomy 0.000 description 4
- 210000003743 erythrocyte Anatomy 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000000149 penetrating effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000003979 eosinophil Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 210000000265 leukocyte Anatomy 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 238000011527 multiparameter analysis Methods 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 210000004885 white matter Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/10—Investigating individual particles
- G01N15/14—Optical investigation techniques, e.g. flow cytometry
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/10—Investigating individual particles
- G01N15/14—Optical investigation techniques, e.g. flow cytometry
- G01N2015/1486—Counting the particles
Landscapes
- Chemical & Material Sciences (AREA)
- Dispersion Chemistry (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The present invention relates to a kind of fluidic cell particle classifying method of counting based on density distance center algorithm, comprise the following steps:1) the stream data collection of the cell particle of counting to be sorted is obtained using stream type cell analyzer, described stream data collection includes the multidimensional data of particle;2) local density and the distance parameter of each particle of stream data concentration are obtained according to density distance center algorithm, is screened and is sorted, obtain initial classes group center to be clustered;3) initial value using initial classes group center as mixed model algorithm, population is clustered according to mixed model, obtains sorted multiple particle monoids, carry out counting statistics.Compared with prior art, the present invention has that accuracy is high, stability is good, the classification that adapts to the distribution of stream data, adapt to small sample population, the advantages that calculating speed is fast.
Description
Technical field
The present invention relates to cell particle classifying fields of measurement, more particularly, to a kind of based on density-distance center algorithm
Fluidic cell particle classifying method of counting.
Background technology
Flow cytometry (flow cytometry, FCM) is the technology that quantitative analysis is carried out using flow cytometer,
It utilizes hydrodynamics focusing principle, and analyzed cell or particulate are formed a line, quickly flow across detection light beam one by one,
Analyzed by high-accuracy optical system, electrical signal processing and computer data, determine cell or particulate trigger it is polygonal
Degree scattering light and multicolor fluorescence, the sizes of individual cells or particulate up to ten thousand, internal structure, nucleic acid, egg can be obtained in a short time
The physics such as white matter and chemical feature.Flow cytometry is biological doctor with the advantages that it is quick, accurate, high-volume, multi parameter analysis
The important basic scientific research apparatus of front line science research is carried out in treatment field;Meanwhile and important clinical examination equipment.
The multi-angle scattering light and multicolor fluorescence that each cell or particulate trigger, pass through optical system collection and photoelectric sensing
Device is converted into electric signal, and handling and sampling by electrical signal turns into data signal, is stored by computer and carried out data point
Analysis;The characteristic of all cells or particulate that flow cytometer obtains is referred to as stream data.
Traditionally, the analysis of stream data relies on experienced person into two-dimentional scatter diagram, then to adopt data projection
Monoid interested is analyzed with the mode of region gating, such as classifies and counts, be referred to as artificial gating method.With streaming
The continuous development of cell art, stream data amount are multiplied, and automatically analyzing for data has become the following hair of Flow Cytometry
The Main way of exhibition.For the cluster analysis of stream data, some automatic analysis methods are successively suggested, and be can be divided mainly into and are based on
The clustering method of probability distribution and the clustering method based on spatial information.
Clustering method based on probability distribution is mainly Finite mixture model clustering algorithm, is such as based on bayesian information criterion
Gauss hybrid models algorithm, the cell population that the algorithm forms to the data set by normal state or nearly normal distribution has preferably
Disposal ability;The data of Non-Gaussian Distribution are converted to nearly normal distribution by t- Distribution Mixed Models algorithm, instead of Gaussian Mixture mould
Type streaming data carries out cluster analysis;Also deflection t- Distribution Mixed Models algorithm, can preferably handle asymmetric distribution
Data.These mixed model clustering algorithms continue to develop, and improve the adaptability that model is distributed to different pieces of information.It is but high
The solution that the mixed models such as this distribution, t- distributions and inclined t- distributions are obtained in itself is local optimum, therefore is based on finite mixtures mould
The clustering algorithm of type depends on the position of initial point (namely class group center).Because real data is often more complicated, such as make an uproar
Situation more than the point of articulation, mixed model clustering algorithm can be wrong point, so the stability of algorithm is not high.
Clustering method based on spatial information is the another kind of main method of stream data analysis, such as K-means algorithms and
DBSCAN algorithms, the assembility of streaming data are limited.Based on the clustering algorithm of Finite mixture model for stream data
Analysis is more suitable for, and applies relatively more.Because the clustering algorithm based on Finite mixture model depends on initial point (namely class
Group center) position, its initial value to model is very sensitive.Clustering algorithm based on K-means and mixed model is for initial
The selection of monoid central point is often random, and people get used to making the mutual distance of initial cluster center remote as much as possible, but
Be that K-means algorithms try to achieve in itself is locally optimal solution, therefore is still possible to be absorbed in local optimum for random initial value,
It is very unstable to the initial value of Selection Model, it is impossible to ensure the Stability and veracity of result.
In a practical situation, stream data is often more complicated, and the cluster analysis of various harsh conditions streaming datas is chosen
War is very big, and such as the situation more than noise point, forefathers' method is by mistake divided into noise point one single monoid sometimes.In addition, sample
Measure monoid small and that distribution is sparse and do not have good solution.For example, in the leukocyte differential count analysis of human peripheral, generally
Monocyte accounts for the 2%~10% of leucocyte total amount, and eosinophil accounts for the 1%~6% of leucocyte total amount, and lymphocyte
40% is accounted for, granulocyte accounts for 50%, is to account for most of monoid.In such multiclass clustering alanysis, large sample class
The quantity of group and small sample monoid differs greatly and close to each other, and difficult point is the positioning and differentiation of small sample monoid.Small sample class
Group is distributed sparse because sample size is few, it is easy to is disturbed by adjacent dominant groups, and is divided into the one of other monoids by mistake
Part, therefore requirement of the small sample monoid to the taste and stability of algorithm is very high.
The content of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind is based on density-distance
The fluidic cell particle classifying method of counting of CENTER ALGORITHM.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of fluidic cell particle classifying method of counting based on density-distance center algorithm, comprises the following steps:
1) the stream data collection of the cell particle of counting to be sorted, described streaming number are obtained using stream type cell analyzer
The multidimensional data of particle is included according to collection;
2) local density and the distance parameter of each particle of stream data concentration are obtained according to density-distance center algorithm,
Screened and sorted, obtain initial classes group center to be clustered;
3) initial value using initial classes group center as mixed model algorithm, population is gathered according to mixed model
Class, sorted multiple particle monoids are obtained, carry out counting statistics.
In described step 1), when the data that stream data is concentrated are 2-D data, by forward scattering optical channel data
As y-axis, the data of lateral scattering optical channel form two-dimentional scatter diagram as x-axis;Or using side scattered light channel data as y
Axle, the data of fluorescence channel form two-dimentional scatter diagram as x-axis;When the data that stream data is concentrated are three-dimensional data, by before
Formed to scattering optical channel data as x-axis, the data of lateral scattering optical channel as y-axis, the data of fluorescence channel as z-axis
Three-dimensional scatter diagram.
Described step 2) specifically includes following steps:
21) for stream data collection S={ x1,x2...xi...xn, define i-th of particle x thereiniLocal density ρi
With distance δiParameter is respectively;
Wherein, dijFor xiTo xjEuclidean distance, dcTo block distance, χ (x) is a function;
22) local density threshold ρ is set0, and exclude the particle that local density is less than threshold value;
23) remaining all particles are arranged in sequence according to the order of distance from big to small;
24) monoid number k is set, k particle is as initial classes group center to be clustered before being chosen successively according to sequence.
In described step 21),
When i-th particle is the maximum point of local density, then assignment δiFor i-th of particle to distance a little most
Big value, then have:
In described step 21),
When multiple local density's identical particle points be present, then to this local density plus one level off to 0 increment,
Then local density and the distance parameter of each particle are recalculated.
In described step 24), when the Euclidean distance of Liang Ge classes group center is less than the threshold value of setting, then it is regarded as same
One monoid, any point in this Liang Ge classes group center is taken as new class group center, or take local in this Liang Ge classes group center
The larger point of density is as new class group center.
In described step 3), mixed model algorithm includes gauss hybrid models, t- Distribution Mixed Models and inclined t- distributions
Mixed model.
Compared with prior art, the present invention has advantages below:
First, accuracy is high, and stability is good:The initial center of each particle monoid is first found using density-distance center algorithm,
Therefore aftermentioned cluster process accuracy is high, and stability is good, is not in the situation that locally optimal solution causes to divide by mistake.
2nd, the distribution of stream data is adapted to:Using mixed model (such as Gauss model, t- Distribution Mixed Models and inclined t-
Distribution Mixed Model etc.) clustered, it can effectively adapt to the characteristic distributions of stream data.
3rd, the classification of small sample population is adapted to:Context of methods can effectively handle small sample population, positioning and classification
Accuracy it is high.
4th, calculating speed is fast:Initial classes group center is determined by density-distance center algorithm, clusters and calculates as mixed model
The initial centered value of method, calculating speed are accelerated.
Brief description of the drawings
Fig. 1 is flow chart of the method for the present invention.
Fig. 2 is embodiments of the invention I schematic diagram, wherein, figure (2a) is distance-density profile, and figure (2b) is two
Scatter diagram is tieed up, figure (2c) is the result after cluster.
Fig. 3 is embodiments of the invention II schematic diagram, wherein, figure (3a) is distance-density profile, and figure (3b) is two
Scatter diagram is tieed up, figure (3c) is the result after cluster.
Fig. 4 is embodiments of the invention III schematic diagram, wherein, figure (4a) is distance-density profile, and figure (4b) is
Two-dimentional scatter diagram, figure (4c) are the result after cluster.
Embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.
The present invention proposes a kind of mixed model stream data clustering method based on density-distance center, by density-away from
In positioning from the initial cluster center that CENTER ALGORITHM is applied to stream data, to determine initial classes group center, so as to ensure
Limit the stability and accuracy of mixed model result.This method will be based on probability distribution and spatial information (density and distance)
Method is merged, and so as to preferably solve the problems, such as the differentiation of small sample monoid, while anti-noise ability is strong, and stability is high,
Accuracy is good.
Fig. 1 show the idiographic flow of the clustering method of present invention processing stream data.With reference to Fig. 1 to following cluster
Step is described in detail:
In step 401, stream data collection to be analyzed is obtained using stream type cell analyzer, such as the feature of cell particle
Data, including the detection limit of multi-angle scattering light and multicolor fluorescence.Stream data collection to be analyzed includes the multidimensional data of particle.
When the data that stream data is concentrated are 2-D data, the number such as comprising forward scattering optical channel data and lateral scattering optical channel
According to two-dimentional scatter diagram can be formed as x-axis using forward scattering optical channel data as y-axis, the data of lateral scattering optical channel, such as
Scheme shown in (2b);Data such as comprising side scattered light channel data and fluorescence channel, can make side scattered light channel data
For y-axis, the data of fluorescence channel form two-dimentional scatter diagram as x-axis;When the data that stream data is concentrated are three-dimensional data, such as
Comprising forward scattering optical channel data, the data of lateral scattering optical channel and the data of fluorescence channel, forward scattering light can be led to
Track data forms three-dimensional scatterplot as y-axis, the data of fluorescence channel as x-axis, the data of lateral scattering optical channel as z-axis
Figure.
In step 402, for stream data collection to be analyzed, each particle is obtained by density-distance center algorithm
Local density and distance parameter, are represented in distance-density profile, are such as schemed shown in (2a).
For data set S={ x to be clustered1,x2,…,xn, define i-th of particle x thereiniLocal density ρiWith
And distance δiTwo parameters (i ∈ [1, n]).Local density reflects the density of the data in certain section, and it is defined as follows:
Wherein, function
Parameter dijThat represent is xiTo xjDistance, such as space Euclidean distance.Parameter dc>0 is blocks distance, according to reality
Sample data is preset, and such as takes dc=5.From formula (1), local density ρiRepresent be in data set with xi(exclude certainly
Body) distance be less than dcData point number.
To the distance δ of certain pointiDefinition be calculate it arrive all points bigger than its local density distance, take therein
Minimum value, specific formula are as follows:
If this point has been the maximum point of local density, then δiBe entered as it to distance a little maximum
Value.
According to formula (1)-(4), each point xiA local density ρ can be obtainediWith a distance value δi。
Especially, if multiple local density's identical particle points be present, 0 is leveled off to plus one to this local density
Increment, then recalculate local density and the distance parameter of each particle.
In step 403, the threshold value ρ of a local density is set0, and make a decision.If the part of a particle point is close
Degree is less than threshold value ρ0, the particle point is deleted from data set.
In step 404, remaining all particles are arranged in sequence according to the order of distance from big to small.
In step 405, monoid number k is set, k particle is as initial classes to be clustered before being chosen successively according to sequence
Group center.
For the stream data of certain determination analysis, the monoid number to be sorted of same class experiment sample be priori determine and
Identical, monoid number is preset as definite value k, such as k=4.
If class group center is(j ∈ [1, k]), cjThe label for representing monoid central point (is the δ chosen successivelyiIndex
I), D represents the set of the label for the monoid central point chosen, then its specific formula is as follows:
Especially, if the space Euclidean distance of Liang Lei group center is less than the threshold value of setting, same monoid is regarded as, is taken
Any point in this Liang Ge classes group center is as new class group center, or takes local density in this Liang Ge classes group center larger
Point is as new class group center.
In a step 406, the initial value using initial classes group center as mixed model algorithm, that is, each t- distributions point
The location parameter μ of metric density functionj, cluster analysis is carried out to population according to mixed model, wherein being entered with maximum likelihood algorithm
Row parameter Estimation.
Consider the distribution characteristics of stream data, the clustering algorithm based on Finite mixture model is relatively adapted to.Gaussian Mixture mould
The cell population that type algorithm forms to the data set by normal state or nearly normal distribution has preferable disposal ability;T- distributed renderings
Model algorithm is adapted to the data of Non-Gaussian Distribution;Deflection t- Distribution Mixed Models algorithm can preferably handle asymmetric distribution
Data.These mixed model clustering algorithms continue to develop, and improve the adaptability that model is distributed to different pieces of information.According to close
The method that degree-distance center algorithm obtains initial classes group center, it may apply to all mixed models (Gauss model, t- distributions
Mixed model, inclined t- Distribution Mixed Models) in.But according to the characteristic distributions of the stream data of haemocyte, and it is real in view of algorithm
Existing complexity and the efficiency of operation, carry out cluster analysis using t- Distribution Mixed Models here.
The specific algorithm of mixed model is described below:
1) mixed model
If X, which is p, ties up random vector, and x1,x2,…,xnRandom sample observation is tieed up for random vector X n p, and mutually
Independent, then the Diversity model probability density function being made up of caused by X k component is defined as:
Wherein, k is the number of components of mixed model;Θ=(π1,...,πk-1,θ1,...,θk), it is matrix of unknown parameters;f
(x;θi) represent the probability density function of i-th of component, θiFor its unknown parameter vector;πiFor mixing ratio, i-th point is represented
Ratio of the metric density in mixed model, it meets
2) t- mixed models
If f (x in formula (5);θi) be distributed for t-, then f (x;Θ) it is t- mixed models.The probability density letter of P dimension t- distributions
Several forms are:
Wherein μ is location parameter, and Σ is positive definite matrix, and υ is the free degree, δ (x;μ, Σ)=(x- μ)TΣ (x- μ) is x with
Square of mahalanobis distance between μ, Γ (x) are Gamma functions, are defined asFor t mixed models, often
Individual component density function all ties up t- distribution density functions for P, and its hybrid guided mode pattern is:
For stream data, if it can be divided into k monoid, t- mixed models assume that it is made up of k t- distribution.
Last cluster result namely obtains k fluidic cell group of corresponding k t- distributions.Pass through streaming data Sample Establishing pole
Maximum-likelihood is estimated, the hybrid parameter of Maximum-likelihood estimation can be obtained using EM algorithms.XiSample is tieed up for some p in stream data
This value, Xi=(xi1,xi2,...,xip)T.Introduce XiThe label vector Z of componenti=(zi1,zi2,...,zik)T, and meet:XiCategory
When j-th of t- is distributed, zij=1, otherwise zij=0.That is ZiRepresent sample value XiWhich t- distribution belonged to.Now, completely
Data vector integrates as XC=(XT, Z1 T, Z2 T..., Zn T)T.Wherein X=(X1 T,X2 T,...,Xn T)T.Its corresponding log-likelihood letter
Number can be written as:
3) EM algorithms are estimated
For t- mixed models, the process that parameter Estimation is carried out using EM algorithms is as follows:
(1) E-stage:If Θ(t)For the estimate of the t times iteration, then in specified criteria Θ(t)Under log-likelihood function
Conditional expectation is
Q(Θ;Θ(t)))=E (ln (Lc(Θ|Xc));Θ(t)) (9)
(2) the M stages:Θ is asked by formula (8)(t+1)Make Q (Θ;Θ(t+1)) maximum, i.e.,
Θ(t+1)=argmax (Q (Θ;Θ(t))) (10)
(3) by formula (9) and formula (10) loop iteration until parameter convergence, obtains parameter Θ estimate.
The iterative of the relevant parameter tried to achieve by EM algorithms be:
Free degree υj (t+1)It is nonlinear equation
Solution, wherein
In step 407, cluster obtains multiple particle monoids, can be identified with different colours, and carry out differential counting system
Meter, such as scheme shown in (2c).
Embodiment I:
As shown in Fig. 2 the specific implementation case I for context of methods.Pending stream data sample, according to forward scattering
The measurement data of optical channel (FSC) and lateral scattering optical channel (SSC) establishes two-dimentional scatter diagram, such as scheme shown in (2b) that (transverse axis is
Lateral scattering optical channel, the longitudinal axis are forward scattering optical channel).This sample is normal sample, and monocyte group accounts for 5%, all kinds of
Group distinguishes substantially, and upper left side is lymphocyte populations, and lower left is bib, and middle top is monocyte group, and right is grain
Cell mass.
Distance and local density's parameter corresponding to each particle obtained as density-distance center algorithm, represent away from
From in-density profile, such as scheme shown in (2a), transverse axis is local density, and the longitudinal axis is distance.
Local density threshold is set, and excludes the particle that local density is less than threshold value;By remaining all particles according to
Sequence is arranged in apart from descending order;Monoid number k=4 is set, k particle, which is used as, before being chosen successively according to sequence waits to gather
The initial classes group center of class.The class group center of selection is represented with " o ", "+", " Δ " and " " respectively in (2b) is schemed.
The 1st initial classes group center chosen is the X2719 in data set, is designated as Xc1;
The 2nd initial classes group center chosen is the X102 in data set, is designated as Xc2;
The 3rd initial classes group center chosen is the X3546 in data set, is designated as Xc3;
The 4th initial classes group center chosen is the X1568 in data set, is designated as Xc4.
Initial value of the initial classes group center obtained as mixed model, is iterated according to mixed model streaming data and asks
Solution, wherein carrying out parameter Estimation with reference to maximum likelihood algorithm.The result of cluster analysis is carried out as schemed with t- Distribution Mixed Models
Shown in (2c).Each particle monoid is identified with different colours, and carries out differential counting statistics.Fig. 2 noise point is more, if only
Solved according to mixed model, easily divide by mistake, be absorbed in locally optimal solution.Initial classes are determined with density-distance center algorithm
Group center, so as to ensure the stability of Finite mixture model result and accuracy.
Using the classification results of artificial gating method as standard, the sample after this algorithm cluster is divided into 4 groups, and respectively red blood cell is broken
Piece, lymphocyte, monocyte and granulocyte.The classification results of artificial gating method are contrasted, for the less monokaryon of population
Cell, the error of this algorithm is 0.33%.
Embodiment II:
As shown in figure 3, the specific implementation case II for context of methods.Pending stream data sample, according to preceding to scattered
The data for penetrating optical channel (FSC) and lateral scattering optical channel (SSC) establish two-dimentional scatter diagram, and such as (transverse axis is side shown in figure (3b)
To scattering light, the longitudinal axis is forward scattering light).The monocyte group sample size of this sample is seldom, accounts for 2%, for sufferer or extreme
Situation.
Distance and local density's parameter corresponding to each particle obtained as density-distance center algorithm, represent away from
From in-density profile, such as scheme shown in (3a), transverse axis is local density, and the longitudinal axis is distance.
Local density threshold is set, and excludes the particle that local density is less than threshold value;By remaining all particles according to
Sequence is arranged in apart from descending order;Monoid number k=4 is set, k particle, which is used as, before being chosen successively according to sequence waits to gather
The initial classes group center of class.The class group center of selection is represented with " o ", "+", " Δ " and " " respectively in (3b) is schemed.
Initial value of the initial classes group center obtained as mixed model, is iterated according to mixed model streaming data and asks
Solution, wherein carrying out parameter Estimation with reference to maximum likelihood algorithm.The result of cluster analysis is carried out as schemed with t- Distribution Mixed Models
Shown in (3c).Each particle monoid is identified with different colours, and carries out differential counting statistics.The monocyte group sample of this sample
Amount is seldom, and is distributed sparse, it is easy to is disturbed by adjacent dominant groups, and is divided into a part for other monoids by mistake.With close
Degree-distance center algorithm determines initial classes group center, so as to ensureing the stability of Finite mixture model result and accuracy.
Using the classification results of artificial gating method as standard, the sample after this algorithm cluster is divided into 4 groups, and respectively red blood cell is broken
Piece, lymphocyte, monocyte and granulocyte.The classification results of artificial gating method are contrasted, for the less monokaryon of population
Cell, the error of this algorithm is 0.19%.
Embodiment III:
As shown in figure 4, the specific implementation case III for context of methods.Pending stream data sample, according to preceding to scattered
The data for penetrating optical channel (FSC) and lateral scattering optical channel (SSC) establish two-dimentional scatter diagram, and such as (transverse axis is side shown in figure (4b)
To scattering light, the longitudinal axis is forward scattering light).Not only sample size is few (accounting for 2%) for the monocyte group of this sample, and and lymph
Cell mass is in close proximity, part aliasing.
Distance and local density's parameter corresponding to each particle obtained as density-distance center algorithm, represent away from
From in-density profile, such as scheme shown in (4a), transverse axis is local density, and the longitudinal axis is distance.
Local density threshold is set, and excludes the particle that local density is less than threshold value;By remaining all particles according to
Sequence is arranged in apart from descending order;Monoid number k=4 is set, k particle, which is used as, before being chosen successively according to sequence waits to gather
The initial classes group center of class.The class group center of selection is represented with " o ", "+", " Δ " and " " respectively in (4b) is schemed.
Initial value of the initial classes group center obtained as mixed model, is iterated according to mixed model streaming data and asks
Solution, wherein carrying out parameter Estimation with reference to maximum likelihood algorithm.The result of cluster analysis is carried out as schemed with t- Distribution Mixed Models
Shown in (4c).Each particle monoid is identified with different colours, and carries out differential counting statistics.The monocyte group sample of this sample
Amount is seldom, and, part aliasing in close proximity with lymphocyte populations, it is easy to is disturbed by adjacent dominant groups, and is divided into by mistake
A part for lymphocyte populations.Initial classes group center is determined with density-distance center algorithm, so as to ensure Finite mixture model
As a result stability and accuracy.
Using the classification results of artificial gating method as standard, the sample after this algorithm cluster is divided into 4 groups, and respectively red blood cell is broken
Piece, lymphocyte, monocyte and granulocyte.The classification results of artificial gating method are contrasted, for the less monokaryon of population
Cell, the error of this algorithm is 0.27%.
In summary it is embodied case, density-distance center algorithm is to distinguishing small sample monoid and close to each other
The various severe distribution situations such as monoid are as a result very stable.So determined by density-distance center algorithm in initial monoid
The heart, the class group center of acquisition accurately and reliably, can preferably handle positioning and the classification problem of small sample monoid, can effectively exclude
The interference of various noise points, so as to ensure the stability of Finite mixture model result and accuracy;And gather as mixed model
The initial centered value of class algorithm, accelerates calculating speed.
Claims (7)
1. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm, it is characterised in that including following
Step:
1) the stream data collection of the cell particle of counting to be sorted, described stream data collection are obtained using stream type cell analyzer
Multidimensional data comprising particle;
2) local density and the distance parameter of each particle of stream data concentration are obtained according to density-distance center algorithm, is carried out
Screening and sequence, obtain initial classes group center to be clustered;
3) initial value using initial classes group center as mixed model algorithm, population is clustered according to mixed model, obtained
To sorted multiple particle monoids, counting statistics is carried out.
2. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 1,
Characterized in that, in described step 1), when the data that stream data is concentrated are 2-D data, by forward scattering optical channel number
According to the data as y-axis, lateral scattering optical channel two-dimentional scatter diagram is formed as x-axis;Or side scattered light channel data is made
For y-axis, the data of fluorescence channel form two-dimentional scatter diagram as x-axis;, will when the data that stream data is concentrated are three-dimensional data
Forward scattering optical channel data are as x-axis, and the data of lateral scattering optical channel are as y-axis, and the data of fluorescence channel are as z-axis shape
Into three-dimensional scatter diagram.
3. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 1,
Characterized in that, described step 2) specifically includes following steps:
21) for stream data collection S={ x1,x2...xi...xn, define i-th of particle x thereiniLocal density ρiWith away from
From δiParameter is respectively;
<mrow>
<msub>
<mi>&rho;</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<munder>
<mo>&Sigma;</mo>
<mi>j</mi>
</munder>
<mi>&chi;</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>-</mo>
<msub>
<mi>d</mi>
<mi>c</mi>
</msub>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<mi>&chi;</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mn>1</mn>
</mtd>
<mtd>
<mrow>
<mi>x</mi>
<mo><</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mn>0</mn>
</mtd>
<mtd>
<mrow>
<mi>x</mi>
<mo>&GreaterEqual;</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
<mrow>
<msub>
<mi>&delta;</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<munder>
<mi>min</mi>
<mrow>
<mi>j</mi>
<mo>:</mo>
<msub>
<mi>&rho;</mi>
<mi>j</mi>
</msub>
<mo>></mo>
<msub>
<mi>&rho;</mi>
<mi>i</mi>
</msub>
</mrow>
</munder>
<mrow>
<mo>(</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
</mrow>
Wherein, dijFor xiTo xjEuclidean distance, dcTo block distance, χ (x) is a function;
22) local density threshold ρ is set0, and exclude the particle that local density is less than threshold value;
23) remaining all particles are arranged in sequence according to the order of distance from big to small;
24) monoid number k is set, k particle is as initial classes group center to be clustered before being chosen successively according to sequence.
4. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 3,
Characterized in that, in described step 21),
When i-th particle is the maximum point of local density, then assignment δiFor i-th of particle to distance a little maximum,
Then have:
<mrow>
<msub>
<mi>&delta;</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<munder>
<mi>max</mi>
<mi>j</mi>
</munder>
<mrow>
<mo>(</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>.</mo>
</mrow>
5. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 3,
Characterized in that, in described step 21),
When multiple local density's identical particle points be present, then to this local density plus one level off to 0 increment, then
Recalculate local density and the distance parameter of each particle.
6. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 3,
Characterized in that, in described step 24), when the Euclidean distance of Liang Ge classes group center is less than the threshold value of setting, then regarded
For same monoid, any point in this Liang Ge classes group center is taken as new class group center, or take in this Liang Ge classes group center
The larger point of local density is as new class group center.
7. a kind of fluidic cell particle classifying method of counting based on density-distance center algorithm according to claim 1,
Characterized in that, in described step 3), mixed model algorithm includes gauss hybrid models, t- Distribution Mixed Models and inclined t- points
Cloth mixed model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710641341.0A CN107389536B (en) | 2017-07-31 | 2017-07-31 | Flow cell particle classification counting method based on density-distance center algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710641341.0A CN107389536B (en) | 2017-07-31 | 2017-07-31 | Flow cell particle classification counting method based on density-distance center algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107389536A true CN107389536A (en) | 2017-11-24 |
CN107389536B CN107389536B (en) | 2020-03-31 |
Family
ID=60343087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710641341.0A Active CN107389536B (en) | 2017-07-31 | 2017-07-31 | Flow cell particle classification counting method based on density-distance center algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107389536B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516584A (en) * | 2019-08-22 | 2019-11-29 | 杭州图谱光电科技有限公司 | A kind of Auto-counting of Cells method based on dynamic learning of microscope |
CN110954465A (en) * | 2018-09-26 | 2020-04-03 | 希森美康株式会社 | Flow cytometer, data transmission method, and information processing system |
CN112507991A (en) * | 2021-02-04 | 2021-03-16 | 季华实验室 | Method and system for setting gate of flow cytometer data, storage medium and electronic equipment |
CN113380318A (en) * | 2021-06-07 | 2021-09-10 | 天津金域医学检验实验室有限公司 | Artificial intelligence assisted flow cytometry 40CD immunophenotyping detection method and system |
CN114136868A (en) * | 2021-12-03 | 2022-03-04 | 浙江博真生物科技有限公司 | Flow cytometry full-automatic clustering method based on density and nonparametric clustering |
CN116401567A (en) * | 2023-06-02 | 2023-07-07 | 支付宝(杭州)信息技术有限公司 | Clustering model training, user clustering and information pushing method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102680379A (en) * | 2012-05-31 | 2012-09-19 | 长春迪瑞医疗科技股份有限公司 | Device for classifying and counting white cells by using even high-order aspherical laser shaping system |
US20130226469A1 (en) * | 2008-04-01 | 2013-08-29 | Purdue Research Foundation | Gate-free flow cytometry data analysis |
CN103562920A (en) * | 2011-03-21 | 2014-02-05 | 贝克顿迪金森公司 | Neighborhood thresholding in mixed model density gating |
CN103942415A (en) * | 2014-03-31 | 2014-07-23 | 中国人民解放军军事医学科学院卫生装备研究所 | Automatic data analysis method of flow cytometer |
CN105424560A (en) * | 2015-11-24 | 2016-03-23 | 苏州创继生物科技有限公司 | Automatic quantitative analysis method for data of flow-type particle instrument |
-
2017
- 2017-07-31 CN CN201710641341.0A patent/CN107389536B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130226469A1 (en) * | 2008-04-01 | 2013-08-29 | Purdue Research Foundation | Gate-free flow cytometry data analysis |
CN103562920A (en) * | 2011-03-21 | 2014-02-05 | 贝克顿迪金森公司 | Neighborhood thresholding in mixed model density gating |
CN102680379A (en) * | 2012-05-31 | 2012-09-19 | 长春迪瑞医疗科技股份有限公司 | Device for classifying and counting white cells by using even high-order aspherical laser shaping system |
CN103942415A (en) * | 2014-03-31 | 2014-07-23 | 中国人民解放军军事医学科学院卫生装备研究所 | Automatic data analysis method of flow cytometer |
CN105424560A (en) * | 2015-11-24 | 2016-03-23 | 苏州创继生物科技有限公司 | Automatic quantitative analysis method for data of flow-type particle instrument |
Non-Patent Citations (1)
Title |
---|
ALEX RODRIGUEZ 等: "Clustering by fast search and find of density peaks,Alex Rodriguez", 《SCIENCE》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110954465A (en) * | 2018-09-26 | 2020-04-03 | 希森美康株式会社 | Flow cytometer, data transmission method, and information processing system |
CN110516584A (en) * | 2019-08-22 | 2019-11-29 | 杭州图谱光电科技有限公司 | A kind of Auto-counting of Cells method based on dynamic learning of microscope |
CN110516584B (en) * | 2019-08-22 | 2021-10-08 | 杭州图谱光电科技有限公司 | Cell automatic counting method based on dynamic learning for microscope |
CN112507991A (en) * | 2021-02-04 | 2021-03-16 | 季华实验室 | Method and system for setting gate of flow cytometer data, storage medium and electronic equipment |
CN113380318A (en) * | 2021-06-07 | 2021-09-10 | 天津金域医学检验实验室有限公司 | Artificial intelligence assisted flow cytometry 40CD immunophenotyping detection method and system |
CN114136868A (en) * | 2021-12-03 | 2022-03-04 | 浙江博真生物科技有限公司 | Flow cytometry full-automatic clustering method based on density and nonparametric clustering |
CN116401567A (en) * | 2023-06-02 | 2023-07-07 | 支付宝(杭州)信息技术有限公司 | Clustering model training, user clustering and information pushing method and device |
CN116401567B (en) * | 2023-06-02 | 2023-09-08 | 支付宝(杭州)信息技术有限公司 | Clustering model training, user clustering and information pushing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN107389536B (en) | 2020-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107389536A (en) | Fluidic cell particle classifying method of counting based on density distance center algorithm | |
US10337975B2 (en) | Method and system for characterizing particles using a flow cytometer | |
US10222320B2 (en) | Identifying and enumerating early granulated cells (EGCs) | |
CN101097180B (en) | Analyzer and analyzing method | |
US20080172185A1 (en) | Automatic classifying method, device and system for flow cytometry | |
CN102507417B (en) | Method for automatically classifying particles | |
JPH0352573B2 (en) | ||
EP2939001B1 (en) | Systems and methods for platelet count with clump adjustment | |
CN105203446B (en) | Based on probability distribution cell classification statistical method | |
CN101672759B (en) | Classified statistic method and device of particles | |
JPWO2005050479A1 (en) | Similar pattern search device, similar pattern search method, similar pattern search program, and fraction separation device | |
CN110023759A (en) | For using system, method and the product of multidimensional analysis detection abnormal cell | |
CN112114000A (en) | Cell analyzer, method for classifying leukocytes based on impedance method and computer-readable storage medium | |
CN114813522A (en) | Blood cell analysis method and system based on microscopic amplification digital image | |
CN106548203A (en) | A kind of fast automatic point of group of multiparameter flow cytometry data and gating method | |
CN102331393A (en) | Method for carrying out automatic classified counting on cells in human blood | |
CN110226083B (en) | Erythrocyte fragment recognition method and device, blood cell analyzer and analysis method | |
CN110197193A (en) | A kind of automatic grouping method of multi-parameter stream data | |
CN111274949B (en) | Blood disease white blood cell scatter diagram similarity analysis method based on structural analysis | |
CN109580550A (en) | A kind of classification processing method and its device of leucocyte | |
CN112789503B (en) | Method for analyzing nucleated red blood cells, blood cell analyzer and storage medium | |
EP2920573B1 (en) | Particle data segmentation result evaluation methods and flow cytometer | |
CN102144153B (en) | Method and device for classifying, displaying, and exploring biological data | |
Chandrasiri et al. | Morphology based automatic disease analysis through evaluation of red blood cells | |
CN111812070B (en) | Nuclear left shift and value range determining method and device and cell analyzer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |