CN106548204A - The fast automatic grouping method of Flow cytometry data - Google Patents
The fast automatic grouping method of Flow cytometry data Download PDFInfo
- Publication number
- CN106548204A CN106548204A CN201610943348.3A CN201610943348A CN106548204A CN 106548204 A CN106548204 A CN 106548204A CN 201610943348 A CN201610943348 A CN 201610943348A CN 106548204 A CN106548204 A CN 106548204A
- Authority
- CN
- China
- Prior art keywords
- main constituent
- matrix
- data
- point
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention provides a kind of method that flow cytometer data carry out fast automatic point of group, the method comprising the steps of:Step one, is lost in cell number evidence, including following sub-step using PCA process:1) sample matrix X is standardized, obtains normalized matrix X*;2) obtain its correlation matrix and carry out feature decomposition, obtain eigenvalue (λ1≥λ2≥…≥λp) and its corresponding characteristic vector a1,a2,…,ap;3) number k of main constituent is determined according to main constituent variance contribution ratio;4) according to the corresponding characteristic vector U=[λ of front k main constituent1,λ2…λk], obtain the eigenvectors matrix W=X that sample data is constituted to k principal component vector*U;Step 2, is clustered using the K means algorithm flow cytometrics after improvement, is obtained monoid label;Step 3, arranges the maximum main constituent of contribution rate and draws scatterplot as coordinate axess;Step 4, realizes point group automatically.
Description
Technical field
The present invention relates to field of biological medicine, and in particular to a kind of flow cytometer data carry out fast automatic point
The method of group.
Background technology
Flow cytometer (Flow Cytometer) becomes biological study and the most important instrument of clinical diagnosises of carrying out, stream
Formula cell art (Flow Cytometry) is a kind of multiparameter can be carried out to the cell that suspends or other microgranules, quick is analyzed
Or the technology of sorting.Flow cytometer can detect various physicochemical properties of individual cells, while it is thin to obtain representative from the cell
Cell space product, the scattered light signal (SC) of granularity and various fluorescent pulse signals (FL) of each antigenic content are represented, and extract signal
Peak value, the characteristic parameter such as pulsewidth and area.Each cell induction obtains scattered light and fluorescence signal with individual event (event)
Form be recorded, all of event pools the complete stream data of tested cell group.
Flow cytometry data analysis is one of difficult point in flow cytometry, and its main purpose is to recognize and divide in sample
Cell subset.When flow cytometry data analysis is carried out, it is usually used and can shows that the two dimension of two Measurement channel parameters dissipates
Point diagram carries out visual analyzing to the data for obtaining, the parameter can for forward scattering light (SSC), side scattered light (FSC) or
Fluorescence signal.But two-dimentional scatterplot can only be analyzed to the parameter of two dimensions every time, as multiparameter stream data is tieed up
Degree is high, and data volume is big, if stream data number of parameters is n, two parameters of random selection can be drawn as horizontal, vertical coordinate
Scatterplot map number isUnder normal circumstances, in the scatterplot that random selection coordinate axess parameter is drawn, the differentiation of cell subsets
Not substantially, need operator to possess the Professional knowledge of higher level and choose specific parameter combination and be analyzed and could obtain
Comparatively ideal grouping result, process is loaded down with trivial details, time-consuming.
The content of the invention
In order to solve the above problems, it is an object of the invention to provide a kind of flow cytometer data carry out it is fast automatic
The method for dividing group, the method comprising the steps of:Step one, is lost in cell number evidence using PCA process, including
Following sub-step:1) sample matrix X is standardized, obtains normalized matrix X*;2) obtain its correlation matrix to go forward side by side
Row feature decomposition, obtains eigenvalue (λ1≥λ2≥…≥λp) and its corresponding characteristic vector a1, a2..., ap;3) according to main constituent
Variance contribution ratio determines number k of main constituent;4) according to the corresponding characteristic vector U=[λ of front k main constituent1, λ2…λk], obtain
The eigenvectors matrix W=X that sample data is constituted to k principal component vector*U;Step 2, is calculated using the K-means after improvement
Method flow cytometric is clustered, and obtains monoid label;Step 3, arranges the maximum main constituent of contribution rate and draws as coordinate axess
Scatterplot;Step 4, realizes point group automatically.
Preferably, the step 2 is specifically included:Determine a data point as first initial cluster center, choose with
, used as second cluster centre, selected distance the first two cluster centre distance is most for the data point of first cluster centre distance maximum
Big data point is the 3rd cluster centre, by that analogy, finally determines n initial cluster center;Finally to each data point
Now cluster in initial clustering.The distance of the heart is iterated computing and realizes cluster.
It should be appreciated that aforementioned description substantially and follow-up description in detail be exemplary illustration and
Explain, should not be as the restriction to claimed content of the invention.
Description of the drawings
With reference to the accompanying drawing enclosed, the present invention more purpose, function and advantages are by by the as follows of embodiment of the present invention
Description is illustrated, wherein:
Fig. 1 is the flow chart of the method that the flow cytometer data of the present invention carry out fast automatic point of group;
Fig. 2 is to draw the result schematic diagram that two-dimentional scatterplot is obtained using Traditional Man grouping method;
Fig. 3 is the contribution rate and contribution rate of accumulative total of the main constituent using obtaining after PCA methods process of the invention;
Fig. 4 is the grouping result schematic diagram obtained using the method for the present invention.
Specific embodiment
By reference to one exemplary embodiment, the purpose of the present invention and function and the side for realizing these purposes and function
Method will be illustrated.However, the present invention is not limited to one exemplary embodiment disclosed below;Can by multi-form come
Which is realized.The essence of description is only to aid in the detail of the various equivalent modifications Integrated Understanding present invention.
Hereinafter, embodiments of the invention will be described with reference to the drawings.In the accompanying drawings, identical reference represents identical
Or similar part, or same or like step.
The present invention proposes PCA (PCA) to be applied in the analysis of multiparameter stream data, by convection type number
Extract according to dimension-reduction treatment and sign is carried out, by the use of best embody two main variables of difference between different cell subsets as
The horizontal stroke of two-dimentional scatterplot, axis of ordinates, carry out scatterplot point cluster analysiss to sample.
PCA is a kind of conventional multi-variate statistical analyses technology, and it is selected by linear transformation according to maximum variance principle
Less significant variable replaces original multiple variables, reduces data dimension and maximizes the effective information for preserving data.
PCA algorithms are standardized to sample matrix X first, obtain normalized matrix X*;Then obtain its correlation matrix to go forward side by side
Row feature decomposition, obtains eigenvalue (λ1≥λ2≥…≥λp) and its corresponding characteristic vector a1, a2..., ap;Next according to master
Composition variance contribution ratio determines number k of main constituent;Finally, according to the corresponding characteristic vector U=[λ of front k main constituent1, λ2…
λk], obtain the eigenvectors matrix W=X that sample data is constituted to k principal component vector*U.Multiparameter flow cytometry data has
The features such as data volume is big, dimension is high, PCA methods can reduce the dimension and redundancy of flow cytometry data, choose main constituent
Variable arranges coordinate axess automatically as new characteristic variable, draws scatterplot, realizes point group automatically.
K-means algorithms are the algorithms for typically being clustered based on distance, and the algorithm is quick, simple, efficiency high.We
Method realizes the automatic gating of cell using the K-means algorithms after improvement.The improvement of algorithm is mainly manifested in initialization cluster
The determination of the position of the heart, traditional K-means clustering algorithms usually randomly choose n value as initial cluster center, cause to gather
Class result is simultaneously unstable.This method is:First determine that a data point, as first initial cluster center, is then chosen and first
The maximum data point of individual cluster centre distance is used as second cluster centre, following selected distance the first two cluster centre distance
Maximum data point is the 3rd cluster centre, by that analogy, finally determines n initial cluster center;Finally to each data
Point is iterated computing to the distance of initial cluster center and realizes cluster.
The method that present invention side provides can realize that Flow cytometry data divides group automatically, without the need for the seat for manually arranging scatterplot
Parameter, the first two obtained after by process or three maximum main constituents of contribution rate are automatically set as coordinate axess, just can
Realize point group automatically of automatic flow cytometry data.Additionally, by using the Kmeans clustering algorithms after improvement to process after
Stream data carries out cluster analyses, obtains the tag along sort of each event of flow cytometry data, realizes the circle door of different cell subsets.
Fig. 1 is the flow chart of the method that the flow cytometer data of the present invention carry out fast automatic point of group.This method divides group to tie automatically
Fruit, analysis time time well below manual analyses consistent with Traditional Man grouping result, the efficiency of cell point group is improve,
The reliability of grouping result is improve simultaneously, and this method has preferable application prospect in the analysis of multiparameter flow cytometry data,
Can be applied in other biological medical data analysis field simultaneously.Fig. 2 is to draw two dimension using Traditional Man grouping method to dissipate
The result schematic diagram that point diagram is obtained.Fig. 3 be processed using the PCA methods of the present invention after the contribution rate of main constituent that obtains and accumulative
Contribution rate.Fig. 4 is the grouping result schematic diagram that profit is obtained by the present invention.From the point of view of Fig. 2 and Fig. 4 contrasts, using this
Bright point group's effect will be due to the snock swarming method being driven.
Adopt the Flow cytometry experiments data of human peripheral blood lymphocytes for process object, sample includes 4811 cells
And 3 kinds of surface differentiation antigens (CD3+, CD19+ and CD56+) of lymphocyte.The stream data of each cell includes 11 ginsengs
Number, respectively pulse height (FITC-H, PE-H, APC-H), pulse area (FSC-A, SSC-A, FITC-A, PE-A, APC-A)
With pulse width (FITC-W, PE-W, APC-W).
The eigenvalue and characteristic vector of 1 contribution rate of table maximum main constituent PC0 and PC1
Tab.1 Characteristic value and characteristic vector of PC1and PC2
Table 2:PCA grouping result accuracys rate
With reference to the explanation and practice of the present invention for disclosing here, the other embodiment of the present invention is for those skilled in the art
All will be readily apparent and understand.Illustrate and embodiment be to be considered only as it is exemplary, the present invention true scope and purport it is equal
It is defined in the claims.
Claims (2)
1. a kind of method that flow cytometer data carry out fast automatic point of group, the method comprising the steps of:
Step one, is lost in cell number evidence, including following sub-step using PCA process:
1) sample matrix X is standardized, obtains normalized matrix X*;
2) obtain its correlation matrix and carry out feature decomposition, obtain eigenvalue (λ1≥λ2≥…≥λp) and its corresponding spy
Levy vectorial a1,a2,…,ap;
3) number k of main constituent is determined according to main constituent variance contribution ratio;
4) according to the corresponding characteristic vector U=[λ of front k main constituent1,λ2…λk], sample data is obtained to k principal component vector
The eigenvectors matrix W=X of composition*U;
Step 2, is clustered using the K-means algorithm flow cytometrics after improvement, is obtained monoid label;
Step 3, arranges the maximum main constituent of contribution rate and draws scatterplot as coordinate axess;
Step 4, realizes point group automatically.
2. method according to claim 1, the step 2 are specifically included:Determine that a data point is initial as first
Cluster centre, chooses with first cluster centre apart from maximum data point as second cluster centre, two before selected distance
The maximum data point of individual cluster centre distance is the 3rd cluster centre, by that analogy, finally determines n initial cluster center;
Computing is finally iterated to the distance of each data point to initial cluster center and realizes cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610943348.3A CN106548204A (en) | 2016-11-01 | 2016-11-01 | The fast automatic grouping method of Flow cytometry data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610943348.3A CN106548204A (en) | 2016-11-01 | 2016-11-01 | The fast automatic grouping method of Flow cytometry data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106548204A true CN106548204A (en) | 2017-03-29 |
Family
ID=58393603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610943348.3A Pending CN106548204A (en) | 2016-11-01 | 2016-11-01 | The fast automatic grouping method of Flow cytometry data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106548204A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108169105A (en) * | 2017-11-07 | 2018-06-15 | 山东卓越生物技术股份有限公司 | Leukocyte differential count processing method applied to cellanalyzer |
CN108287129A (en) * | 2018-03-22 | 2018-07-17 | 中国计量大学 | The detection device of multichannel fluorescence Spectra bioaerosol particle |
CN110197193A (en) * | 2019-03-18 | 2019-09-03 | 北京信息科技大学 | A kind of automatic grouping method of multi-parameter stream data |
CN112131937A (en) * | 2020-08-14 | 2020-12-25 | 中翰盛泰生物技术股份有限公司 | Automatic grouping method of fluorescent microspheres |
CN113188981A (en) * | 2021-04-30 | 2021-07-30 | 天津深析智能科技发展有限公司 | Automatic analysis method of multi-factor cytokine |
CN114136868A (en) * | 2021-12-03 | 2022-03-04 | 浙江博真生物科技有限公司 | Flow cytometry full-automatic clustering method based on density and nonparametric clustering |
CN117517176A (en) * | 2024-01-04 | 2024-02-06 | 成都棱镜泰克生物科技有限公司 | Automatic processing method and device for flow cytometry data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101226190A (en) * | 2007-01-17 | 2008-07-23 | 深圳迈瑞生物医疗电子股份有限公司 | Automatic sorting method and apparatus for flow type cell art |
CN104200114A (en) * | 2014-09-10 | 2014-12-10 | 中国人民解放军军事医学科学院卫生装备研究所 | Flow cytometry data fast analysis method |
-
2016
- 2016-11-01 CN CN201610943348.3A patent/CN106548204A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101226190A (en) * | 2007-01-17 | 2008-07-23 | 深圳迈瑞生物医疗电子股份有限公司 | Automatic sorting method and apparatus for flow type cell art |
CN104200114A (en) * | 2014-09-10 | 2014-12-10 | 中国人民解放军军事医学科学院卫生装备研究所 | Flow cytometry data fast analysis method |
Non-Patent Citations (3)
Title |
---|
GERALD GREGORI等: "Hyperspectral Cytometry at the Single-Cell Level Using a 32-Channel Photodetector", 《CYTOMETRY PART A BANNER》 * |
MALCOLM F. WILKINS等: "Comparison of Five Clustering Algorithms to Classify Phytoplankton From Flow Cytometry Data", 《CYTOMETRY BANNER》 * |
周鹏等: "基于主成分分析和支持向量机的睡眠分期研究", 《生物医学工程学杂志》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108169105A (en) * | 2017-11-07 | 2018-06-15 | 山东卓越生物技术股份有限公司 | Leukocyte differential count processing method applied to cellanalyzer |
CN108169105B (en) * | 2017-11-07 | 2020-12-18 | 山东卓越生物技术股份有限公司 | Leukocyte classification processing method applied to hematology analyzer |
CN108287129A (en) * | 2018-03-22 | 2018-07-17 | 中国计量大学 | The detection device of multichannel fluorescence Spectra bioaerosol particle |
CN110197193A (en) * | 2019-03-18 | 2019-09-03 | 北京信息科技大学 | A kind of automatic grouping method of multi-parameter stream data |
CN112131937A (en) * | 2020-08-14 | 2020-12-25 | 中翰盛泰生物技术股份有限公司 | Automatic grouping method of fluorescent microspheres |
CN113188981A (en) * | 2021-04-30 | 2021-07-30 | 天津深析智能科技发展有限公司 | Automatic analysis method of multi-factor cytokine |
CN114136868A (en) * | 2021-12-03 | 2022-03-04 | 浙江博真生物科技有限公司 | Flow cytometry full-automatic clustering method based on density and nonparametric clustering |
CN117517176A (en) * | 2024-01-04 | 2024-02-06 | 成都棱镜泰克生物科技有限公司 | Automatic processing method and device for flow cytometry data |
CN117517176B (en) * | 2024-01-04 | 2024-03-22 | 成都棱镜泰克生物科技有限公司 | Automatic processing method and device for flow cytometry data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106548204A (en) | The fast automatic grouping method of Flow cytometry data | |
CN106548205A (en) | A kind of fast automatic point of group of flow cytometry data and circle door method | |
CN106548203A (en) | A kind of fast automatic point of group of multiparameter flow cytometry data and gating method | |
US20170102310A1 (en) | Flow cytometer and a multi-dimensional data classification method and an apparatus thereof | |
Poomcokrak et al. | Red blood cells extraction and counting | |
Safdar et al. | Intelligent microscopic approach for identification and recognition of citrus deformities | |
Chen et al. | Automated flow cytometric analysis across large numbers of samples and cell types | |
Amin et al. | 3d semantic deep learning networks for leukemia detection | |
CN104091178A (en) | Method for training human body sensing classifier based on HOG features | |
Rahadi et al. | Red blood cells and white blood cells detection by image processing | |
CN107356594A (en) | Medicinal material section detection method, electronic equipment and storage medium based on cell analysis | |
CN109580458A (en) | Fluidic cell intelligent immunity classifying method, device and electronic equipment | |
Bacus et al. | Image processing for automated erythrocyte classification. | |
Aliyu et al. | Normal and abnormal red blood cell recognition using image processing | |
CN110226083B (en) | Erythrocyte fragment recognition method and device, blood cell analyzer and analysis method | |
Gavhale et al. | Identification of medicinal plant using Machine learning approach | |
CN110197193A (en) | A kind of automatic grouping method of multi-parameter stream data | |
Azad et al. | Immunophenotype discovery, hierarchical organization, and template-based classification of flow cytometry samples | |
Di Ruberto et al. | A region proposal approach for cells detection and counting from microscopic blood images | |
Appleby et al. | Sources of variability in cytosolic calcium transients triggered by stimulation of homogeneous uro-epithelial cell monolayers | |
Gondois‐Rey et al. | Multi‐parametric cytometry from a complex cellular sample: Improvements and limits of manual versus computational‐based interactive analyses | |
FI117987B (en) | General procedure for classifying plant embryos by a generalized Lorenz-Bayes classifier | |
Hokanson et al. | Some theoretical and practical considerations for multivariate statistical cell classification useful in autologous stem cell transplantation and tumor cell purging | |
Wen et al. | Dimension reduction analysis in image-based species classification | |
Micks et al. | A chromatographic study of the systematic relationship within the Anopheles gambiae complex |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170329 |
|
RJ01 | Rejection of invention patent application after publication |