WO2018151680A1 - Procédés et dispositifs d'identification de populations cellulaires dans des données - Google Patents

Procédés et dispositifs d'identification de populations cellulaires dans des données Download PDF

Info

Publication number
WO2018151680A1
WO2018151680A1 PCT/SG2018/050073 SG2018050073W WO2018151680A1 WO 2018151680 A1 WO2018151680 A1 WO 2018151680A1 SG 2018050073 W SG2018050073 W SG 2018050073W WO 2018151680 A1 WO2018151680 A1 WO 2018151680A1
Authority
WO
WIPO (PCT)
Prior art keywords
points
density
gating
density model
determining
Prior art date
Application number
PCT/SG2018/050073
Other languages
English (en)
Inventor
Hao Chen
JinMiao CHEN
Michael Poidinger
Anis LARBI
Xavier CAMOUS
Original Assignee
Agency For Science, Technology And Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency For Science, Technology And Research filed Critical Agency For Science, Technology And Research
Publication of WO2018151680A1 publication Critical patent/WO2018151680A1/fr

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N2015/1006Investigating individual particles for cytology
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Optical investigation techniques, e.g. flow cytometry
    • G01N2015/1402Data analysis by thresholding or gating operations performed on the acquired signals or stored data

Definitions

  • Various aspects relate generally to methods and devices for identifying clusters in multidimensional data, including in particle analysis such as gating of flow cytometry data.
  • Flow cytometry devices and other particle analyzers provide for the identification and characterization of particles based on certain predetermined parameters, e.g. optical parameters including light scatter and fluorescence.
  • optical parameters including light scatter and fluorescence.
  • particles in beads of a fluid suspensions are passed through a detection region where the particles are subjected to light, typically from one or more lasers, and the light scattering and fluorescence properties of the particles are measured by sensors in the detection region.
  • Particles are typically labeled with one or more fluorescent dyes of known properties in order to facilitate detection, and the sensors in the detection region are arranged in order to detect a plurality of different properties simultaneously, e.g.
  • each of the used fluorescent dyes and one or more light scattering properties such as forward- scattered light (FCS), side-scattered light (SSC), etc.
  • FCS forward- scattered light
  • SSC side-scattered light
  • sensors e.g. photodetectors, obtain the data for the particles in real-time as they pass through the detection region, and transmit the data to computer readable media for data storage.
  • the data obtained is multidimensional in nature, wherein each particle may correspond to a point in a multidimensional space defined by the measured parameters. Populations, or clusters, of certain types of cells are identified based on their correlation to each other in this
  • Manual gating plays a crucial role in flow cytometry data analysis due to its flexibility and intuition; however, manual sequential gating to extract interested cell populations becomes tedious and labor intensive at larger file sizes. And while automated gating mechanisms provides more expeditious results, current automated gating methods do not provide the accuracy and reliability of manual gating. As particle analyzers continue to improve and become capable of gathering larger amounts of data, e.g. greater than 40 channels in mass cytometry (each channel based on a measured stable isotope mass) or up to 50 characteristics per cell using a BD FACSymphonyTM flow cytometer, improved automated gating methods are necessary in order to efficiently and accurately analyze data.
  • methods and devices for efficiently and effectively processing and analyzing data including large sets of multidimensional data, are presented. These methods may include selecting at least one of ID modalGate and/or 2D modalGate (together, referred to as modalGate) depending on the data to be analyzed. modalGate efficiently and effectively identifies population clusters in data by determining a density model of the sorted data;
  • FIG. 1 shows a basic configuration of a flow cytometer in some aspects
  • FIG. 2 shows an internal configuration of a computer for data acquisition and analysis in some aspects
  • FIG. 3 shows detection results using ID modalGate on the left two charts, under 300, and the detection results using 2D modalGate on the right two charts;
  • FIG. 4 shows gating results using modalGate compared to manual gating in some aspects
  • FIGs. 5-9 show a comparison of the performance of modalGate with another automated gating method named flowDensity using expert manual gating as a benchmark in some aspects
  • FIG. 10 shows is a box plot chart showing the Fl measurement between modalGate and flowDensity in some aspects.
  • FIG. 11 shows a flowchart describing a method in some aspects.
  • the terms "at least one” and “one or more” may be understood to include any integer number greater than or equal to one, i.e. one, two, three, four, [... ], etc.
  • the term "a plurality” may be understood to include any integer number greater than or equal to two, i.e. two, three, four, five, [... ], etc.
  • phrases "at least one of” with regard to a group of elements may be used herein to mean at least one element from the group consisting of the elements.
  • the phrase "at least one of” with regard to a group of elements may be used herein to mean a selection of: one of the listed elements, a plurality of one of the listed elements, a plurality of individual listed elements, or a plurality of a multiple of listed elements.
  • data may be understood to include information in any suitable analog or digital form, e.g., provided as a file, a portion of a file, a set of files, a signal or stream, a portion of a signal or stream, a set of signals or streams, a key and/or value used in KV database, and the like. Further, the term “data” may also be used to mean a reference to information, e.g., in form of a pointer.
  • circuit or circuitry as used herein are understood as any kind of logic-implementing entity, which may include special-purpose hardware or a processor executing software.
  • a circuit may thus be an analog circuit, digital circuit, mixed-signal circuit, logic circuit, processor, microprocessor, Central Processing Unit (CPU), Graphics Processing Unit (GPU), Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), integrated circuit, Application Specific Integrated Circuit (ASIC), etc., or any combination thereof.
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • circuit Any other kind of implementation of the respective functions which will be described below in further detail may also be understood as a "circuit.” It is understood that any two (or more) of the circuits detailed herein may be realized as a single circuit with substantially equivalent functionality, and conversely that any single circuit detailed herein may be realized as two (or more) separate circuits with substantially equivalent functionality. Additionally, references to a "circuit" may refer to two or more circuits that collectively form a single circuit.
  • processor or “controller” as for example used herein may be understood as any kind of entity that allows handling data.
  • the data may be handled according to one or more specific functions executed by the processor or controller.
  • a processor or controller as used herein may be understood as any kind of circuit, e.g., any kind of analog or digital circuit.
  • handle or “handling” as for example used herein referring to data handling, file handling or request handling may be understood as any kind of operation, e.g., an I/O operation, as for example, storing (i.e. writing) and reading, or any kind of logic operation.
  • a processor or a controller may be or include an analog circuit, digital circuit, mixed-signal circuit, logic circuit, processor, microprocessor, Central Processing Unit (CPU), Graphics Processing Unit (GPU), Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), integrated circuit, Application Specific Integrated Circuit (ASIC), etc., or any combination thereof. Any other kind of implementation of the respective functions, which will be described below in further detail, may also be understood as a processor, controller, or logic circuit.
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • any two (or more) of the processors, controllers, or logic circuits detailed herein may be realized as a single entity with equivalent functionality or the like, and conversely that any single processor, controller, or logic circuit detailed herein may be realized as two (or more) separate entities with equivalent functionality or the like.
  • firmware refers to any type of executable instruction, including firmware.
  • system e.g., a storage system, measurement system, data analysis system, etc.
  • elements can be, by way of example and not of limitation, one or more mechanical components, one or more electrical components, one or more instructions (e.g., encoded in storage media), one or more processors, and the like.
  • storage e.g., a storage device, a primary storage, storage system, etc.
  • storage may be understood as any suitable type of memory or memory device, e.g., one or more of a solid state drive (SSD), hard disk drive (HDD), redundant array of independent disks (RAID), direct-connected NVM device, etc., or any combination thereof.
  • SSD solid state drive
  • HDD hard disk drive
  • RAID redundant array of independent disks
  • direct-connected NVM device etc., or any combination thereof.
  • memory may be understood as a non-transitory computer-readable medium in which data or information can be stored for retrieval. It is appreciated that a single component referred to as “memory” or “a memory” may be composed of more than one different type of memory, and thus may refer to a collective component comprising one or more types of memory. It is readily understood that any single memory component may be separated into multiple collectively equivalent memory components, and vice versa. Furthermore, while memory may be depicted as separate from one or more other components (such as in the drawings), it is understood that memory may be integrated within another component, such as on a common integrated chip.
  • FIG. 1 shows a basic configuration of a flow cytometer 100. It is appreciated that FIG. 1 is exemplary in nature and may thus be simplified for purposes of this explanation.
  • Flow cytometry is a laser, or impedance, based technology employed in cell counting, cell sorting, biomarker detection, protein engineering, and similar fields, which suspends cells 150 in a stream of fluid (sheath fluid 152) and passed the cells through an electronic detection apparatus.
  • a flow cytometer allows for simultaneous multi-parametric analysis of the physical and/or chemical characteristics of the particles at high flow rates, e.g. thousands of particles a second.
  • a flow cytometer has several main components: a flow cell 102, a measuring system
  • amplification system (e.g. including one or more lasers 104), a detection region 110-116, an amplification system
  • the flow cell 102 has a liquid stream (sheath fluid) 152 which carries and aligns the cells 150 (only one indicated is the one passing though the laser) so that they pass in a single file though one or more light beams, i.e. from one or more lasers 104, for sensing.
  • the measuring system may employ measurement of impedance and optical systems, including lamps, high power lasers, low power lasers, diode lasers, etc., which result in light signals which are detected by the sensors, 110-116, of the detection region.
  • These sensors may include a forward-scatter light (FSC) detector, side-scatter light (SSC) detector, and one or more fluorescent (Fl) marker detectors, 114-116 (where N may be any integer greater than
  • the detection region may also include dichroic glass/mirrors, e.g. 120-122, for separating lights of different wavelengths for the fluorescent (Fl) marker sensors.
  • Each of the signals from the sensors may pass through analog to digital converters (ADC) to convert the analog measurements of the FCS, SSC, and/or dye-specific fluorescence signal into digital signals than can be processed by the computer 130, wherein the computer has a storage medium for data acquisition and one or more processors for data analysis. Data acquisition is performed by the computer physically connected to the flow cytometer, and includes software which handles the digital interface with the flow cytometer.
  • ADC analog to digital converters
  • the software may be configured to adjust parameters (e.g., voltage, compensation) for testing, and also may assist in displaying initial sample information while acquiring sample data to ensure that parameters are set correctly.
  • this software may include executable instructions which when retrieved by a processor of the computer 130, execute the processes described herein.
  • the signal data obtained from the sensors in the detection region may be stored in data storage 202 in order to be analyzed. This may include converting one or more signals corresponding to each particle into n-dimensional data, where n is an integer greater than or equal to 1.
  • a processor 206 plot the n-dimensional data in a Cartesian coordinate system (e.g. (x, y) or (x, y, z)) as shown in FIG. 3-9 for data analysis.
  • methods and devices for performing analysis of data sets are disclosed.
  • the methods and device presented herein provide unbiased, objective, fast, and fully automated methods for data analysis of sets of population (modal) density data.
  • FIG. 2 shows an exemplary internal configuration of a computer 130 for data acquisition and analysis 130 in some aspects. It is appreciated that FIG. 2 is exemplary in nature and may thus be simplified for purposes of this explanation.
  • Data storage 202 may be one or more memory devices configured to receive data from the sensors 110-116 of flow cytometer 100. Data storage may be configured to store this data as raw FCS files.
  • Data analyzer 204 may include a processor 206 and a memory 208.
  • Processor 206 may be a single processor or multiple processors, and may be configured to retrieve and execute program code to perform the methods as described herein. Processor 206 may transmit and receive data over a software-level connection that is physically transmitted as signals.
  • Memory 208 may be a non-transitory computer readable medium storing instructions for subroutines 208a-208d, which may include executable instructions for performing the ID modalGate and/or 2D modalGate methods described herein.
  • These executable instructions may include instructions determining a density model of a sorted flow cytometry data 208a; calculating a derivative value at each of a plurality of points on the density model 208b; determining, using the calculated derivatives values: one or more peaks in the density model and one or more points of inflection (POI) in the density model 208c; and identifying one or more gating points based on the one or more peaks, or the one or more points of inflection 208d.
  • Computer 130 may further be configured with an interface to the one or more lasers (not shown).
  • clusters i.e. populations
  • the performance of cell subset detection by clustering depends on the data "cleanness.”
  • Most clustering methods work optimally on FCS files that are obtained after several steps of pre-gating such as removing doublets, dead cells, lineage+ cell, etc.
  • FCS raw flow cytometry standard
  • the disclosure herein may include initially conducting a quality control in order to remove low-quality cell events from the FCS files, e.g. using flowAI.
  • flowAI two methods to clean FCS files from unwanted events (i.e. data occurrences) are provided: 1) an automatic method that adopts algorithms for the detection of anomalies, and 2) an interactive method with a graphical user interface implemented into an R shiny application.
  • the general approach behind these two methods of flowAI includes three steps to check and remove suspected anomalies attributed to 1) abrupt changes in the flow rate, 2) instability of signal acquisition, and 3) outliers in the lower limit and margin events in the upper limit of the dynamic range.
  • the first step i.e. flow rate check
  • the second step i.e. signal acquisition check
  • verifies the stability of the signal acquired over time which may include, for example, verifying the quality of signal acquisition using Levy- Jennings-type graphs, where fluorescence is plotted against time.
  • a stable signal acquisition should produce intensity values whose distribution is consistent throughout the course of the experiment.
  • the third step (i.e. outlier check in the ranges) is performed at both the lower and upper limit of the dynamic range of the acquired data, which accounts for the occurrence of "margin events,” i.e. measurements with a real value higher/lower than the upper/power limits, respectively, causing an accumulation of signals which is not comparable with the rest of the acquired data.
  • the gating methods (modalGate) described herein may be performed.
  • modalGate a flexible and efficient automated gating algorithm, herein referred to as modalGate, for analyzing populations of data.
  • modalGate can efficiently detect the modals (i.e. populations or clusters) and determine the gating boundary by tracking the density changes of selected marks.
  • this gating algorithm can be further customized to fulfill most gating purposes implemented by manual gating while achieving data analysis speeds associated with automated gating.
  • the algorithm disclosed herein may construct an automated gating pipeline by chaining multiple customized modalGates together to gate the data, e.g. flow cytometry data, in a hierarchal manner.
  • the pipeline is data driven and may automatically adjust to account for variation among samples, and may therefore be applied as a gating template to any FCS files with a similar staining panel.
  • a gating template may be constructed to automatically gate myeloid cells from raw flow cytometry data, whereby the output matches that of manual gating with high precision and outperforms current automating gating methods such as flowDensity.
  • the modalGate algorithms automatically detect gating boundaries on markers by tracking changes in density.
  • Equation (2) where for appropriate functions is a kernel-
  • the derivative value of d j of fix) at the edge point of each bin is calculated to track the slope change of the density using Equation (4):
  • Equation (5)
  • the local maximum or minimum points in d j represents the change point, i.e. point of inflection, of density.
  • a local maximum in d j is the place where the increasing rate of density reaches maximum in a modal; a local minimum in d j labels the point with the maximal decreasing rate of density in a modal.
  • Change points c i.e. points of inflection, are identified using Equation (6):
  • ID modalGate can efficiently detect the gating position (g) at either the minimal intersection point between any two adjacent peaks or the cutting point along the tail of a specified peak, shown by Equation (7):
  • k and / are two adjustable parameters, wherein k affects the sensitivity of detecting peaks and change points (points of inflection), and / determines the distance between the gating on a tail from the peak.
  • 2D modalGate extends the application of ID modalGate to multi-dimensional, e.g. 2D, gating by implementing a bin-aligned density tracking.
  • ID modalGate to multi-dimensional, e.g. 2D, gating by implementing a bin-aligned density tracking.
  • the kernel density is determined using a bivariate normal
  • Equation (8) where and the bandwidth h x and h y are calculated using the same method as h
  • the data is binned on x and y to a grid of k x rows and k y columns with bin size
  • the density track on the two-dimensional data is transformed to a density track on each bin of x or y, depending along which tracks the density values are calculated.
  • ID modalGate is applied to detect the gating point on each bin, and all the gating points of each bin are linked up using linear regressions in order to obtain a gating line on the two- dimensional plot.
  • FIG. 3 shows detection results using ID modalGate on the left two charts, under 300, and the detection results using 2D modalGate on the right two charts, under 350.
  • the top panel shows the plot of the derivative of the density plot against x and the bottom panel shows the density plot of exemplary data results, e.g. FCS, SSC, or any fluorescent marker data obtained from a flow cytometer.
  • exemplary data results e.g. FCS, SSC, or any fluorescent marker data obtained from a flow cytometer.
  • Two peaks are detected in the density plot and labeled with black circles (corresponding to zero values on the derivative curve in the top panel), while the four vertical dashed lines labeled 302 represent the four detected points of inflection (i.e. change points) on the density plot
  • the graphs under 350 illustrate the application of 2D modalGate on two markers, x and y, of an exemplary set of data.
  • the two markers may be, for example, FCS, SSC, or any fluorescent marker data obtained via a flow cytometer.
  • x has been divided into 100 bins and y has been divided into 30 bins.
  • ID modalGate is applied to find the gating point between the two peaks.
  • the gating point on each bin is shown by the dashed line, where the solid line is drawn with the linear regression of all the gating points.
  • FIG. 4 shows the gating results using modalGate 450A-450E compared to manual gating 400A-400E in one exemplary comparison for pre-gating of myeloid cells from raw Flow Cytometry Standard (FCS) files.
  • FCS Flow Cytometry Standard
  • the gating results of modalGate 450 matches up very well with that of expert gating 400.
  • the methods described herein may be used for pre-gating for other types of cells, e.g. T-cells, B-cells, or the like, and include the use of other data obtained from flow cytometers (e.g. using other fluorescent markers or light scatter data).
  • This pre-gating includes five sequential gates (A-E) to remove the unwanted cells step by step, including beads and debris, doublets, CD45- cells, dead cells and Lineage positive cells. These pre-gating steps are necessary for all myeloid cells analysis using flow cytometry. However, the data usually possess variation between different files and samples, which makes a static gating template not suitable to be applied to batch FCS files. Manual adjusting is thus incorporated by FCS experts to gate the myeloid population on each file using software like flow Jo, and is shown in 400A-400E.
  • PBMC cells are gated. This takes two markers, FSC-A and SSC-A
  • ID modalGate finds the boundary of two populations (two modals) with low SSC-A value, denoted as Then, for cells with
  • ID modalGate to find the boundary of beads at the tail of the first modal on FSC-A, denoted as xj.
  • ID modalGate to find the boundary of the first modal, denoted as 3 ⁇ 4.
  • the gating of PBMC cells is to connect the point plus the vertical line of (0, yj)
  • the CD45+ cells i.e. leukocytes groups
  • ID modalGate is applied to find the tail of the first modal of marker CD45 on the positive side.
  • the gating of alive cells is performed. 2D modal gate is used on the marker of DAPI-A and SSC-A. DAPI-A is cut into 100 bins while SSC-A is cut into 30 bins. For bins of SSC-A with total density above the 50 percentile, the minimal intersection point between the first and second modal are detected and then combined with a linear regression line to get the gating for alive cells.
  • 450E the lineage negative cells are gated. ID modalGate is applied to find the boundary of the first modal and the second modal on marker Lineage. [0057] Combining these five automated gates, 450A-450E, an automated gating template for extracting myeloid cells from raw FCS is constructed. Before loading the raw FCS data for automated gating, a quality control is performed with the default parameters using an in- house developed toolkit.
  • FIG. 5-9 compare the performance of modalGate with another automated gating method named flowDensity. Expert manual gating is also shown and used as the benchmark.
  • flowDensity is designed to automate the 2D traditional gating scheme by choosing the best cut-off points using characteristic of marker density distribution. While modalGate and flowDensity both share a reliance on the density estimation of the obtained data, they differ from each other in the methods of extracting desired characteristics from density distribution to use to determine the gating points.
  • the number of peaks is directly detected from the density distribution, then the height and width of peaks, the standard deviation of the peak, percentile of the density distribution and the slope of distribution curve are calculated as characteristics for aiding the determination of gating position.
  • the derivative values from the density distribution i.e. determining the peaks and points of inflection
  • FIG. 5 shows a comparison between modalGate (middle row) and flowDensity
  • FIG. 6 shows a comparison between modalGate (middle row) and flowDensity (bottom row) for gating a single cell population from 5 different FSC files. Expert manual gating (top row) is used as the benchmark.
  • FIG. 7 shows a comparison between modalGate (middle row) and flowDensity (bottom row) for gating CD45+ population from 5 different FSC files. Expert manual gating (top row) is used as the benchmark.
  • FIG. 8 shows a comparison between modalGate (middle row) and flowDensity (bottom row) for gating a live cell population from 5 different FSC files. Expert manual gating (top row) is used as the benchmark.
  • FIG. 9 shows a comparison between modalGate (middle row) and flowDensity (bottom row) for gating lineage negative cell population from 5 different FSC files. Expert manual gating (top row) is used as the benchmark.
  • FIG. 10 is a box plot chart 1000 showing the Fl measurement between modalGate (B) and flowDensity (A) in each of five pre-gating steps (i.e. PBMS, SingleCell, CD45pos, Alive, and LINneg) of myeloid cells.
  • PBMS modalGate
  • A flowDensity
  • Chart 1000 shows that modalGate is quite robust for different gates as well as for different files, represented by small interquartile ranges and few outliers in the Fl measurements. Also, the median Fl measurements all approach 1 in all five gates in modalGate. In contrast, flowDensity shows high interquartile range in several gates, e.g. in PBMC, Alive cell and LINneg gates. This indicates that flowDensity doesn't perform as robustly as modalGate on different files, and in several cases, its Fl measure is even lower than 0.5, e.g. in Alive cell gating.
  • Table 1 shows a more detailed statistical assessment between modalGate and flowDensity in five different gates of 5 FCS files described above.
  • modalGate demonstrates high accuracy and robustness in the pre-gating of myeloid population, outperforming known gating methods such as flowDensity in testing. While shown as being used for pre-gating of myeloid data in some aspects, it is appreciated that modalGate may be used an algorithm for building automated gating templates, thereby facilitating the automated and reproducible analysis of flow cytometry data.
  • FIG. 11 shows a flowchart 1100 for gating of populations clusters in a data set in some aspects of this disclosure. It is appreciated that flowchart 1100 is exemplary in nature and may thus be simplified for purposes of this explanation.
  • the data is sorted according to one or more measured parameters.
  • a density model is determined based on the sorted data.
  • a derivative value at each of a plurality of points along the density model is calculated.
  • one or more gating points are identified based on the one or more peaks, or the one or more points of inflection, e.g. this may include using both one or more peaks and one or more points of inflection.
  • Example 1 a method for automated gating of flow cytometry data by a processing device, the method comprising sorting the flow cytometry data according to one or more parameters measured by a flow cytometer; determining a density model of the sorted flow cytometry data; calculating a derivative value at each of a plurality of points on the density model; determining, using the calculated derivative values: one or more peaks in the density model, and/or one or more points of inflection in the density model; and identifying one or more gating points based on the one or more peaks, or the one or more points of inflection.
  • Example 2 the subject matter of Example 1 may include when multiple peaks are determined in the density model, the method further comprising identifying at least one gating point between two of the multiple peaks based on the calculated derivative values of the density model between the two said peaks.
  • Example 3 the subject matter of Example 2 may include identifying the at least one gating point by determining a minimum absolute value of the calculated derivative values between the two said peaks.
  • Example 4 the subject matter of Examples 1-3 may include wherein the one or more parameters measured by the flow cytometer is selected from the group consisting of: forward scattered light (FSC); side scattered light (SSC); and a fluorescent activated marker.
  • FSC forward scattered light
  • SSC side scattered light
  • fluorescent activated marker a fluorescent activated marker
  • Example 5 the subject matter of Examples 1-4 may include determining the density model by applying a Gaussian kernel density estimation.
  • Example 6 the subject matter of Examples 1-5 may include determining the plurality of points along a domain of the density model by using a fixed number of bins.
  • Example 7 the subject matter of Example 6 may include determining a size of each of the bins by dividing a difference of a maximum of the domain and a minimum of the domain by the fixed number of bins.
  • Example 8 the subject matter of Examples 1-7 may include determining at least one of the one or more peaks by using a first calculated derivative value on one side of a point and a second calculated derivative value on a second side of the point.
  • Example 9 the subject matter of Example 8 may include wherein the product of the first calculated derivative value and the second calculated derivative value is less than zero.
  • Example 10 the subject matter of Example 9 may include wherein the first calculated derivative value is greater than zero.
  • Example 11 the subject matter of Examples 1- 10 may include determining the one or more points of inflection of the density model by determining local maximum and/or local minimum in the calculated derivative values.
  • Example 12 the subject matter of Examples 1- 11 may include determining the one or more points of inflection according to: where j is a number of a point of the plurality of points on the density model, c is a point of inflection, d j is a derivative value at j, and m is determined peak.
  • Example 13 the subject matter of Examples 1- 12 may include wherein the one or more gating points, g, of the sorted flow cytometry data are identified according to:
  • j is a number of a point of the plurality of points on the density model
  • c is a point of inflection
  • d j is a derivative value at j
  • b size is a fixed distance between each of the plurality of points on the density model
  • m is a determined peak
  • I is an adjustable parameter configured to determine a distance between the gating on a respective tail of a respective peak.
  • Example 14 the subject matter of Example 13 may include setting a default value of I to about 1.
  • Example 15 the subject matter of Examples 1- 14 may include adjusting a sensitivity of the determining of the peak and points of inflection by setting a pre-determined value for the number the plurality of points along the density model.
  • Example 16 the subject matter of Example 15 may include setting the predetermined value to about 512.
  • Example 17 the subject matter of Example 1 may include sorting the flow cytometry data according to two parameters measured by a flow cytometer.
  • Example 18 the subject matter of Example 17 may include determining the density model by applying a bivariate normal kernel function to the sorted flow cytometry data.
  • Example 19 the subject matter of Examples 17-18 may include determining a plurality of density tracks in the density model along either one of an x-axis or a y-axis of the density model.
  • Example 20 the subject matter of Example 19 may include wherein determining the plurality of density tracks comprises determining an "A" number of bins along the x-axis of the density model and a "B" number of bins along the y-axis of the density model.
  • Example 21 the subject matter of Example 20 may include determining a bin size dimension along the x-axis, b xsize , by a difference of a maximum the density model along the x-axis and a minimum of the density model along the x-axis by "A.”
  • Example 22 the subject matter of Examples 20-21 may include determining a bin size dimension along the y-axis, b ysize , by a difference of a maximum the density model along the y-axis and a minimum of the density model along the y-axis by "B.”
  • Example 23 the subject matter of Examples 20-22 may include calculating a density value of each of the bins selected along the x-axis or y-axis of the density model.
  • Example 24 the subject matter of Example 23 may include wherein determining the plurality of density tracks comprises transforming the calculated density values into the plurality of density tracks along the selected axis of the density model.
  • Example 25 the subject matter of Example 24 may include wherein calculating the derivative value at each of the plurality of points on the density model comprises calculating the derivative value at each of the plurality of points along each of the plurality of density tracks.
  • Example 26 the subject matter of Example 25 may include wherein each of the plurality of points along each density track is an edge of a bin dimension along the non- selected axis.
  • Example 27 the subject matter of Examples 25-26 may include wherein one or more peaks and one or more points of inflection are determined on each of the plurality of density tracks.
  • Example 28 the subject matter of Example 27 may include wherein at least one gating point is identified on each of the plurality of density tracks, providing a multiplicity of gating points.
  • Example 29 the subject matter of Example 28 may include identifying a gating line comprising a linear regression of the multiplicity of gating points.
  • machine-readable storage including machine-readable instructions which when executed by a processor of a device, cause the device to implement a method as recited in any preceding Example.
  • Example 31 a system for performing flow cytometry experiments, the system comprising: a flow cytometer configured to measure one or more parameters of a sample comprising a plurality of particles; and a processing device configured to: obtain the measurements from the flow cytometer; sort the measurements according to the one or more parameters; determine a density model of the sorted measurements; calculate a derivative value at each of a plurality of points on the density model; determine, using the calculated derivative values: one or more peaks in the density model, and/or one or more points of inflection in the density model; and identify one or more gating points based on the one or more peaks, or the one or more points of inflection.
  • Example 32 the subject matter of Example 31 may include the processing device further configured to, when multiple peaks are determined in the density model, identify at least one gating point between two of the multiple peaks based on the calculated derivative values of the density model between the two said peaks.
  • Example 33 the subject matter of Example 32 may include the processing device further configured to identify the at least one gating point by determining a minimum absolute value of the calculated derivative values between the two said peaks.
  • Example 34 the subject matter of Examples 31-33 may include wherein the one or more parameters measured by the flow cytometer is selected from the group consisting of: forward scattered light (FSC); side scattered light (SSC); and a fluorescent activated marker.
  • FSC forward scattered light
  • SSC side scattered light
  • fluorescent activated marker a fluorescent activated marker
  • Example 35 the subject matter of Examples 31-34 may include the processing device further configured to determine the density model by applying a Gaussian kernel density estimation.
  • Example 36 the subject matter of Examples 31-35 may include the processing device further configured to determine the plurality of points along a domain of the density model by using a fixed number of bins.
  • Example 37 the subject matter of Example 36 the processing device further configured to determine a size of each of the bins by dividing a difference of a maximum of the domain and a minimum of the domain by the fixed number of bins.
  • Example 38 the subject matter of Examples 31-37 may include the processing device further configured to determine at least one of the one or more peaks by using a first calculated derivative value on one side of a point and a second calculated derivative value on a second side of the point.
  • Example 39 the subject matter of Example 38 may include wherein the product of the first calculated derivative value and the second calculated derivative value is less than zero.
  • Example 40 the subject matter of Example 39 may include wherein the first calculated derivative value is greater than zero.
  • Example 41 the subject matter of Examples 31-40 may include the processing device further configured to determine the one or more points of inflection of the density model by determining local maximum and/or local minimum in the calculated derivative values.
  • Example 42 the subject matter of Examples 31-41 may include the processing device further configured to determine the one or more points of inflection according to:
  • Example 43 the subject matter of Examples 31-42 may include wherein the one or more gating points, g, of the sorted flow cytometry data are identified according to:
  • j is a number of a point of the plurality of points on the density model
  • c is a point of inflection
  • d j is a derivative value at j
  • b size is a fixed distance between each of the plurality of points on the density model
  • m is a determined peak
  • I is an adjustable parameter configured to determine a distance between the gating on a respective tail of a respective peak.
  • Example 44 the subject matter of Example 43 may include the processing device further configured to set a default value of I to about 1.
  • Example 45 the subject matter of Examples 31-44 may include the processing device further configured to adjust a sensitivity of the determining of the peak and points of inflection by setting a pre-determined value for the number the plurality of points along the density model.
  • Example 46 the subject matter of Example 45 may include the processing device further configured to set the pre-determined value to about 512.
  • Example 47 the subject matter of Example 31 may include the processing device further configured to set the flow cytometry data according to two parameters measured by a flow cytometer.
  • Example 48 the subject matter of Example 47 may include the processing device further configured to determine the density model by applying a bivariate normal kernel function to the sorted flow cytometry data.
  • Example 49 the subject matter of Examples 47-48 may include the processing device further configured to determine a plurality of density tracks in the density model along either one of an x-axis or a y-axis of the density model.
  • Example 50 the subject matter of Example 49 may include wherein determining the plurality of density tracks comprises determining an "A" number of bins along the x-axis of the density model and a "B" number of bins along the y-axis of the density model.
  • Example 51 the subject matter of Example 50 may include the processing device further configured to determine a bin size dimension along the x-axis, b xsize , by a difference of a maximum the density model along the x-axis and a minimum of the density model along the x-axis by "A.”
  • Example 52 the subject matter of Examples 50-51 may include the processing device further configured to determine a bin size dimension along the y-axis, b ysize , by a difference of a maximum the density model along the y-axis and a minimum of the density model along the y-axis by "B.”
  • Example 53 the subject matter of Examples 50-52 may include the processing device further configured to calculate a density value of each of the bins selected along the x- axis or y-axis of the density model.
  • Example 54 the subject matter of Example 53 may include wherein determining the plurality of density tracks comprises transforming the calculated density values into the plurality of density tracks along the selected axis of the density model.
  • Example 55 the subject matter of Example 54 may include wherein calculating the derivative value at each of the plurality of points on the density model comprises calculating the derivative value at each of the plurality of points along each of the plurality of density tracks.
  • Example 56 the subject matter of Example 55 may include wherein each of the plurality of points along each density track is an edge of a bin dimension along the non- selected axis.
  • Example 57 the subject matter of Examples 55-56 may include wherein one or more peaks and one or more points of inflection are determined on each of the plurality of density tracks.
  • Example 58 the subject matter of Example 57 may include wherein at least one gating point is identified on each of the plurality of density tracks, providing a multiplicity of gating points.
  • Example 59 the subject matter of Example 58 may include the processing device further configured to identify a gating line comprising a linear regression of the multiplicity of gating points.
  • a device corresponding to a method detailed herein may include one or more components configured to perform each aspect of the related method.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Dispersion Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Cette invention concerne des procédés et des dispositifs d'identification automatisé de populations cellulaires dans des données de cytométrie en flux, comprenant les étapes suivantes : trier les données de cytométrie en flux en fonction d'un ou de plusieurs paramètres mesurés par un cytomètre de flux ; déterminer un modèle de densité des données de cytométrie en flux triées ; calculer une valeur de dérivée à chaque point d'une pluralité de points dans le modèle de densité ; déterminer, à l'aide des valeurs de dérivées calculées : un ou plusieurs pics dans le modèle de densité, et un ou plusieurs points d'inflexion dans le modèle de densité ; et identifier un ou plusieurs points d'identification de populations cellulaires sur la base du ou des pics ou du ou des points d'inflexion.
PCT/SG2018/050073 2017-02-15 2018-02-15 Procédés et dispositifs d'identification de populations cellulaires dans des données WO2018151680A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10201701209X 2017-02-15
SG10201701209X 2017-02-15

Publications (1)

Publication Number Publication Date
WO2018151680A1 true WO2018151680A1 (fr) 2018-08-23

Family

ID=63170697

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2018/050073 WO2018151680A1 (fr) 2017-02-15 2018-02-15 Procédés et dispositifs d'identification de populations cellulaires dans des données

Country Status (1)

Country Link
WO (1) WO2018151680A1 (fr)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020128561A1 (fr) * 2018-12-21 2020-06-25 Premium Genetics (Uk) Ltd. Système et procédés pour identification de sous-populations
CN113158817A (zh) * 2021-03-29 2021-07-23 南京信息工程大学 一种基于快速密度峰聚类的客观天气分型方法
US11187224B2 (en) 2013-07-16 2021-11-30 Abs Global, Inc. Microfluidic chip
US11193879B2 (en) 2010-11-16 2021-12-07 1087 Systems, Inc. Use of vibrational spectroscopy for microfluidic liquid measurement
US11243494B2 (en) 2002-07-31 2022-02-08 Abs Global, Inc. Multiple laminar flow-based particle and cellular separation with laser steering
US11331670B2 (en) 2018-05-23 2022-05-17 Abs Global, Inc. Systems and methods for particle focusing in microchannels
WO2022139597A1 (fr) * 2020-12-23 2022-06-30 Engender Technologies Limited Systèmes et procédés de tri et de classification de particules
US11415503B2 (en) 2013-10-30 2022-08-16 Abs Global, Inc. Microfluidic system and method with focused energy apparatus
US11628439B2 (en) 2020-01-13 2023-04-18 Abs Global, Inc. Single-sheath microfluidic chip
US11889830B2 (en) 2019-04-18 2024-02-06 Abs Global, Inc. System and process for continuous addition of cryoprotectant

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020029235A1 (en) * 2000-05-11 2002-03-07 Becton Dickinson And Company System for identifying clusters in scatter plots using smoothed polygons with optimal boundaries
US20050059046A1 (en) * 2003-06-18 2005-03-17 Applera Corporation Methods and systems for the analysis of biological sequence data
US20080221812A1 (en) * 2006-09-29 2008-09-11 Richard Pittaro Differentiation of flow cytometry pulses and applications
CN104200114A (zh) * 2014-09-10 2014-12-10 中国人民解放军军事医学科学院卫生装备研究所 流式细胞仪数据快速分析方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020029235A1 (en) * 2000-05-11 2002-03-07 Becton Dickinson And Company System for identifying clusters in scatter plots using smoothed polygons with optimal boundaries
US20050059046A1 (en) * 2003-06-18 2005-03-17 Applera Corporation Methods and systems for the analysis of biological sequence data
US20080221812A1 (en) * 2006-09-29 2008-09-11 Richard Pittaro Differentiation of flow cytometry pulses and applications
CN104200114A (zh) * 2014-09-10 2014-12-10 中国人民解放军军事医学科学院卫生装备研究所 流式细胞仪数据快速分析方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GE Y. ET AL.: "flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding", BIOINFORMATICS, vol. 28, no. 15, 17 May 2012 (2012-05-17), pages 2052 - 2058, XP055073112, [retrieved on 20180412] *
MALEK M. ET AL.: "flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification", BIOINFORMATICS, vol. 31, no. 4, 16 October 2014 (2014-10-16), pages 606 - 607, XP055537251, [retrieved on 20180412] *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11415936B2 (en) 2002-07-31 2022-08-16 Abs Global, Inc. Multiple laminar flow-based particle and cellular separation with laser steering
US11243494B2 (en) 2002-07-31 2022-02-08 Abs Global, Inc. Multiple laminar flow-based particle and cellular separation with laser steering
US11422504B2 (en) 2002-07-31 2022-08-23 Abs Global, Inc. Multiple laminar flow-based particle and cellular separation with laser steering
US11965816B2 (en) 2010-11-16 2024-04-23 1087 Systems, Inc. Use of vibrational spectroscopy for microfluidic liquid measurement
US11193879B2 (en) 2010-11-16 2021-12-07 1087 Systems, Inc. Use of vibrational spectroscopy for microfluidic liquid measurement
US11512691B2 (en) 2013-07-16 2022-11-29 Abs Global, Inc. Microfluidic chip
US11187224B2 (en) 2013-07-16 2021-11-30 Abs Global, Inc. Microfluidic chip
US11415503B2 (en) 2013-10-30 2022-08-16 Abs Global, Inc. Microfluidic system and method with focused energy apparatus
US11639888B2 (en) 2013-10-30 2023-05-02 Abs Global, Inc. Microfluidic system and method with focused energy apparatus
US11796449B2 (en) 2013-10-30 2023-10-24 Abs Global, Inc. Microfluidic system and method with focused energy apparatus
US11331670B2 (en) 2018-05-23 2022-05-17 Abs Global, Inc. Systems and methods for particle focusing in microchannels
WO2020128561A1 (fr) * 2018-12-21 2020-06-25 Premium Genetics (Uk) Ltd. Système et procédés pour identification de sous-populations
CN113260848A (zh) * 2018-12-21 2021-08-13 Abs全球公司 亚群识别的系统和方法
US11889830B2 (en) 2019-04-18 2024-02-06 Abs Global, Inc. System and process for continuous addition of cryoprotectant
US11628439B2 (en) 2020-01-13 2023-04-18 Abs Global, Inc. Single-sheath microfluidic chip
WO2022139597A1 (fr) * 2020-12-23 2022-06-30 Engender Technologies Limited Systèmes et procédés de tri et de classification de particules
CN113158817B (zh) * 2021-03-29 2023-07-18 南京信息工程大学 一种基于快速密度峰聚类的客观天气分型方法
CN113158817A (zh) * 2021-03-29 2021-07-23 南京信息工程大学 一种基于快速密度峰聚类的客观天气分型方法

Similar Documents

Publication Publication Date Title
WO2018151680A1 (fr) Procédés et dispositifs d'identification de populations cellulaires dans des données
US10482590B2 (en) Method and system for defect classification
US10883916B2 (en) Cell analyzer and sorting method therefor
US10337975B2 (en) Method and system for characterizing particles using a flow cytometer
CN102473660B (zh) 等离子加工系统自动瑕疵检测和分类及其方法
TWI576708B (zh) 自動缺陷分類的分類器準備與維持
US9293298B2 (en) Defect discovery and inspection sensitivity optimization using automated classification of corresponding electron beam images
US20080172185A1 (en) Automatic classifying method, device and system for flow cytometry
KR102576881B1 (ko) 설계 및 잡음 기반 케어 영역들
US20170102310A1 (en) Flow cytometer and a multi-dimensional data classification method and an apparatus thereof
Kicherer et al. BAT (Berry Analysis Tool): A high-throughput image interpretation tool to acquire the number, diameter, and volume of grapevine berries
CN107389536B (zh) 基于密度-距离中心算法的流式细胞粒子分类计数方法
Rogers et al. Cytometric fingerprinting: quantitative characterization of multivariate distributions
US20230196720A1 (en) Computer-implemented method, computer program product and system for data analysis
US10274412B2 (en) Flow cytometry data segmentation result evaluation systems and methods
EP3244191A1 (fr) Procédé et système de caractérisation de particules à l'aide d'un cytomètre de flux
JPS60107565A (ja) 赤血球異常のスクリ−ニング方法
CN112557285B (zh) 一种流式细胞检测数据自动设门方法和装置
US20240038338A1 (en) System and method for automated flow cytometry data analysis and interpretation
EP4246124A1 (fr) Analyseur d'échantillon, procédé d'analyse d'échantillon et programme
CN115839912A (zh) 一种动物用血液分析装置及方法
JP2023137001A (ja) 検体分析装置、検体分析方法およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18754848

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18754848

Country of ref document: EP

Kind code of ref document: A1