CN117288661A - Method, medium and system for outputting cell mass removal signal by flow cytometer - Google Patents

Method, medium and system for outputting cell mass removal signal by flow cytometer Download PDF

Info

Publication number
CN117288661A
CN117288661A CN202311243009.0A CN202311243009A CN117288661A CN 117288661 A CN117288661 A CN 117288661A CN 202311243009 A CN202311243009 A CN 202311243009A CN 117288661 A CN117288661 A CN 117288661A
Authority
CN
China
Prior art keywords
pulse
matrix
fsc
signal
ssc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311243009.0A
Other languages
Chinese (zh)
Inventor
赵国强
纪存朋
孙树梁
王硕硕
孙谧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Ruisikeer Biotechnology Co ltd
Original Assignee
Qingdao Ruisikeer Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Ruisikeer Biotechnology Co ltd filed Critical Qingdao Ruisikeer Biotechnology Co ltd
Priority to CN202311243009.0A priority Critical patent/CN117288661A/en
Publication of CN117288661A publication Critical patent/CN117288661A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume, or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Electro-optical investigation, e.g. flow cytometers
    • G01N15/1429Electro-optical investigation, e.g. flow cytometers using an analyser being characterised by its signal processing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume, or surface-area of porous materials
    • G01N15/10Investigating individual particles
    • G01N15/14Electro-optical investigation, e.g. flow cytometers
    • G01N15/1434Electro-optical investigation, e.g. flow cytometers using an analyser being characterised by its optical arrangement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Abstract

The invention provides a method, medium and system for outputting cell mass removal signals by a flow cytometer, belonging to the technical field of flow cytometry, wherein the method comprises the following steps: traversing each electric pulse signal in the FSC pulse set, and acquiring electric pulse signals with channel values larger than a first threshold value to form an FSC large pulse set; screening the FSC matrix, the SSC pulse matrix and the fluorescence pulse matrix according to the FSC large pulse set; establishing an aggregation matrix from the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescence pulse matrix; analyzing the aggregation matrix by using a cell mass analysis model to obtain electric pulse signals of corresponding cell masses in the aggregation matrix; deleting the electric pulse signals corresponding to the cell mass in the electric pulse signal collection. The invention can solve the technical problems that the existing cell mass identification and removal method relies on manual setting of threshold values and parameters, and the identification rule is required to be manually adjusted aiming at different samples, so that automation cannot be realized.

Description

Method, medium and system for outputting cell mass removal signal by flow cytometer
Technical Field
The invention belongs to the technical field of flow cytometry, and particularly relates to a method, medium and system for outputting a cell mass removal signal by a flow cytometer.
Background
Flow cytometry is a rapid, high-throughput single-cell analysis technique, widely used in basic research and clinical detection. In flow cytometry measurements, individual cells are arranged in thin lines by a fluid, passed one by one through a field of view where the laser intersects the detector, and various optical signals are recorded. From the forward scattered light (FSC), side scattered light (SSC) and fluorescent signals, the size, endoplasmic and surface markers characteristics of each cell can be analyzed.
Quantitative analysis of cell mass by flow cytometry is severely disturbed compared to single cells. The cell mass is formed by the aggregation of a plurality of cells, and its larger volume will produce a stronger optical signal than a single cell. This can seriously affect subsequent quantitative analysis and cell classification. Therefore, identification and removal of cell clusters is an important pre-processing step for flow cytometry data analysis.
The current methods for identifying cell clusters mainly comprise the following two types:
(1) A method based on pulse shape. The identification is performed by setting a threshold using pulse waveform information, such as pulse area, width, peak ratio, etc., generated when each cell or cell mass passes through the detection region. Such methods rely on parameter settings and empirical determinations.
(2) A snapshot image based method. A tiny image of each event is first captured, and then the cell mass is identified by image processing techniques. However, the method requires storing a large number of images, is complex in operation, high in cost and not wide in application range.
Generally, the existing cell mass identification and removal methods rely on manual setting of thresholds and parameters, and manual adjustment of identification rules is required for different samples, so that automation cannot be realized.
Disclosure of Invention
In view of the above, the invention provides a method, medium and system for outputting a cell mass removal signal by a flow cytometer, which can solve the technical problems that the existing cell mass identification and removal method depends on manually setting a threshold and parameters, and the identification rule needs to be manually adjusted for different samples, so that the automation cannot be realized.
The invention is realized in the following way:
in a first aspect, the present invention provides a method for outputting a decellularized signal by a flow cytometer, comprising the steps of:
s10, collecting an electric pulse signal set collected by a photoelectric converter of a flow cytometer, wherein the electric pulse signal set comprises an electric pulse signal set generated by forward scattered light, an electric pulse signal set generated by side scattered light and an electric pulse signal set generated by fluorescent signals, and the electric pulse signal sets are respectively recorded as an FSC pulse set, an SSC pulse set and a fluorescent pulse set; wherein the electrical pulse signal comprises a time, a length, a width and an area;
s20, respectively establishing an FSC matrix, an SSC pulse matrix and a fluorescence pulse matrix according to the obtained FSC pulse set, SSC pulse set and fluorescence pulse set;
s30, traversing each electric pulse signal in the FSC pulse set, and acquiring electric pulse signals with channel values larger than a first threshold value to form an FSC large pulse set;
s40, screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix;
s50, establishing an aggregation matrix from the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescence pulse matrix;
s60, analyzing the aggregation matrix by utilizing a pre-trained cell mass analysis model to obtain an electric pulse signal of a corresponding cell mass in the aggregation matrix;
s70, deleting the electric pulse signals corresponding to the cell clusters in the electric pulse signal set, and outputting the obtained electric pulse signal set to a data processing system of the flow cytometer.
The method for outputting the cell mass removal signal by the flow cytometer has the technical effects that: because certain relations exist among FSC, SSC and fluorescent signals of each cell or cell mass, for example, the cell mass is relatively large in size, the internal structure is more complex, so that certain relations are generated among FSC and SSC, meanwhile, the fluorescent signals of the cell mass can generate different channel values from those of common cells due to relatively more fluorescent dyes of the cell mass, the selected threshold value can be inaccurate due to the fact that the data size is too large by simple manual judgment, and the multi-channel fusion method is utilized to conduct multi-feature extraction analysis by utilizing a polymerization matrix, so that the cell mass signals can be better determined.
Based on the technical scheme, the method for outputting the cell mass removal signal by the flow cytometer can be improved as follows:
the step of establishing the FSC matrix, the SSC pulse matrix and the fluorescence pulse matrix according to the obtained FSC pulse set, the SSC pulse set and the fluorescence pulse set respectively further comprises the step of eliminating noise points of the electric pulse signal.
The step of traversing each electric pulse signal in the FSC pulse set to obtain an electric pulse signal with a channel value larger than a first threshold value and forming an FSC large pulse set specifically comprises the following steps:
traversing the FSC pulse set, and extracting pulse signals with FSC channel values larger than a first threshold value as a first pulse set;
calculating the inter-pulse distance of the first pulse set, and merging adjacent pulses;
smoothing the first pulse set after combining the adjacent pulses;
the first pulse set after smoothing filter processing is subjected to dimensionality reduction by adopting PCA method
The first pulse set after dimension reduction is used for seating the FSC large pulse set.
By default, the first threshold is set to 3 times the FSC baseline noise.
The step of screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix specifically comprises the following steps:
traversing the FSC matrix, and reserving a row corresponding to the FSC large pulse;
screening corresponding rows from SSC and fluorescent matrix according to FSC large pulse;
smoothing the filtered matrix;
outputting the filtered and smoothed FSC, SSC and fluorescent matrix.
The step of establishing an aggregation matrix by using the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescent pulse matrix specifically comprises the following steps:
transversely splicing screening smooth matrixes of different channels to form an aggregation matrix;
normalizing and standardizing the aggregation matrix;
and performing dimension reduction on the normalized and standardized aggregation matrix by adopting PCA to obtain a dimension reduction aggregation matrix fused with different channel information.
The cell mass analysis model building and training method specifically comprises the following steps of:
collecting a plurality of historical aggregation matrix samples, and marking cell mass data to serve as training samples;
constructing a machine learning classification model;
and training the machine learning classification model by using a training sample to obtain a cell mass analysis model.
Further, the model used for constructing the machine learning classification model is an SVM or random forest.
Wherein, the step of deleting the electric pulse signals corresponding to the cell mass in the electric pulse signal set specifically comprises the following steps:
traversing an original electric pulse signal set;
judging whether each pulse is in the identified cell mass collection;
if so, deleting the pulse from the original set;
retaining the pulse of the non-cell mass to form a filtered output set;
returning the filtered set of electrical pulse signals.
A second aspect of the present invention provides a computer readable storage medium having stored therein program instructions which when executed are adapted to carry out a method of outputting a degranulation signal by a flow cytometer as described above.
A third aspect of the present invention provides a system for outputting a degranulation signal by a flow cytometer, wherein the system comprises the computer readable storage medium.
Compared with the prior art, the method, medium and system for outputting the cell mass removal signal by the flow cytometer have the beneficial effects that:
1. realizing the automatic identification and removal of cell clusters
According to the invention, by constructing the pre-trained machine learning model, the cell clusters in the aggregation matrix can be automatically identified without manually setting identification rules and threshold values. This avoids the complexity of parameter adjustment and achieves intelligent automatic identification of cell clusters.
In contrast, existing methods based on threshold determination require a professional to adjust parameters for different samples, are very empirical, and cannot implement automatic batch processing. The automatic identification method of the invention greatly simplifies the operation flow.
2. Improving the accuracy of cell mass identification
The invention comprehensively utilizes multi-channel information such as FSC, SSC and the like to construct the aggregation matrix with strong expression capacity, and adopts a pre-training machine learning model, thereby ensuring the accuracy of cell mass identification.
The prior method often generates misrecognition or misrecognition due to improper threshold setting. The multi-channel aggregation matrix provides richer features, and the recognition accuracy can be effectively improved by combining a machine learning model.
3. The operation complexity is reduced
Compared with cell mass identification based on image processing, the method directly models the pulse matrix, does not need to store and process a large number of images, and has higher calculation efficiency.
Meanwhile, compared with the method of traversing and judging the threshold value for many times, the machine learning model can be calculated in parallel, and the prediction speed is faster. The method is efficient in operation.
4. Improving robustness and the ability to adapt to different samples
The machine learning model can obtain better generalization capability through pre-training and adapt to different parameter changes.
The model is obtained based on the pre-training of various samples, can identify cell clusters of different sizes and types, and has strong robustness without adjusting parameters for each sample.
5. Lay a foundation for standardized automatic processing flow
The invention realizes standardization and automation of the step of cell mass identification. The flow cytometry analysis can establish a standard end-to-end flow, and automation from acquisition to analysis results is realized.
The user can obtain an accurate result after removing the cell clusters only by simple operation, and the detection efficiency is greatly improved.
6. The detection cost is reduced
The automation degree is improved, so that the workload of professional operators can be reduced. Meanwhile, the calculation efficiency is higher, and the instrument use time can be reduced.
The application of the invention can reduce the labor and equipment cost of flow cytometry detection and improve the detection efficiency.
In summary, the invention can solve the technical problems that the existing cell mass identification and removal method relies on manual setting of threshold and parameters, manual adjustment of identification rules is required for different samples, automation cannot be realized, and the storage and calculation amount of the image-based method is large.
Drawings
FIG. 1 is a flow chart of a method for outputting a decellularized mass signal by a flow cytometer according to the present invention.
Detailed Description
In order to make the purposes, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention are clearly and completely described.
Referring to FIG. 1, a flow chart of a method for outputting a cell mass removal signal by a flow cytometer according to a first aspect of the present invention is provided, the method comprising the steps of:
s10, collecting an electric pulse signal set collected by a photoelectric converter of a flow cytometer, wherein the electric pulse signal set comprises an electric pulse signal set generated by forward scattered light, an electric pulse signal set generated by side scattered light and an electric pulse signal set generated by fluorescent signals, and the electric pulse signal sets are respectively recorded as an FSC pulse set, an SSC pulse set and a fluorescent pulse set; wherein the electrical pulse signal comprises time, length, width and area;
s20, respectively establishing an FSC matrix, an SSC pulse matrix and a fluorescence pulse matrix according to the obtained FSC pulse set, SSC pulse set and fluorescence pulse set;
s30, traversing each electric pulse signal in the FSC pulse set, and acquiring electric pulse signals with channel values larger than a first threshold value to form an FSC large pulse set;
s40, screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix;
s50, establishing an aggregation matrix from the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescence pulse matrix;
s60, analyzing the aggregation matrix by using a pre-trained cell mass analysis model to obtain electric pulse signals of corresponding cell masses in the aggregation matrix;
s70, deleting the electric pulse signals corresponding to the cell clusters in the electric pulse signal set, and outputting the obtained electric pulse signal set to a data processing system of the flow cytometer.
Specifically, the implementation steps of S10 are described as follows:
in this step we collect the raw electrical signal collected by the photoelectric converter of the flow cytometer. The photoelectric converter generally adopts a device such as a photodiode or a phototransistor. As the cell sample flows through the laser beam, forward scattered light, side scattered light, and fluorescent signals are generated. The optical signals are converted into current signals by the photodiodes and then into voltage signals by the operational amplifiers, namely original electric pulse signals.
The raw electrical pulse signal contains all the optical information about each cellular or non-cellular event. Different types of electrical pulse signals can be obtained according to the scattering and fluorescence characteristics of cellular or non-cellular events in different channels. In the present invention, we divide these electrical pulse signals into three categories:
(1) FSC pulse set-set of electrical pulse signals generated by forward scattered light (FSC). FSC reflects mainly cell size.
(2) SSC pulse set-the set of electrical pulse signals generated by side scattered light (SSC). SSC reflects mainly the internal structure and complexity of the cell.
(3) Fluorescence pulse set, namely an electric pulse signal set generated by fluorescence detected by different fluorescence channels. Different fluorescent channels correspond to different fluorescent labels.
The electric pulse signal contains the following information:
(1) Time of day: the time of occurrence of the pulse signal.
(2) Length: the length of the pulse signal, i.e., the X-axis direction span. Reflecting the amount of photon flux.
(3) Width: the width of the pulse signal, i.e., the Y-axis direction height. Is related to the pulse signal amplitude.
(4) Area is the area under the pulse signal curve. Correlated to the number of photons detected.
In summary, the key of this step is to collect the original electrical pulse signals, and classify them into FSC pulse sets, SSC pulse sets and fluorescence pulse sets according to different optical channels, and these signals include the optical characteristic information of each cell/non-cell event, which lays a foundation for the subsequent identification of cell clusters.
The implementation of this step is described in detail below:
1. the optical path system of the flow cytometer is configured to include a laser, a condenser lens, a flow cell, etc., so that when the laser irradiates the sample, forward scattered light, side scattered light, and a fluorescent signal can be generated.
2. And a proper number of photodiodes or phototriodes are selected as photoelectric converters, and proper numbers are configured according to actual needs, so that signals of forward scattered light, side scattered light and each fluorescent channel can be detected.
3. The output of the photodiode or phototransistor is connected to an operational amplifier, which is configured with a magnification factor to amplify the current signal.
4. The digital acquisition card is connected, and the sampling frequency is configured to acquire the voltage signal output by the operational amplifier.
5. The flow cytometer is operated using a cell sample with a predetermined fluorescent label. When a cell sample is passed through the laser beam, an optical signal is generated.
6. The photodiode converts the optical signal into a current signal, and the operational amplifier amplifies the signal and converts the signal into a voltage pulse signal.
7. The digital acquisition card samples the voltage pulse signal at a certain frequency to obtain an original electric pulse signal.
8. Depending on the photodiode location, it can be determined whether the electrical pulse signal is coming from an FSC, SSC or a specific fluorescent channel.
9. And classifying and sorting the acquired original electric pulse signals into an FSC pulse set, an SSC pulse set and a fluorescence pulse set according to the judgment.
10. Analyzing the electric pulse signal, extracting characteristic information such as time, length, width, area and the like, and providing basis for the follow-up identification of the cell mass.
11. And storing the processed electric pulse signals in a memory or a hard disk of a computer for later use.
12. Repeating steps 5-11, collecting a sufficient number of cell samples, and obtaining an original electric pulse signal set containing cell mass information.
Through the steps, the original electric pulse signals acquired by the photoelectric converter of the flow cytometer can be effectively obtained, the signals are classified according to different optical channels, characteristic information is extracted, a foundation is laid for subsequent cell mass identification, and the step S10 is completed. Places to be noted include:
(1) The light path system is reasonable to be configured, so that scattered light and fluorescence can be effectively detected.
(2) The photoelectric converter needs to be selected correctly and has high response speed.
(3) The amplification factor is proper to prevent waveform distortion.
(4) The sampling frequency is set reasonably, and the Nyquist sampling theorem is satisfied.
(5) The extracted features are effective to represent signal characteristics.
(6) The storage is to guarantee the access speed.
By paying attention to the points, the original electric pulse signal with good quality and rich information can be obtained, reliable basic data is provided for the subsequent cell mass identification, and the acquisition and processing of the step S10 are completed.
In the above technical solution, the step of establishing the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC pulse set, the SSC pulse set and the fluorescent pulse set obtained by the electrical pulse signal, respectively, further includes a step of eliminating noise points of the electrical pulse signal.
Specifically, the implementation steps of S20 are described as follows:
first, some variables and constants are defined:
FSC_pulse_set represents a FSC pulse set comprising m pulse signals, denoted { pSc } 1 ,psc 2 ,...,pSc m };
SSC_pulse_set represents a SSC pulse set comprising n pulse signals, denoted { pssc } 1 ,pssc 2 ,...,pssc n };
FL_pulse_set represents a set of fluorescent pulses, comprising 1 pulse signal, denoted { pfl } 1 ,pfl 2 ,...,pfl l };
FSC matrix represents the FSC matrix to be established, and the dimension is m multiplied by 4;
the sscmatrix represents the SSC matrix to be established, with dimensions n x 4;
-FL matrix represents the fluorescent matrix to be established, with dimension lx4;
the pulse signal p consists of 4 features: p= { t, w, a, h }, where t represents the pulse time, w represents the pulse width, a represents the pulse area, and h represents the pulse height;
then, each pulse signal psc in FSC_pulse_set is traversed i Writing it to the ith row of FSC_matrix;
similarly, traversing SSC_pulse_set and FL_pulse_set, and writing pulse signals into corresponding matrixes;
to this end, an FSC matrix, an SSC matrix and a fluorescence matrix are established from the pulse signal set.
Preferably, to further eliminate the influence of noise points, the matrix may be smoothed:
1) FSC matrix smoothing
Calculate each pulse psc i Distance d (psc) on time axis i ):
If d (psc) i ) < threshold, then psc will be i And Psc (PSC) i+1 And are combined into one pulse.
Repeating the above steps until d (psc) i ) And (3) until the threshold is not less than.
2) The SSC matrix and the fluorescence matrix are smoothed in the same way as the FSC matrix.
3) And carrying out mean filtering on the smoothed matrix to further eliminate random noise.
FSC_matrix=smooth(FSC_matrix);
SSC_matrix=smooth(SSC_matrix);
FL_matrix=smooth(FL_matrix);
Through the steps, the FSC matrix, the SSC matrix and the fluorescence matrix after the smoothing treatment are obtained, the matrixes reflect the pulse distribution condition of each channel, and a foundation is laid for subsequent recognition analysis.
The matrix processing algorithm adopts a smoothing and filtering method, so that random errors caused by instrument noise can be effectively eliminated, and the accuracy of subsequent identification is improved. The time complexity is O (m+n+l), and the space complexity is O (m+n+l). When the cell samples are more, the matrix size becomes larger, and the dimension reduction can be considered by adopting PCA and other methods to reduce the calculated amount. PCA (Principal Component Analysis) principal component analysis is a common way of data analysis that uses orthogonal transformation to transform data represented by linearly related variables, referred to as principal components, into a few data represented by linearly independent variables. The number of principal components is typically smaller than the number of original variables, so principal component analysis is often used for dimension reduction of high-dimensional data, extracting the principal characteristic components of the data.
Furthermore, the threshold for pulse combining needs to be empirically determined based on specific instrument parameters, typically taken as 10-20% of the pulse width. When the matrix is smoothed, mean filtering, median filtering or Gaussian filtering can be adopted, and filter parameters are required to be selected according to actual conditions.
In conclusion, the step establishes a pulse matrix by traversing the pulse signals, and performs smooth filtering treatment, so that a foundation is laid for subsequent identification and analysis, the algorithm thought is clear and direct, the calculated amount is moderate, and better pulse expression can be obtained. The method can be widely applied to pulse signal processing in the fields of flow cytometry and the like.
In the above technical solution, traversing each electric pulse signal in the FSC pulse set, obtaining an electric pulse signal with a channel value greater than a first threshold value, and forming a FSC large pulse set, which specifically includes:
traversing the FSC pulse set, and extracting pulse signals with FSC channel values larger than a first threshold value as a first pulse set;
calculating the inter-pulse distance of the first pulse set, and merging adjacent pulses;
smoothing the first pulse set after combining the adjacent pulses;
the first pulse set after smoothing filter processing is subjected to dimensionality reduction by adopting PCA method
The first pulse set after dimension reduction is used for seating the FSC large pulse set.
Specifically, the implementation steps of S30 are described as follows:
1. defining variables and constants
-threshold 1 : FSC channel first threshold
Fsc_large_set: FSC large pulse set to be formed
2. Traversing each pulse signal psc in FSC_pulse_set i If psc is i .h>threshold 1 Then psc is to i Adding FSC_large_set;
3. calculating the time distance between each pulse signal in FSC_large_set, if the distance is smaller than threshold 2 Combining the two pulse signals;
4. smoothing filtering
And smoothing the FSC_large_set by applying mean filtering and median filtering.
5. Dimension reduction
And the method such as PCA can be selected to reduce the dimension of the FSC_large_set, so that the redundancy characteristic is reduced.
Through the steps, large pulse signals are extracted from the FSC pulse set to form a FSC large pulse set FSC_large_set, and preparation is made for subsequent cell mass identification analysis.
By default, a first threshold value 1 Set to 3 times FSC baseline noise;
further, a first threshold value threshold 1 The method is used for extracting the pulse signals with larger FSC channel intensity from the FSC pulse set. The determination needs to consider the following factors:
1. instrument parameters
The baseline noise and sensitivity of the FSC channel will vary from flow cytometer to flow cytometer, requiring an empirical range to be predetermined based on the parameters of the instrument.
2. Sample type
Different cell sample size distributions result in different FSC intensity distribution ranges, requiring adjustment of the threshold according to the sample.
3. Data distribution
And visualizing or counting the data distribution condition of the FSC pulse set, and determining a threshold value capable of effectively distinguishing strong pulses from weak pulses.
4. Error testing method
Several candidate thresholds can be taken more from the experience range, and the screening effect is tested by combining the samples, so that the optimal threshold is selected.
5. Adaptive threshold
A data-driven adaptive threshold algorithm, such as Otsu's method, may be designed to automatically determine the threshold based on the FSC pulse signal profile.
Taking the above factors into consideration, a first threshold is generally recommended 1 The specific values need to be determined for the sample and instrument, set to be in the range of 3-5 times the FSC baseline noise. Meanwhile, the threshold value can be optimized by a trial and error method, or an automatic threshold value selection algorithm is designed.
In the S30 step, threshold 2 Is a threshold used to determine if the distance between two FSC large pulses is sufficiently close to determine if they are to be combined into one pulse. Default threshold 2 3 times the pulse time length;
of course, the threshold can also be determined according to the following method 2 The following factors need to be considered:
1. instrument acquisition frequency
The acquisition frequency of the flow cytometer determines the minimum time interval between two successive pulse events. If threshold 2 Below this interval, erroneous merging may result.
2. Physical width of cells or cell clusters
Considering the physical width of the cell mass, too small threshold is avoided 2 Causing different cell clusters to be pooled.
3. Noise jitter effect
Too little threshold 2 Noise jitter may also be combined into one pulse by mistake.
4. Statistical multi-sample distance distribution
The distribution of pulse distance in a plurality of samples can be observed, and an appropriate interval is selected as threshold 2
5. Empirical parameter adjustment
And (5) manually testing different parameters, and selecting the threshold with the best recognition effect.
Taking the above factors into consideration 2 The setting proposal of (1) is that the time corresponding to the acquisition frequency is 3-5 times; or testing gradually from more than 0.5ms, selectingThe optimal threshold for efficient cell mass pulse merger without introducing excessive merger is selected.
In the above technical solution, the steps of screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix specifically include:
traversing the FSC matrix, and reserving a row corresponding to the FSC large pulse;
screening corresponding rows from SSC and fluorescent matrix according to FSC large pulse;
smoothing the filtered matrix;
outputting the filtered and smoothed FSC, SSC and fluorescent matrix.
Specifically, the implementation steps of S40 are described as follows:
1. defining variables and constants
Fsc_large_set: s30, obtaining an FSC large pulse set;
-FSC matrix: s20, obtaining an FSC matrix;
SSC matrix: s20, obtaining an SSC matrix;
-FL matrix: s20, obtaining a fluorescence matrix;
FSC_filter: the FSC matrix after screening;
ssc_filter: a filtered SSC matrix;
-fl_filter: a fluorescent matrix after screening;
2. screening FSC matrix
Traversing FSC_matrix, if pulse signal psc of current line i Exists in FSC_large_set, the line is reserved, otherwise the line is deleted.
3. SSC and fluorescent matrix are screened according to the same principle
4. Smoothing filtering
And carrying out smooth filtering on the screened matrix.
FSC_filter=smooth(FSC_filter);
SSC_filter=smooth(SSC_filter);
FL_filter=smooth(FL_filter);
Through the steps, other matrixes are screened by using the FSC large pulse set, and a smooth matrix after screening is obtained, so that preparation is made for the next step of identification.
In the above technical solution, the step of establishing an aggregation matrix by using the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescent pulse matrix specifically includes:
transversely splicing screening smooth matrixes of different channels to form an aggregation matrix;
normalizing and standardizing the aggregation matrix;
and performing dimension reduction on the normalized and standardized aggregation matrix by adopting PCA to obtain a dimension reduction aggregation matrix fused with different channel information.
Specifically, the implementation steps of S50 are described as follows:
1. definition of variables
-fusion_matrix: aggregation matrix to be built
2. Matrix aggregation
And performing matrix transverse splicing on the three matrixes according to the rows to obtain fusion_matrix:
3. normalization
Column normalization was performed on fusion_matrix:
normalization can eliminate the effects of different feature magnitudes.
4. Dimension reduction
The dimension of fusion matrix can be reduced using PCA or the like:
so far, the feature matrixes of different channels are aggregated to form an aggregation matrix, and standardized and dimension-reduced processing is carried out to prepare for the next identification.
In the above technical solution, the steps of establishing and training the cell mass analysis model specifically include:
collecting a plurality of historical aggregation matrix samples, and marking cell mass data to serve as training samples;
constructing a machine learning classification model;
and training the machine learning classification model by using a training sample to obtain a cell mass analysis model.
The specific description is as follows:
1. collecting a plurality of historical aggregation matrix samples, and marking;
-collecting a plurality of samples of the aggregation matrix comprising single cells and cell clusters produced in step S50;
-labeling, for each sample, the rows in which the corresponding single cells and cell clusters are identified by means of manual judgment;
2. construction of classification models
-taking each matrix row as a sample input;
-taking the annotated class as a label;
-building machine learning classification models, such as SVM, random forest, etc.;
3. training model
-partitioning the training set and the validation set;
-training a classification model, optimizing model hyper-parameters;
-evaluating model effects on the validation set;
4. further, it also includes model evaluation
-evaluating classification performance on an independent test set;
-calculating the index of accuracy, recall, etc.
Further, in the above technical solution, the model used for constructing the machine learning classification model is an SVM or a random forest.
In the above technical solution, the step of deleting the electric pulse signal corresponding to the cell mass in the electric pulse signal set specifically includes:
traversing an original electric pulse signal set;
judging whether each pulse is in the identified cell mass collection;
if so, deleting the pulse from the original set;
retaining the pulse of the non-cell mass to form a filtered output set;
returning the filtered set of electrical pulse signals.
A second aspect of the present invention provides a computer readable storage medium having stored therein program instructions which when executed are adapted to carry out a method of outputting a degranulation signal by a flow cytometer as described above.
A third aspect of the present invention provides a system for outputting a degranulation signal by a flow cytometer, wherein the system comprises the computer readable storage medium.
Specifically, the principle of the invention is as follows:
1. multi-channel information fusion
The flow cytometer may acquire multiple parameters for each cell or cell pellet simultaneously, such as FSC, SSC, fluorescence, etc. The invention fuses the information matrixes of different channels to form an aggregate matrix with comprehensive multiple characteristics. Because certain relations exist among FSC, SSC and fluorescent signals of each cell or cell mass, for example, the cell mass is relatively large in size, the internal structure is more complex, so that certain relations are generated among FSC and SSC, meanwhile, the fluorescent signals of the cell mass can generate different channel values from those of common cells due to relatively more fluorescent dyes of the cell mass, the selected threshold value can be inaccurate due to the fact that the data size is too large by simple manual judgment, and the multi-channel fusion method is utilized to conduct multi-feature extraction analysis by utilizing a polymerization matrix, so that the cell mass signals can be better determined.
The aggregation matrix provides a richer feature describing each event than a single parameter, facilitating subsequent identification and classification. The fusion of multimodal information is an important approach to improve expression.
2. Matrix representation of coded weak changes
The invention organizes the description index of the pulse event, such as pulse shape, time and other information, in a matrix form.
Compared with simple statistics, the matrix saves small changes of the original pulse, especially relations among different channel signals, and provides richer features for cell mass identification.
3. Pre-trained machine learning model
The invention uses a pre-trained machine learning model to carry out classification analysis on the aggregation matrix, and identifies the cell clusters therein.
Compared with a fixed threshold and a manual judgment rule, the machine learning model can learn complex distribution of samples and has stronger generalization capability on unknown data.
4. End-to-end automation flow
From the original signal to the final output, the invention establishes a fully-automatic analysis flow without manual intervention and parameter setting.
Standardized end-to-end flow is critical to achieving automated detection.
5. Accuracy and efficiency are taken into consideration
The invention considers the identification accuracy and the operation efficiency simultaneously, namely, firstly, the information quantity is improved by multi-channel aggregation, and then, the efficiency is ensured by using the algorithm dimension reduction.
The balance of accuracy and efficiency is also an important indicator of an automated detection system.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. A method for outputting a degranulation signal by a flow cytometer, comprising the steps of:
s10, collecting an electric pulse signal set collected by a photoelectric converter of a flow cytometer, wherein the electric pulse signal set comprises an electric pulse signal set generated by forward scattered light, an electric pulse signal set generated by side scattered light and an electric pulse signal set generated by fluorescent signals, and the electric pulse signal sets are respectively recorded as an FSC pulse set, an SSC pulse set and a fluorescent pulse set; wherein the electrical pulse signal comprises a time, a length, a width and an area;
s20, respectively establishing an FSC matrix, an SSC pulse matrix and a fluorescence pulse matrix according to the obtained FSC pulse set, SSC pulse set and fluorescence pulse set;
s30, traversing each electric pulse signal in the FSC pulse set, and acquiring electric pulse signals with channel values larger than a first threshold value to form an FSC large pulse set;
s40, screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix;
s50, establishing an aggregation matrix from the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescence pulse matrix;
s60, analyzing the aggregation matrix by utilizing a pre-trained cell mass analysis model to obtain an electric pulse signal of a corresponding cell mass in the aggregation matrix;
s70, deleting the electric pulse signals corresponding to the cell clusters in the electric pulse signal set, and outputting the obtained electric pulse signal set to a data processing system of the flow cytometer.
2. The method according to claim 1, wherein the step of establishing the FSC pulse set, the SSC pulse set, and the fluorescence pulse set based on the electrical pulse signal to form the FSC matrix, the SSC pulse matrix, and the fluorescence pulse matrix, respectively, further comprises the step of eliminating noise points from the electrical pulse signal.
3. The method of outputting a decellularized signal by a flow cytometer of claim 1, wherein traversing each electrical pulse signal in the FSC pulse set obtains an electrical pulse signal having a channel value greater than a first threshold value, and wherein the step of forming the FSC large pulse set comprises:
traversing the FSC pulse set, and extracting pulse signals with FSC channel values larger than a first threshold value as a first pulse set;
calculating the inter-pulse distance of the first pulse set, and merging adjacent pulses;
smoothing the first pulse set after combining the adjacent pulses;
the first pulse set after smoothing filter processing is subjected to dimensionality reduction by adopting PCA method
The first pulse set after dimension reduction is used for seating the FSC large pulse set.
4. The method for outputting a decellularized signal by a flow cytometer of claim 1, wherein the steps of screening the FSC matrix, the SSC pulse matrix, and the fluorescent pulse matrix according to the FSC large pulse set, to obtain a screened FSC matrix, a screened SSC pulse matrix, and a screened fluorescent pulse matrix specifically comprise:
traversing the FSC matrix, and reserving a row corresponding to the FSC large pulse;
screening corresponding rows from SSC and fluorescent matrix according to FSC large pulse;
smoothing the filtered matrix;
outputting the filtered and smoothed FSC, SSC and fluorescent matrix.
5. The method of outputting a decellularized signal by a flow cytometer of claim 1, wherein the steps of establishing an aggregation matrix from the obtained screening FSC matrix, screening SSC pulse matrix, and screening fluorescent pulse matrix comprise:
transversely splicing screening smooth matrixes of different channels to form an aggregation matrix;
normalizing and standardizing the aggregation matrix;
and performing dimension reduction on the normalized and standardized aggregation matrix by adopting PCA to obtain a dimension reduction aggregation matrix fused with different channel information.
6. The method of outputting a degrouping signal by a flow cytometer as described in claim 1, wherein said step of establishing and training said cell mass analysis model comprises:
collecting a plurality of historical aggregation matrix samples, and marking cell mass data to serve as training samples;
constructing a machine learning classification model;
and training the machine learning classification model by using a training sample to obtain a cell mass analysis model.
7. The method of outputting a decellularized signal by a flow cytometer of claim 6, wherein the model used to construct the machine-learned classification model is an SVM or random forest.
8. The method of outputting a degranulation signal by a flow cytometer of claim 1 wherein the step of deleting the electrical pulse signal corresponding to the cell mass from the collection of electrical pulse signals comprises:
traversing an original electric pulse signal set;
judging whether each pulse is in the identified cell mass collection;
if so, deleting the pulse from the original set;
retaining the pulse of the non-cell mass to form a filtered output set;
returning the filtered set of electrical pulse signals.
9. A computer readable storage medium having stored therein program instructions which when executed are adapted to carry out a method of outputting a decellularized signal by a flow cytometer as described in any of claims 1-8.
10. A system for outputting a degranulation signal by a flow cytometer comprising the computer readable storage medium of claim 9.
CN202311243009.0A 2023-09-25 2023-09-25 Method, medium and system for outputting cell mass removal signal by flow cytometer Pending CN117288661A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311243009.0A CN117288661A (en) 2023-09-25 2023-09-25 Method, medium and system for outputting cell mass removal signal by flow cytometer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311243009.0A CN117288661A (en) 2023-09-25 2023-09-25 Method, medium and system for outputting cell mass removal signal by flow cytometer

Publications (1)

Publication Number Publication Date
CN117288661A true CN117288661A (en) 2023-12-26

Family

ID=89247493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311243009.0A Pending CN117288661A (en) 2023-09-25 2023-09-25 Method, medium and system for outputting cell mass removal signal by flow cytometer

Country Status (1)

Country Link
CN (1) CN117288661A (en)

Similar Documents

Publication Publication Date Title
US11900598B2 (en) System and method of classification of biological particles
WO2017143919A1 (en) Method and apparatus for establishing data identification model
EP3364341A1 (en) Analyzing digital holographic microscopy data for hematology applications
US20130279796A1 (en) Classifier readiness and maintenance in automatic defect classification
US10337975B2 (en) Method and system for characterizing particles using a flow cytometer
US20080172185A1 (en) Automatic classifying method, device and system for flow cytometry
TW201350836A (en) Optimization of unknown defect rejection for automatic defect classification
KR102122068B1 (en) Image analyzing system and method thereof
CN112215790A (en) KI67 index analysis method based on deep learning
WO2021154561A1 (en) Methods and systems for classifying fluorescent flow cytometer data
CN111274949B (en) Blood disease white blood cell scatter diagram similarity analysis method based on structural analysis
US11193927B2 (en) Automated body fluid analysis
JP5506805B2 (en) Shape parameters for hematology equipment
CN117288661A (en) Method, medium and system for outputting cell mass removal signal by flow cytometer
JPH0584544B2 (en)
CN109508649A (en) A kind of pulse signal analysis recognition method of cellanalyzer
CN113380318B (en) Artificial intelligence assisted flow cytometry 40CD immunophenotyping detection method and system
WO2021052240A1 (en) Laser probe classification method and device capable of automatically selecting spectral lines on basis of image features
JP5693973B2 (en) High resolution classification
CN112557285B (en) Automatic gating method and device for flow cytometry detection data
CN117686411A (en) Flow cytometer detection data analysis method, medium and system
CN116642819B (en) Method and device for identifying cell population
CN114674729B (en) Pulse identification method, pulse identification device, pulse identification storage medium, pulse identification equipment and blood cell analyzer
EP3933376A1 (en) Method and system for characterizing particles using an angular detection in a flow cytometer
CN117708569A (en) Identification method, device, terminal and storage medium for pathogenic microorganism information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination