CN112599199A - Analysis method suitable for 10x single cell transcriptome sequencing data - Google Patents
Analysis method suitable for 10x single cell transcriptome sequencing data Download PDFInfo
- Publication number
- CN112599199A CN112599199A CN202011592574.4A CN202011592574A CN112599199A CN 112599199 A CN112599199 A CN 112599199A CN 202011592574 A CN202011592574 A CN 202011592574A CN 112599199 A CN112599199 A CN 112599199A
- Authority
- CN
- China
- Prior art keywords
- cells
- cell
- expression
- sequencing data
- clustering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 30
- 238000004458 analytical method Methods 0.000 title claims abstract description 26
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 56
- 239000003550 marker Substances 0.000 claims abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 14
- 230000014509 gene expression Effects 0.000 claims description 42
- 238000000034 method Methods 0.000 claims description 18
- 238000003908 quality control method Methods 0.000 claims description 13
- 230000009467 reduction Effects 0.000 claims description 5
- 108020005196 Mitochondrial DNA Proteins 0.000 claims description 4
- 238000007621 cluster analysis Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000000513 principal component analysis Methods 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 4
- 238000007405 data analysis Methods 0.000 claims 2
- 238000001914 filtration Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000012398 clinical drug development Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000002626 targeted therapy Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Abstract
The invention discloses an analysis method suitable for 10x single cell transcriptome sequencing data, which comprises the steps of 1) sequencing data processing, 2) Seurat data filtering, 3) cell clustering and grouping, and 4) cell group marker gene analysis.
Description
Technical Field
The invention relates to the technical field of gene detection, in particular to an analysis method suitable for 10x single cell transcriptome sequencing data.
Background
A transcriptome is the collection of all transcripts produced by a certain species or specific cell type. Transcriptome studies can study gene functions and gene structures from the whole level, reveal molecular mechanisms in specific biological processes and disease development processes, and have been widely applied in the fields of basic research, clinical diagnosis, drug development, and the like.
The common transcriptome is the transcription condition of all mRNA corresponding to a certain time in a biological tissue sample, and is usually used as an important index of the tissue or a certain time state of the sample, different samples, different tissues, different species and different treatments can cause the change of the expression condition of the mRNA, thereby regulating and controlling the life state of an organism or executing certain cell functions.
However, the transcriptome of a sample or a tissue is an average value of the expression levels of one transcriptome of all cells, and cannot reflect the state of all cells or a certain group of cells in the sample, so that the transcriptional state of a single cell or a certain group of cells needs to be deeply studied, and thus the state of the tissue can be more finely and accurately reflected. If the immune or drug response research is carried out, the immune therapy or the targeted therapy can be more accurately carried out aiming at cells or cell subsets, which is a necessary condition for precise medical treatment.
Disclosure of Invention
The invention provides an analysis method suitable for 10x single cell transcriptome sequencing data.
The scheme of the invention is as follows:
an analysis method suitable for 10x single cell transcriptome sequencing data, comprising the following steps:
1) processing the sequencing data, and performing sequencing data processing,
the raw data were processed using the 10x genomics official cellanger count procedure:
processing the off-line Fastq file by using a count program to obtain the number of expressed reads of the gene of each cell;
performing quality control on the sequencing result according to the information in the webpage generated by the count program;
2) the data of the Seurat is filtered,
further removing low-quality cells from the expression data obtained in step 1) using a Seurat software package to obtain filtered cells;
3) clustering the cells, and clustering the filtered cells by using a Surart software package;
4) and (3) analyzing the marker genes of the cell populations by using a Surart software package to obtain results.
As a preferred technical scheme, the main quality control indexes in the step 1) comprise effective cell numbers which are close to the estimated cell numbers during library building; a genomic alignment ratio, the genomic alignment ratio being greater than 70%; the gene set alignment rate is more than 30 percent.
As a preferred technical scheme, the main quality control indexes in the step 1) comprise effective cell numbers which are close to the estimated cell numbers during library building; a genomic alignment ratio, the genomic alignment ratio being greater than 80%; the gene set alignment rate is more than 50%.
Preferably, the step 2) removes low-quality cells, wherein the low-quality cells comprise cells with over-high expression genes, cells with over-low expression reads and cells with over 10% of mitochondrial gene expression reads.
As a preferred technical scheme, clustering the filtered cells by using the Suerat software package specifically comprises:
normalizing and normalizing the filtered cell expression matrix;
selecting 2000 genes with highest expression variation among cells as factors for principal component analysis;
analyzing and selecting main component analysis factors of the genes by using a PCA algorithm, and selecting the first 20 main component analysis factors as input items of cluster analysis;
constructing a K-nearest neighbor (KNN) graph according to PCA main components, and clustering cells by using a Louvain algorithm;
and carrying out nonlinear dimensionality reduction on the clustering result by using a UMAP algorithm, and visualizing the clustering result according to the first two dimensions.
As a preferred technical scheme, the method for analyzing the marker genes of each cell population by using the Surart software package specifically comprises the following steps:
comparing the cell expression level of each cell population with the mean expression level of other cell populations to find genes which are highly expressed in the cell population and are lowly expressed in other cell populations;
screening the found marker gene;
the expression of marker genes in each cell population was displayed using the VlnPlot function of the Suerat software package.
As a preferred technical scheme, the marker gene is screened to be more than 25 percent of the expression ratio in the cell population, and the result that the logfc is more than 0.25 is retained.
Due to the adoption of the technical scheme, the analysis method is suitable for 10x single cell transcriptome sequencing data, 1) sequencing data processing is carried out, and a 10x genomics official cellanger count flow is used for processing the original data: processing the off-line Fastq file by using a count program to obtain the number of expressed reads of the gene of each cell; performing quality control on the sequencing result according to the information in the webpage generated by the count program; 2) seurat data filtering, namely further removing low-quality cells from the expression data obtained in the step 1) by using a Seurat software package to obtain filtered cells; 3) clustering the cells, and clustering the filtered cells by using a Surart software package; 4) and (3) analyzing the marker genes of the cell populations by using a Surart software package to obtain results.
The invention has the advantages that: 1. the method for sequencing reduces the steps of data preprocessing, improves the analysis speed, enhances the sequencing efficiency and is convenient to operate and use;
2. the later analysis is carried out on the data by using the R language Seurat software package with higher acceptance at present, so that the accuracy of the analysis is improved and the analysis is more precise;
3. the quality control is carried out on the cells by combining a plurality of parameters, so that the influence of low-quality cells on analysis is reduced, and the accuracy of the analysis is improved;
4. the sensitivity and accuracy of the marker gene search are improved by various optional difference analysis methods;
5. provides a plurality of result display forms, combines the UMAP dimensionality reduction result, and is more convenient to understand the dynamic change of the marker gene in the cell.
Drawings
FIG. 1 is an analytical flow chart according to the present invention;
FIG. 2 is a scatter plot of the UMI number distribution of cells of the present invention;
FIG. 3 is a diagram of a single cell data quality control violin according to the present invention;
FIG. 4 is a UMAP scatter plot of cells of the present invention;
FIG. 5 is a gene expression distribution scattergram of the present invention;
FIG. 6 is a Marker gene GO enrichment factor graph of the invention.
Detailed Description
In order to make up for the above deficiencies, the present invention provides an analysis method suitable for 10 × single cell transcriptome sequencing data to solve the above problems in the background art.
An analysis method suitable for 10x single cell transcriptome sequencing data, comprising the following steps:
1) processing the sequencing data, and performing sequencing data processing,
the raw data were processed using the 10x genomics official cellanger count procedure:
processing the off-line Fastq file by using a count program to obtain the number of expressed reads of the gene of each cell;
performing quality control on the sequencing result according to the information in the webpage generated by the count program;
2) the data of the Seurat is filtered,
further removing low-quality cells from the expression data obtained in step 1) using the saurat software package to obtain filtered cells:
3) clustering the cells, and clustering the filtered cells by using a Surart software package;
4) and (3) analyzing the marker genes of the cell populations by using a Surart software package to obtain results.
The main quality control indexes in the step 1) comprise effective cell numbers which are close to the estimated cell numbers when the database is built; a genomic alignment ratio, the genomic alignment ratio being greater than 70%; the gene set comparison rate is more than 30 percent
The main quality control indexes in the step 1) comprise effective cell numbers which are close to the estimated cell numbers when the database is built; a genomic alignment ratio, the genomic alignment ratio being greater than 80%; the gene set alignment rate is more than 50%.
Removing low-quality cells in the step 2), wherein the low-quality cells comprise cells with over-high expression gene number, cells with over-low expression reads number and cells with over 10% of mitochondrial gene expression reads.
Clustering the filtered cells using the Suerat software package specifically was:
normalizing and normalizing the filtered cell expression matrix;
selecting 2000 genes with highest expression variation among cells as factors for principal component analysis;
analyzing and selecting main component analysis factors of the genes by using a PCA algorithm, and selecting the first 20 main component analysis factors as input items of cluster analysis;
constructing a K-nearest neighbor (KNN) graph according to PCA main components, and clustering cells by using a Louvain algorithm;
and carrying out nonlinear dimensionality reduction on the clustering result by using a UMAP algorithm, and visualizing the clustering result according to the first two dimensions.
The specific analysis of marker genes for each cell population using the Surart software package is:
comparing the cell expression level of each cell population with the mean expression level of other cell populations to find genes which are highly expressed in the cell population and are lowly expressed in other cell populations;
screening the found marker gene;
the expression of marker genes in each cell population was displayed using the VlnPlot function of the Suerat software package.
The result that the expression ratio of the marker gene in the cell population is more than 25 percent and the logfc is more than 0.25 is reserved is screened.
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
Example (b):
the method comprises the following steps of:
the raw data were processed using the 10x genomics official cellanger count procedure:
processing the off-line Fastq file by using a count program to obtain the number of expressed reads of the gene of each cell; as shown in fig. 2;
according to the information in the webpage generated by the count program, the quality control is carried out on the sequencing result, the main index is 1, the effective cell number is close to the estimated cell number when the library is built; 2. the genome alignment rate is more than 80% under the conventional condition; 3. the gene set alignment, which conventionally should be greater than 50%, can be relaxed to the following criteria for species with poor genome assembly: the genome alignment rate is more than 70 percent, and the gene set alignment rate is more than 30 percent;
step two, a Seurat data filtering step:
low quality cells were further removed from expression data obtained in step one using the saurat software package:
removing cells with too high a number of expressed genes;
removing cells expressing an excessively low number of reads;
removing cells in which more than 10% of mitochondrial gene expression reads are present; as shown in fig. 3;
step three, clustering and grouping cells:
clustering the filtered cells by using Surart, specifically:
normalizing and normalizing the filtered cell expression matrix;
selecting 2000 genes with highest expression variation among cells as factors for principal component analysis;
analyzing the principal components of the selected genes by using a PCA algorithm, and selecting the first 20 principal components as input items of cluster analysis;
constructing a K-nearest neighbor (KNN) graph according to PCA main components, and clustering cells by using a Louvain algorithm;
performing nonlinear dimensionality reduction on the clustering result by using a UMAP algorithm, and visualizing the clustering result according to the first two dimensions; as shown in fig. 4;
step four, analyzing the cell population marker gene
The marker genes of each cell population were analyzed using surat, specifically:
comparing the cell expression level of each cell population with the mean expression level of other cell populations to find genes which are highly expressed in the cell population and are lowly expressed in other cell populations;
screening the found marker gene, and reserving the result that the expression ratio is more than 25% and the logfc is more than 0.25 in the cell population;
the expression of the marker gene in each cell population was displayed using the VlnPlot function of Suerat, as shown in fig. 5 and 6.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (7)
1. An analysis method suitable for 10x single cell transcriptome sequencing data, which is characterized by comprising the following steps:
1) processing the sequencing data, and performing sequencing data processing,
the raw data were processed using the 10x genomics official cellanger count procedure:
processing the off-line Fastq file by using a count program to obtain the number of expressed reads of the gene of each cell;
performing quality control on the sequencing result according to the information in the webpage generated by the count program;
2) the data of the Seurat is filtered,
further removing low-quality cells from the expression data obtained in step 1) using a Seurat software package to obtain filtered cells;
3) clustering the cells, and clustering the filtered cells by using a Surart software package;
4) and (3) analyzing the marker genes of the cell populations by using a Surart software package to obtain results.
2. The method of claim 1, wherein the method is applied to 10x single cell transcriptome sequencing data analysis, and comprises the following steps:
the main quality control indexes in the step 1) comprise effective cell numbers which are close to the estimated cell numbers when the database is built; a genomic alignment ratio, the genomic alignment ratio being greater than 70%; the gene set alignment rate is more than 30 percent.
3. The method of claim 2, wherein the method is applied to 10x single cell transcriptome sequencing data, and comprises the following steps:
the main quality control indexes in the step 1) comprise effective cell numbers which are close to the estimated cell numbers when the database is built; a genomic alignment ratio, the genomic alignment ratio being greater than 80%; the gene set alignment rate is more than 50%.
4. The method of claim 1, wherein the method is applied to 10x single cell transcriptome sequencing data analysis, and comprises the following steps:
removing low-quality cells in the step 2), wherein the low-quality cells comprise cells with over-high expression gene number, cells with over-low expression reads number and cells with over 10% of mitochondrial gene expression reads.
5. The method of claim 1, wherein clustering filtered cells using a Suerat software package is specifically:
normalizing and normalizing the filtered cell expression matrix;
selecting 2000 genes with highest expression variation among cells as factors for principal component analysis;
analyzing and selecting main component analysis factors of the genes by using a PCA algorithm, and selecting the first 20 main component analysis factors as input items of cluster analysis;
constructing a KNN graph according to PCA main components, and clustering cells by using a Louvain algorithm;
and carrying out nonlinear dimensionality reduction on the clustering result by using a UMAP algorithm, and visualizing the clustering result according to the first two dimensions.
6. The method for analyzing 10x single cell transcriptome sequencing data according to claim 1, wherein the analysis of marker genes of each cell population using the Suerat software package is specifically:
comparing the cell expression level of each cell population with the mean expression level of other cell populations to find genes which are highly expressed in the cell population and are lowly expressed in other cell populations;
screening the found marker gene;
the expression of marker genes in each cell population was displayed using the VlnPlot function of the Suerat software package.
7. The method of claim 6, wherein the method is applied to 10x single cell transcriptome sequencing data, and comprises the following steps: the result that the expression ratio of the marker gene in the cell population is more than 25 percent and the logfc is more than 0.25 is reserved is screened.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011592574.4A CN112599199A (en) | 2020-12-29 | 2020-12-29 | Analysis method suitable for 10x single cell transcriptome sequencing data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011592574.4A CN112599199A (en) | 2020-12-29 | 2020-12-29 | Analysis method suitable for 10x single cell transcriptome sequencing data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112599199A true CN112599199A (en) | 2021-04-02 |
Family
ID=75203357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011592574.4A Pending CN112599199A (en) | 2020-12-29 | 2020-12-29 | Analysis method suitable for 10x single cell transcriptome sequencing data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112599199A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113188981A (en) * | 2021-04-30 | 2021-07-30 | 天津深析智能科技发展有限公司 | Automatic analysis method of multi-factor cytokine |
CN113257364A (en) * | 2021-05-26 | 2021-08-13 | 南开大学 | Single cell transcriptome sequencing data clustering method and system based on multi-objective evolution |
CN113611359A (en) * | 2021-08-13 | 2021-11-05 | 江苏先声医学诊断有限公司 | Method for improving strain assembly efficiency of metagenome nanopore sequencing data |
CN113674800A (en) * | 2021-08-25 | 2021-11-19 | 中国农业科学院蔬菜花卉研究所 | Cell clustering method based on single cell transcriptome sequencing data |
CN115424668A (en) * | 2022-11-02 | 2022-12-02 | 杭州联川基因诊断技术有限公司 | Single-cell transcriptome data availability analysis method, medium and equipment |
CN117079726A (en) * | 2023-10-16 | 2023-11-17 | 浙江大学长三角智慧绿洲创新中心 | Database visualization method based on single cells and related equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109979538A (en) * | 2019-03-28 | 2019-07-05 | 广州基迪奥生物科技有限公司 | A kind of analysis method based on the unicellular transcript profile sequencing data of 10X |
CN110060729A (en) * | 2019-03-28 | 2019-07-26 | 广州序科码生物技术有限责任公司 | A method of cell identity is annotated based on unicellular transcript profile cluster result |
WO2019200342A1 (en) * | 2018-04-12 | 2019-10-17 | The J. David Gladstone Institutes | Methods for treating apoe4/4-associated disorders |
CN111863138A (en) * | 2020-05-26 | 2020-10-30 | 浙江大学 | Human uterine tissue cell composition analysis model and establishing method and application thereof |
-
2020
- 2020-12-29 CN CN202011592574.4A patent/CN112599199A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019200342A1 (en) * | 2018-04-12 | 2019-10-17 | The J. David Gladstone Institutes | Methods for treating apoe4/4-associated disorders |
CN109979538A (en) * | 2019-03-28 | 2019-07-05 | 广州基迪奥生物科技有限公司 | A kind of analysis method based on the unicellular transcript profile sequencing data of 10X |
CN110060729A (en) * | 2019-03-28 | 2019-07-26 | 广州序科码生物技术有限责任公司 | A method of cell identity is annotated based on unicellular transcript profile cluster result |
CN111863138A (en) * | 2020-05-26 | 2020-10-30 | 浙江大学 | Human uterine tissue cell composition analysis model and establishing method and application thereof |
Non-Patent Citations (1)
Title |
---|
YULONG FU;XIAOHU HUANG;PENG ZHANG;JOYCE VAN DE LEEMPUT;ZHE HAN;: "Single-cell RNA sequencing identifies novel cell types in Drosophila blood", JOURNAL OF GENETICS AND GENOMICS, no. 04 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113188981A (en) * | 2021-04-30 | 2021-07-30 | 天津深析智能科技发展有限公司 | Automatic analysis method of multi-factor cytokine |
CN113257364A (en) * | 2021-05-26 | 2021-08-13 | 南开大学 | Single cell transcriptome sequencing data clustering method and system based on multi-objective evolution |
CN113611359A (en) * | 2021-08-13 | 2021-11-05 | 江苏先声医学诊断有限公司 | Method for improving strain assembly efficiency of metagenome nanopore sequencing data |
CN113674800A (en) * | 2021-08-25 | 2021-11-19 | 中国农业科学院蔬菜花卉研究所 | Cell clustering method based on single cell transcriptome sequencing data |
CN115424668A (en) * | 2022-11-02 | 2022-12-02 | 杭州联川基因诊断技术有限公司 | Single-cell transcriptome data availability analysis method, medium and equipment |
CN117079726A (en) * | 2023-10-16 | 2023-11-17 | 浙江大学长三角智慧绿洲创新中心 | Database visualization method based on single cells and related equipment |
CN117079726B (en) * | 2023-10-16 | 2024-01-30 | 浙江大学长三角智慧绿洲创新中心 | Database visualization method based on single cells and related equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112599199A (en) | Analysis method suitable for 10x single cell transcriptome sequencing data | |
US7860660B2 (en) | Characterization of phenotypes by gene expression patterns and classification of samples based thereon | |
Belacel et al. | Clustering methods for microarray gene expression data | |
US20060259246A1 (en) | Methods for efficiently mining broad data sets for biological markers | |
Kim et al. | Effect of data normalization on fuzzy clustering of DNA microarray data | |
CA2300639A1 (en) | Methods and apparatus for analyzing gene expression data | |
CN113674800B (en) | Cell clustering method based on single cell transcriptome sequencing data | |
US20130304783A1 (en) | Computer-implemented method for analyzing multivariate data | |
Jhajharia et al. | A cross-platform evaluation of various decision tree algorithms for prognostic analysis of breast cancer data | |
US20140058682A1 (en) | Nucleic Acid Information Processing Device and Processing Method Thereof | |
US20140019062A1 (en) | Nucleic Acid Information Processing Device and Processing Method Thereof | |
Bir-Jmel et al. | Gene selection via BPSO and Backward generation for cancer classification | |
TW202121223A (en) | Methods for training an artificial neural network to predict whether a subject will exhibit a characteristic gene expression and systems for executing the same | |
Khalilabad et al. | Fully automatic classification of breast cancer microarray images | |
CN115274136A (en) | Tumor cell line drug response prediction method integrating multiomic and essential genes | |
KR20100001177A (en) | Gene selection algorithm using principal component analysis | |
Schaefer | Gene expression analysis based on ant colony optimisation classification | |
Ma et al. | EnsembleKQC: an unsupervised ensemble learning method for quality control of single cell RNA-seq sequencing data | |
JP3936851B2 (en) | Clustering result evaluation method and clustering result display method | |
Walsh et al. | Feature selection using co-occurrence correlation improves cell clustering and embedding in single cell rnaseq data | |
EP1691311A1 (en) | Method, system and software for carrying out biological interpretations of microarray experiments | |
CN113971984A (en) | Classification model construction method and device, electronic equipment and storage medium | |
CN115527610B (en) | Cluster analysis method for single-cell histology data | |
Muhammad et al. | Gvdeepnet: Unsupervised deep learning techniques for effective genetic variant classification | |
Zhong et al. | Controlled Noise: Evidence of Epigenetic Regulation of Single-Cell Expression Variability |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |