CN112735523A - System and detection method for identifying arabidopsis thaliana cotyledon cell type - Google Patents

System and detection method for identifying arabidopsis thaliana cotyledon cell type Download PDF

Info

Publication number
CN112735523A
CN112735523A CN202011379750.6A CN202011379750A CN112735523A CN 112735523 A CN112735523 A CN 112735523A CN 202011379750 A CN202011379750 A CN 202011379750A CN 112735523 A CN112735523 A CN 112735523A
Authority
CN
China
Prior art keywords
cell
cells
cotyledon
arabidopsis thaliana
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011379750.6A
Other languages
Chinese (zh)
Inventor
孙旭武
肖云平
刘祉辛
殷昊
陆瑶
巴永兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN202011379750.6A priority Critical patent/CN112735523A/en
Publication of CN112735523A publication Critical patent/CN112735523A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Immunology (AREA)
  • Data Mining & Analysis (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Bioethics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Cell Biology (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a system and a detection method for identifying the cell type of an arabidopsis cotyledon based on single cell sequencing. About 1 ten thousand cells can be identified in about 10 minutes, so that the labor cost is greatly reduced, and the annotation precision is ensured.

Description

System and detection method for identifying arabidopsis thaliana cotyledon cell type
Technical Field
The invention belongs to the technical field of transcriptome sequencing, and particularly relates to a system and a detection method for identifying the type of an arabidopsis thaliana cotyledon cell based on single-cell transcriptome sequencing data.
Background
In the field of high-throughput single-cell transcriptome sequencing analysis, cell type identification is a crucial link, and by cell type identification and analysis, heterogeneity of complex cell populations can be effectively revealed, and a cell map is constructed. At present, two methods exist for identifying cell types, namely manual identification based on specific Marker genes (Marker-based), and identification based on single cell reference data sets. The use of the former method of marker-based artificial identification means that researchers must consult a large amount of literature to collect markers, is time-consuming and labor-consuming, and many cell types cannot distinguish cell types or subtypes well by a few markers. For example, in Reference-based analysis of long single-cell sequencing derived a translational structural macro, using CD27 gene can not accurately judge the negative B cell and memory B cell, and in T cell subtype, the marker gene has only high or low expression level difference in many cases, and the cell type can not be judged by the expression of a small amount of marker. However, methods based on singleR dataset identification can distinguish cell subtypes well.
For an arabidopsis thaliana cotyledon sample, no directly available reference data set is available at present for automatically and rapidly matching and identifying a cell type, manual identification only by a marker gene is time-consuming and labor-consuming, the automation degree is low, and the accuracy of identification of similar cell types is not high. Therefore, it is highly desirable to construct a single cell reference data set suitable for the identification of the cotyledon cell type of Arabidopsis thaliana, and to establish a set of computer programs for the automated identification of the cell type.
Disclosure of Invention
Based on the above problems, the present invention aims to overcome the above disadvantages of the prior art, and provide an analysis method for rapidly and objectively identifying the cell type of arabidopsis thaliana cotyledons based on single-cell transcriptome sequencing data.
The invention provides a system for identifying the type of an arabidopsis cotyledon cell based on single cell sequencing, which is characterized by comprising the following components: a cell sequencing platform, a database platform of cell types, and a data analysis and processing platform.
The cell sequencing platform is a single-cell transcriptome sequencing platform, and gene data of the cell is obtained by a single-cell transcriptome sequencing technology (scRNA-seq).
The database platform for cell types as described above was based on Marker genes of mesophyll cells (MPC), pseudomeristematic blast cells (MMC), early meristematic cells (EM), late meristematic cells (LM), Guard Mother Cells (GMC), Young Guard Cells (YGC), Guard Cells (GC), squamous cells (PC), and arabidopsis thaliana reference data platforms were constructed, wherein the Marker genes of each cell were as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Late meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): IQD5, RBCS.
The database platform for cell types as described above was established as follows:
through single cell transcriptome sequencing technology (scRNA-seq), a plurality of Marker genes are collected to identify cell types representing different stages of stomatal development, and specific cells and Marker genes are as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Early meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): IQD5, RBCS;
a database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cells is constructed.
The steps for constructing a database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cell are as follows:
plotting expression level in single cells for the relevant markers using FeaturePlot () and VlnPlot () functions;
plotting gene expression clustering heatmaps in single cells for related markers using a pheasap () function;
and judging the cell type composition of the arabidopsis thaliana cotyledon based on the expression quantity diagram and the gene expression clustering heat map, obtaining a single cell expression spectrum corresponding to each cell type of the arabidopsis thaliana cotyledon, and constructing a cell type identification reference data set.
The data analysis and processing platform identifies cell types using the SingleR () function, plots a correlation heat map for cell type identification, counts the most abundant cell types, outputs results and plots as described above.
Preferably, the data analysis and processing steps are as follows:
based on the construction, a cell type identification reference data set is obtained, and a SingleR packet is used for matching corresponding cell types by comparing the ranking of genes which are obviously up-regulated and expressed in each group of data to be detected in the reference data set, and is used for quickly judging the types of the arabidopsis thaliana cotyledon cells in the subsequent high-throughput single-cell transcriptome sequencing, and the specific operation steps are as follows:
importing data to be detected;
loading a constructed database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cells;
identifying a cell type using a SingleR () function;
mapping cell type identification correlation heatmaps;
counting the most abundant cell types;
and outputting the result and drawing.
The invention also provides a detection method for identifying the type of the arabidopsis thaliana cotyledon cell based on single cell sequencing, which is characterized by comprising the following steps:
based on a system for identifying the type of an arabidopsis cotyledon cell based on single cell sequencing, the system comprises: a cell sequencing platform, a database platform of cell types, and a data analysis and processing platform;
the cell sequencing platform is a single-cell transcriptome sequencing platform, and gene data of cells are obtained by a single-cell transcriptome sequencing technology (scRNA-seq);
the cell type database platform is based on Marker genes of mesophyll cells (MPC), pseudomeristematic blast cells (MMC), early meristematic cells (EM), late meristematic cells (LM), guard blast cells (GMC), Young Guard Cells (YGC), Guard Cells (GC) and squamous cells (PC), and an arabidopsis thaliana reference data platform is constructed, wherein the Marker genes of all the cells are as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Late meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): IQD5, RBCS;
the data analysis and processing platform identifies cell types by using a SingleR () function, draws a cell type identification correlation heat map, counts the most cell types, and outputs results and a drawing.
The database platform for cell types as described above was established as follows:
through a single cell transcriptome sequencing technology (scRNA-seq), a plurality of Marker genes are collected to identify cell types representing different stages of stomatal development, and a database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cells is constructed.
The steps for constructing a database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cell are as follows:
plotting expression level in single cells for the relevant markers using FeaturePlot () and VlnPlot () functions;
plotting gene expression clustering heatmaps in single cells for related markers using a pheasap () function;
and judging the cell type composition of the arabidopsis thaliana cotyledon based on the expression quantity diagram and the gene expression clustering heat map, obtaining a single cell expression spectrum corresponding to each cell type of the arabidopsis thaliana cotyledon, and constructing a cell type identification reference data set.
Further elaborating the technical scheme of the invention:
the method provided by the invention collects a plurality of Marker genes in the existing related documents to identify the cell types representing different stages of stomatal development by a single cell transcriptome sequencing technology (scRNA-seq), constructs a single cell reference data set suitable for identifying the types of the arabidopsis thaliana cotyledon cells, and establishes a set of computer programs for automatic identification. The method specifically comprises the following steps:
1. expression levels in individual cells were plotted against the associated markers using the FeaturePlot () and VlnPlot () functions in the sourta package (v3.0.0).
Mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Early meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): iqd5, RBCS
Figure BDA0002809058220000041
Figure BDA0002809058220000051
2. The gene expression cluster heatmap in individual cells was plotted against the associated markers using the pheamap () function in the pheamap package.
library(pheatmap)
pdf("heatmap.pdf")
pheatmap(topn_markers2vis,cluster_rows=T,cluster_cols=T,show_rownames=T)
dev.off()
3. And judging the cell type composition of the arabidopsis thaliana cotyledon based on the expression quantity graph, the gene expression clustering heat map and the like, obtaining a single cell expression spectrum corresponding to each cell type of the arabidopsis thaliana cotyledon, and constructing a cell type identification reference data set.
library(SingleR)
library(Seurat)
library(scater)
library(dplyr)
ref_ob=readRDS("celltype.rds")
ref.m=GetAssayData(ref_ob,assay="RNA",slot="counts")
cell_metadata=ref_ob@meta.data%>%select("celltype")
ref.sce=SingleCellExperiment(assays=list(counts=ref.m),colData=cell_metadata)
ref.sce=logNormCounts(ref.sce)
saveRDS(ref.sce,"reference.rds")
4. And obtaining a cell type identification reference data set based on the construction, and using a SingleR packet to match corresponding cell types by comparing the ranking of the genes which are obviously up-regulated and expressed in each group of data to be detected in the reference data set, so as to be used for quickly judging the types of the arabidopsis thaliana cotyledon cells in the subsequent high-throughput single-cell transcriptome sequencing.
Figure BDA0002809058220000052
Figure BDA0002809058220000061
Figure BDA0002809058220000071
In conclusion, the beneficial effects of the invention are as follows: based on single cell transcriptome sequencing data, annotation aiming at the arabidopsis thaliana cotyledon cell type can be quickly finished by adopting the reference data set and the automatic identification process, and identification can be finished within about 10 minutes for about 1 ten thousand cells. The method has the advantages that single R is not innovatively used, but a reference data set of the arabidopsis thaliana cotyledon cell type is constructed for the first time by using the single R, so that subsequent researchers can quickly identify the arabidopsis thaliana cotyledon cell type in a single cell sequencing result.
Drawings
FIG. 1 is a violin diagram showing the expression level of marker gene, the abscissa is the number of cell population and the ordinate is the normalized gene expression value;
FIG. 2 is a graph showing the expression amount of Maker gene, featureplot;
FIG. 3 is a schematic diagram of the process of automated identification of the type of cotyledon cells of Arabidopsis thaliana in single cell sequencing;
FIG. 4 shows the results of cell type identification using the present automated procedure.
Detailed Description
The invention is further described in detail with reference to the following specific examples and the accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited.
Example 1, Manual identification
Firstly, a lot of literature data are consulted to collect Marker genes, an expression clustering heat map of the genes and an expression quantity map in a single cell (FeatureParot) are drawn, so that cell types representing different stages of stomata development in an arabidopsis cotyledon are identified manually, and the specific used Marker genes are as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Late meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): iqd5, RBCS
The expression level of the gene in a single cell was plotted using the following code:
Figure BDA0002809058220000081
example 2 identification method based on singleR reference dataset
Based on the identified arabidopsis thaliana cotyledon cell types, a reference data set of each cell type is constructed according to the expression profile of the arabidopsis thaliana cotyledon cell types and is used for quickly judging the arabidopsis thaliana cotyledon cell types in high-throughput single-cell transcriptome sequencing, and the specific operation steps are as follows:
step 1, importing data to be tested;
seurat_ob=readRDS("seurat_ob.rds")
query.m=GetAssayData(seurat_ob,assay="RNA",slot="counts")
query.sce=SingleCellExperiment(assays=list(counts=query.m))
query.sce=logNormCounts(query.sce)
step 2, loading the constructed arabidopsis reference data set;
ref.sce=readRDS("reference.rds")
step 3, identifying the cell type by using a SingleR () function;
pred=SingleR(query.sce,ref.sce,labels=factor(ref.sce$celltype),BPPARAM=
MulticoreParam(workers=10))
saveRDS(pred,"singleR.rds")
step 4, drawing a correlation heat map for identifying the cell types;
Figure BDA0002809058220000091
Figure BDA0002809058220000101
step 5, counting the cell types with the most proportion in each cluster;
seurat_ob=SetIdent(seurat_ob,value="clusters")
top_celltype=main_celltyping_stat%>%group_by(clusters)%>%top_n(1,cell_num)
write.table(top_celltype,quote=F,"top_celltyping_statistics.xls",sep="\t",row.names=F)
and 6, outputting the comment result of each cluster and drawing.
from.id=as.vector(top_celltype$clusters)
to.id=as.vector(top_celltype$raw_celltype)
seurat_ob=SetIdent(seurat_ob,value=
plyr::mapvalues(x=Idents(seurat_ob),from=from.id,to=to.id))
seurat_ob=StashIdent(seurat_ob,save.name="celltype")
ggtsne2=DimPlot(object=seurat_ob,reduction="tsne",pt.size=1)+theme(plot.title=
element_text(hjust=0.5))
ggsave("celltyping.pdf",plot=ggtsne2)
Results and analysis:
the SCRM gene is one of Marker genes of Guard Mother Cells (GMC), and by drawing a violin graph and FeaturePlot (figure 1 and figure 2) of the expression amount of the gene in a single cell, the gene can be seen to be expressed in the 6 th group and the 11 th group, and only the expression amount is different, so that the accuracy of judging the type of a similar cell population is not high only by the expression of a small amount of Marker, and a large amount of literature data needs to be consulted to search for more Marker genes to manually identify, which is time-consuming and labor-consuming.
By using the reference data set and the automatic program constructed by the invention, the cell types representing different stages of stomatal development in the arabidopsis cotyledon can be quickly obtained only by inputting the data to be identified (the flow schematic diagram is shown in fig. 3) (fig. 4). About 1 ten thousand cells can be identified in about 10 minutes, so that the labor cost is greatly reduced, and two similar cell types are well distinguished: guard Mother Cells (GMC) and Young Guard Cells (YGC), i.e. group 6 and group 11 cells, ensured annotation accuracy.
The foregoing description is a general description of the invention. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation, as form changes and equivalents may be employed. Various changes or modifications may be effected therein by one skilled in the art and equivalents may be made thereto without departing from the scope of the invention as defined in the claims appended hereto.

Claims (10)

1. A system for identifying an arabidopsis cotyledon cell type based on single cell sequencing, the system comprising: a cell sequencing platform, a database platform of cell types, and a data analysis and processing platform.
2. The system of claim 1, wherein the platform is a single-cell transcriptome sequencing platform, and the genetic data of the cell is obtained by single-cell transcriptome sequencing technology (scRNA-seq).
3. The system of claim 1, wherein the database platform of cell types is based on Marker genes of mesophyll cells (MPC), pseudomeristematic mother cells (MMC), early meristematic cells (EM), late meristematic cells (LM), Guard Mother Cells (GMC), Young Guard Cells (YGC), Guard Cells (GC), and squamous cells (PC), and the database platform of cell types constructs an arabidopsis thaliana reference data platform, wherein the Marker genes of each cell are as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Late meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): IQD5, RBCS.
4. The system for identifying the type of Arabidopsis cotyledon cell based on single cell sequencing as claimed in claim 3, wherein the database platform of the cell type is established by the following method:
through single cell transcriptome sequencing technology (scRNA-seq), a plurality of Marker genes are collected to identify cell types representing different stages of stomatal development, and specific cells and Marker genes are as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Early meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): IQD5, RBCS;
a database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cells is constructed.
5. The system for identifying the type of the arabidopsis thaliana cotyledon cell based on single cell sequencing as claimed in claim 4, wherein the step of constructing the database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cell is as follows:
plotting expression level in single cells for the relevant markers using FeaturePlot () and VlnPlot () functions;
plotting gene expression clustering heatmaps in single cells for related markers using a pheasap () function;
and judging the cell type composition of the arabidopsis thaliana cotyledon based on the expression quantity diagram and the gene expression clustering heat map, obtaining a single cell expression spectrum corresponding to each cell type of the arabidopsis thaliana cotyledon, and constructing a cell type identification reference data set.
6. The system of claim 1, wherein the data analysis and processing platform is configured to identify cell types using SingleR () function, generate a correlation heatmap for identifying cell types, generate statistical scores for cell types, and output results and maps.
7. The system for identifying the type of cotyledon cell of Arabidopsis thaliana based on single cell sequencing as claimed in claim 6, wherein the data analysis and processing steps are as follows:
based on the construction, a cell type identification reference data set is obtained, and a SingleR packet is used for matching corresponding cell types by comparing the ranking of genes which are obviously up-regulated and expressed in each group of data to be detected in the reference data set, and is used for quickly judging the types of the arabidopsis thaliana cotyledon cells in the subsequent high-throughput single-cell transcriptome sequencing, and the specific operation steps are as follows:
importing data to be detected;
loading a constructed database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cells;
identifying a cell type using a SingleR () function;
mapping cell type identification correlation heatmaps;
counting the most abundant cell types;
and outputting the result and drawing.
8. A detection method for identifying the type of an arabidopsis thaliana cotyledon cell based on single cell sequencing is characterized by comprising the following steps:
based on a system for identifying the type of an arabidopsis cotyledon cell based on single cell sequencing, the system comprises: a cell sequencing platform, a database platform of cell types, and a data analysis and processing platform;
the cell sequencing platform is a single-cell transcriptome sequencing platform, and gene data of cells are obtained by a single-cell transcriptome sequencing technology (scRNA-seq);
the cell type database platform is based on Marker genes of mesophyll cells (MPC), pseudomeristematic blast cells (MMC), early meristematic cells (EM), late meristematic cells (LM), guard blast cells (GMC), Young Guard Cells (YGC), Guard Cells (GC) and squamous cells (PC), and an arabidopsis thaliana reference data platform is constructed, wherein the Marker genes of all the cells are as follows:
mesophyll cells (MPCs): RBCS, LHCB
Pseudomeristematic blast (MMC): HDG2, POLAR, SPCH, TMM, MUTE, EPF2
Early meristematic cells (EM): MUTE, BASL, SPCH, EPF2
Late meristematic cells (LM): BASL, MUTE, EPF1
Guard Mother Cell (GMC): EPF1, HIC, FAMA, SCRM
Young Guard Cells (YGC): RBCS, FAMA, EPF1
Guard Cells (GC): low expression of RBCS, FAMA, SCRM, and TMM genes
Flat cell (PC): IQD5, RBCS;
the data analysis and processing platform identifies cell types by using a SingleR () function, draws a cell type identification correlation heat map, counts the most cell types, and outputs results and a drawing.
9. The detection method for identifying the type of the cotyledon cell of Arabidopsis thaliana based on single-cell sequencing according to claim 8, wherein: the database platform of the cell types is established as follows:
through single cell transcriptome sequencing technology (scRNA-seq), a plurality of Marker genes are collected to identify cell types representing different stages of stomatal development,
a database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cells is constructed.
10. The system for identifying the type of the arabidopsis thaliana cotyledon cell based on single cell sequencing as claimed in claim 8, wherein the step of constructing the database platform (single cell reference data set) suitable for identifying the type of the arabidopsis thaliana cotyledon cell is as follows:
plotting expression level in single cells for the relevant markers using FeaturePlot () and VlnPlot () functions;
plotting gene expression clustering heatmaps in single cells for related markers using a pheasap () function;
and judging the cell type composition of the arabidopsis thaliana cotyledon based on the expression quantity diagram and the gene expression clustering heat map, obtaining a single cell expression spectrum corresponding to each cell type of the arabidopsis thaliana cotyledon, and constructing a cell type identification reference data set.
CN202011379750.6A 2020-12-01 2020-12-01 System and detection method for identifying arabidopsis thaliana cotyledon cell type Pending CN112735523A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011379750.6A CN112735523A (en) 2020-12-01 2020-12-01 System and detection method for identifying arabidopsis thaliana cotyledon cell type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011379750.6A CN112735523A (en) 2020-12-01 2020-12-01 System and detection method for identifying arabidopsis thaliana cotyledon cell type

Publications (1)

Publication Number Publication Date
CN112735523A true CN112735523A (en) 2021-04-30

Family

ID=75597119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011379750.6A Pending CN112735523A (en) 2020-12-01 2020-12-01 System and detection method for identifying arabidopsis thaliana cotyledon cell type

Country Status (1)

Country Link
CN (1) CN112735523A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114295444A (en) * 2021-12-30 2022-04-08 河南大学 Frozen section method for peach fruit tissue space transcriptomics analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040014215A1 (en) * 2000-07-27 2004-01-22 Margit Menges Synchronised arabidopsis cell suspensions and uses thereof
US20140297194A1 (en) * 2013-04-02 2014-10-02 Yih-Sheng Yang Gene signatures for detection of potential human diseases
CN110060729A (en) * 2019-03-28 2019-07-26 广州序科码生物技术有限责任公司 A method of cell identity is annotated based on unicellular transcript profile cluster result
CN111243675A (en) * 2020-01-07 2020-06-05 广州基迪奥生物科技有限公司 Interactive cell heterogeneity analysis visualization platform and implementation method thereof
CN111951892A (en) * 2020-08-04 2020-11-17 荣联科技集团股份有限公司 Method for analyzing cell trajectory based on single cell sequencing data and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040014215A1 (en) * 2000-07-27 2004-01-22 Margit Menges Synchronised arabidopsis cell suspensions and uses thereof
US20140297194A1 (en) * 2013-04-02 2014-10-02 Yih-Sheng Yang Gene signatures for detection of potential human diseases
CN110060729A (en) * 2019-03-28 2019-07-26 广州序科码生物技术有限责任公司 A method of cell identity is annotated based on unicellular transcript profile cluster result
CN111243675A (en) * 2020-01-07 2020-06-05 广州基迪奥生物科技有限公司 Interactive cell heterogeneity analysis visualization platform and implementation method thereof
CN111951892A (en) * 2020-08-04 2020-11-17 荣联科技集团股份有限公司 Method for analyzing cell trajectory based on single cell sequencing data and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHIXIN LIU ET AL: "Global Dynamic Molecular Profiling of Stomatal Lineage Cell Development by Single-Cell RNA Sequencing", 《MOLECULAR PLANT》 *
郑光敏等: "单细胞测序数据的智能解析与数据库", 《发育医学电子杂志》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114295444A (en) * 2021-12-30 2022-04-08 河南大学 Frozen section method for peach fruit tissue space transcriptomics analysis

Similar Documents

Publication Publication Date Title
CN114420212B (en) Escherichia coli strain identification method and system
CN114708910B (en) Method for calculating enrichment score of cell subpopulations in cell sequencing by using single cell sequencing data
CN112289384B (en) Construction method and application of citrus whole genome KASP marker library
CN112735523A (en) System and detection method for identifying arabidopsis thaliana cotyledon cell type
CN113344272A (en) Prediction method of interaction relation between circRNA, miRNA and RBP based on machine learning
CN103866007A (en) Method for screening real-time fluorescence quantification PCR internal reference molecules of syntrichia caninervis in desert
Orlando et al. Manipulating large-scale Arabidopsis microarray expression data: identifying dominant expression patterns and biological process enrichment
CN111292807B (en) Method for analyzing double cells in single-cell transcriptome data
KR101506916B1 (en) Method for identifying miRNA automatically from sample using miRNA automated detection system
CN112233722A (en) Method for identifying variety, and method and device for constructing prediction model thereof
CN105279396B (en) The Drought-resistant gene of plant module method of excavation
CN111681704B (en) Construction method of matK gene-based unknown plant species identification database and database
CN108595914A (en) One grows tobacco mitochondrial RNA (mt RNA) editing sites high-precision forecasting method
CN104112023A (en) Computer database system based paternity identification search method
CN112102880A (en) Method for identifying variety, and method and device for constructing prediction model thereof
CN113066530A (en) Method for combining linkage disequilibrium SNP in eQTL analysis results in batch
CN113377765A (en) Multi-group chemical data analysis system and data conversion method thereof
CN117095748B (en) Method for constructing plant miRNA genetic regulation pathway
Meyer et al. ReadZS detects developmentally regulated RNA processing programs in single cell RNA-seq and defines subpopulations independent of gene expression
CN110232952B (en) Bioinformatics method for analyzing microsatellite data in batches
CN116467596B (en) Training method of rice grain length prediction model, morphology prediction method and apparatus
CN118016145A (en) Analysis method and system of sgRNA library
CN117535429B (en) SNP locus set for identifying Tibetan chicken variety from Lingzhang and application thereof
CN116312786B (en) Single cell expression pattern difference evaluation method based on multi-group comparison
CN116153410B (en) Microbial genome reference database, construction method and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210430

RJ01 Rejection of invention patent application after publication