CN108090325B - Method for analyzing single cell sequencing data by applying beta-stability - Google Patents
Method for analyzing single cell sequencing data by applying beta-stability Download PDFInfo
- Publication number
- CN108090325B CN108090325B CN201611126940.0A CN201611126940A CN108090325B CN 108090325 B CN108090325 B CN 108090325B CN 201611126940 A CN201611126940 A CN 201611126940A CN 108090325 B CN108090325 B CN 108090325B
- Authority
- CN
- China
- Prior art keywords
- cell
- cells
- gene expression
- zygotic
- stability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention relates to a bioinformatics analysis method of sequencing data, in particular to a method for analyzing single cell sequencing data by applying beta-stability. The invention aims to analyze the stability and variability of gene expression quantity between single cells on time and space scales by applying a beta-stability method aiming at single cell sequencing data. Taking a dynamic change model of the gene expression level of a single cell in the embryonic development period as an example, the expression level of the gene of the single cell in each period is analyzed by a beta-stability method, so that the dynamic change model of the gene expression level more accurate in the early embryonic development stage is obtained.
Description
Technical Field
The invention relates to a bioinformatics analysis technology of sequencing data, in particular to a method for analyzing the stability and variability of single-cell gene expression quantity on time and space scales by applying beta-stability, taking a dynamic change model of single-cell gene expression quantity in the early development stage of an embryo as an example.
Background
In 2001, the "human genome project" primarily completed the "draft" of human genome, and the function of the gene was originally exposed. Over a decade, gene sequencing technology has been rapidly developed and widely used in various fields. Especially in the field of life sciences, sequencing technology has become a powerful tool for researchers to explore the mysteries of life-base, and many methods for applying sequencing technology have been derived. The single cell sequencing technology is an emerging application method for the sequencing technology in recent years, and compared with the conventional sequencing method, the single cell sequencing technology has the great advantage that the single cell sequencing technology only sequences genetic materials of single cells, so that the heterogeneity of each cell on the genetic materials can be detected to the maximum extent. The advantage of the single cell sequencing technology enables the single cell sequencing technology to have great application value in the fields of life science and medicine. For example, researchers have used single cell sequencing technology to study the differences between tumor cells and normal cells, detect the variation of genetic material in tumor cells, and can examine the process of the change of gene expression levels of tumor cells at different periods; in addition, single cell sequencing of some pathogenic bacteria is used for developing specific vaccines. In addition, researchers perform single cell sequencing on sexual germ cells and embryonic cells to study the dynamic change of gene expression level in the early development stage of organisms.
The single cell sequencing technology is used for researching the change of the gene expression level in the early development stage of organisms, and the gene expression level of single cells in different stages such as sperm cells, egg cells, zygotic cells, two-cell stages, four-cell stages, eight-cell stages and the like is mainly detected (see figure 1). The method comprises the steps of measuring the gene expression quantity of each cell in the early embryonic development stage by using a single cell sequencing technology, firstly carrying out single cell sequencing on sperm cells and egg cells, then sequencing zygote cells generated by the two cells, sequencing cells in a two-cell stage, cells in a four-cell stage and cells in an eight-cell stage generated by the development of the zygote cells, and further researching the dynamic change of the gene expression quantity. However, currently, due to the limitations of the current experimental conditions, the complete development process of a zygotic cell cannot be tracked. For example, after extracting the genome of a zygotic cell, the cell is destroyed and cannot further develop into a two-cell stage embryo. The sequenced two-cell stage samples of genetic material developed from other zygotic cells. As another example, in FIG. 1, after extracting the genetic material of cell A, which is damaged, and cell B and cell A, which are samples from other cells in the two-cell phase, are obtained. Therefore, it is impossible to follow and study all cells in two-cell stage, four-cell stage and eight-cell stage generated by the same zygote cell.
Based on single cell sequencing data, the invention takes single cell gene expression quantity data at the early embryonic development stage as an example, and analyzes the expression quantity of single cell genes at each period by using a beta-stability method. Thereby obtaining a more accurate dynamic change model of the gene expression level.
Disclosure of Invention
The invention aims to:
a method for analyzing the stability and variability of gene expression levels between individual cells on a temporal and spatial scale using beta-stability is provided. Taking single cell sequencing data at the early stage of embryonic development as an example, analyzing the gene expression level of a single cell at each stage of embryonic development by using a beta-stability method so as to obtain a more accurate dynamic change model of the gene expression level.
In order to realize the purpose, the invention adopts the technical scheme that:
starting from sperm cells and egg cells, all individual cells during embryonic development were extracted through the eight-cell stage, and the transcriptome of each cell was subjected to single cell sequencing, thereby determining the expression amount of genes in each cell. As shown in FIG. 1, after single cell sequencing, data on the gene expression levels of 17 cells in total (sperm cells, egg cells, zygotic cells, cell A, cell B, cell A-cell D, cell 1-cell 8) were obtained. Due to the limitations of experimental techniques, the obtained sample of zygotic cells is not developed from the collected sperm cell and egg cell samples, but is generated from the development of other sperm cells and egg cells of the same individual. In response to this drawback, the inventors constructed a dynamic model of gene expression levels in the early embryonic development stage by listing all "developmental pathways" from zygotic cells using a permutation and combination approach.
The invention has the following effects:
a method for analyzing the stability and variability of gene expression quantity between single cells on a time scale and a space scale is invented. The method is used for analyzing the dynamic change of the gene expression quantity in the early embryonic development stage, and a more accurate dynamic change model of the gene expression quantity can be obtained under the limitation of the current experimental technology.
Drawings
FIG. 1 is a diagram showing the distribution of cells at each stage of the early development stage of an embryo
Detailed Description
As described above, due to the limitations of the experimental technique, it is not possible to determine whether cell a is derived from cell a or cell B in this experiment (cell B, cell C, and cell D also face the same problem), and it is also not possible to determine from which cell a to cell D cell 1 is derived (cell 2 to cell 8 face the same problem). Based on the limitations of the experimental conditions described above, the inventors enumerated developmental pathways for all permutations, for a total of 64 possible developmental pathways:
(1) zygotic cell → cell A → cell 1
(2) Zygotic cell → cell A → cell 2
(3) Zygotic cell → cell A → cell 3
(4) Zygotic cell → cell A → cell 4
(5) Zygotic cell → cell A → cell 5
(6) Zygotic cell → cell A → cell 6
(7) Zygotic cell → cell A → cell 7
(8) Zygotic cell → cell A → cell 8
(9) Zygotic cell → cell A → cell B → cell 1
(10) Zygotic cell → cell A → cell B → cell 2
(11) Zygotic cell → cell A → cell B → cell 3
(12) Zygotic cell → cell A → cell B → cell 4
(13) Zygotic cell → cell A → cell B → cell 5
(14) Zygotic cell → cell A → cell B → cell 6
(15) Zygotic cell → cell A → cell B → cell 7
(16) Zygotic cell → cell A → cell B → cell 8
(17) Zygotic cell → cell A → cell C → cell 1
(18) Zygotic cell → cell A → cell C → cell 2
(19) Zygotic cell → cell A → cell C → cell 3
(20) Zygotic cell → cell A → cell C → cell 4
(21) Zygotic cell → cell A → cell C → cell 5
(22) Zygotic cell → cell A → cell C → cell 6
(23) Zygotic cell → cell A → cell C → cell 7
(24) Zygotic cell → cell A → cell C → cell 8
(25) Zygotic cell → cell A → cell D → cell 1
(26) Zygotic cell → cell A → cell D → cell 2
(27) Zygotic cell → cell A → cell D → cell 3
(28) Zygotic cell → cell A → cell D → cell 4
(29) Zygotic cell → cell A → cell D → cell 5
(30) Zygotic cell → cell A → cell D → cell 6
(31) Zygotic cell → cell A → cell D → cell 7
(32) Zygotic cell → cell A → cell D → cell 8
(33) Zygotic cell → cell B → cell A → cell 1
(34) Zygotic cell → cell B → cell A → cell 2
(35) Zygotic cell → cell B → cell A → cell 3
(36) Zygotic cell → cell B → cell A → cell 4
(37) Zygotic cell → cell B → cell A → cell 5
(38) Zygotic cell → cell B → cell A → cell 6
(39) Zygotic cell → cell B → cell A → cell 7
(40) Zygotic cell → cell B → cell A → cell 8
(41) Zygotic cell → cell B → cell 1
(42) Zygotic cell → cell B → cell 2
(43) Zygotic cell → cell B → cell 3
(44) Zygotic cell → cell B → cell 4
(45) Zygotic cell → cell B → cell 5
(46) Zygotic cell → cell B → cell 6
(47) Zygotic cell → cell B → cell 7
(48) Zygotic cell → cell B → cell 8
(49) Zygotic cell → cell B → cell C → cell 1
(50) Zygotic cell → cell B → cell C → cell 2
(51) Zygotic cell → cell B → cell C → cell 3
(52) Zygotic cell → cell B → cell C → cell 4
(53) Zygotic cell → cell B → cell C → cell 5
(54) Zygotic cell → cell B → cell C → cell 6
(55) Zygotic cell → cell B → cell C → cell 7
(56) Zygotic cell → cell B → cell C → cell 8
(57) Zygotic cell → cell B → cell D → cell 1
(58) Zygotic cell → cell B → cell D → cell 2
(59) Zygotic cell → cell B → cell D → cell 3
(60) Zygotic cell → cell B → cell D → cell 4
(61) Zygotic cell → cell B → cell D → cell 5
(62) Zygotic cell → cell B → cell D → cell 6
(63) Zygotic cell → cell B → cell D → cell 7
(64) Zygotic cell → cell B → cell D → cell 8
A dynamic change model of the gene expression level of the cell was calculated for each of the above 64 routes, and the amount of change in the dynamic model was calculated for each route. The inventor adopts the beta-stability index proposed by Wang and Loreau to measure the dynamic change of the expression level of the single-cell gene in the process of embryonic development. The indices include alpha-variability, beta-variability, and gamma-variability. The calculation process is as follows:
(1) the coefficient of variation at the gene level in each cell was calculated:
wherein μ represents the average of the expression levels of all genes in a single cell, σ2Represents the variance of the expression levels of all genes in a single cell.
(2) Calculating the synchronism of the expression quantity change of the genes among cells in different periods of each development path:
in the formula, S represents the number of genes expressed in a single cell, ρSThe correlation relationship of the expression quantity change of the gene between different cells is shown, and one is calculated for each pathThis coefficient allows the same cell to have different alpha-variability in the calculated gene expression levels in different pathways.
(3) Calculating the spatial synchronism of the gene expression level of each developmental path
Where m denotes the number of cells in each path, pPIndicating the correlation between the gene expression levels of the cells in each developmental pathway.
(4) Calculating the alpha-variability, beta-variability and gamma-variability of the gene expression level change of each cell in the development path.
The alpha-variability of gene expression can be calculated for each cell, based on each developmental pathwayThe β -variability and γ -variability were calculated from the α -variability of the cells at four stages, such as the homozygote cell, the two-cell stage cell, the four-cell stage cell, and the eight-cell stage cell. This is used to indicate the dynamic process of gene expression for each developmental pathway. The mean value v of the beta-stabilities of the eight developmental pathways (1, 9, 17, 25, 33, 41, 49, 57) of the producer cell 1 was calculated1As a parameter of the cell 1 dynamics model. Similarly, the dynamic change v of the cells 2 to 8 was calculated2~v8Together with v1Constructing a dynamic model of the expression level of the single-cell gene in the development process of the embryo from the zygote cell to the eight-cell stage.
Claims (6)
1. A method for analyzing stability and variability of gene expression quantity of single cell sequencing data by using a beta-stability method is characterized in that all single cells in the embryonic development process are extracted from sperm cells and egg cells to the eight-cell period, sequencing the transcriptome of each cell by single cell to determine the expression level of the gene in each cell, sequencing by single cell, obtaining the gene expression data of the cells, wherein the gene expression data comprise sperm cells, egg cells, zygotic cells, cells A, B, A-D and 1-8, the obtained zygote cell sample is generated by the development of other sperm cells and egg cells of the same individual, all embryo development paths are listed from the zygote cells by adopting a permutation and combination method, and an accurate dynamic change model of the gene expression quantity is obtained by integrating the dynamic change models calculated by all the paths.
2. The method of claim 1, wherein: beta-stability methods include alpha-variability, beta-variability, and gamma-variability.
3. The method of claim 1, wherein: the stability and variability of gene expression levels between single cells on both temporal and spatial scales were analyzed for single cell sequencing data.
4. The method of claim 1, wherein: and (3) aiming at the single cell sequencing data of each period in the early embryonic development stage, analyzing the dynamic change of the gene expression quantity by using a beta-stability method.
5. The method of claim 4, wherein: enumerating all embryo development paths, and analyzing the dynamic change of gene expression amount by applying a beta-stability method aiming at each development path.
6. The method of any of claims 1-5, wherein: products whose algorithms and functions are implemented in any form of software, firmware or hardware to provide services.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611126940.0A CN108090325B (en) | 2016-11-23 | 2016-11-23 | Method for analyzing single cell sequencing data by applying beta-stability |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611126940.0A CN108090325B (en) | 2016-11-23 | 2016-11-23 | Method for analyzing single cell sequencing data by applying beta-stability |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108090325A CN108090325A (en) | 2018-05-29 |
CN108090325B true CN108090325B (en) | 2022-01-25 |
Family
ID=62170487
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611126940.0A Active CN108090325B (en) | 2016-11-23 | 2016-11-23 | Method for analyzing single cell sequencing data by applying beta-stability |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108090325B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109033743B (en) * | 2018-07-25 | 2021-01-01 | 上海交通大学 | Method for reducing technical noise in single-cell transcriptome data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105392894A (en) * | 2012-01-20 | 2016-03-09 | 深圳华大基因医学有限公司 | Method and system for determining whether copy number variation exists in sample genome, and computer readable medium |
CN105603062A (en) * | 2006-05-03 | 2016-05-25 | 人口诊断股份有限公司 | Method of evaluating genetic disorders |
CN105989249A (en) * | 2014-09-26 | 2016-10-05 | 叶承羲 | Method, system and device for assembling genomic sequence |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070058440A (en) * | 2004-07-02 | 2007-06-08 | 헨리 엘 니만 | Copy choice recombination and uses thereof |
RS64230B1 (en) * | 2011-05-24 | 2023-06-30 | BioNTech SE | Individualized vaccines for cancer |
-
2016
- 2016-11-23 CN CN201611126940.0A patent/CN108090325B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105603062A (en) * | 2006-05-03 | 2016-05-25 | 人口诊断股份有限公司 | Method of evaluating genetic disorders |
CN105392894A (en) * | 2012-01-20 | 2016-03-09 | 深圳华大基因医学有限公司 | Method and system for determining whether copy number variation exists in sample genome, and computer readable medium |
CN105989249A (en) * | 2014-09-26 | 2016-10-05 | 叶承羲 | Method, system and device for assembling genomic sequence |
Non-Patent Citations (2)
Title |
---|
Biodiversity and ecosystem stability across scales in metacommunities;Shaopeng Wang等;《Ecology Letters》;20160531;第1-4页 * |
Detection of high variability in gene expression from single-cell RNA-seq profiling;Hung-I Harry Chen等;《The Author(s) BMC Genomics》;20160822;第1-3页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108090325A (en) | 2018-05-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Crombie et al. | Deep sampling of Hawaiian Caenorhabditis elegans reveals high genetic diversity and admixture with global populations | |
Simakov et al. | Hemichordate genomes and deuterostome origins | |
Gawad et al. | Single-cell genome sequencing: current state of the science | |
Duveau et al. | Fitness effects of altering gene expression noise in Saccharomyces cerevisiae | |
Naik et al. | Cellular barcoding: a technical appraisal | |
Kelly et al. | Pervasive linked selection and intermediate-frequency alleles are implicated in an evolve-and-resequencing experiment of Drosophila simulans | |
Tschopp et al. | Deep homology in the age of next-generation sequencing | |
CN107077537A (en) | With short reading sequencing data detection repeat amplification protcol | |
Zenda et al. | Advances in cereal crop genomics for resilience under climate change | |
Dornburg et al. | Maximizing power in phylogenetics and phylogenomics: a perspective illuminated by fungal big data | |
Cissé et al. | Genomic insights into the host specific adaptation of the Pneumocystis genus | |
CN115052994A (en) | Method for determining base type of predetermined site in chromosome of embryonic cell and application thereof | |
Salmona et al. | Inferring demographic history using genomic data | |
Widmayer et al. | Evaluating the power and limitations of genome-wide association studies in Caenorhabditis elegans | |
Hopkins et al. | Phenotypic screening models for rapid diagnosis of genetic variants and discovery of personalized therapeutics | |
CN108090325B (en) | Method for analyzing single cell sequencing data by applying beta-stability | |
Pardo-De la Hoz et al. | Ancient rapid radiation explains most conflicts among gene trees and well-supported phylogenomic trees of Nostocalean cyanobacteria | |
Calisi et al. | RNAseq-ing a more integrative understanding of animal behavior | |
Kuo et al. | Weak gene–gene interaction facilitates the evolution of gene expression plasticity | |
Barrie et al. | Elevated genetic risk for multiple sclerosis originated in Steppe Pastoralist populations | |
Stres et al. | New frontiers in soil microbiology: how to link structure and function of microbial communities? | |
Mardulyn et al. | Controlling population evolution in the laboratory to evaluate methods of historical inference | |
Kelly et al. | An examination of the evolve-and-resequence method using Drosophila simulans | |
Pratto et al. | Germline DNA replication shapes the recombination landscape in mammals | |
CN117237324B (en) | Non-invasive euploid prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |