CN115394360B - Exhaustive analysis method for sequential biological big data - Google Patents

Exhaustive analysis method for sequential biological big data Download PDF

Info

Publication number
CN115394360B
CN115394360B CN202210710202.XA CN202210710202A CN115394360B CN 115394360 B CN115394360 B CN 115394360B CN 202210710202 A CN202210710202 A CN 202210710202A CN 115394360 B CN115394360 B CN 115394360B
Authority
CN
China
Prior art keywords
analysis
data
trend
time
total number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210710202.XA
Other languages
Chinese (zh)
Other versions
CN115394360A (en
Inventor
张际峰
杨士伟
刘海涛
汪承润
李茂业
蒋磊
张国超
刘芯茹
孟静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huainan Normal University
Original Assignee
Huainan Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huainan Normal University filed Critical Huainan Normal University
Priority to CN202210710202.XA priority Critical patent/CN115394360B/en
Publication of CN115394360A publication Critical patent/CN115394360A/en
Application granted granted Critical
Publication of CN115394360B publication Critical patent/CN115394360B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an exhaustive analysis method for big data of time sequence biology. Belongs to the fields of bioinformatics and big data. According to the invention, a specific analysis method is provided for the time sequence biological group big data analysis steps and the implementation scheme, and references are provided for trend analysis, segment analysis and interaction analysis research of the time sequence biological group big data in the biological field.

Description

Exhaustive analysis method for sequential biological big data
Field of the invention
The invention belongs to the fields of bioinformatics and big data research in the field of life science, and particularly relates to an exhaustive analysis method for various biological data aiming at a multi-point time sequence.
Second, background art
With the blowout type release of high-throughput histology data, a class of chronologically recorded histology data, namely time-series histology big data, is favored by more and more life science researchers. Because time-series data has continuity in time, all types of data have the same background environment except time. Therefore, only a single time variable need be considered for comparison. It is often used in a variety of biological processes with a certain time span, such as continuous growth of plants, continuous invasion of the host by viruses, continuous division of cells, etc. Through multipoint dynamic analysis of the biological process, the change trend and regularity of the object to be detected in time are obtained.
The current development of chronology big data analysis mainly focuses on the timing analysis of transcriptome data most commonly. The research method comprises short time sequence expression digger (STEM), K-means clustering algorithm, mfuzz algorithm and the like.
However, to our knowledge, these methods are rarely used for other data than transcriptome data, and they all rely on their internal parameter settings to obtain the total number of possible types, not all of which are exhaustive of the feature data to be studied. Some potentially significant data features are lost in this way.
Based on this, the present invention is directed to an analysis method of time series characteristic data including quantifiable data of transcriptome, epigenetic group or proteome; the study object can be a segment of chromosome, a gene, a protein or non-coding RNA, etc. The analysis content comprises 3 types of trend analysis, segmentation analysis and interaction analysis, the possible analysis types are exhausted, the characteristic data of obvious difference change under the types are explored by using a statistical analysis method, and the analysis method is systematic analysis of various time sequence data under a big data background, removes false passbook and digs the hidden dynamic rules behind time data.
Third, summary of the invention
1. Problems to be solved by the invention
The invention aims to solve the following problems: first, in general, an exhaustive analysis method for sequential biological big data is presented, providing a comprehensive systematic analysis scheme for sequential transcriptome, proteome, and epigenetic group feature data analysis; second, the analysis scheme traverses the possible behavior of the study object between different time nodes, from the global time perspective (trend analysis), the local time perspective (segmentation analysis) and the comparative analysis (interaction analysis) between pairs of study objects, explores the change rule of the study object with time fluctuation, and discovers potential characteristics and biological mystery in time series data.
The research method provided by the patent is helpful for solving the problems that the current time sequence group big data analysis method is less, the existing method depends on subjective parameter selection, the method is less exhaustive of the data characteristics of various nodes, and the like.
2. Technical proposal
The invention provides an exhaustive analysis method for big data of time sequence biology, which comprises the following specific embodiments:
(1) Data preprocessing
The chrono-histologic data may be derived from a public database, such as transcriptome or methylation set data in a GEO database, or may be derived from direct identification results from a biological company, such as protein mass spectrometry. The time nodes of the histologic data which can be analyzed by the method are generally not less than 3, and the sample information corresponding to different time nodes should be consistent or identical.
After obtaining the time series histology data, the corresponding data are preprocessed as follows:
(1) probes whose deleted feature data may have more than 80% missing values;
(2) combining the same probes, and combining according to the mean value or the median value of the characteristic data;
(3) carrying out standardization processing on probe data of different time sequences to enable characteristic data of each sample to have similar numerical distribution;
(4) according to the self-properties of the research object, targeted processing is carried out, such as methylation data chips in the epigenetic group, and data are required to be converted into beta values and then standardized;
(5) except the first time point, the characteristic value of the other time points at the back and the front of the first time point is taken, and then the natural logarithm or the logarithm based on 10 or 2 is taken to perform data conversion, wherein the obtained result is defined as alpha;
(6) the fluctuating trend with respect to a single subject is determined by the corresponding threshold value of α itself, which may be a value such as α=0; a pair of opposite numbers, such as α= ±0.2, is also possible;
(7) for α=0, we consider α >0 to mean: between two nodes to be studied, the data feature appears to rise; conversely, a <0 means that the data characteristic appears to be declining; whereas α=0 means that the single feature data is unchanged: between two nodes to be studied, the data feature is represented by alpha threshold being a pair of mutually opposite numbers, for example, alpha >0.5 indicates that the data feature is relatively up-regulated, alpha < -0.5 indicates that the data feature is relatively down-regulated, and alpha epsilon < -0.5,0.5] indicates that the data feature is relatively unchanged.
(2) Exhaustive analytical method use
According to the exhaustive analysis method for the big data of the time sequence biology, the following steps are implemented in 3 aspects examined by the method:
(1) performing trend analysis of the feature data, measuring "all possible trends" of the feature data for all single study objects at global or local time angles:
first, all possible trend conditions are combed. There may be 3 cases of inter-node ratios compared to a given threshold, namely three cases greater than, less than, or equal to the threshold. So that the trend is three conditions of rising, falling and unchanged. And there is a set of these three possibilities between the two time nodes.
Secondly, all possible trend conditions are counted. According to a set of three likelihood calculations, a total likelihood of a trend of all time nodes can be obtained. I.e., 3 to the power of the total number of nodes minus 1, while the resulting overall trend needs to be subtracted by a case that does not change at any time, which may not have any relation to the time event.
Thirdly, selecting the trend of 'characteristic significant'. Screening can be performed in terms of number of trends and specificity of the trends, as required by the analysis. In the aspect of the number of the trends, most situations of the trends or 5% situations of the trends can be selected; in the aspect of the specificity of the trend, the conditions of uniform ascending and uniform descending, or the conditions of ascending, descending, ascending and descending can be selected for analysis of biological processes.
(2) A piecewise analysis of the feature data is performed, measuring "all possible trends or interactions" of the feature data for all single study subjects at a local temporal perspective.
First, all possible time segments are sorted, where the method only considers a single segment case, and does not consider a case where there are two or more segments. Because, if the number of segment cut points exceeds two, the possibility of feature data becomes more complicated. And if two segments are actually needed to be selected, the starting segment can be used as the starting end of the time sequence data. This contributes to the complexity of the dimension reduction analysis;
and secondly, accumulating the total number of possible segments, and analyzing the middle segments of the multiple time nodes, wherein the possible total number is influenced by the number of time periods between the two nodes. Obviously, the total number should be twice the number of time periods, i.e. the total number of instant time nodes minus 2 is available.
Thirdly, according to the segmentation situation, the time period of interest can be examined or further analysis can be performed according to monotonicity and characteristics of the trend. The difference in number or "sub-trend" at different time periods was examined.
(3) Performing an interaction analysis of the feature data, measuring "interaction conditions" of the feature data between pairs or groups of study objects at global or local time angles:
first, the possible types of interaction analysis are first examined, which include two major categories, "antagonism" and "synergy", and the total number of interactions between antagonism relationship groups in the interaction analysis is half of the total number of trends in the trend analysis.
Secondly, the trend analysis can be only aimed at between two single study objects, or can be only aimed at between two study object groups. Interactions of "synergistic" relationships can be further screened in the same trend analysis results, while interactions of "antagonistic" relationships require further screening in "symmetrical" trends.
3. Advantageous effects
The exhaustive analysis method for the sequential biological big data has the following specific beneficial effects:
(1) The invention relates to an exhaustive analysis method for time sequence biological group big data, which comprises a trend analysis method, wherein the method can comprehensively comb out global or local trend conditions of all single study objects in characteristic data, and provides an effective analysis scheme for screening out trend characteristics which are specially required and searching the characteristic data corresponding to biological processes.
(2) The invention relates to an exhaustive analysis method for time sequence biological big data, which can take time sequence into account in sections so as to obtain a section analysis method, wherein the method can carry out local analysis in a required time period of interest or analyze two time periods before and after comparison, and the local analysis can also comprise trend analysis and interaction analysis. And a research thought is provided for locally finding out the proper time series data characteristics.
(3) The invention is an exhaustive analysis method for time series biological group big data, which includes a method for providing interaction analysis, and the method can screen interaction relation possibly existing between pairs of study objects in time series data. Or interactions between two groups of subjects may exist. Facilitating screening of possible physical interactions between genes, chromosomal segments or proteins based on data. Provides an analysis scheme for the research of biological macromolecule interaction.
Drawings
FIG. 1 is a timing diagram of a characteristic data of the present invention;
when the number of time nodes is 4 (fig. 1A-N) or 5 (fig. 1O), taking natural logarithm from the ratio of the next time node to the previous time node to obtain a value, and comparing the value with a self-defined alpha value to obtain a possible situation diagram and a possible segmentation point diagram;
the reference numerals in the drawings illustrate: a: the four time node values of 0h (hour), 12h,24h and 48h are unchanged all the time; B-N: four time nodes, 26 possible profiles (numbered as P2 a) are possible, these comprising 13 symmetrical profile pairs (numbered as fig. 1B), which can be used for "antagonism" relationship analysis of the interaction analysis pairs. O represents the 6 possible segmentation cases for 5 time nodes.
Detailed Description
For a further understanding of the present invention, reference should be made to the following detailed description of the invention taken in conjunction with the accompanying drawings.
Examples
For the time-series transcriptome data of four time nodes of 0h,12h,24h and 48h, a total of 3 analysis schemes for trend analysis, segment analysis and interaction analysis by the method will be described in detail in this embodiment, and the specific details are as follows:
(1) Acquisition of time-sequential transcriptome data
The present embodiment relates to time-sequential transcriptome data that can be obtained from the database GEO common data repository below. Screening data should consider that it explicitly describes time points, each with sufficient biological repeats, no fewer than two time nodes, and four time nodes (i.e., 0h,12h,24h, and 48 h) were selected for RNAseq sequencing data in this example.
(2) Data preprocessing
Aiming at the characteristic data of the transcription group in the embodiment, firstly, the filling default value, the probe name merging and the standardization processing are carried out; then, the transcriptome values of successively adjacent time nodes are averaged and then compared, and a base 10 logarithm is taken, and if the gene name is X, the time nodes are i and i+1, respectively, the ratio is expressed as X (i+1)/i Which are collectively referred to as α. Further, ±0.5 was set as a threshold value for judging the variation in transcription level of each gene.
(3) Exhaustive analysis method
One is trend analysis. For single gene X, judge X (i+1)/i And drawing floating conditions among different time nodes according to the + -0.5 relation. X is X (i+1)/i >0.5 or X (i+1)/i <-0.5 is represented as rising and falling, respectively, while between-0.5 and 0.5 is unchanged. Traversing each gene in the data in this way, the variation types of all genes under the condition of 4 time nodes can be obtained, the total number of which is 3 to the 3 th power, namely 27. In the trend analysis, the total number of trends is: 27-1=26.
The monotonically increasing trend and the monotonically decreasing trend in the trend analysis are analyzed, P11a in FIG. 1K is the monotonically increasing trend, and P11b in FIG. 1K is the monotonically decreasing trend, and the monotonic trend analysis can show that the gene keeps rising or falling trend at any time node.
For another example, for P8a and P8b in FIG. 1H, which represent that the gene group is of the type of up-flat-down and down-flat-up, respectively, in the trend, it can be seen that the gene expression level varies significantly during the different time nodes. However, if the traditional analysis method ignores the difference between the time sequence point data, a possible result is that the two trends do not show a significant fluctuation relationship.
And secondly, carrying out sectional analysis. According to the possible segmentation types of fig. 1O, for 5 time nodes (defined herein as 0h,12h,24h,48 h,72 h), the number of segments is 3*2 =6 segments. The method comprises the following steps: 0h-12h and 12h-48h, 0h-24h and 24h-48h, and 0h-48h and 48h-72 h.
P13a and P13b of FIG. 1M, it can be seen that they have significantly opposite monotonicity in the 0h-12h segment and the 12h-48h segment. I.e. P13a increases monotonically in the 0h-12h segment and decreases monotonically in the 12h-48h segment. P13b has opposite monotonicity. In addition, P12a and P12b of FIG. 1L are also opposite monotonic trends in the 0h-24h segment and 24h-48h segment. Compared to global analysis methods, local segmentation analysis facilitates mining more significant, characteristic results for "local time".
Thirdly, interaction analysis. Existing results are analyzed according to the trend and the segmentation. The gene or genes of interest are selected for evaluation of the interaction analysis. Analysis can be divided into global and local over time, as can be seen in fig. 1K-N, which have a significant fluctuating trend globally, the interaction of which may be global; and as in figures 1B-J are all locally significant wave patterns.
In fig. 1K, it can be seen that P11a and P11b may have significant "mutual inhibition" antagonistic interactions, whereas in P11a or in the respective populations of P11b, we may screen for synergistic interactions with "mutual promotion". These interactions may be directed to a pair of genes or to two groups of genes. In the embodiment of the patent, four obvious single gene pairs have obvious interaction in the process of picking out beauveria bassiana to infect host insect wax moth, BBA_05021 corresponds to BBA_08187, BBA_02297 corresponds to BBA_00032, antagonistic interaction exists between every two genes, BBA_05635 corresponds to BBA_00807, BBA_02196 corresponds to BBA_07954, and synergistic interaction exists between every two genes.

Claims (7)

1. An exhaustive analysis method for sequential biological big data, characterized by:
(1) The analyzed data comprise quantifiable histology big data in transcriptomes, epigenetic groups or proteomes, and the number n of time sequence nodes is required to satisfy n > 3;
(2) The subject is a segment of a chromosome, a gene, a protein, or a non-coding RNA;
(3) Data preprocessing is needed before data analysis, and sequencing data of n time sequence nodes are selected;
the pretreatment is as follows: except the corresponding value of sequencing data of the first time sequence node is divided by the corresponding value of sequencing data, the corresponding value of each time sequence node at the back of the first time sequence node is divided by the corresponding value of one point at the front of the first time sequence node, the ratio is subjected to logarithmic processing, and the obtained result is defined as alpha;
the alpha is compared with the specific value selected to define the relevant trend: when a value of 0, α >0 is directly chosen, this means: between the two points to be studied, the data feature appears to be ascending; alpha <0 means that the data characteristic appears to be declining; whereas α=0 means that the data characteristics have not changed; when a pair of opposite numbers + -0.5 is selected, alpha >0.5 represents an up-regulation of the data characteristic, alpha < -0.5 represents a down-regulation of the data characteristic, and alpha epsilon < -0.5,0.5] represents no change in the data characteristic;
(4) The method comprises the following steps: firstly, analyzing the trend of the global time angle of a single research object, namely measuring all possible trends of data of all single research objects under the global time angle, wherein the analysis comprises the steps of analyzing monotonically increasing trend and monotonically decreasing trend in the trend analysis; secondly, based on the segmentation analysis of the local time angle, namely measuring the data change condition of all single study objects under the local time angle, and considering only single segmentation condition and not considering two or more segmentation conditions in all possible time segmentation aiming at time sequence data; or when two sections are selected, the starting section is used as the starting end of the time sequence data; thirdly, the interaction analysis of the comparison analysis between the paired study objects is to measure the interaction condition of paired or two groups of study object data according to the existing results of trend analysis and segmentation analysis; they are abbreviated as trend analysis, segmentation analysis and interaction analysis, respectively, for a total of 3 compositions.
2. An exhaustive analysis method for chronobiological big data according to claim 1, characterized in that: the total number of possible changes of the data is regular, an exponential equation should be satisfied, and according to the data processing method, the total number Z of possible types of changes of the data of n nodes will satisfy the following formula 1:
Z(n)=3 n-1 equation 1
From this, it can be seen that when the node n=4 of the time series data, the total number of the change types of the data is 3 to the power of 3, that is, 27 kinds.
3. An exhaustive analysis method for chronobiological big data according to claim 2, characterized in that: the total number and type of trend changes of the data are deterministic; the total number of trend changes accords with a certain regularity, the total number of the trend changes accords with linear correlation with the total number of the change types, and the trend T change condition of corresponding data accords with the following formula 2:
t (n) =z (n) -1 formula 2
As shown in the formula 2, when the node n=4 of the time series data, the total number of possible trend changes is 27-1, i.e. 26.
4. An exhaustive analysis method for chronobiological big data according to claim 3, characterized in that: for interaction analysis, when the number of nodes is n, the total number of interaction type pairs P satisfies equation 3:
p (n) =t (n) 2 equation 3
Wherein T (n) is the total number of trend changes obtained according to equation 2, and in the interaction analysis, the data between the study objects are symmetrically or uniformly distributed, and the study objects are presumed to have the possibility of mutual "antagonism" or "synergy", and the total number of interactions between antagonism relationship groups or synergy relationship groups in the interaction analysis is half of the total number of trends in the trend analysis.
5. An exhaustive analysis method for chronobiological big data according to claim 4, characterized in that: the analysis method is applied to time sequence transcriptome data of beauveria bassiana infection host insect wax moth process, four time nodes, namely 0h,12h,24h and 48h RNAseq sequencing data are selected for analysis, two obvious single gene pairs are found to have a correlation, and a remarkable antagonistic interaction of 'mutual inhibition' and a synergistic interaction relationship of 'mutual promotion' are found to exist.
6. An exhaustive analysis method for chronobiological big data according to claim 1, characterized in that: for the segmentation points and the segmentation type number of the data in the segmentation condition analysis, when the node number is n, the segmentation points are n-2, and the total number D of the segmentation types satisfies the formula 4:
d (n) =2×n-2 formula 4
The various segment types of the segment analysis are obtained according to the formula 4, one or more types of the segment types can be selected for analysis, and for 5 time nodes, the number of segments is 2*3 =6 segments, which is equivalent to: 2 x (5-2) =6.
7. An exhaustive analysis method for chronobiological big data according to claim 1 or 3 or 4 or 6, characterized in that: among trend analysis, segment analysis and interaction analysis of time series data, segment analysis is to perform trend analysis and interaction analysis in a smaller range, and interaction analysis is also based on trend analysis in which "characteristic significant" trends are selected from the number of trends and the specificity of the trends, thereby screening.
CN202210710202.XA 2022-06-22 2022-06-22 Exhaustive analysis method for sequential biological big data Active CN115394360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210710202.XA CN115394360B (en) 2022-06-22 2022-06-22 Exhaustive analysis method for sequential biological big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210710202.XA CN115394360B (en) 2022-06-22 2022-06-22 Exhaustive analysis method for sequential biological big data

Publications (2)

Publication Number Publication Date
CN115394360A CN115394360A (en) 2022-11-25
CN115394360B true CN115394360B (en) 2024-02-02

Family

ID=84115951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210710202.XA Active CN115394360B (en) 2022-06-22 2022-06-22 Exhaustive analysis method for sequential biological big data

Country Status (1)

Country Link
CN (1) CN115394360B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832585A (en) * 2017-11-23 2018-03-23 南宁科城汇信息科技有限公司 A kind of RNAseq data analysing methods
CN109192252A (en) * 2018-08-23 2019-01-11 南开大学 Co-express purposes of the transcription group of period circadian rhythm in mechanism of drug action discovery
CN112201303A (en) * 2019-07-08 2021-01-08 广州基迪奥科技服务有限公司 Method and system for miRNA data and transcriptome data through analysis
CN112458179A (en) * 2020-10-27 2021-03-09 东阿阿胶股份有限公司 Donkey muscle development related gene excavation regulation method based on RNAseq technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832585A (en) * 2017-11-23 2018-03-23 南宁科城汇信息科技有限公司 A kind of RNAseq data analysing methods
CN109192252A (en) * 2018-08-23 2019-01-11 南开大学 Co-express purposes of the transcription group of period circadian rhythm in mechanism of drug action discovery
CN112201303A (en) * 2019-07-08 2021-01-08 广州基迪奥科技服务有限公司 Method and system for miRNA data and transcriptome data through analysis
CN112458179A (en) * 2020-10-27 2021-03-09 东阿阿胶股份有限公司 Donkey muscle development related gene excavation regulation method based on RNAseq technology

Also Published As

Publication number Publication date
CN115394360A (en) 2022-11-25

Similar Documents

Publication Publication Date Title
Si et al. Model-based clustering for RNA-seq data
Schwartz et al. Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes
Beaulieu et al. Fruit evolution and diversification in campanulid angiosperms
CN107169355B (en) Worm homology analysis method and device
Lou et al. Assigning sequences to species in the absence of large interspecific differences
Seo et al. Estimating absolute rates of synonymous and nonsynonymous nucleotide substitution in order to characterize natural selection and date species divergences
CN112599199A (en) Analysis method suitable for 10x single cell transcriptome sequencing data
Jeong et al. PRIME: a probabilistic imputation method to reduce dropout effects in single-cell RNA sequencing
CN110288003B (en) Data change identification method and equipment
CN1783092A (en) Data analysis device and data analysis method
CN112289376A (en) Method and device for detecting somatic cell mutation
DePasquale et al. DoubletDecon: cell-state aware removal of single-cell RNA-seq doublets
Elhaik et al. Comparative testing of DNA segmentation algorithms using benchmark simulations
CN115331750A (en) New target compound activity prediction method and system based on deep learning
CN112445690A (en) Information acquisition method and device and electronic equipment
CN115394360B (en) Exhaustive analysis method for sequential biological big data
Hanusch et al. Biogeography and integrative taxonomy of Epipterygium (Mniaceae, Bryophyta)
Ramensky et al. DNA segmentation through the Bayesian approach
CN109192246B (en) Method, apparatus and storage medium for detecting chromosomal copy number abnormalities
Singh et al. Performance evaluation of different window functions for STDFT based exon prediction technique taking paired numeric mapping scheme
CN110263291B (en) Industrial data trend identification method and system
Ruan et al. A dynamic programming algorithm for binning microbial community profiles
West et al. Approximate entropy of network parameters
CN115527610A (en) Cluster analysis method of unicellular omics data
CN107016354B (en) Method and system for extracting characteristic pattern of aluminum electrolysis anode current sequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant