WO2013097413A1 - 一种二倍体单体构建方法和系统 - Google Patents
一种二倍体单体构建方法和系统 Download PDFInfo
- Publication number
- WO2013097413A1 WO2013097413A1 PCT/CN2012/076324 CN2012076324W WO2013097413A1 WO 2013097413 A1 WO2013097413 A1 WO 2013097413A1 CN 2012076324 W CN2012076324 W CN 2012076324W WO 2013097413 A1 WO2013097413 A1 WO 2013097413A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- matrix
- annealing
- annealing process
- sequence
- haplotype
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Definitions
- the present invention relates to the field of bioinformatics, and in particular to a method and system for constructing diploid monomers. Background technique
- SNP single nucleotide polymorphism
- Haplotype Map Haplotype Map
- haplotypes based on chromosomal fragment information
- MFR minimum fragment removal
- MEC minimum error correction
- MSR minimum SNP removal
- the greedy heuristic algorithm proposed by Levy et al. The core idea is based on the greedy heuristic algorithm to minimize the difference between known chromosome fragments and reconstructed haplotypes. When there is no sequencing error in the heterozygous SNP site in the fragment, the method can quickly obtain the optimal haplotype. When sequencing errors occur in heterozygous SNP loci, this method takes longer and results are less accurate.
- HapCUT The core idea is to calculate the weight between SNP loci (based on MEC) by initializing the haplotype and establishing the chromosome fragment matrix, and constructing the bipartite graph according to the weight scale to divide the SNP into two categories. Optimal, and reconstruct the haplotype according to the optimal SNP classification results. When the data contains more chromosome fragments and heterozygous SNP sites, the method runs longer and the results obtained are usually only local optimal solutions rather than global optimal solutions.
- ReFHap This method is similar to HapCUT in constructing bipartite graphs, but instead of classifying SNPs in them, all chromosome fragments are divided into two categories according to the degree of similarity between them. The two sets of chromosome fragments are used as the final result and are reconstituted according to their haplotypes. Although this method is short in time and has high accuracy, it still cannot get rid of the shortcomings that the result is easy to fall into the local optimal solution.
- One technical problem to be solved in one aspect is to provide a highly accurate and fast method and system for constructing diploid monomers.
- S, j 6 T, £(M, i, j) represents the difference between all numbers of the bases of the segments i and j that are identical in base type and base type differences; based on the objective function and the Initial reference temperature T.
- the simulated annealing process is performed, and the final set S and T are output when convergence is reached; the haplotype h is inferred from the final set S and T by the minimum error correction model.
- an initial reference temperature T. -
- max - ⁇ , p r is the initial acceptance probability, and the max and min represent randomly generated K groups according to the matrix M by S and T
- the set formed by the subset of fragments calculates the maximum and minimum values of the ⁇ values of each set of S and T, respectively, and ⁇ is a natural number greater than or equal to 2.
- the Metropolis sampling stability criterion is used to determine whether to stop iterating into the anneal.
- the convergence criterion of the simulated annealing process is: When the value of the objective function is kept constant for a predetermined number of times of continuous annealing, the entire algorithm is considered to have reached the final convergence.
- the ternary characters ⁇ A, B, C ⁇ are ⁇ 0, 1, - ⁇ .
- the method further comprises: filtering the SNP locus in the chromosome fragment to remove homozygous and SNP loci comprising more than two alleles.
- a diploid monomer construction system comprising: a sequence matrix construction module for constructing a ternary character ⁇ A, B, according to all sequence segments including at least one common site; C ⁇ consists of a sequence fragment matrix M of mxn, wherein in the sequence fragment matrix M, the two alleles of the SNP site in the chromosome segment are respectively labeled with A and B, and m is the number of rows of the matrix, The number of chromosome fragments, n is the number of columns of the matrix, indicating the number of heterozygous SNP sites; an initial condition determination module for using the sequence of the sequence fragments
- an initial reference temperature T. For T. -
- max - min , Pr is the initial acceptance probability, and the max and ⁇ respectively represent the random generation of the group according to the matrix ⁇ by S and ⁇
- the set formed by the subset of fragments calculates the maximum and minimum values of the ⁇ values of each set of S and ⁇ , respectively, and ⁇ is a natural number greater than or equal to 2.
- j is the number of anneals
- the simulated annealing iteration module determines whether to stop iterating into the annealing by Metropolis sampling stability criterion.
- the simulated annealing process convergence criterion adopted by the simulated annealing iterative module is: When the value of the objective function of the predetermined number of consecutive annealings is kept constant, the whole algorithm is considered to have reached the final convergence goal.
- system further comprises: a SNP site filtering module for filtering SNP sites in the chromosome fragment to remove homozygous and SNP sites comprising more than two alleles.
- a SNP site filtering module for filtering SNP sites in the chromosome fragment to remove homozygous and SNP sites comprising more than two alleles.
- the diploid monomer construction method and system of the present invention performs a simulated annealing process based on an objective function and an initial reference temperature, and outputs a final set S and T when converging,
- the haplotype is inferred by the minimum error correction model, and the haplotype of the global optimal solution is obtained, which has high accuracy and high speed.
- FIG. 1 is a flow chart showing one embodiment of a method for constructing a diploid monomer of the present invention
- Figure 2 is a flow chart showing another embodiment of the method for constructing a diploid monomer of the present invention
- Figure 3 is a flow chart showing still another embodiment of the method for constructing a diploid monomer of the present invention.
- Figure 4 shows the effect of different ⁇ values on the cooling rate in the high temperature process annealing function
- Figure 5 shows the effect of different ⁇ values of the low temperature process annealing function on the cooling rate
- Figure 6 shows the effect of the degree of overlap between chromosome segments on the accuracy of haplotype reconstruction results
- Figure 7 shows the relationship between the error rate of the reconstructed haplotype and the error rate of SNPs by sequencing at different depths
- Figure 8 is a block diagram showing an embodiment of a diploid monomer building system of the present invention.
- Figure 9 is a structural view showing another embodiment of the diploid monomer building system of the present invention.
- Fig. 10 shows an example of a bipartite graph. detailed description
- the basic idea of the present invention is that the simulated annealing algorithm will be obtained by sequencing.
- the sequence fragment constructs a bipartite graph according to the degree of difference to achieve haplotype remodeling, aiming at the rapid and accurate completion of haplotype remodeling in the genome (eg human) in the context of massive sequencing data.
- the simulated annealing algorithm is derived from the principle of solid annealing on a physical system.
- the solid is heated to a sufficiently high temperature and then allowed to cool slowly.
- the solid internal particles rise with temperature. It becomes disordered, and the internal energy increases.
- the particles become orderly, reaching an equilibrium state at each temperature, and finally reaching the ground state at normal temperature, and the internal energy is minimized.
- the algorithm is a random search algorithm for solving large-scale optimization problems. It is based on the similarity between the optimization problem solving process and the physical system annealing process.
- the optimized objective function is equivalent to the internal energy of the metal.
- the independent variable combination state space is equivalent to the internal energy state space of the metal; the solution process of the problem is to find a combined state to minimize (or maximize) the objective function value.
- Simulated annealing is achieved by using the Metropolis criterion and appropriately controlling the temperature drop process to achieve the goal of solving global optimization problems in polynomial time.
- the problem of reconstructing haplotypes is a problem of solving combinatorial optimizations.
- the internal energy E is simulated as the objective function value f
- the temperature T is evolved into the control parameter t, which is the simulated annealing algorithm for the solution optimization problem: starting from the initial solution i and the initial value t of the control parameter, Repeat the iteration of "generate new solution ⁇ calculate objective function difference ⁇ accept or discard" for the current solution, and gradually attenuate the t value.
- the current solution at the end of the algorithm is the approximate approximate solution obtained. This is based on the Monte Carlo iterative solution method. A heuristic random search process.
- the bipartite graph also known as the bipartite graph, is a special model in graph theory.
- V1, V2 The subset
- V1 vertex sets
- V1 vertex sets
- Fig. 1 is a flow chart showing an embodiment of a method for constructing a diploid monomer of the present invention.
- step 102 constructs a sequence fragment matrix M of mxn composed of ternary characters ⁇ A, B, C ⁇ according to all sequence segments including at least one common site, wherein, in the sequence segment matrix M
- the two alleles of the SNP locus in the chromosome fragment are labeled with A and B, respectively, m is the number of rows of the matrix, indicating the number of chromosome fragments, and n is the number of columns of the matrix, indicating the number of heterozygous SNP sites.
- the two alleles of the SNP locus in the chromosome fragment are changed to 0 and 1 in the order of ASCII code, that is, the ASCII code is smaller with 0, the larger one is represented by 1, and all at least Fragments containing a common site are grouped together to form a two-dimensional matrix of mxn, denoted as M, where m is the number of rows of the matrix, representing the number of chromosome segments, n is the number of columns in the matrix, indicating the heterozygous SNP position
- M two-dimensional matrix of mxn
- S, j ⁇ , ⁇ (M, i, j) represents the segment between i and j
- S, j ⁇ , ⁇ (M, i, j) represents the segment between i and j
- Step 108 Perform a simulated annealing process based on the objective function and the initial reference temperature TO, and output the final set S and T when converging. The convergence is adjusted to be adjusted as needed by those skilled in the art.
- the haplotype h is inferred from the final set S and T by a minimum error correction model (MEC).
- MEC minimum error correction model
- the segment matrix is constructed by the sequence segment, the simulated annealing process is performed based on the objective function and the initial reference temperature, and the final set S and T are output during convergence, and the haplotype is inferred by the minimum error correction model, thereby obtaining the global optimum.
- the haplotype is solved with high accuracy.
- the objective function considers the same and different base types between segments, and therefore can effectively avoid the disadvantages caused by insufficient utilization of information by considering only the scores of bases with different bases. This defect can be illustrated by the following example:
- Fragment f 3 1011010111111 If only the sites with different base types are considered, the difference between f 2 and f 3 will be considered to be the smallest due to the incompleteness of the f 2 information, and in this case, the fact is that the difference from f 3 is It is the smallest.
- the determination of the initial state (initial reference temperature) in one embodiment is described below:
- the initial temperature function is T. Called ⁇ max
- the matrix M is used to randomly generate a set of K (for example, 30 ⁇ 200) sets of two subsets of S and T, and then calculate the ⁇ values of each set of S and T, respectively. The largest difference between them is
- a max max - min ; p r is the initial acceptance probability (for example, p r 0.9 ).
- Fig. 2 is a flow chart showing another embodiment of the method for constructing a diploid monomer of the present invention.
- a sequence segment matrix M of mxn composed of ternary characters ⁇ 0, 1, - ⁇ is constructed based on all sequence segments containing at least one common site.
- Step 204 initializing two segment sets S according to the sequence segment matrix M
- determining the scoring matrix s ⁇ a x , a 2 ) ⁇ )
- ai and a 2 are respectively the ⁇ type of the ith sequence segment and the jth sequence segment in the matrix M at the same site coordinates.
- the significance of the scoring matrix is to give a score of the degree of fragmentation difference by making the difference between the number of all base types of the fragments different from the base type, and the larger the score, the more the difference between the two fragments is. Big.
- Annealing process is an important process of the algorithm, which affects the acceptance probability of state transition. In order to obtain the global optimal solution more accurately and avoid falling into the local optimal solution in the later stage of annealing, the annealing process is divided into two parts: high temperature annealing and low temperature. Annealing process.
- the optimal solution range is locked by a high temperature annealing process.
- High temperature annealing The purpose of the process is to quickly lock the range of the optimal solution (that is, the maximum value of ⁇ ) and narrow the solution interval.
- the corresponding model perturbation is: ⁇ is a uniformly distributed random number in [0,1], [A ⁇ is the fluctuation range, and [ ⁇ , ⁇ ] For example, assuming 50 sequence segments, the fluctuation range is [1, 50].
- step 212 when the high temperature annealing process is stable, the process proceeds to a low temperature annealing process, and when the convergence is reached, the final set S and enthalpy are output.
- ⁇ is a uniformly distributed random number in [0,1]
- [AbBi] is the fluctuation range
- mi 6 [Ai,Bi] can be seen from the annealing function.
- the system undergoes a certain degree of tempering and temperature rise, which is beneficial to jump out of the local optimum solution that may enter during the high temperature annealing process.
- Step 214 inferring the haplotype from the final set S and T by the minimum error correction model.
- a complementary relationship can be used to accurately obtain another haplotype corresponding thereto.
- the annealing process (including high temperature and low temperature) is divided into two closely connected steps (1) Metropolis sampling stability criterion: At the same annealing temperature t, the target function ⁇ value is stable, then iteratively enters annealing. At the same temperature t, the Metropolis sampling criterion is used to randomly extract a sequence fragment v from S or T, and if vGS, then TUv, if vGT, then SUv, and calculate the value after the transformation. If the ⁇ value becomes larger, the transformation is accepted. If the ⁇ value becomes smaller or unchanged, the value of the probability function ⁇ (- ⁇ ⁇ / ⁇ ) is calculated at this time, and compared with the random number between 0 and 1, to determine whether Accept the transformation.
- Metropolis sampling stability criterion At the same annealing temperature t, the target function ⁇ value is stable, then iteratively enters annealing. At the same temperature t, the Metropolis sampling criterion is used to randomly extract a sequence
- the SNP site is filtered to remove homozygous and SNP sites containing more than two alleles.
- Fig. 3 is a flow chart showing still another embodiment of the method for constructing a diploid monomer of the present invention.
- step 300 the SNP site is filtered to remove homozygous and A SNP site comprising more than two alleles.
- step 302 a segment matrix M is constructed.
- Step 304 Determine the target function and the initial reference temperature according to the sequence segment matrix M with the two segment sets S and T.
- Step 306 Determine whether the segment sets S and ⁇ satisfy the algorithm convergence criterion. If so, the result is output (step 326), otherwise, step 308 is continued.
- Step 308 judging whether the sampling stability criterion is met? If yes, proceed to step 310, otherwise, continue to step 314.
- Step 310 Determine whether it is currently a high temperature annealing process? If so, then step 312a is followed by annealing using a high temperature annealing function; otherwise, step 312b is continued and annealing is performed using a low temperature annealing function.
- Step 314 generating a new state from the current state.
- Step 316 determining whether the acceptance function is established, and if so, accepting the new state (step 320), otherwise, maintaining the current state (step 318). Return to step 308.
- step 308 judges that the "sampling stability criterion" is satisfied, the temperature is removed, and when the temperature is removed, the high temperature annealing process or the low temperature annealing process is required.
- step 310 It is judged whether this is a high temperature annealing process (step 312a) or a low temperature annealing process (step 312b), if a high temperature process is used, a high temperature annealing function is used, and if it is a low temperature process, a low temperature annealing function is used.
- the algorithm does not cross the high temperature and low temperature processes in the process of reconfiguring the haplotype. It must be two processes that are strictly separated, that is, after the high temperature annealing process, the next step is In the low-temperature annealing process, there is no phenomenon from the low-temperature annealing process to the high-temperature annealing process, and the temperature-recovering step is also a low-temperature process.
- Figure 3 shows the flow of a singular-type reconstruction method based on simulated annealing with ⁇ as the objective function.
- the following is an example to illustrate the setting of corresponding parameters in the annealing process.
- the appropriate data type is an example to illustrate the setting of corresponding parameters in the annealing process. The appropriate data type.
- Example parameters The values of ⁇ and ⁇ : The difference between the parameters ⁇ and ⁇ in the annealing function will affect the speed of cooling.
- Example 1 Evaluation Index of Results SE: The applicant uses an exchange error rate (SE, also referred to as a reconstruction error rate) to evaluate the accuracy of the haplotype reconstruction results based on the present invention.
- SE exchange error rate
- the formula for calculating SE is:
- SE min ⁇ d(h rec0 n S compt, h rea n), d(h reconstruc t, h real 2) ⁇ /n, where hreconstruct represents a haplotype in the reconstruction result, d(h reconstruct?h Rea il) and d(h reconstruct , h real2 ) represent the number of SNP mismatches between a reconstructed haplotype and two standard haplotypes generated by simulation, where n is the number of SNPs in the reconstruction result, then SE Expressed as a percentage of the minimum number of SNP sites that are inconsistent between the reconstructed results and the simulated real results. The smaller the SE value, the more similar the haplotype reconstructed from the simulated data is to the real result, and the higher the accuracy.
- the overlap level of chromosome fragments is defined as:
- n denotes the number of SNP sites (i.e., the number of columns of the above sequence matrix)
- ⁇ N the number of SNP sites
- Verlap _ level reflects the sequence fragment of the entire sequence matrix M and the proportion of the SNP locus and the adequacy of the available information. This weight can indicate the average number of times each SNP locus is reused. It also illustrates the degree of compactness between sequence segments, the more SNP sites that overlap each other (ie, the greater the number of times SNP sites are reused) or the compactness between sequence segments, then P. The larger verlap _ level is, the more information is available.
- Each set of data consists of 50 pairs of standard haplotypes generated randomly and chromosomal fragments generated from haplotypes.
- the standard haplotype data contains the same number of SNPs of 200, and the corresponding number of chromosome fragments generated is the same, but the number of SNPs contained in the chromosome fragments is increased by 5 per set from 10.
- the deletion and transversion of SNPs in the chromosome fragments are also considered.
- the probability of deletion of each heterozygous SNP locus in the chromosome fragment during the generation of the simulated data is 0.9, which is much higher than the actual situation. Under this condition, if the same result can be obtained, then the method of the present invention is It has practical application significance. At the same time, the probability of setting the transposition is 0.05.
- 39 sets of simulated data generated by statistics The range of Poverlapjevel (that is, the degree of overlap) is from 0 ⁇ 18 to 0 ⁇ 90, and the corresponding SE (ie, reconstruction error rate) ranges from 0 to 0.03. As shown in Fig.
- the abscissa is the degree of overlap (hereinafter referred to as ⁇ overlap level), and the ordinate is the reconstruction error rate (hereinafter referred to as SE).
- SE reconstruction error rate
- Embodiment 2 Relationship between SE and SNP transversion rate: In order to evaluate the relationship between SNP transversion rate and SE, the applicant generated simulation data of SNP transversion rate from low to high at different coverage depths.
- the SNP coverage depth includes 10X, 20X, 30X, 40X, 50X, and each depth contains 7 sets of simulation data with SNP transversion rate of 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5. Each set of data is randomly generated. 50 pairs of standard haplotypes and chromosomal fragments formed from haplotypes.
- the formula for calculating the SNP coverage depth is:
- m is the number of chromosome fragments
- L is the average length of the chromosome fragments
- d is the SNP deletion rate
- N is the number of SNPs contained in the standard haplotype.
- Figure 8 is a block diagram showing one embodiment of the diploid monomer building system of the present invention.
- the diploid monomer construction system in this embodiment includes: a sequence matrix construction module 81, which is constructed by quaternion characters ⁇ A, B according to all sequence segments including at least one common site.
- haplotype determining module 84 for each column j, j [1, n] , mj , 0 represents the number of zeros in the column, my represents the number of 1s in the column, hj indicates that the column is inferred
- max - min , p r is the initial acceptance probability, and max and min respectively represent the random generation group according to the matrix ⁇
- the simulated annealing iteration module determines whether to stop the iterative entry annealing by the Metropolis sampling stability criterion; the simulated annealing process convergence criterion adopted by the simulated annealing iterative module is: when continuously annealing a predetermined number of objective functions When the value ⁇ remains the same, the entire algorithm is considered to have reached the final convergence.
- Fig. 9 is a structural view showing another embodiment of the diploid monomer building system of the present invention.
- a SNP site filtering module 90 may be further included for filtering SNP sites in the chromosome segment to remove homozygous and SNP sites containing more than two alleles.
- annealing stability determining unit 932 determining whether the high temperature annealing process is stable, when the high temperature annealing process is stable, transferring to a low temperature annealing process; low temperature annealing performing unit 933, performing a low temperature annealing process
- Figures 8-9 can be implemented by separate computing processing devices or integrated into a single device implementation. They are shown in boxes in Figures 8 to 9 to illustrate their function. These functional blocks can be implemented in hardware, software, firmware, middleware, microcode, hardware description speech, or any combination thereof. For example, one or both of the functional blocks can be implemented by code running on a microprocessor, digital signal processor (DSP), or any other suitable computing device.
- a code can represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, or any combination of instructions, data structures, or program statements.
- the code can be located on a computer readable medium.
- the computer readable medium can include one or more storage devices including, for example, RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, ⁇ i, mobile hard disk, CD-ROM, or any other form known in the art. Storage medium.
- the computer readable medium can also include The carrier that encodes the data signal.
- this patent proposes a new haplotype reconstruction method and system, which can determine the global optimal solution of the objective function in a short time, and then complete Monomeric remodeling.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/369,604 US20150120256A1 (en) | 2011-12-31 | 2012-05-31 | Method of reconstructing haplotype of diploid and system thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110456562.3 | 2011-12-31 | ||
CN201110456562 | 2011-12-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013097413A1 true WO2013097413A1 (zh) | 2013-07-04 |
Family
ID=48696317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2012/076324 WO2013097413A1 (zh) | 2011-12-31 | 2012-05-31 | 一种二倍体单体构建方法和系统 |
Country Status (2)
Country | Link |
---|---|
US (1) | US20150120256A1 (zh) |
WO (1) | WO2013097413A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106526338A (zh) * | 2016-10-19 | 2017-03-22 | 天津大学 | 基于模拟退火的室内射线跟踪参数校正方法 |
CN110430593A (zh) * | 2019-08-17 | 2019-11-08 | 胡洋 | 一种边缘计算用户任务卸载方法 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160034423A1 (en) * | 2014-08-04 | 2016-02-04 | Microsoft Corporation | Algorithm for Optimization and Sampling |
CN105046105B (zh) * | 2015-07-09 | 2018-02-02 | 天津诺禾医学检验所有限公司 | 染色体跨度的单体型图及其构建方法 |
US10176296B2 (en) * | 2017-05-17 | 2019-01-08 | International Business Machines Corporation | Algebraic phasing of polyploids |
CN111383714B (zh) * | 2018-12-29 | 2023-07-28 | 安诺优达基因科技(北京)有限公司 | 模拟目标疾病仿真测序文库的方法及其应用 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030190652A1 (en) * | 2002-01-25 | 2003-10-09 | De La Vega Francisco M. | Methods of validating SNPs and compiling libraries of assays |
CN101256602A (zh) * | 2008-03-18 | 2008-09-03 | 中南大学 | 基于优化解集合的个体单体型重建方法 |
CN102191311A (zh) * | 2010-03-10 | 2011-09-21 | 常州楚天生物科技有限公司 | 一种通用寡核苷酸序列库的构建及应用 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002101631A1 (en) * | 2001-06-08 | 2002-12-19 | President And Fellows Of Harvard College | Haplotype determination |
US20130196862A1 (en) * | 2009-07-17 | 2013-08-01 | Natera, Inc. | Informatics Enhanced Analysis of Fetal Samples Subject to Maternal Contamination |
US8877442B2 (en) * | 2010-12-07 | 2014-11-04 | The Board Of Trustees Of The Leland Stanford Junior University | Non-invasive determination of fetal inheritance of parental haplotypes at the genome-wide scale |
-
2012
- 2012-05-31 US US14/369,604 patent/US20150120256A1/en not_active Abandoned
- 2012-05-31 WO PCT/CN2012/076324 patent/WO2013097413A1/zh active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030190652A1 (en) * | 2002-01-25 | 2003-10-09 | De La Vega Francisco M. | Methods of validating SNPs and compiling libraries of assays |
CN101256602A (zh) * | 2008-03-18 | 2008-09-03 | 中南大学 | 基于优化解集合的个体单体型重建方法 |
CN102191311A (zh) * | 2010-03-10 | 2011-09-21 | 常州楚天生物科技有限公司 | 一种通用寡核苷酸序列库的构建及应用 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106526338A (zh) * | 2016-10-19 | 2017-03-22 | 天津大学 | 基于模拟退火的室内射线跟踪参数校正方法 |
CN110430593A (zh) * | 2019-08-17 | 2019-11-08 | 胡洋 | 一种边缘计算用户任务卸载方法 |
CN110430593B (zh) * | 2019-08-17 | 2022-05-13 | 胡洋 | 一种边缘计算用户任务卸载方法 |
Also Published As
Publication number | Publication date |
---|---|
US20150120256A1 (en) | 2015-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Pouyet et al. | Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences | |
WO2013097413A1 (zh) | 一种二倍体单体构建方法和系统 | |
Willems et al. | Population-scale sequencing data enable precise estimates of Y-STR mutation rates | |
Cariou et al. | Is RAD‐seq suitable for phylogenetic inference? An in silico assessment and optimization | |
Li et al. | Low-coverage sequencing: implications for design of complex trait association studies | |
Sun et al. | Identifying splicing sites in eukaryotic RNA: support vector machine approach | |
Tirosh et al. | Comparative analysis indicates regulatory neofunctionalization of yeast duplicates | |
Hovmöller et al. | Effects of missing data on species tree estimation under the coalescent | |
US11322225B2 (en) | Systems and methods for determining effects of therapies and genetic variation on polyadenylation site selection | |
US20190338349A1 (en) | Methods and systems for high fidelity sequencing | |
Bellos et al. | cnvHiTSeq: integrative models for high-resolution copy number variation detection and genotyping using population sequencing data | |
Wang et al. | CNVeM: copy number variation detection using uncertainty of read mapping | |
Sun et al. | Introducing heuristic information into ant colony optimization algorithm for identifying epistasis | |
Rhee et al. | Survey of computational haplotype determination methods for single individual | |
Morris | Direct analysis of unphased SNP genotype data in population‐based association studies via Bayesian partition modelling of haplotypes | |
JP2022549737A (ja) | In vitro受精に関する多遺伝子リスクスコア | |
Wang et al. | Role of SNPs in the Biogenesis of Mature miRNAs | |
JP2023506084A (ja) | ゲノム瘢痕アッセイ及び関連する方法 | |
CN106570350B (zh) | 单核苷酸多态位点分型算法 | |
Fadziso et al. | Identical by Descent (IBD): Investigation of the Genetic Ties between Africans, Denisovans, and Neandertals | |
Paşaniuc et al. | Imputation-based local ancestry inference in admixed populations | |
Zhao et al. | Large-scale study of long non-coding RNA functions based on structure and expression features | |
Olyaee et al. | AROHap: An effective algorithm for single individual haplotype reconstruction based on asexual reproduction optimization | |
Treangen et al. | A novel heuristic for local multiple alignment of interspersed DNA repeats | |
Kimmel et al. | Association mapping and significance estimation via the coalescent |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12863346 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14369604 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC, FORM 1205N DATED 26-11-2014 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12863346 Country of ref document: EP Kind code of ref document: A1 |