AU2022319125A1 - Quality score calibration of basecalling systems - Google Patents
Quality score calibration of basecalling systems Download PDFInfo
- Publication number
- AU2022319125A1 AU2022319125A1 AU2022319125A AU2022319125A AU2022319125A1 AU 2022319125 A1 AU2022319125 A1 AU 2022319125A1 AU 2022319125 A AU2022319125 A AU 2022319125A AU 2022319125 A AU2022319125 A AU 2022319125A AU 2022319125 A1 AU2022319125 A1 AU 2022319125A1
- Authority
- AU
- Australia
- Prior art keywords
- base
- range
- sensor data
- value
- intensity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 claims abstract description 155
- 238000012163 sequencing technique Methods 0.000 claims description 181
- 238000003860 storage Methods 0.000 claims description 84
- 238000012545 processing Methods 0.000 claims description 61
- 230000015654 memory Effects 0.000 claims description 54
- 238000010606 normalization Methods 0.000 claims description 51
- 230000008569 process Effects 0.000 claims description 43
- 238000013507 mapping Methods 0.000 claims description 30
- 239000012491 analyte Substances 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 description 118
- 239000010410 layer Substances 0.000 description 110
- 230000006870 function Effects 0.000 description 82
- 238000013528 artificial neural network Methods 0.000 description 75
- 239000011159 matrix material Substances 0.000 description 68
- 125000003729 nucleotide group Chemical group 0.000 description 59
- 239000002773 nucleotide Substances 0.000 description 55
- 238000005070 sampling Methods 0.000 description 49
- 150000007523 nucleic acids Chemical class 0.000 description 44
- 239000012530 fluid Substances 0.000 description 39
- 102000039446 nucleic acids Human genes 0.000 description 39
- 108020004707 nucleic acids Proteins 0.000 description 39
- 238000003062 neural network model Methods 0.000 description 33
- 239000000758 substrate Substances 0.000 description 31
- 108020004414 DNA Proteins 0.000 description 30
- 102000053602 DNA Human genes 0.000 description 30
- 230000002123 temporal effect Effects 0.000 description 29
- 229920001519 homopolymer Polymers 0.000 description 28
- 239000000126 substance Substances 0.000 description 27
- 230000003321 amplification Effects 0.000 description 24
- 238000003199 nucleic acid amplification method Methods 0.000 description 24
- 238000005286 illumination Methods 0.000 description 23
- 238000001514 detection method Methods 0.000 description 21
- 238000003491 array Methods 0.000 description 20
- 239000000523 sample Substances 0.000 description 20
- 238000004458 analytical method Methods 0.000 description 19
- 230000005284 excitation Effects 0.000 description 19
- 239000007787 solid Substances 0.000 description 19
- 238000012549 training Methods 0.000 description 19
- 238000004891 communication Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 17
- 239000000463 material Substances 0.000 description 17
- 238000013139 quantization Methods 0.000 description 16
- 230000008859 change Effects 0.000 description 15
- 239000000243 solution Substances 0.000 description 15
- 238000005516 engineering process Methods 0.000 description 13
- 238000004166 bioassay Methods 0.000 description 11
- 102000040430 polynucleotide Human genes 0.000 description 11
- 108091033319 polynucleotide Proteins 0.000 description 11
- 239000002157 polynucleotide Substances 0.000 description 11
- 108091028043 Nucleic acid sequence Proteins 0.000 description 10
- 239000011324 bead Substances 0.000 description 10
- 238000013135 deep learning Methods 0.000 description 10
- 238000003384 imaging method Methods 0.000 description 10
- 102000004169 proteins and genes Human genes 0.000 description 10
- 108090000623 proteins and genes Proteins 0.000 description 10
- 230000004044 response Effects 0.000 description 10
- 238000013459 approach Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 9
- 239000003153 chemical reaction reagent Substances 0.000 description 9
- 238000002161 passivation Methods 0.000 description 9
- 230000002441 reversible effect Effects 0.000 description 9
- 229920002477 rna polymer Polymers 0.000 description 9
- 230000003287 optical effect Effects 0.000 description 8
- 239000000203 mixture Substances 0.000 description 7
- 238000012384 transportation and delivery Methods 0.000 description 7
- 241000588626 Acinetobacter baumannii Species 0.000 description 6
- 241000193755 Bacillus cereus Species 0.000 description 6
- 241000894006 Bacteria Species 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- 108091034117 Oligonucleotide Proteins 0.000 description 6
- 238000003556 assay Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 150000002500 ions Chemical class 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 230000001133 acceleration Effects 0.000 description 5
- 238000001444 catalytic combustion detection Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 239000011521 glass Substances 0.000 description 5
- 238000010348 incorporation Methods 0.000 description 5
- 239000012528 membrane Substances 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 239000002699 waste material Substances 0.000 description 5
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 230000036541 health Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 239000007788 liquid Substances 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- -1 polypropylene Polymers 0.000 description 4
- 239000011148 porous material Substances 0.000 description 4
- 239000000376 reactant Substances 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 108091093088 Amplicon Proteins 0.000 description 3
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 3
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 3
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 239000007853 buffer solution Substances 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 125000000524 functional group Chemical group 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 3
- 230000005855 radiation Effects 0.000 description 3
- 229910052710 silicon Inorganic materials 0.000 description 3
- 239000010703 silicon Substances 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000001131 transforming effect Effects 0.000 description 3
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 108091028732 Concatemer Proteins 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 2
- 229910052581 Si3N4 Inorganic materials 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 230000000712 assembly Effects 0.000 description 2
- 238000000429 assembly Methods 0.000 description 2
- 238000005415 bioluminescence Methods 0.000 description 2
- 230000029918 bioluminescence Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000001311 chemical methods and process Methods 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 238000013090 high-throughput technology Methods 0.000 description 2
- 239000000017 hydrogel Substances 0.000 description 2
- 230000003100 immobilizing effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 229920003023 plastic Polymers 0.000 description 2
- 229920002401 polyacrylamide Polymers 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- HQVNEWCFYHHQES-UHFFFAOYSA-N silicon nitride Chemical compound N12[Si]34N5[Si]62N3[Si]51N64 HQVNEWCFYHHQES-UHFFFAOYSA-N 0.000 description 2
- 238000001308 synthesis method Methods 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 239000012780 transparent material Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 235000012431 wafers Nutrition 0.000 description 2
- PHIYHIOQVWTXII-UHFFFAOYSA-N 3-amino-1-phenylpropan-1-ol Chemical compound NCCC(O)C1=CC=CC=C1 PHIYHIOQVWTXII-UHFFFAOYSA-N 0.000 description 1
- 240000001436 Antirrhinum majus Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 206010013710 Drug interaction Diseases 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 241000191025 Rhodobacter Species 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 239000012790 adhesive layer Substances 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009435 amidation Effects 0.000 description 1
- 238000007112 amidation reaction Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 238000009360 aquaculture Methods 0.000 description 1
- 244000144974 aquaculture Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 239000003012 bilayer membrane Substances 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000018044 dehydration Effects 0.000 description 1
- 238000006297 dehydration reaction Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000007865 diluting Methods 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 208000018459 dissociative disease Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000032050 esterification Effects 0.000 description 1
- 238000005886 esterification reaction Methods 0.000 description 1
- 238000006266 etherification reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 229940052961 longrange Drugs 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 230000005257 nucleotidylation Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000002974 pharmacogenomic effect Effects 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006722 reduction reaction Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000002165 resonance energy transfer Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 238000007363 ring formation reaction Methods 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 238000005464 sample preparation method Methods 0.000 description 1
- 238000005204 segregation Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000007841 sequencing by ligation Methods 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 238000001179 sorption measurement Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 125000002264 triphosphate group Chemical group [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/10—Signal processing, e.g. from mass spectrometry [MS] or from PCR
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Public Health (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Genetics & Genomics (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Signal Processing (AREA)
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163226707P | 2021-07-28 | 2021-07-28 | |
US63/226,707 | 2021-07-28 | ||
US17/839,387 | 2022-06-13 | ||
US17/839,387 US20230029970A1 (en) | 2021-07-28 | 2022-06-13 | Quality score calibration of basecalling systems |
PCT/US2022/038729 WO2023009758A1 (en) | 2021-07-28 | 2022-07-28 | Quality score calibration of basecalling systems |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2022319125A1 true AU2022319125A1 (en) | 2024-01-18 |
Family
ID=83149575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2022319125A Pending AU2022319125A1 (en) | 2021-07-28 | 2022-07-28 | Quality score calibration of basecalling systems |
Country Status (7)
Country | Link |
---|---|
EP (1) | EP4377960A1 (ko) |
JP (1) | JP2024532049A (ko) |
KR (1) | KR20240037882A (ko) |
AU (1) | AU2022319125A1 (ko) |
CA (1) | CA3223746A1 (ko) |
IL (1) | IL309786A (ko) |
WO (1) | WO2023009758A1 (ko) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118053503A (zh) * | 2024-01-11 | 2024-05-17 | 中国农业科学院农业基因组研究所 | 一种入侵生物多组学数据库构建方法及系统 |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2044616A1 (en) | 1989-10-26 | 1991-04-27 | Roger Y. Tsien | Dna sequencing |
US5641658A (en) | 1994-08-03 | 1997-06-24 | Mosaic Technologies, Inc. | Method for performing amplification of nucleic acid with two primers bound to a single solid support |
US6090592A (en) | 1994-08-03 | 2000-07-18 | Mosaic Technologies, Inc. | Method for performing amplification of nucleic acid on supports |
WO1998044152A1 (en) | 1997-04-01 | 1998-10-08 | Glaxo Group Limited | Method of nucleic acid sequencing |
EP3034626A1 (en) | 1997-04-01 | 2016-06-22 | Illumina Cambridge Limited | Method of nucleic acid sequencing |
AR021833A1 (es) | 1998-09-30 | 2002-08-07 | Applied Research Systems | Metodos de amplificacion y secuenciacion de acido nucleico |
US20020150909A1 (en) | 1999-02-09 | 2002-10-17 | Stuelpnagel John R. | Automated information processing in randomly ordered arrays |
AU2001282881B2 (en) | 2000-07-07 | 2007-06-14 | Visigen Biotechnologies, Inc. | Real-time sequence determination |
WO2002044425A2 (en) | 2000-12-01 | 2002-06-06 | Visigen Biotechnologies, Inc. | Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity |
AR031640A1 (es) | 2000-12-08 | 2003-09-24 | Applied Research Systems | Amplificacion isotermica de acidos nucleicos en un soporte solido |
US7057026B2 (en) | 2001-12-04 | 2006-06-06 | Solexa Limited | Labelled nucleotides |
US20040002090A1 (en) | 2002-03-05 | 2004-01-01 | Pascal Mayer | Methods for detecting genome-wide sequence variations associated with a phenotype |
SI3363809T1 (sl) | 2002-08-23 | 2020-08-31 | Illumina Cambridge Limited | Modificirani nukleotidi za polinukleotidno sekvenciranje |
PT3147292T (pt) | 2002-08-23 | 2018-11-22 | Illumina Cambridge Ltd | Nucleótidos identificados |
GB0321306D0 (en) | 2003-09-11 | 2003-10-15 | Solexa Ltd | Modified polymerases for improved incorporation of nucleotide analogues |
EP2789383B1 (en) | 2004-01-07 | 2023-05-03 | Illumina Cambridge Limited | Molecular arrays |
EP3415641B1 (en) | 2004-09-17 | 2023-11-01 | Pacific Biosciences Of California, Inc. | Method for analysis of molecules |
EP1828412B2 (en) | 2004-12-13 | 2019-01-09 | Illumina Cambridge Limited | Improved method of nucleotide detection |
JP4990886B2 (ja) | 2005-05-10 | 2012-08-01 | ソレックサ リミテッド | 改良ポリメラーゼ |
US8045998B2 (en) | 2005-06-08 | 2011-10-25 | Cisco Technology, Inc. | Method and system for communicating using position information |
GB0514936D0 (en) | 2005-07-20 | 2005-08-24 | Solexa Ltd | Preparation of templates for nucleic acid sequencing |
GB0517097D0 (en) | 2005-08-19 | 2005-09-28 | Solexa Ltd | Modified nucleosides and nucleotides and uses thereof |
US7405281B2 (en) | 2005-09-29 | 2008-07-29 | Pacific Biosciences Of California, Inc. | Fluorescent nucleotide analogs and uses therefor |
GB0522310D0 (en) | 2005-11-01 | 2005-12-07 | Solexa Ltd | Methods of preparing libraries of template polynucleotides |
EP2021503A1 (en) | 2006-03-17 | 2009-02-11 | Solexa Ltd. | Isothermal methods for creating clonal single molecule arrays |
EP4105644A3 (en) | 2006-03-31 | 2022-12-28 | Illumina, Inc. | Systems and devices for sequence by synthesis analysis |
US20080242560A1 (en) | 2006-11-21 | 2008-10-02 | Gunderson Kevin L | Methods for generating amplified nucleic acid arrays |
US7595882B1 (en) | 2008-04-14 | 2009-09-29 | Geneal Electric Company | Hollow-core waveguide-based raman systems and methods |
PT3623481T (pt) | 2011-09-23 | 2021-10-15 | Illumina Inc | Composições para sequenciação de ácidos nucleicos |
US20160110499A1 (en) * | 2014-10-21 | 2016-04-21 | Life Technologies Corporation | Methods, systems, and computer-readable media for blind deconvolution dephasing of nucleic acid sequencing data |
CN111971748A (zh) * | 2018-01-26 | 2020-11-20 | 宽腾矽公司 | 用于测序装置的机器学习使能脉冲及碱基判定 |
US11783917B2 (en) * | 2019-03-21 | 2023-10-10 | Illumina, Inc. | Artificial intelligence-based base calling |
-
2022
- 2022-07-28 AU AU2022319125A patent/AU2022319125A1/en active Pending
- 2022-07-28 KR KR1020237043770A patent/KR20240037882A/ko unknown
- 2022-07-28 EP EP22761681.0A patent/EP4377960A1/en active Pending
- 2022-07-28 JP JP2023579782A patent/JP2024532049A/ja active Pending
- 2022-07-28 WO PCT/US2022/038729 patent/WO2023009758A1/en active Application Filing
- 2022-07-28 IL IL309786A patent/IL309786A/en unknown
- 2022-07-28 CA CA3223746A patent/CA3223746A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
WO2023009758A1 (en) | 2023-02-02 |
IL309786A (en) | 2024-02-01 |
EP4377960A1 (en) | 2024-06-05 |
KR20240037882A (ko) | 2024-03-22 |
CA3223746A1 (en) | 2023-02-02 |
JP2024532049A (ja) | 2024-09-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP4107737B1 (en) | Knowledge distillation and gradient pruning-based compression of artificial intelligence-based base caller | |
US20220300811A1 (en) | Neural network parameter quantization for base calling | |
CN115136244A (zh) | 基于人工智能的多对多碱基判读 | |
AU2022319125A1 (en) | Quality score calibration of basecalling systems | |
US20230041989A1 (en) | Base calling using multiple base caller models | |
US20230029970A1 (en) | Quality score calibration of basecalling systems | |
US20230026084A1 (en) | Self-learned base caller, trained using organism sequences | |
US20220415445A1 (en) | Self-learned base caller, trained using oligo sequences | |
CN117529780A (zh) | 碱基检出系统的质量分数校准 | |
KR20230157230A (ko) | 염기 호출을 위한 타일 위치 및/또는 사이클 기반 가중치 세트 선택 | |
KR20240027608A (ko) | 유기체 서열을 사용하여 훈련된 자체-학습 염기 호출자 | |
JP2024529843A (ja) | 複数のベースコーラモデルを使用するベースコール | |
WO2022197752A1 (en) | Tile location and/or cycle based weight set selection for base calling | |
CN117546248A (zh) | 使用多个碱基检出器模型的碱基检出 | |
CN117546249A (zh) | 使用寡核苷酸序列训练的自学碱基检出器 | |
EP4405955A1 (en) | Compressed state-based base calling |