KR101977976B1 - 앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 - Google Patents
앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 Download PDFInfo
- Publication number
- KR101977976B1 KR101977976B1 KR1020170101540A KR20170101540A KR101977976B1 KR 101977976 B1 KR101977976 B1 KR 101977976B1 KR 1020170101540 A KR1020170101540 A KR 1020170101540A KR 20170101540 A KR20170101540 A KR 20170101540A KR 101977976 B1 KR101977976 B1 KR 101977976B1
- Authority
- KR
- South Korea
- Prior art keywords
- sequence
- lead
- primer
- sample
- primer sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 73
- 108091093088 Amplicon Proteins 0.000 title claims abstract description 29
- 238000007405 data analysis Methods 0.000 title claims abstract description 13
- 238000007481 next generation sequencing Methods 0.000 claims abstract description 47
- 239000000523 sample Substances 0.000 claims description 130
- 238000012163 sequencing technique Methods 0.000 claims description 27
- 238000004519 manufacturing process Methods 0.000 claims description 8
- 238000012300 Sequence Analysis Methods 0.000 claims description 4
- 230000005856 abnormality Effects 0.000 claims description 2
- 239000013068 control sample Substances 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 abstract description 19
- 108090000623 proteins and genes Proteins 0.000 description 20
- 230000035772 mutation Effects 0.000 description 17
- 150000007523 nucleic acids Chemical class 0.000 description 13
- 239000002773 nucleotide Substances 0.000 description 13
- 125000003729 nucleotide group Chemical group 0.000 description 13
- 108020004414 DNA Proteins 0.000 description 11
- 102000053602 DNA Human genes 0.000 description 11
- 102000039446 nucleic acids Human genes 0.000 description 11
- 108020004707 nucleic acids Proteins 0.000 description 11
- 239000012634 fragment Substances 0.000 description 9
- 239000000463 material Substances 0.000 description 8
- 201000010099 disease Diseases 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 239000012491 analyte Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 239000007858 starting material Substances 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 3
- 125000004429 atom Chemical group 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 210000001808 exosome Anatomy 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 229920002477 rna polymer Polymers 0.000 description 3
- 239000004055 small Interfering RNA Substances 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 108700028369 Alleles Proteins 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 108091027967 Small hairpin RNA Proteins 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- XOWYMQTVAKUBIF-UHFFFAOYSA-N sample-18 Chemical compound CC(=O)N1C2C(CN(C3)CC4)CN4CC23C2=CC=CC=C2C1C1=CC=C([N+]([O-])=O)C=C1 XOWYMQTVAKUBIF-UHFFFAOYSA-N 0.000 description 2
- 238000010008 shearing Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 108700026220 vif Genes Proteins 0.000 description 2
- 101150028074 2 gene Proteins 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- 108700010154 BRCA2 Genes Proteins 0.000 description 1
- 101150008921 Brca2 gene Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 108091007412 Piwi-interacting RNA Proteins 0.000 description 1
- 208000035977 Rare disease Diseases 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 125000005842 heteroatom Chemical group 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000012175 pyrosequencing Methods 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 239000012925 reference material Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/6853—Nucleic acid amplification reactions using modified primers or templates
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/30—Unsupervised data analysis
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Organic Chemistry (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Data Mining & Analysis (AREA)
- Analytical Chemistry (AREA)
- Bioethics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
본 발명에 따른 앰플리콘 기반 차세대 염기서열 분석기법(Next Generation Sequencing, NGS)에서 프라이머 제거를 통한 리드 데이터의 분석 정확도를 증가시키는 방법은, 데이터 분석의 속도가 빠르고, 민프라이머 서열만 정확하게 제거할 수 있어, 리드 데이터 분석의 효율 및 정확도를 증가시키는데 유용하다.
Description
도 2의 (a)는 본 발명의 일 실시예에 따른 BRCA2 유전자에서 설계한 앰플리콘의 배열 일부를 나타낸 모식도 이고, (b)는 (a)의 리드 일부를 실제 서열로 나타낸 것이다.
도 3은 본 발명의 일 실시예에 따른, 앰플리콘 프라이머의 조합을 나타내는 것이다.
도 4는 본 발명의 방법과 기존에 공지된 프로그램에서 프라이머 제거 완료 시간을 비교한 그래프이다.
도 5는 본 발명의 방법과 기존에 공지된 프로그램에서 프라이머 제거 완료 후, 분석에 사용할 수 있는 리드의 개수를 측정한 그래프이다.
도 6은 본 발명의 방법과 기존에 공지된 프로그램에서 프라이머 제거 완료후, 리드를 정렬하여 정확도를 분석한 결과이다.
Sample # | Read count | Sample # | Read count | Sample # | Read count |
Sample 1 | 38329 | Sample 9 | 36601 | Sample 17 | 43909 |
Sample 2 | 42871 | Sample 10 | 39747 | Sample 18 | 41652 |
Sample 3 | 38410 | Sample 11 | 35189 | Sample 19 | 46255 |
Sample 4 | 38881 | Sample 12 | 40638 | Sample 20 | 43950 |
Sample 5 | 40867 | Sample 13 | 41649 | Sample 21 | 50263 |
Sample 6 | 36741 | Sample 14 | 40010 | Sample 22 | 40038 |
Sample 7 | 39031 | Sample 15 | 31768 | Sample 23 | 49956 |
Sample 8 | 39541 | Sample 16 | 41566 | Sample 24 | 49082 |
Sample # | Read count | Sample # | Read count | Sample # | Read count |
Sample 1 | 36310 | Sample 9 | 34568 | Sample 17 | 41585 |
Sample 2 | 40414 | Sample 10 | 37438 | Sample 18 | 39466 |
Sample 3 | 36552 | Sample 11 | 33268 | Sample 19 | 43909 |
Sample 4 | 36807 | Sample 12 | 38278 | Sample 20 | 41498 |
Sample 5 | 38281 | Sample 13 | 38973 | Sample 21 | 47681 |
Sample 6 | 34406 | Sample 14 | 37417 | Sample 22 | 37799 |
Sample 7 | 36934 | Sample 15 | 30169 | Sample 23 | 47449 |
Sample 8 | 37460 | Sample 16 | 39332 | Sample 24 | 46518 |
Sample # | Read count | Sample # | Read count | Sample # | Read count |
Sample 1 | 248 | Sample 9 | 219 | Sample 17 | 291 |
Sample 2 | 285 | Sample 10 | 228 | Sample 18 | 284 |
Sample 3 | 224 | Sample 11 | 221 | Sample 19 | 291 |
Sample 4 | 259 | Sample 12 | 238 | Sample 20 | 304 |
Sample 5 | 274 | Sample 13 | 264 | Sample 21 | 311 |
Sample 6 | 232 | Sample 14 | 248 | Sample 22 | 242 |
Sample 7 | 238 | Sample 15 | 210 | Sample 23 | 296 |
Sample 8 | 277 | Sample 16 | 222 | Sample 24 | 299 |
Sample # | Read count | Sample # | Read count | Sample # | Read count |
Sample 1 | 41 | Sample 9 | 43 | Sample 17 | 48 |
Sample 2 | 53 | Sample 10 | 43 | Sample 18 | 43 |
Sample 3 | 37 | Sample 11 | 35 | Sample 19 | 52 |
Sample 4 | 39 | Sample 12 | 43 | Sample 20 | 42 |
Sample 5 | 51 | Sample 13 | 42 | Sample 21 | 51 |
Sample 6 | 35 | Sample 14 | 34 | Sample 22 | 38 |
Sample 7 | 37 | Sample 15 | 19 | Sample 23 | 40 |
Sample 8 | 55 | Sample 16 | 42 | Sample 24 | 52 |
Pair1 | Pair2 | save | |
Read1 | BRCA2_10_07_FOR | BRCA2_10_07_REV | O |
Read2 | BRCA2_10_07_FOR | BRCA2_10_09_REV | X |
Read3 | BRCA2_10_07_REV | BRCA2_10_07_FOR | O |
cutadapt | 본원 발명 | |
sample1 | 238s | 57s |
sample2 | 294s | 70s |
sample3 | 360s | 63s |
sample4 | 234s | 63s |
sample5 | 372s | 64s |
sample6 | 224s | 58s |
sample7 | 236s | 65s |
sample8 | 242s | 66s |
sample9 | 220s | 58s |
sample10 | 234s | 65s |
sample11 | 207s | 55s |
sample12 | 244s | 76s |
sample13 | 248s | 79s |
sample14 | 243s | 73s |
sample15 | 190s | 50s |
sample16 | 258s | 74s |
sample17 | 265s | 81s |
sample18 | 260s | 76s |
sample19 | 274s | 98s |
sample20 | 264s | 87s |
sample21 | 303s | 98s |
sample22 | 241s | 66s |
sample23 | 306s | 97s |
sample24 | 303s | 95s |
cutadapt | Raw 리드 대비(%) | 본원 발명 | Raw 리드 대비(%) | |
sample1 | 35036 | 91.409% | 36419 | 95.017% |
sample2 | 39092 | 91.185% | 40752 | 95.057% |
sample3 | 34896 | 90.851% | 36813 | 95.842% |
sample4 | 35406 | 91.062% | 37105 | 95.432% |
sample5 | 37260 | 91.174% | 38606 | 94.467% |
sample6 | 33463 | 91.078% | 34673 | 94.371% |
sample7 | 35823 | 91.781% | 37209 | 95.332% |
sample8 | 36024 | 91.105% | 37792 | 95.577% |
sample9 | 33242 | 90.823% | 34830 | 95.161% |
sample10 | 35851 | 90.198% | 37709 | 94.873% |
sample11 | 31867 | 90.560% | 33524 | 95.268% |
sample12 | 36886 | 90.767% | 38559 | 94.884% |
sample13 | 37757 | 90.655% | 39279 | 94.310% |
sample14 | 36404 | 90.987% | 37699 | 94.224% |
sample15 | 28907 | 90.994% | 30398 | 95.687% |
sample16 | 37932 | 91.257% | 39596 | 95.261% |
sample17 | 39847 | 90.749% | 41924 | 95.479% |
sample18 | 37623 | 90.327% | 39793 | 95.537% |
sample19 | 41852 | 90.481% | 44252 | 95.670% |
sample20 | 40095 | 91.229% | 41844 | 95.208% |
sample21 | 45740 | 91.001% | 48043 | 95.583% |
sample22 | 36428 | 90.984% | 38079 | 95.107% |
sample23 | 45304 | 90.688% | 47785 | 95.654% |
sample24 | 44659 | 90.989% | 46869 | 95.491% |
Claims (13)
- (a) 앰플리콘 기반 차세대 염기서열 분석기법을 통해 리드를 획득하는 단계;
(b) 프라이머 서열과 상기 리드 서열을 분석하여 리드 서열 내 프라이머 서열을 결정하는 단계; 및
(c) 결정된 프라이머 서열을 제거하는 단계를 포함하는 앰플리콘 기반 차세대 염기서열 분석기법(Next generation sequencing)에서 프라이머 제거를 통해 리드 데이터 분석의 정확도를 증가시키는 방법으로,
상기 (b) 단계는
(i) 프라이머 서열과 상기 리드 서열에서 완벽하게 매칭되는 리드 서열을 추출하는 단계;
(ii) 프라이머 서열과 상기 (i) 단계에서 추출되지 않은 리드 서열에서 기준 에러 값(%) 만큼 매칭되는 리드 서열을 추출하는 단계; 및
(iii) 프라이머 서열과 상기 (ii) 단계에서 추출되지 않은 리드 서열에서 리드 내부 프라이머 서열 정보로 리드의 프라미어 서열 정보를 결정하는 단계
를 포함하고,
상기 (iii) 단계의 리드 내부 프라이머 서열 정보는 리드 서열 내부에 존재하는 다른 리드의 프라이머 서열에 해당하는 정보인 것을 특징으로 하는 방법.
- 삭제
- 제1항에 있어서, 상기 (b) 단계의 리드 서열은 5‘ 부분이 1 내지 65% 제거된 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 (i) 단계의 서열 비교는 리드 서열의 5‘ 부분 20bp 내지 70bp와 프라이머 서열을 비교하여 일치하는지를 확인하는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 (i) 단계의 서열 비교는 아호이-코라식(ahoi-corasick) 알고리즘을 이용하는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 (ii) 단계의 기준 에러 값(%)은 0.1% 내지 10% 인 것을 특징으로 하는 방법.
- 삭제
- 제1항에 있어서, 상기 (b) 단계의 프라이머 서열을 결정하는 것은 제1리드와 제2리드의 서열 분석 결과에서 리드의 프라이머가 각각 Forward(5‘)와 Reverse(3’) 를 가지고 일치할 경우, 리드정보와 프라이머 정보를 결정하고 저장하는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 방법은 전체 리드 서열에서 (b) 단계에서 프라이머 서열을 결정한 리드와 결정하지 못한 리드의 비율을 정리하여 보고하는 단계를 추가로 포함하는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 차세대 염기서열 분석기법이 앰플리콘(amplicon) 기반일 경우, 앰플리콘 생산량 결과를 통해 데이터 이상 유무를 보고하는 단계를 추가로 포함하는 것을 특징으로 하는 방법.
- 제10항에 있어서, 상기 앰플리콘 생산량 결과는 실험 샘플의 프라이머 매칭 결과를 바탕으로 예측되는 앰플리콘 생산량 결과와 실제 컨트롤 샘플 대비 실험 샘플의 앰플리콘 생산량 결과를 비교하는 것을 특징으로 하는 방법.
- 상기 방법은 (a) 앰플리콘 기반 차세대 염기서열 분석기법을 통해 리드를 획득하는 단계;
(b) 프라이머 서열과 상기 리드 서열을 분석하여 리드 서열 내 프라이머 서열을 결정하는 단계; 및
(c) 결정된 프라이머 서열을 제거하는 단계를포함하는차세대염기서열분석(Next Generation Sequencing, NGS)에서 프라이머 서열 제거를 수행할 수 있도록 컴퓨팅 시스템을 제어하기 위한 복수의 명령이 암호화된 컴퓨터 판독 가능한 매체를 포함하는 컴퓨터 시스템으로,.
상기 (b) 단계는
(i) 프라이머 서열과 상기 리드 서열에서 완벽하게 매칭되는 리드 서열을 추출하는 단계;
(ii) 프라이머 서열과 상기 (i) 단계에서 추출되지 않은 리드 서열에서 기준 에러 값(%) 만큼 매칭되는 리드 서열을 추출하는 단계; 및
(iii) 프라이머 서열과 상기 (ii) 단계에서 추출되지 않은 리드 서열에서 리드 내부 프라이머 서열 정보로 리드의 프라미어 서열 정보를 결정하는 단계.
를 포함하는 것을 특징으로 하는 컴퓨터 시스템. - 삭제
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020170101540A KR101977976B1 (ko) | 2017-08-10 | 2017-08-10 | 앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 |
US16/637,880 US20200216888A1 (en) | 2017-08-10 | 2018-08-09 | Method for increasing accuracy of analysis by removing primer sequence in amplicon-based next-generation sequencing |
PCT/KR2018/009088 WO2019031867A1 (ko) | 2017-08-10 | 2018-08-09 | 앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020170101540A KR101977976B1 (ko) | 2017-08-10 | 2017-08-10 | 앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20190017161A KR20190017161A (ko) | 2019-02-20 |
KR101977976B1 true KR101977976B1 (ko) | 2019-05-14 |
Family
ID=65272333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020170101540A Active KR101977976B1 (ko) | 2017-08-10 | 2017-08-10 | 앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20200216888A1 (ko) |
KR (1) | KR101977976B1 (ko) |
WO (1) | WO2019031867A1 (ko) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20210114279A (ko) | 2020-03-10 | 2021-09-23 | 사회복지법인 삼성생명공익재단 | 고유 분자 식별자의 표지 정확도를 증진하는 방법 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20240133412A (ko) | 2023-02-28 | 2024-09-04 | 주식회사 에스엠엘제니트리 | 앰플리콘 기반 차세대 염기서열분석기법을 이용하여 인간 유두종바이러스(hpv)의 타입을 검출하는 방법 및 장치 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8209130B1 (en) * | 2012-04-04 | 2012-06-26 | Good Start Genetics, Inc. | Sequence assembly |
HK1211058A1 (en) * | 2012-07-24 | 2016-05-13 | 纳特拉公司 | Highly multiplex pcr methods and compositions |
KR20170023979A (ko) * | 2014-06-26 | 2017-03-06 | 10엑스 제노믹스, 인크. | 핵산 서열 조립을 위한 프로세스 및 시스템 |
-
2017
- 2017-08-10 KR KR1020170101540A patent/KR101977976B1/ko active Active
-
2018
- 2018-08-09 US US16/637,880 patent/US20200216888A1/en active Pending
- 2018-08-09 WO PCT/KR2018/009088 patent/WO2019031867A1/ko active Application Filing
Non-Patent Citations (2)
Title |
---|
A. M. Bolger 외 2인, "Trimmomatic: a flexible trimmer for Iiiumina sequence data", Bioinformatics, 30권, 15호, pp.2114-2120, 2014.* |
S. Zucca 외 6인, "Analysis of amplicon-based NGS data from neurological disease gene panels: a new method for allele drop-out management", BMC Bioinformatics 2016, 17(Suppl 12), pp.87-98, 2016.* |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20210114279A (ko) | 2020-03-10 | 2021-09-23 | 사회복지법인 삼성생명공익재단 | 고유 분자 식별자의 표지 정확도를 증진하는 방법 |
Also Published As
Publication number | Publication date |
---|---|
WO2019031867A1 (ko) | 2019-02-14 |
KR20190017161A (ko) | 2019-02-20 |
US20200216888A1 (en) | 2020-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kumar et al. | Next-generation sequencing and emerging technologies | |
Logsdon et al. | Long-read human genome sequencing and its applications | |
JP7637139B2 (ja) | がん予測パイプラインにおけるrna発現コールを自動化するためのシステムおよび方法 | |
US11193175B2 (en) | Normalizing tumor mutation burden | |
McElhoe et al. | Development and assessment of an optimized next-generation DNA sequencing approach for the mtgenome using the Illumina MiSeq | |
TWI793586B (zh) | 血漿dna之單分子定序 | |
US20210343367A1 (en) | Methods for detecting mutation load from a tumor sample | |
CN107849612A (zh) | 比对和变体测序分析管线 | |
Lange et al. | Analysis pipelines for cancer genome sequencing in mice | |
JP2022505050A (ja) | プーリングを介した多数の試料の効率的な遺伝子型決定のための方法および試薬 | |
CN109461473B (zh) | 胎儿游离dna浓度获取方法和装置 | |
US20200176081A1 (en) | Method for detecting gene rearrangement by using next generation sequencing | |
CN105483210A (zh) | 一种rna编辑位点的检测方法 | |
EP3210145A1 (en) | A computational method for the identification of variants in nucleic acid sequences | |
KR101638473B1 (ko) | 차세대 염기서열 분석법을 기반으로 하는 결실 유전자군 검출 방법 | |
KR101977976B1 (ko) | 앰플리콘 기반 차세대 염기서열 분석기법에서 프라이머 서열을 제거하여 분석의 정확도를 높이는 방법 | |
JP2025028203A (ja) | 脱アミノ化に誘導される配列エラーの補正 | |
Cliften | Base calling, read mapping, and coverage analysis | |
KR102347463B1 (ko) | 핵산 서열 분석에서 위양성 변이를 검출하는 방법 및 장치 | |
KR20230154658A (ko) | Ngs 분석에서의 itd 분석을 위한 씨드 서열의 생성 방법 및 장치 | |
Copeland | Computational Analysis of High-replicate RNA-seq Data in Saccharomyces Cerevisiae: Searching for New Genomic Features | |
HK40004815B (en) | Method and device for acquiring fetal free dna concentration | |
HK40004815A (en) | Method and device for acquiring fetal free dna concentration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
PA0109 | Patent application |
Patent event code: PA01091R01D Comment text: Patent Application Patent event date: 20170810 |
|
PA0201 | Request for examination | ||
A302 | Request for accelerated examination | ||
PA0302 | Request for accelerated examination |
Patent event date: 20180911 Patent event code: PA03022R01D Comment text: Request for Accelerated Examination Patent event date: 20170810 Patent event code: PA03021R01I Comment text: Patent Application |
|
E902 | Notification of reason for refusal | ||
PE0902 | Notice of grounds for rejection |
Comment text: Notification of reason for refusal Patent event date: 20181213 Patent event code: PE09021S01D |
|
PG1501 | Laying open of application | ||
E701 | Decision to grant or registration of patent right | ||
PE0701 | Decision of registration |
Patent event code: PE07011S01D Comment text: Decision to Grant Registration Patent event date: 20190425 |
|
GRNT | Written decision to grant | ||
PR0701 | Registration of establishment |
Comment text: Registration of Establishment Patent event date: 20190507 Patent event code: PR07011E01D |
|
PR1002 | Payment of registration fee |
Payment date: 20190508 End annual number: 3 Start annual number: 1 |
|
PG1601 | Publication of registration | ||
PR1001 | Payment of annual fee |
Payment date: 20220405 Start annual number: 4 End annual number: 6 |