WO2016080750A1 - Gene panel for detecting cancer genome mutant - Google Patents

Gene panel for detecting cancer genome mutant Download PDF

Info

Publication number
WO2016080750A1
WO2016080750A1 PCT/KR2015/012384 KR2015012384W WO2016080750A1 WO 2016080750 A1 WO2016080750 A1 WO 2016080750A1 KR 2015012384 W KR2015012384 W KR 2015012384W WO 2016080750 A1 WO2016080750 A1 WO 2016080750A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
genomic dna
polynucleotide
variation
nucleotide sequence
Prior art date
Application number
PCT/KR2015/012384
Other languages
French (fr)
Korean (ko)
Inventor
배준설
박웅양
김나영
Original Assignee
사회복지법인 삼성생명공익재단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 사회복지법인 삼성생명공익재단 filed Critical 사회복지법인 삼성생명공익재단
Priority claimed from KR1020150161608A external-priority patent/KR20160059446A/en
Publication of WO2016080750A1 publication Critical patent/WO2016080750A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Definitions

  • a composition for use in detecting mutations in genomic DNA of cancer cells and a method for detecting mutations in genomic DNA of cancer cells.
  • Cancer is a disease of various kinds depending on the tissue and cells that develop, and also causes the cause. Cancers can be accompanied by a variety of genomic variations in each tumor, and many studies have reported that somatic mutations can significantly affect the development and progression of cancer. Accordingly, the method of detecting genome mutations in cancer cells has attracted great attention. In addition, the detected mutation information can greatly help cancer patients in selecting a custom anticancer agent.
  • the conventional method performs a separate detection method (e.g., SNV: real-time PCR, CNV: CGH array; or translocation: FISH; etc.) according to each genotype variation. It takes a lot of time to detect the variation of, and a large cost is generated. Therefore, there is a need for a method capable of easily and quickly detecting various genome variations occurring in cancer patients with high sensitivity and accuracy.
  • SNV real-time PCR
  • CNV CGH array
  • FISH translocation
  • One aspect is to provide a composition for use in detecting mutations in genomic DNA of cancer cells.
  • Another aspect is to provide a method for detecting mutations in genomic DNA of cancer cells.
  • One aspect includes a first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of a TERT gene, or a complementary polynucleotide thereof; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS,
  • polynucleotide refers to a nucleotide polymer of any length, where polynucleotides can be used interchangeably with nucleic acid, or oligonucleotide.
  • gene refers to a structural unit that determines genetic information, and expresses the expression of a structural gene and / or structural gene having information that determines the amino acid sequence of a protein or the nucleotide sequence of a functional RNA (tRNA, rRNA, etc.). Controlling genes (eg, promoters, repressors, operators, etc.).
  • tRNA RNA, rRNA, etc.
  • Controlling genes eg, promoters, repressors, operators, etc.
  • gene is understood herein to mean a single stranded side comprising a nucleotide sequence that is transcribed to produce a product of a gene.
  • nucleotide sequence of a gene shall mean a nucleotide sequence that encodes a function contained in a single strand comprising a nucleotide sequence that is transcribed to produce a product of a gene and / or a nucleotide sequence that controls the expression of a structural gene. Can be.
  • nucleotide sequence means two or more adjacent nucleotide sequences.
  • complementary means having a degree of complementarity capable of selectively hybridizing to the above-described nucleotide sequence under certain specific hybridization or annealing conditions.
  • complementary may encompass both substantially complementary and perfectly complementary, and may specifically mean completely complementary.
  • exon refers to a region of a DNA sequence that contains protein synthesis information, for example, a nucleic acid molecule that encodes some or all of the expressed protein.
  • intron refers to a nucleic acid molecule segment that is transcribed into RNA but spliced from endogenous RNA before it is translated into protein.
  • the mutation may be one with respect to standard genomic DNA.
  • the variation may include variation in the copy number of the gene or variation in the nucleotide sequence with respect to standard genomic DNA.
  • the variation in the copy number of the gene may be, for example, a copy variation (CNV).
  • Variations in the nucleotide sequence may include substitution, insertion, deletion, or translocation of one or more nucleotide sequences relative to standard genomic DNA. Substitution of the one or more nucleotide sequences can be, for example, a single nucleotide variation (SNV).
  • the mutation of the genomic DNA of the cancer cell may be more specifically one or more selected from the group consisting of a single nucleotide variation, indel deletion, copy number variation and translocation.
  • the genome variant may also be most specifically a single nucleotide variant, an indel deletion, a copying variant and a translocation.
  • Single Nucleotide Variation refers to the difference between a single base in which a single nucleotide polymorphism occurs in multiple populations in one species, whereas a small population in a sequence or species It means the difference between the single base appearing in, for example, may mean a difference from the standard base sequence shown in the sequencing data.
  • insertion deletion refers to the insertion or deletion of a nucleotide sequence capable of changing the number of nucleic acids of a gene.
  • CNV Copy Number Variation
  • translocation refers to the phenomenon that cleavage occurs in a portion of a chromosome, and that fragment binds to another portion or another chromosome of the same chromosome to change the shape of the chromosome.
  • the single nucleotide variation, or indel deletion may be, for example, occurring at a promoter region of the TERT gene, and at one or more positions selected from Table 1 below.
  • the promoter region of the TERT gene may be, for example, at position 1295163-1296162 of the human 5 chromosome.
  • the copy number variation may occur at one or more positions selected from, for example, Table 1 below.
  • the translocation may occur at one or more positions selected from Table 2 below.
  • the first, second, or three polynucleotides may be specifically single stranded.
  • the first, second, or third polynucleotides include DNA, RNA, peptide nucleic acid (PNA), locked nucleic acid (LNA), zip nucleic acid (ZNA), bridged nucleic acid (Bridged Nucleic). Acid: BNA) and nucleotide analogues.
  • the polynucleotide may specifically be DNA or RNA, and more specifically RNA.
  • a polynucleotide consisting of RNA was used to detect mutations in the cancer genome.
  • the binding strength is superior to other strengths, so that the hybridization time is shortened and has a high detection sensitivity. Therefore, it may be advantageous to detect large region mutations such as nucleotide deletion of 25 bp or more.
  • the first, second, or third polynucleotide is, for example, 75 to 200, 80 to 200, 90 to 200, 100 to 200, 100 to 180, 100 to 160, 100 to 140, 100 Nucleotides of from 120 to 110, from 110 to 180, from 110 to 160, from 110 to 140, from 110 to 130, or from 110 to 120. Specifically, it may be 110 to 120 nucleotides in size, and more specifically 120 nucleotides in size.
  • the first, second, or third polynucleotide has a size of 75 or less, the accuracy of capturing a target region is low, and when the size of 200 or more, the synthesis cost increases.
  • the first, second, or third polynucleotides are economically sized and optimized for detecting mutations in genomic DNA.
  • the first, second, or third polynucleotide may specifically mean a population consisting of two or more polynucleotides.
  • each of the sequences of the polynucleotides constituting the population includes a portion of the corresponding gene nucleotide sequence, and there is no corresponding gene nucleotide sequence region not included in the polynucleotide. This means that the entire nucleotide sequence of a gene of interest can be covered by the polynucleotides that make up the population.
  • the term "cover by polynucleotides” herein means that the polynucleotide comprises a particular nucleotide sequence of a gene or a sequence complementary to that sequence.
  • first, second, or third polynucleotide is a polynucleotide population
  • one nucleotide in the nucleotide sequence of the gene may be covered by two or more specifically polynucleotides constituting three or more populations.
  • any of the polynucleotides constituting the population and other polynucleotides including the nucleotide sequence of the gene closest to the population may be, for example, 50 to 150, 60 to 60 It may have 140, 70-120, 70-110, 70-100, 70-90, 70-80 or 80 identical sequences. Therefore, the oncogene can be sequenced with high coverage using the composition according to one aspect.
  • the first polynucleotide may specifically include a continuous nucleotide selected from the nucleotide sequence of the promoter region of the TERT gene.
  • promoter refers to a DNA region that is present in the upstream region of a structural gene and to which RNA polymerase binds to initiate transcription.
  • the promoter region of the TERT gene may be, for example, at position 1295163-1296162 of the human 5 chromosome.
  • each of the genes related to the second polynucleotide may be shown in Table 1 below.
  • the genes listed in Table 1 are all derived from humans.
  • Region means the number of target exon regions (but may include regions other than exons).
  • intron region of each gene associated with the third polynucleotide may be shown in Table 2 below.
  • the genes listed in Table 2 are all derived from humans.
  • the first, second, or third polynucleotide of the present invention may specifically bind to the sequence of the target gene.
  • the specific binding properties of such polynucleotides can be used to effectively separate target genes or fragments thereof from the mixture from the mixture.
  • the polynucleotide can be named as a probe.
  • probe refers to a substance that specifically detects a particular substance, site, condition, and the like.
  • the first polynucleotide may be for single nucleotide variation and / or indel deletion.
  • the second polynucleotide may be for detecting a single nucleotide variation, indel deletion and / or copy number variation.
  • the third polynucleotide may be for detecting gene translocation. Therefore, the composition of the present invention includes all of the first to third polynucleotides to perform a single nucleotide mutation (SNV), indel, mutation (CNV) and translocation in the cancer cell genome. All have the benefit of being detectable.
  • SNV single nucleotide mutation
  • CNV mutation
  • the first polynucleotide may include one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 1 to 16, and more specifically, polynucleotides having respective sequences of SEQ ID NOs: 1 to 16. It may be all inclusive.
  • the second polynucleotide may include one or more sequences selected from the group consisting of SEQ ID NOs: 17 to 7266, and more specifically, all polynucleotides having respective sequences of SEQ ID NOs: 17 to 7266. It may be to include.
  • the third polynucleotide may include one or more sequences selected from the group consisting of the sequences of SEQ ID NOs: 7267 to 8102, and more specifically, may include all of the polynucleotides having the sequences of 7267 to 8102. have.
  • the third polynucleotide has a length of 75 or more (eg 120 lengths) and is designed to cover the intron region of 5 genes (ALK, RET, EWSR1, ROS1, TMPRSS2). New regions and genes in which translocations have occurred can be detected.
  • the first, second, or third polynucleotide may further include a moiety for isolation or purification of the polynucleotide.
  • the moiety may be attached to one or more of the nucleotides that comprise the polynucleotide.
  • the moiety may comprise one or more selected from the group consisting of biotin, avidin, and streptavidin.
  • the moiety for example, biotin, avidin or streptavidin may include magnetic beads, or a substance specifically binding to the moiety may include magnetic beads.
  • the separation or purification may be by a substance or magnetic field that specifically binds to the moiety.
  • biotin is attached to one or more bases of the polynucleotide, the biotin attached polynucleotide (probe) is hybridized with genomic DNA, and then the streptavidin particles coated on the magnetic beads are combined, followed by a magnetic field. Polynucleotides hybridized with genomic DNA were isolated using.
  • the cancer cells in the composition may be isolated from cancer patients.
  • the cancer may be, for example, solid cancer, and specifically, the cancer may include liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer. It may be one or more selected from the group consisting of.
  • the present invention is not limited thereto, and the composition is applicable to all carcinomas.
  • the composition may be for use in the search for an anticancer agent that is effective in reducing the viability of cancer cells. Reduction of the viability of the cancer cells is understood to include removal of cancer cells, inhibition or delay of metastasis or growth of cancer cells, and the like.
  • the composition may also be for use in the search for an anticancer agent effective for treating cancer in a patient with cancer cells.
  • the effective anticancer agent may mean an anticancer agent having excellent viability reduction or cancer treatment effect when compared with an anticancer agent selected without using the composition.
  • the composition may be in a liquid state.
  • the liquid may be an aqueous solution.
  • the composition may further comprise a buffer.
  • the composition may be one containing first, second, and third polynucleotides in one container.
  • the composition for detecting the mutation of the cancer cell genome according to the invention may be in the form of a kit.
  • the kit comprises a first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of the TERT gene, or a complementary polynucleotide thereof; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM
  • the kit may further comprise known materials required for the polynucleotide to hybridize with the genomic nucleic acid. For example, it may further include reagents, buffers, buffers, cofactors, and / or substrates required for hybridization of the nucleic acid. Also, when the kit is subjected to a PCR amplification process, optionally, it may include reagents required for PCR amplification, such as buffers, DNA polymerases, DNA polymerase cofactors and dNTPs, and when the kit is subjected to an immunoassay.
  • the kit of the present invention may optionally comprise a secondary antibody and a substrate of the label.
  • the kit may further include instructions for use to amplify the target nucleic acid, and may be manufactured in a number of separate packaging or compartments containing the reagent components described above.
  • the detection method comprises contacting the genomic DNA derived from a cancer cell with a composition for use in detecting a mutation of the genomic DNA of the cancer cell, whereby the genomic DNA and the first, second, or third in the composition are contacted.
  • the composition is as described above.
  • the dielectric variation is as described above.
  • the genome variation may specifically be one or more of a single nucleotide variation, an insertion-deletion variation, a copy number variation, and a translocation, and more specifically, may include all of a single nucleotide variation, an insertion-deletion variation, a copy number variation, and a translocation. .
  • the cancer may for example be a solid cancer.
  • the cancer may be at least one selected from the group consisting of liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer.
  • the genomic DNA derived from the cancer cell may be genomic DNA or fragment thereof isolated from a biological sample.
  • the sample may be any one or more selected from the group consisting of blood, saliva, urine, feces, tissues, cells, and biopsies.
  • the sample may be one that contains a stored biological sample or genomic DNA isolated therefrom.
  • the storage may be stored by a known method.
  • the genomic DNA may be DNA or RNA derived from tissues stored in frozen storage or formalin fixed paraffin-embedded tissue at room temperature. Methods of separating genomic DNA from biological samples are well known.
  • the cancer cell may be isolated from a cancer patient.
  • the sample may be isolated from cells, tissues, organs, and body fluids of a cancer patient, in which case the sample may be subjected to biopsy using conventional methods, for example, methods well known by those skilled in the relevant medical techniques. Can be obtained.
  • the genomic DNA contained in the sample may be fragmented (fragmentation) to any size.
  • the method may further comprise the step of fragmenting the genomic DNA derived from cancer cells before or in the contacting step.
  • fragmentation can be carried out by methods well known to those skilled in the art.
  • genomic DNA can be fragmented by the use of ultrasound.
  • the detection method may include ligation of a sequence for amplification at both ends of the fragmented genomic DNA after fragmentation of the genomic DNA.
  • the method of ligation of the sequence (eg, paired-end tag, universal tag) for the amplification can be performed by those skilled in the art by appropriately selecting a known technique.
  • the hybridization may be performed by a known method. For example, it can be performed by incubating the polynucleotide and genomic DNA in a buffer known to be suitable for hybridization of nucleic acids. Hybridization can be carried out at an appropriate temperature. Suitable temperatures for hybridization can be, for example, 40 to 80 ° C, 50 to 75 ° C, 60 to 70 ° C, or 62 to 67 ° C, specifically 65 ° C. In addition, the hybridization temperature is not limited thereto, and may be appropriately selected according to the sequence and length of the polynucleotide included in the composition. Hybridization time can be, for example, for 1 hour to 12 hours (overnight). In one embodiment the polynucleotides included in the composition hybridize with fragments of genomic DNA having the sequence of the genes they target.
  • the method may further comprise separating the hybridization product of genomic DNA and the first, second, or three polynucleotides.
  • the separating step of the hybridization product may be to separate the hybridization product from the contact product obtained in the contacting step before the step of identifying the nucleotide sequence of the genomic DNA in the hybridization product.
  • the separation may be using a moiety for separation or purification attached to the polynucleotide.
  • the separation or purification may be by a substance or magnetic field that specifically binds to the moiety.
  • streptavidin coated with magnetic beads was used to separate the hybridization product of the first, second, or third polynucleotide with genomic DNA to which biotin was attached.
  • the separation allows selective detection of genomic DNA hybridized with polynucleotides. This may be called "target capture”.
  • the separation may further include the step of separating the genomic DNA from the hybridization product, that is, the step of separating the hybridized holnucleotide and genomic DNA.
  • Isolation of genomic DNA from hybridization products can be performed, for example, by amplification using primers specific for the target DNA after isolation by high temperature.
  • the high temperature may be 80 to 110 ° C, 90 to 100 ° C or 95 ° C.
  • the detection method is a PCR using a hybrid primer or a universal primer complementary to the sequence for amplification attached to each of the genomic DNA as a template, and amplify the genomic DNA It may further comprise a step.
  • the nucleotide sequence can be confirmed using the amplified genomic DNA.
  • NGS next generation sequencing
  • the detection method includes comparing a nucleotide sequence of the identified genomic DNA with a standard nucleotide sequence.
  • the term “reference neucleotide sequence” may refer to a human genomic sequence that does not include a mutation, to which reference is made for identification of the mutation.
  • a human gene sequence published in a database of the National Institute of Bioscience and Biotechnology Information Institute (NCBI), specifically NCBI37.1 or UCSC hg19 (GRCh37), may be used as the standard sequence.
  • NCBI National Institute of Bioscience and Biotechnology Information Institute
  • GRCh37 UCSC hg19
  • the comparison between the base sequence and the standard sequence of the genomic DNA can be performed using various known sequence comparison analysis programs, for example, Maq, Bowtie, SOAP, GSNAP and the like.
  • the detection method may further comprise comparing the number of copies of genomic DNA of a particular region caused by cancer development and progression with a level (control level) obtained using a standard reference.
  • the detection method may further include determining that the copy number of the genomic DNA is increased when the level of the amplified genomic DNA amount is increased compared to the control level.
  • the detection method includes identifying mutations in genomic DNA.
  • the mutation check may be performed using a known mutation detection program, for example, GATK, SAMtool, MoDIL, SeqSeq, PeMer, VariationHunter, Pindel, BreakDancer, and Mutek, but is not limited thereto.
  • single nucleotide mutations and indels were identified using the GATK-2.2.9 algorithm, and CNVs were used to compare the intensity of cancer tissue specimens with signal intensity using a reference cell line.
  • CNV detection was performed by developing an in-house program that detects and compares them with relative values.
  • translocation was identified using the CIGAR algorithm, which extracts discrepant reads separately during BAM file generation.
  • the step of identifying mutation may further include comparing the extracted mutation information with a previously constructed cancer-related genetic mutation-related database to determine whether it is a known mutation or a newly-discovered mutation.
  • the detection method may further comprise the step of confirming a correlation between the identified mutation of the genomic DNA and the cancer treatment effect in the individual of the anticancer agent.
  • Checking the correlation may include identifying a mechanism of action of the anticancer agent, and / or a target targeted for the action. Accordingly, the detection method enables the identification of mutations in genomic DNA and anticancer agents associated with such mutations, and / or selection of anticancer agents, by confirming the correlation. Therefore, information may be provided for selecting an individual cancer treatment agent customized using the detection method.
  • the term “individual” refers to all animals classified as mammals with or suspected of having cancer and includes livestock and farm animals, primates and humans, eg, humans, non-human primates, cattle, horses. And pigs, sheep, goats, dogs, cats or rodents. In particular, the subject is a human male or female of any age or race. "Subject” and “patient” are used interchangeably herein.
  • the genomic DNA may be obtained from cancer tissue or cancer cells of the subject.
  • the obtaining method may use a method known to those skilled in the art for separating genomic DNA from tissue or cells.
  • the method includes selecting a cancer drug that is associated with the mutation in the cancer drug database based on the identified correlation.
  • an algorithm for predicting the correlation between each data can be constructed. Can be.
  • the constructed algorithm can also be used to derive patient-specific cancer therapeutics from genetic information data of a patient and clinical information data of a patient without additional experiments in vitro and in vivo conditions.
  • the algorithm for selecting an individual cancer treatment agent from the genetic information data and the clinical information data of the present invention may include: receiving a result of analysis of genome variation of a sample; Receiving clinical information data of the patient; Selecting an individual cancer treatment agent from a cancer treatment DB based on the results of the genome variation analysis and the clinical information data of the sample; Accumulating the genetic information data of the cancer and the clinical information data of the individual and the selected personalized anticancer agent data corresponding thereto; And analyzing the correlation between the accumulated genetic information data, clinical information data, and patient-specific anticancer drug data.
  • the cancer therapeutic agent DB may be characterized in that the DB of the correlation between the known sequence variation information and the cancer treatment agent.
  • SNV non-small cell cancer
  • EGFR epidermal growth factor receptor
  • SNV SNV (L861Q) where the 861th leucine is replaced by glutamine
  • the algorithm can be updated by adding data, thereby improving the success rate of screening patient-specific anticancer drugs.
  • the composition according to one aspect can detect various genome variations of major cancer-related genes through a single experiment, which can be very economical and efficient than conventional methods using different platforms for each genome variation.
  • sequencing with high coverage of the major genes related to the induction and progression of cancer to ensure high resolution to enable identification of low frequency genome mutations that were not detected by conventional methods. Therefore, by using the composition according to one aspect it is possible to analyze a variety of genetic variations in the sample containing the genome of cancer cells at the same time with high sensitivity and accuracy, it is possible to efficiently search for a patient-specific cancer treatment based on the analysis results.
  • 1 is a diagram illustrating a nucleotide variation analysis method of a cancer sample using the composition according to one aspect.
  • FIG. 3A shows some of the single nucleotide variations detected through NGS using the composition according to one aspect
  • B is the cytosine (C) at the exon of the EGFR gene using the composition according to one aspect.
  • T The result of detecting SNV substituted with T
  • FIG. 4A is a table summarizing some of the results of detecting the insertion-deletion mutation through NGS using the composition according to one aspect, and B is the detection of the insertion-deletion mutation detected using the composition according to one aspect. The result is.
  • 5A is a result of detecting copy number variation through NGS using a composition according to one aspect
  • B is a table summarizing a part of results of detecting copy number variation through NGS using a composition according to one aspect. .
  • FIG. 6A is a result of detecting gene translocation through NGS using a composition according to an aspect
  • B is a table listing some of the results of gene translocation through NGS using a probe composition according to an aspect.
  • FIG. 7 is a graph showing the sensitivity of the detection result of a single nucleotide variation using the composition according to one aspect.
  • polynucleotides capable of detecting the promoter region of the TERT gene were constructed (Agilent, Santa Clara, USA, Table 3). Each polynucleotide is 120 bp in length, and 80 bp of base is overlapped between two polynucleotides having adjacent SEQ ID NO (for example, 80 bases at the 3 'end of SEQ ID NO: 1 and 5' of SEQ ID NO: 2). 80 bases at the ends are identical to each other).
  • each of the 16 polynucleotides was hybridized with a portion of the promoter region of the TERT gene, but was designed to cover the entire sequence of the promoter region of the TERT gene.
  • the produced polynucleotide is a single strand of RNA and includes nucleotides of a promoter region of a DNA chain to which the TERT gene is transcribed.
  • polynucleotides were prepared based on the sequence of the intron region. Specific information of each gene for producing the polynucleotide is shown in Table 2 above.
  • Each polynucleotide is 120 bp in length, as described in the above TERT, and is designed to overlap 80 bp of base between two polynucleotides having adjacent sequence numbers, and to cover the entirety of each gene sequence.
  • one nucleotide in the gene was produced to be included in three polynucleotides.
  • the produced polynucleotide is a single strand of RNA and contains nucleotides of a gene of a DNA chain (antisense DNA) to which each gene is transcribed.
  • the sequence numbers of the polynucleotides produced for each gene are summarized in Table 3 below.
  • Beginning and ending means the beginning and end of SEQ ID NO.
  • Gene means a target gene to which each probe binds.
  • a variation refers to a variation detected by a probe.
  • Genomic DNA was isolated from various cancer patient-derived cancer tissue samples (Tissue, blood, FFPE, FNA, etc.) using the QiAmp DNA Mini kit (Qiagen, Valencia, CA, USA) for NGS experiments. Subsequently, Nanodrop 8000 UV-Vis spectrometer (Thermo Scientific Inc., DE, USA), Qubit 2.0 Fluorometer (Life technologies Inc., Grand Island, NY, USA) and 2200 TapeStation Instrument (Aglient Technologies, Santa Clara, CA, USA ), The concentration, purity, and degradation of the isolated genomic DNA were determined using the equipment. Samples meeting the QC criteria were used for the next step of the experiment.
  • Genomic DNA obtained from each tissue was sheared using Covaris S220 (Covaris, MA, USA), followed by end-repair, A-tailing, paired-end adapter ligation and amplification.
  • the sequencing library was then fabricated.
  • the hybridization time of the library was reacted at 65 ° C. for 24 hours using a composition containing all of the polynucleotides prepared to capture the 83 genomic regions selected in Example 1, and was captured by hybridization.
  • Genomic DNA library fragments were purified. Purification took advantage of the binding properties of streptavidin and biotin attached to the polynucleotide.
  • the captured library fragments were separated from the mixture using magnetic force. Then, the purified genomic DNA library fragment was amplified with an index barcode tag.
  • the primer containing the index barcode tag was amplified by PCR equipment under the following conditions.
  • Example 3-1 The gene fragments captured in Example 3-1 were injected into an NGS sequencing machine (Miseq, illumina, USA) to obtain sequence information of each DNA fragment and aligned to obtain sequence information for each gene in a cancer sample. Sequencing reactions were performed using TruSeq Rapid PE Cluster kit and TruSeq Rapid SBS kit (Illumina, USA) and performed under 100bp paired-end conditions (FIG. 1).
  • Example 3-2 The sequencing reads data obtained in Example 3-2 were aligned to the UCSC hg19 reference genome (http://genome.ucsc.edu) using a Burrows-Wheeler Aligner (BWA) algorithm. PCR duplication was removed using Picard-tools-1.8 (http://picard.sourceforge.net/) and single nucleotide variation (SNV) and indel deletion using the GATK-2.2.9 algorithm. Indel was identified (see FIGS. 3 and 4). CNV developed CNV detection by developing an in-house program that detects cancer cells by comparing them with relative values using a reference cell line (see FIG. 5). The translocation was performed by extracting discordant reads separately in the process of generating a BAM file, performing possible fusion pairs using the CIGAR algorithm, and then removing false positive calls to identify final translocation information.
  • BWA Burrows-Wheeler Aligner
  • FIG. 3A shows a part of single nucleotide variations detected through NGS using the composition according to one aspect.
  • Gene information Gene, Gene, Function, Variation type, and Variation occurred.
  • Exon number amino acid change information, SNP DB recording information (dbSNP), chromosome number where mutation occurred (Chromosome), position of mutation, reference standard nucleotide sequence (Reference), mutation
  • I a table listing the generated nucleotide sequence (Alteration), frequency (VAF)
  • B is an enlarged view of the results from 55,249,044 to 55,249,097 of chromosome 7 in the EGFR region to the IGV viewer using the probe composition according to the present invention.
  • the detection result of SNV in which cytosine (C) at position 55,249,072 is substituted with thymine (T) is shown.
  • FIG. 4A shows a part of the results of detecting the insertion-deletion mutation through the NGS using the composition according to an aspect of the present invention, Gene, Function, Variation type, Variation type, and Variation Exon number generated (Exon), amino acid change information (Amino acid change), SNP DB recording information (dbSNP), chromosome number (mutation) (Chromosome) the mutation occurred, the position (Position), reference standard nucleotide sequence (Reference), Nucleotide sequence (Alteration), frequency (VAF) is a table listing the mutation occurred, B using the probe composition according to the present invention to expand the results from 55,242,364 to 55,242,580 of chromosome 7 in the EGFR region to IGV viewer Deletion mutations of 15 nucleotides from 55,242,465 to 55,242,480 are shown.
  • Figure 5A is to reduce the tumor tissue content (Tumor purity) to 100%, 50%, 30% using normal tissue to detect Copy Number Variation (CNV) through NGS using the composition according to one aspect One result. It also shows that it detects the copy number of CDK4 and MDM2 even when containing 30% low tumor tissue.
  • B is a table summarizing some of the results of detecting the copy variation through NGS using the actual cancer tumor tissue sample composition according to one aspect. It shows good detection of the number of copies of CDK4 and MDM2.
  • 6A is a result of detecting gene translocation through NGS using a composition according to an aspect
  • B is a summary of some of results of detecting gene translocation through NGS using a probe composition according to the present invention. It is a vote.
  • the probe sequence covering the intron of ALK is the result of accurately detecting the region where the translocation occurred due to the binding of EML.
  • FIG. 7 is a graph showing the sensitivity of the SNV detection result using the composition according to one aspect.
  • 1000x of sequencing data is produced, more than 5% of the mutations are detected with a sensitivity of 99% or more.
  • sensitivity and accuracy were compared to pooled samples at each frequency using four cancer cell lines and normal paired samples with known mutation information. As a result, it was confirmed that a sensitivity of 100.0% was observed in a sample having a tumor volume of 30% or more, and PPV (positive prediction value) showed a high accuracy of 75.0% for amplification and 100.0% for deletion. Table 4).

Abstract

The present invention relates to a composition for detecting the mutation of cancer cell genomic DNA and a method for detecting the mutation of cancer cell genomic DNA. The composition, according to one aspect of the present invention, may be used for analyzing, at once with high sensitivity and accuracy, various gene mutations from a sample comprising a cancer cell genome, and on the basis of the results of such analysis, a patient-tailored cancer therapeutic agent may be efficiently explored.

Description

암 유전체 돌연변이 검출용 유전자 패널Gene panel for detecting cancer genome mutations
암 세포의 유전체 DNA의 변이를 검출하는데 사용하기 위한 조성물, 및 암 세포의 유전체 DNA의 변이를 검출하는 방법에 관한 것이다.A composition for use in detecting mutations in genomic DNA of cancer cells, and a method for detecting mutations in genomic DNA of cancer cells.
암은 발생 조직 및 세포에 따라 그 종류가 다양한 질병으로, 발생원인 또한 다양하다. 암은 각 종양 별로 다양한 유전체 변이를 동반할 수 있으며, 암의 발생 및 진행에 체세포 돌연변이가 큰 영향을 미칠 수 있다는 것에 대해 많은 연구가 보고되었다. 이에 따라 암세포에서 유전체 변이를 검출하는 방법은 큰 주목을 받고 있다. 또한 검출된 돌연변이 정보는 암환자의 맞춤 항암제 선정에 많은 도움을 줄 수 있다. Cancer is a disease of various kinds depending on the tissue and cells that develop, and also causes the cause. Cancers can be accompanied by a variety of genomic variations in each tumor, and many studies have reported that somatic mutations can significantly affect the development and progression of cancer. Accordingly, the method of detecting genome mutations in cancer cells has attracted great attention. In addition, the detected mutation information can greatly help cancer patients in selecting a custom anticancer agent.
그러나 기존 유전체 변이 검출 방법들은 한 개의 유전체 변이만을 검출할 수 있도록 프로브가 고안되어 있으므로, 암의 원인이 되는 다양한 체세포 돌연변이를 검출하기 위해서는 다른 유전체 변이의 검출을 위한 추가 실험이 필요하며, 기존에 밝혀진 변이 이외의 새로운 변이를 발견할 수 없다는 단점을 갖는다. 또한 기존의 방법은 각 유전체 변이 종류에 따라 별도의 검출 방법 (예, SNV: real-time PCR, CNV: CGH array; 또는 전좌: FISH; 등)을 수행하기 때문에, 한명의 암조직을 대상으로 모든 종류의 변이를 검출하기 위해서는 많은 시간이 소요되고, 큰 비용이 발생하게 된다. 따라서 암 환자에게 발생한 다양한 유전체 변이를 민감도 및 정확성이 높으면서 간편하고 신속하게 검출할 수 있는 방법이 요구된다.However, since existing genome mutation detection methods are designed to detect only one genome variation, additional experiments for detecting other genome mutations are needed to detect various somatic mutations that cause cancer. It has the disadvantage that no new variant can be found other than the variant. In addition, the conventional method performs a separate detection method (e.g., SNV: real-time PCR, CNV: CGH array; or translocation: FISH; etc.) according to each genotype variation. It takes a lot of time to detect the variation of, and a large cost is generated. Therefore, there is a need for a method capable of easily and quickly detecting various genome variations occurring in cancer patients with high sensitivity and accuracy.
일 양상은 암 세포의 유전체 DNA의 변이를 검출하는데 사용하기 위한 조성물 을 제공하는 것이다.One aspect is to provide a composition for use in detecting mutations in genomic DNA of cancer cells.
다른 양상은 암 세포의 유전체 DNA의 변이를 검출하는 방법을 제공하는 것이다.Another aspect is to provide a method for detecting mutations in genomic DNA of cancer cells.
일 양상은 TERT 유전자의 뉴클레오티드 서열로부터 선택된 연속(contiguous) 뉴클레오티드 서열을 포함하는 제1 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53 및 VHL 유전자 각각의 엑손 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제2 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드; 및 ALK, RET, ROS1, EWSR1 및 TMPRSS2 유전자 각각의 인트론 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제3 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드를 포함하는, 암 세포의 유전체 DNA의 변이를 검출하는데 사용하기 위한 조성물을 제공한다. One aspect includes a first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of a TERT gene, or a complementary polynucleotide thereof; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC A second polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the exon region of each of the STK11, SYK, TOP1, TP53, and VHL genes, or a complementary polynucleotide thereof; And a third polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the intron region of each of the ALK, RET, ROS1, EWSR1 and TMPRSS2 genes, or complementary polynucleotides thereof A composition for use is provided.
용어 "폴리뉴클레오티드"는 임의의 길이를 지닌 뉴클레오티드 폴리머를 의미하는 것으로, 본 명세서에서 폴리뉴클레오티드는 핵산, 또는 올리고뉴클레오티드와 상호교환적으로 사용할 수 있다. The term "polynucleotide" refers to a nucleotide polymer of any length, where polynucleotides can be used interchangeably with nucleic acid, or oligonucleotide.
용어 "유전자"는 유전정보를 결정하는 구조단위를 의미하는 것으로, 단백질의 아미노산 서열 또는 기능 RNA (tRNA, rRNA 등)의 염기 배열을 결정하는 정보를 가지는 구조 유전자, 및/또는 구조유전자의 발현을 제어하는 조절유전자 (예를 들면, 프로모터, 억제자(repressor), 작동유전자(operator) 등)를 포함한다. 본 명세서에서 용어 "유전자"는 유전자의 산물을 생성하기 위해 전사되는 뉴클레오티드 서열을 포함하는 단일가닥 쪽을 의미하는 것으로 이해된다. 예를 들면 "유전자의 뉴클레오티드 서열"은 유전자의 산물을 생성하기 위해 전사되는 뉴클레오티드 서열을 포함하는 단일가닥에 포함된 기능을 코딩하는 뉴클레오티드 서열 및/또는 구조유전자의 발현을 제어하는 뉴클레오티드 서열을 의미할 수 있다.The term "gene" refers to a structural unit that determines genetic information, and expresses the expression of a structural gene and / or structural gene having information that determines the amino acid sequence of a protein or the nucleotide sequence of a functional RNA (tRNA, rRNA, etc.). Controlling genes (eg, promoters, repressors, operators, etc.). The term "gene" is understood herein to mean a single stranded side comprising a nucleotide sequence that is transcribed to produce a product of a gene. For example, "nucleotide sequence of a gene" shall mean a nucleotide sequence that encodes a function contained in a single strand comprising a nucleotide sequence that is transcribed to produce a product of a gene and / or a nucleotide sequence that controls the expression of a structural gene. Can be.
용어 "연속 뉴클레오티드 서열 (contiguous nucleotide sequence)"은 인접한 2개 이상의 뉴클레오티드 서열을 의미한다.The term "contiguous nucleotide sequence" means two or more adjacent nucleotide sequences.
용어 "상보적(complementary)"은 어떤 특정한 혼성화(hybridization) 또는 어닐링 조건 하에서 상술한 뉴클레오티드 서열에 선택적으로 혼성화할 수 있을 정도의 상보성을 갖는 것을 의미한다. 상기 용어 "상보적"은 실질적 상보적(substantially complementary) 및 완전 상보적(perfectly complementary)인 것을 모두 포괄할 수 있으며, 구체적으로는 완전히 상보적인 것을 의미할 수 있다.The term "complementary" means having a degree of complementarity capable of selectively hybridizing to the above-described nucleotide sequence under certain specific hybridization or annealing conditions. The term "complementary" may encompass both substantially complementary and perfectly complementary, and may specifically mean completely complementary.
용어 "엑손"은 DNA 서열 중 단백질의 합성 정보를 담고 있는 영역으로, 예를 들면 발현된 단백질의 일부 또는 전부를 인코드하는 핵산 분자를 의미한다.The term "exon" refers to a region of a DNA sequence that contains protein synthesis information, for example, a nucleic acid molecule that encodes some or all of the expressed protein.
용어, "인트론"은 RNA로 전사되지만, 상기 RNA가 단백질로 번역되기 전에 내생 RNA로부터 스플라이싱(spliced)되는 핵산 분자 절편을 의미한다.The term “intron” refers to a nucleic acid molecule segment that is transcribed into RNA but spliced from endogenous RNA before it is translated into protein.
상기 변이는 표준 유전체 DNA에 대하여 변이를 갖는 것일 수 있다. 구체적으로 상기 변이는 표준 유전체 DNA에 대하여 유전자의 카피 수의 변이 또는 뉴클레오티드 서열의 변이를 포함할 수 있다. 상기 유전자의 카피수의 변이는 예를 들면 복제수변이(CNV)일 수 있다. 상기 뉴클레오티드 서열의 변이는 표준 유전체 DNA에 대하여 하나 이상의 뉴클레오티드 서열의 치환, 삽입, 결실, 또는 전좌를 포함할 수 있다. 상기 하나 이상의 뉴클레오티드 서열의 치환은 예를 들면 단일 뉴클레오티드 변이(SNV)일 수 있다. 또한 상기 암 세포의 유전체 DNA의 변이는 더욱 구체적으로 단일 뉴클레오티드 변이, 삽입-결실 변이, 복제수 변이 및 전좌로 이루어진 군으로부터 선택되는 하나 이상일 수 있다. 또한 상기 유전체 변이는 가장 구체적으로 단일 뉴클레오티드 변이, 삽입-결실변이, 복제수변이 및 전좌일 수 있다.The mutation may be one with respect to standard genomic DNA. Specifically, the variation may include variation in the copy number of the gene or variation in the nucleotide sequence with respect to standard genomic DNA. The variation in the copy number of the gene may be, for example, a copy variation (CNV). Variations in the nucleotide sequence may include substitution, insertion, deletion, or translocation of one or more nucleotide sequences relative to standard genomic DNA. Substitution of the one or more nucleotide sequences can be, for example, a single nucleotide variation (SNV). In addition, the mutation of the genomic DNA of the cancer cell may be more specifically one or more selected from the group consisting of a single nucleotide variation, indel deletion, copy number variation and translocation. The genome variant may also be most specifically a single nucleotide variant, an indel deletion, a copying variant and a translocation.
용어 "단일 뉴클레오티드 변이 (Single Nucleotide Variation: SNV)"는 단일 뉴클레오티드 다형성(Single Nucleotide Polymorphism)이 하나의 종 내 다수의 집단에서 나타나는 단일염기의 차이를 말하는 것에 비해, 하나의 서열 또는 종 내 소수의 집단에서 나타나는 단일염기의 차이를 의미하는 것으로, 예를 들면 시퀀싱 데이터에서 나타나는 표준염기서열과의 차이를 의미할 있다. The term “Single Nucleotide Variation (SNV)” refers to the difference between a single base in which a single nucleotide polymorphism occurs in multiple populations in one species, whereas a small population in a sequence or species It means the difference between the single base appearing in, for example, may mean a difference from the standard base sequence shown in the sequencing data.
용어 "삽입-결실 변이(Indel)"는 유전자의 핵산 개수를 변화시킬 수 있는 염기 서열이 삽입 또는 결실된 것을 의미한다. The term “insertion deletion” (Indel) refers to the insertion or deletion of a nucleotide sequence capable of changing the number of nucleic acids of a gene.
용어 "복제수 변이(Copy Number Variation: CNV)"는 특정 염색체의 상대적으로 큰 영역이 결손되거나 증폭되어 반복적으로 나타나는 유전체 DNA의 변이를 의미하는 것으로, 예를 들면 1kB 이상의 DNA 조각이 중첩되어 존재하거나 일부가 결실되는 변이일 수 있다.The term “Copy Number Variation (CNV)” refers to a variation in genomic DNA that appears repeatedly or is missing or amplified by a relatively large region of a particular chromosome. For example, overlapping DNA fragments of 1 kB or more It may be a mutation in which some are deleted.
용어 "전좌(translocation)"는 염색체의 일부분에 절단이 일어나, 그 단편이 동일 염색체의 다른 부분 또는 다른 염색체에 결합하여 염색체의 형태를 바꾸는 현상을 의미한다.The term "translocation" refers to the phenomenon that cleavage occurs in a portion of a chromosome, and that fragment binds to another portion or another chromosome of the same chromosome to change the shape of the chromosome.
상기 단일 뉴클레오티드 변이, 또는 삽입-결실변이는 예를 들면 TERT 유전자의 프로모터 영역, 및 하기 표 1에서 선택되는 하나 이상의 위치에서 발생하는 것일 수 있다. 상기 TERT 유전자의 프로모터 영역은 예를 들면, 인간 5 염색체의 1295163-1296162의 위치일 수 있다. 상기 복제수 변이는 예를 들면 하기 표 1에서 선택되는 하나 이상의 위치에서 발생하는 것일 수 있다. 상기 전좌는 예를 들면 하기 표 2에서 선택되는 하나 이상의 위치에서 발생하는 것일 수 있다.The single nucleotide variation, or indel deletion, may be, for example, occurring at a promoter region of the TERT gene, and at one or more positions selected from Table 1 below. The promoter region of the TERT gene may be, for example, at position 1295163-1296162 of the human 5 chromosome. The copy number variation may occur at one or more positions selected from, for example, Table 1 below. For example, the translocation may occur at one or more positions selected from Table 2 below.
상기 제1, 2, 또는 3 폴리뉴클레오티드는 구체적으로 단일가닥일 수 있다.The first, second, or three polynucleotides may be specifically single stranded.
상기 제1, 2, 또는 3 폴리뉴클레오티드는 DNA, RNA, 펩티드 핵산(Peptide Nucleic Acid: PNA), 잠금 핵산(Locked Nucleic Acid: LNA), 지프 핵산(Zip Nucleic Acid: ZNA), 가교 핵산(Bridged Nucleic Acid: BNA) 및 뉴클레오티드 유사체로부터 선택되는 하나 이상일 수 있다. 상기 폴리뉴클레오티드는 구체적으로 DNA 또는 RNA일 수 있으며, 더욱 구체적으로는 RNA일 수 있다. 일 실시예에서 RNA로 이루어진 폴리뉴클레오티드를 이용하여, 암 유전체의 변이를 검출하였다. 상기 폴리뉴클레오티드가 RNA일 경우, 결합 강도가 다른 강도보다 우수하여 혼성화 시간이 단축되고, 높은 검출 민감도의 효과를 갖는다. 따라서 25bp 이상 크기의 뉴클레오티드 결실과 같은 큰 영역의 변이 검출에 유리할 수 있다. The first, second, or third polynucleotides include DNA, RNA, peptide nucleic acid (PNA), locked nucleic acid (LNA), zip nucleic acid (ZNA), bridged nucleic acid (Bridged Nucleic). Acid: BNA) and nucleotide analogues. The polynucleotide may specifically be DNA or RNA, and more specifically RNA. In one embodiment, a polynucleotide consisting of RNA was used to detect mutations in the cancer genome. When the polynucleotide is RNA, the binding strength is superior to other strengths, so that the hybridization time is shortened and has a high detection sensitivity. Therefore, it may be advantageous to detect large region mutations such as nucleotide deletion of 25 bp or more.
상기 제1, 2, 또는 3 폴리뉴클레오티드는 예를 들면 75 내지 200개, 80 내지 200개, 90 내지 200개, 100 내지 200개, 100 내지 180개, 100 내지 160개, 100 내지 140개, 100 내지 120개, 110 내지 180개, 110 내지 160개, 110 내지 140개, 110 내지 130개, 또는 110 내지 120개의 크기의 뉴클레오티드일 수 있다. 구체적으로는 110 내지 120개의 크기의 뉴클레오티드일 수 있으며, 더욱 구체적으로는 120개의 크기의 뉴클레오티드일 수 있다. 상기 제1, 2, 또는 3 폴리뉴클레오티드가 75개 이하의 크기를 가질 경우 표적 영역을 캡처에 대한 정확도가 낮으며, 200개 이상의 크기를 가질 경우 합성비용이 증가하는 단점을 갖는다. 따라서 상기 제 제1, 2, 또는 3 폴리뉴클레오티드는 경제적이면서 유전체 DNA의 변이 검출에 최적화된 크기를 갖는다.The first, second, or third polynucleotide is, for example, 75 to 200, 80 to 200, 90 to 200, 100 to 200, 100 to 180, 100 to 160, 100 to 140, 100 Nucleotides of from 120 to 110, from 110 to 180, from 110 to 160, from 110 to 140, from 110 to 130, or from 110 to 120. Specifically, it may be 110 to 120 nucleotides in size, and more specifically 120 nucleotides in size. When the first, second, or third polynucleotide has a size of 75 or less, the accuracy of capturing a target region is low, and when the size of 200 or more, the synthesis cost increases. Thus, the first, second, or third polynucleotides are economically sized and optimized for detecting mutations in genomic DNA.
상기 제1, 2, 또는 3 폴리뉴클레오티드는 구체적으로 2개 이상의 폴리뉴클레오티드로 이루어지는 집단을 의미할 수 있다. 이 경우 집단을 구성하는 폴리뉴클레오티드들의 서열의 각각은 해당 유전자 뉴클레오티드 서열의 일부를 포함하며, 폴리뉴클레오티드에 포함되지 않는 해당 유전자 뉴클레오티드 서열 영역은 존재하지 않는다. 이는 해당 유전자 전체 뉴클레오티드 서열이 집단을 구성하는 폴리뉴클레오티드들에 의해 커버될 수 있음을 의미한다. 여기서 용어 "폴리뉴클레오티드들에 의한 커버"는 폴리뉴클레오티드가 유전자의 특정 뉴클레오티드 서열 또는 그 서열에 상보적인 서열을 포함하는 것을 의미한다.The first, second, or third polynucleotide may specifically mean a population consisting of two or more polynucleotides. In this case, each of the sequences of the polynucleotides constituting the population includes a portion of the corresponding gene nucleotide sequence, and there is no corresponding gene nucleotide sequence region not included in the polynucleotide. This means that the entire nucleotide sequence of a gene of interest can be covered by the polynucleotides that make up the population. The term "cover by polynucleotides" herein means that the polynucleotide comprises a particular nucleotide sequence of a gene or a sequence complementary to that sequence.
또한 상기 제1, 2, 또는 3 폴리뉴클레오티드가 폴리뉴클레오티드 집단일 경우, 해당 유전자의 뉴클레오티드 서열 중 한 개의 뉴클레오티드가 2개 이상 구체적으로는 3개 이상의 집단을 구성하는 폴리뉴클레오티드에 의해 커버될 수 있다.In addition, when the first, second, or third polynucleotide is a polynucleotide population, one nucleotide in the nucleotide sequence of the gene may be covered by two or more specifically polynucleotides constituting three or more populations.
또한 상기 제1, 2, 또는 3 폴리뉴클레오티드는 집단일 경우, 집단을 구성하는 임의의 폴리뉴클레오티드와 이와 가장 인접한 유전자의 뉴클레오티드 서열을 포함하는 다른 폴리뉴클레오티드는, 예를 들면 50 내지 150개, 60 내지 140개, 70 내지 120개, 70 내지 110개, 70 내지 100개, 70 내지 90개, 70 내지 80개 또는 80개의 동일한 서열을 가질 수 있다. 따라서 일 양상에 따른 조성물을 이용하여 암유전자를 높은 커버리지로 시퀀싱할 수 있다.Also, when the first, second, or third polynucleotide is a population, any of the polynucleotides constituting the population and other polynucleotides including the nucleotide sequence of the gene closest to the population may be, for example, 50 to 150, 60 to 60 It may have 140, 70-120, 70-110, 70-100, 70-90, 70-80 or 80 identical sequences. Therefore, the oncogene can be sequenced with high coverage using the composition according to one aspect.
상기 제1 폴리뉴클레오티드는 구체적으로 TERT 유전자의 프로모터 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드를 포함하는 것일 수 있다.The first polynucleotide may specifically include a continuous nucleotide selected from the nucleotide sequence of the promoter region of the TERT gene.
용어 "프로모터"는 구조 유전자의 상류(upstream) 지역에 존재하고, 전사를 개시하기 위하여 RNA 폴리머라아제가 결합하는 DNA 영역을 의미한다. 상기 TERT 유전자의 프로모터 영역은 예를 들면 인간 5 염색체의 1295163-1296162의 위치일 수 있다. The term “promoter” refers to a DNA region that is present in the upstream region of a structural gene and to which RNA polymerase binds to initiate transcription. The promoter region of the TERT gene may be, for example, at position 1295163-1296162 of the human 5 chromosome.
상기 제2 폴리뉴클레오티드와 관련된 유전자 각각의 구체적 예는 하기 표 1에 나타낸 것일 수 있다. 표 1에 나열된 유전자들은 모두 인간으로부터 유래한다.Specific examples of each of the genes related to the second polynucleotide may be shown in Table 1 below. The genes listed in Table 1 are all derived from humans.
제2 폴리뉴클레오티드와 관련된 유전자의 구체예Embodiments of Genes Associated with Second Polynucleotide
TargetIDTargetID IntervalInterval ChrChr StartStart EndEnd RegionsRegions SizeSize
ABL1ABL1 chr9:133589697-133761080chr9: 133589697-133761080 99 133589697133589697 133761080133761080 1414 3,964 3,964
AKT1AKT1 chr14:105236668-105258990chr14: 105236668-105258990 1414 105236668105236668 105258990105258990 1313 1,703 1,703
AKT2AKT2 chr19:40739466-40771184chr19: 40739466-40771184 1919 4073946640739466 4077118440771184 1515 2,133 2,133
AKT3AKT3 chr1:243663035-244006482chr1: 243663035-244006482 1One 243663035243663035 244006482244006482 1414 1,764 1,764
ALKALK chr2:29416080-30143535chr2: 29416080-30143535 22 2941608029416080 3014353530143535 3131 5,738 5,738
APCAPC chr5:112043405-112198243chr5: 112043405-112198243 55 112043405112043405 112198243112198243 1919 9,166 9,166
ARID1AARID1A chr1:27022885-27107257chr1: 27022885-27107257 1One 2702288527022885 2710725727107257 2020 7,260 7,260
ARID1BARID1B chr6:157099054-157529035chr6: 157099054-157529035 66 157099054157099054 157529035157529035 2323 7,836 7,836
ARID2ARID2 chr12:46123610-46298871chr12: 46123610-46298871 1212 4612361046123610 4629887146298871 2424 6,067 6,067
ATMATM chr11:108098342-108236245chr11: 108098342-108236245 1111 108098342108098342 108236245108236245 6262 10,411 10,411
ATRXATRX chrX:76763819-77041497chrX: 76763819-77041497 XX 7676381976763819 7704149777041497 3737 8,299 8,299
AURKAAURKA chr20:54945204-54963263chr20: 54945204-54963263 2020 5494520454945204 5496326354963263 88 1,387 1,387
AURKBAURKB chr17:8108179-8113552chr17: 8108179-8113552 1717 81081798108179 81135528113552 88 1,198 1,198
BCL2BCL2 chr18:60795848-60985909chr18: 60795848-60985909 1818 6079584860795848 6098590960985909 22 793 793
BRAFBRAF chr7:140426284-140624513chr7: 140426284-140624513 77 140426284140426284 140624513140624513 2121 2,799 2,799
BRCA1BRCA1 chr17:41197685-41276123chr17: 41197685-41276123 1717 4119768541197685 4127612341276123 2424 6,184 6,184
BRCA2BRCA2 chr13:32890588-32972917chr13: 32890588-32972917 1313 3289058832890588 3297291732972917 2626 10,777 10,777
CDH1CDH1 chr16:68771309-68868194chr16: 68771309-68868194 1616 6877130968771309 6886819468868194 1717 3,135 3,135
CDK4CDK4 chr12:58142298-58145510chr12: 58142298-58145510 1212 5814229858142298 5814551058145510 77 1,052 1,052
CDK6CDK6 chr7:92244444-92462647chr7: 92244444-92462647 77 9224444492244444 9246264792462647 77 1,121 1,121
CDKN2ACDKN2A chr9:21968218-21994463chr9: 21968218-21994463 99 2196821821968218 2199446321994463 55 1,135 1,135
CSF1RCSF1R chr5:149433622-149466000chr5: 149433622-149466000 55 149433622149433622 149466000149466000 2222 3,449 3,449
CTNNB1CTNNB1 chr3:41265550-41280843chr3: 41265 550-41280843 33 4126555041265550 4128084341280843 1414 2,626 2,626
DDR2DDR2 chr1:162688844-162750046chr1: 162688844-162750046 1One 162688844162688844 162750046162750046 1616 3,064 3,064
EGFREGFR chr7:55086961-55273320chr7: 55086961-55273320 77 5508696155086961 5527332055273320 3232 4,734 4,734
EPHB4EPHB4 chr7:100401073-100424662chr7: 100401073-100424662 77 100401073100401073 100424662100424662 1717 3,304 3,304
ERBB2ERBB2 chr17:37855803-37884307chr17: 37855803-37884307 1717 3785580337855803 3788430737884307 2828 4,360 4,360
ERBB3ERBB3 chr12:56474075-56495849chr12: 56474075-56495849 1212 5647407556474075 5649584956495849 2929 4,745 4,745
ERBB4ERBB4 chr2:212248330-213403264chr2: 212248330-213403264 22 212248330212248330 213403264213403264 2929 4,552 4,552
EZH2EZH2 chr7:148504728-148544400chr7: 148504728-148544400 77 148504728148504728 148544400148544400 2121 2,876 2,876
FBXW7FBXW7 chr4:153244023-153332965chr4: 153244023-153332965 44 153244023153244023 153332965153332965 1414 2,898 2,898
FGFR1FGFR1 chr8:38271136-38318634chr8: 38271136-38318634 88 3827113638271136 3831863438318634 2020 3,220 3,220
FGFR2FGFR2 chr10:123239085-123353341chr10: 123239085-123353341 1010 123239085123239085 123353341123353341 2323 3,259 3,259
FGFR3FGFR3 chr4:1795652-1809424chr4: 1795652-1809424 44 17956521795652 18094241809424 1818 3,360 3,360
FLT3FLT3 chr13:28578179-28674657chr13: 28578179-28674657 1313 2857817928578179 2867465728674657 2525 3,504 3,504
GNA11GNA11 chr19:3094640-3121187chr19: 3094640-3121187 1919 30946403094640 31211873121187 77 1,220 1,220
GNAQGNAQ chr9:80336229-80646161chr9: 80336229-80646161 99 8033622980336229 8064616180646161 88 1,289 1,289
GNASGNAS chr20:57415152-57485894chr20: 57415152-57485894 2020 5741515257415152 5748589457485894 1717 4,436 4,436
HNF1AHNF1A chr12:121416562-121440298chr12: 121416562-121440298 1212 121416562121416562 121440298121440298 1212 2,729 2,729
HRASHRAS chr11:532626-534332chr11: 532626-534332 1111 532626532626 534332534332 55 733 733
IDH1IDH1 chr2:209101793-209116285chr2: 209101793-209116285 22 209101793209101793 209116285209116285 88 1,408 1,408
IDH2IDH2 chr15:90627488-90645632chr15: 90627488-90645632 1515 9062748890627488 9064563290645632 1111 1,579 1,579
IGF1RIGF1R chr15:99192801-99500681chr15: 99192801-99500681 1515 9919280199192801 9950068199500681 2121 4,524 4,524
ITKITK chr5:156607979-156679698chr5: 156607979-156679698 55 156607979156607979 156679698156679698 1818 2,249 2,249
JAK1JAK1 chr1:65300235-65351957chr1: 65300235-65351957 1One 6530023565300235 6535195765351957 2424 3,945 3,945
JAK2JAK2 chr9:5021978-5126801chr9: 5021978-5126801 99 50219785021978 51268015126801 2323 3,859 3,859
JAK3JAK3 chr19:17937542-17955236chr19: 17937542-17955236 1919 1793754217937542 1795523617955236 2323 3,987 3,987
KDRKDR chr4:55946098-55991470chr4: 55946098-55991470 44 5594609855946098 5599147055991470 3030 4,671 4,671
KITKIT chr4:55524172-55604733chr4: 55524172-55604733 44 5552417255524172 5560473355604733 2222 3,379 3,379
KRASKRAS chr12:25362719-25398328chr12: 25362719-25398328 1212 2536271925362719 2539832825398328 55 787 787
MDM2MDM2 chr12:69202248-69233639chr12: 69202248-69233639 1212 6920224869202248 6923363969233639 1414 1,945 1,945
METMET chr7:116335801-116436188chr7: 116335801-116436188 77 116335801116335801 116436188116436188 2121 4,779 4,779
MLH1MLH1 chr3:37035029-37107120chr3: 37035029-37107120 33 3703502937035029 3710712037107120 2020 2,711 2,711
MPLMPL chr1:43803510-43818453chr1: 43803510-43818453 1One 4380351043803510 4381845343818453 1212 2,233 2,233
MTORMTOR chr1:11166652-11319476chr1: 11166652-11319476 1One 1116665211166652 1131947611319476 6060 9,217 9,217
NF1NF1 chr17:29422318-29705959chr17: 29422318-29705959 1717 2942231829422318 2970595929705959 6060 9,902 9,902
NOTCH1NOTCH1 chr9:139390513-139440248chr9: 139390513-139440248 99 139390513139390513 139440248139440248 3434 8,348 8,348
NPM1NPM1 chr5:170814943-170837579chr5: 170814943-170837579 55 170814943170814943 170837579170837579 1212 1,134 1,134
NRASNRAS chr1:115251146-115258791chr1: 115251146-115258791 1One 115251146115251146 115258791115258791 44 650 650
NTRK1NTRK1 chr1:156785612-156851444chr1: 156785612-156851444 1One 156785612156785612 156851444156851444 1919 2,902 2,902
PDGFRAPDGFRA chr4:55106210-55161449chr4: 55106210-55161449 44 5510621055106210 5516144955161449 2424 3,930 3,930
PDGFRBPDGFRB chr5:149495316-149516620chr5: 149495316-149516620 55 149495316149495316 149516620149516620 2222 3,795 3,795
PIK3CAPIK3CA chr3:178916604-178952162chr3: 178916604-178952162 33 178916604178916604 178952162178952162 2020 3,609 3,609
PIK3R1PIK3R1 chr5:67522494-67593439chr5: 67522494-67593439 55 6752249467522494 6759343967593439 1919 2,779 2,779
PTCH1PTCH1 chr9:98208655-98279112chr9: 98208655-98279112 99 9820865598208655 9827911298279112 2727 5,131 5,131
PTCH2PTCH2 chr1:45286351-45308614chr1: 45286351-45308614 1One 4528635145286351 4530861445308614 2424 4,171 4,171
PTENPTEN chr10:89624217-89725239chr10: 89624217-89725239 1010 8962421789624217 8972523989725239 99 1,392 1,392
PTPN11PTPN11 chr12:112856906-112942578chr12: 112856906-112942578 1212 112856906112856906 112942578112942578 1616 2,112 2,112
RB1RB1 chr13:48878039-49054217chr13: 48878039-49054217 1313 4887803948878039 4905421749054217 2727 3,327 3,327
RETRET chr10:43572697-43623727chr10: 43572697-43623727 1010 4357269743572697 4362372743623727 2020 3,777 3,777
ROS1ROS1 chr6:117609645-117746829chr6: 117609645-117746829 66 117609645117609645 117746829117746829 4545 7,978 7,978
SMAD4SMAD4 chr18:48573407-48604847chr18: 48573407-48604847 1818 4857340748573407 4860484748604847 1313 1,999 1,999
SMARCB1SMARCB1 chr22:24129347-24176377chr22: 24129347-24176377 2222 2412934724129347 2417637724176377 99 1,392 1,392
SMOSMO chr7:128828983-128852302chr7: 128828983-128852302 77 128828983128828983 128852302128852302 1313 2,647 2,647
SRCSRC chr20:36012547-36031792chr20: 36012547-36031792 2020 3601254736012547 3603179236031792 1212 1,869 1,869
STK11STK11 chr19:1206903-1226656chr19: 1206903-1226656 1919 12069031206903 12266561226656 99 1,482 1,482
SYKSYK chr9:93606171-93657892chr9: 93606171-93657892 99 9360617193606171 9365789293657892 1313 2,168 2,168
TOP1TOP1 chr20:39657698-39751947chr20: 39657698-39751947 2020 3965769839657698 3975194739751947 2121 2,718 2,718
TP53TP53 chr17:7565247-7579922chr17: 7565247-7579922 1717 75652477565247 75799227579922 1414 1,697 1,697
VHLVHL chr3:10183522-10191659chr3: 10183522-10191659 33 1018352210183522 1019165910191659 33 702 702
TotalTotal 1,5551,555 287,164 287,164
* 표 1에서 Region은 표적이 되는 엑손 영역의 수를 의미함 (단, 엑손 이외의 영역도 포함할 수 있음)* In Table 1, Region means the number of target exon regions (but may include regions other than exons).
상기 제3의 폴리뉴클레오티드와 관련된 각 유전자의 인트론 영역의 구체적 예는 하기 표 2에 나타낸 것일 수 있다. 표 2에 나열된 유전자들은 모두 인간으로부터 유래한다.Specific examples of the intron region of each gene associated with the third polynucleotide may be shown in Table 2 below. The genes listed in Table 2 are all derived from humans.
제3 폴리뉴클레오티드와 관련된 각 유전자 인트론의 구체예Embodiments of each gene intron associated with a third polynucleotide
GeneGene TargetIDTargetID IntervalInterval Chr.Chr. StartStart EndEnd RegionsRegions SizeSize
ALKALK ALK1ALK1 chr2:29445475-29446206chr2: 29445475-29446206 22 2944547529445475 2944620629446206 1One 732 732
ALKALK ALK2ALK2 chr2:29446396-29448325chr2: 29446396-29448325 22 2944639629446396 2944832529448325 1One 1,930 1,930
ALKALK ALK3ALK3 chr2:29448433-29449786chr2: 29448433-29449786 22 2944843329448433 2944978629449786 1One 1,354 1,354
EWSR1EWSR1 EWSR11EWSR11 chr22:29670273-29674017chr22: 29670273-29674017 2222 2967027329670273 2967401729674017 1One 3,745 3,745
EWSR1EWSR1 EWSR110EWSR110 chr22:29693941-29694721chr22: 29693941-29694721 2222 2969394129693941 2969472129694721 1One 781 781
EWSR1EWSR1 EWSR111EWSR111 chr22:29694887-29695222chr22: 29694887-29695222 2222 2969488729694887 2969522229695222 1One 336 336
EWSR1EWSR1 EWSR112EWSR112 chr22:29695323-29695587chr22: 29695323-29695587 2222 2969532329695323 2969558729695587 1One 265 265
EWSR1EWSR1 EWSR12EWSR12 chr22:29674207-29682910chr22: 29674207-29682910 2222 2967420729674207 2968291029682910 1One 8,704 8,704
EWSR1EWSR1 EWSR13EWSR13 chr22:29678548-29682910chr22: 29678548-29682910 2222 2967854829678548 2968291029682910 1One 4,363 4,363
EWSR1EWSR1 EWSR14EWSR14 chr22:29683125-29684593chr22: 29683125-29684593 2222 2968312529683125 2968459329684593 1One 1,469 1,469
EWSR1EWSR1 EWSR15EWSR15 chr22:29684777-29687552chr22: 29684777-29687552 2222 2968477729684777 2968755229687552 1One 2,776 2,776
EWSR1EWSR1 EWSR16EWSR16 chr22:29687590-29688124chr22: 29687590-29688124 2222 2968759029687590 2968812429688124 1One 535 535
EWSR1EWSR1 EWSR17EWSR17 chr22:29688160-29688475chr22: 29688160-29688475 2222 2968816029688160 2968847529688475 1One 316 316
EWSR1EWSR1 EWSR18EWSR18 chr22:29688597-29692227chr22: 29688597-29692227 2222 2968859729688597 2969222729692227 1One 3,631 3,631
EWSR1EWSR1 EWSR19EWSR19 chr22:29692360-29693815chr22: 29692360-29693815 2222 2969236029692360 2969381529693815 1One 1,456 1,456
RETRET RET1RET1 chr10:43604680-43606653chr10: 43604680-43606653 1010 4360468043604680 4360665343606653 1One 1,974 1,974
RETRET RET2RET2 chr10:43606915-43607545chr10: 43606915-43607545 1010 4360691543606915 4360754543607545 1One 631 631
RETRET RET3RET3 chr10:43607674-43608299chr10: 43607674-43608299 1010 4360767443607674 4360829943608299 1One 626 626
RETRET RET4RET4 chr10:43608413-43609002chr10: 43608413-43609002 1010 4360841343608413 4360900243609002 1One 590 590
RETRET RET5RET5 chr10:43609125-43609926chr10: 43609125-43609926 1010 4360912543609125 4360992643609926 1One 802 802
RETRET RET6RET6 chr10:43610186-43612030chr10: 43610186-43612030 1010 4361018643610186 4361203043612030 1One 1,845 1,845
ROS1ROS1 ROS11ROS11 chr6:117641195-117642420chr6: 117641195-117642420 66 117641195117641195 117642420117642420 1One 1,226 1,226
ROS1ROS1 ROS12ROS12 chr6:117642559-117645493chr6: 117642559-117645493 66 117642559117642559 117645493117645493 1One 2,935 2,935
ROS1ROS1 ROS13ROS13 chr6:117645580-117647385chr6: 117645580-117647385 66 117645580117645580 117647385117647385 1One 1,806 1,806
ROS1ROS1 ROS14ROS14 chr6:117647579-117650490chr6: 117647579-117650490 66 117647579117647579 117650490117650490 1One 2,912 2,912
ROS1ROS1 ROS15ROS15 chr6:117650611-117658333chr6: 117650611-117658333 66 117650611117650611 117658333117658333 1One 7,723 7,723
TMPRSS2TMPRSS2 TMPRSS21TMPRSS21 chr21:42852531-42860319chr21: 42852531-42860319 2121 4285253142852531 4286031942860319 1One 7,789 7,789
TMPRSS2TMPRSS2 TMPRSS22TMPRSS22 chr21:42860442-42861432chr21: 42860442-42861432 2121 4286044242860442 4286143242861432 1One 991 991
TMPRSS2TMPRSS2 TMPRSS23TMPRSS23 chr21:42861522-42866281chr21: 42861522-42866281 2121 4286152242861522 4286628142866281 1One 4,760 4,760
TMPRSS2TMPRSS2 TMPRSS24TMPRSS24 chr21:42866507-42870044chr21: 42866507-42870044 2121 4286650742866507 4287004442870044 1One 3,538 3,538
TMPRSS2TMPRSS2 TMPRSS25TMPRSS25 chr21:42870118-42880006chr21: 42870118-42880006 2121 4287011842870118 4288000642880006 1One 9,889 9,889
TotalTotal 3131 82,430 82,430
본 발명의 제1, 2, 또는 제3 폴리뉴클레오티드는 표적 유전자의 서열과 특이적으로 결합할 수 있다. 이러한 폴리뉴클레오티드의 특이적 결합 특성을 이용하여, 혼합 시료로부터 표적 유전자 또는 그의 단편을 효과적으로 혼합물로부터 분리할 수 있다. 따라서 상기 폴리뉴클레오티드를 프로브로 명명할 수 있다. 용어 "프로브"는 특정물질, 부위, 상태 등을 특이적으로 검출하는 물질을 의미한다. The first, second, or third polynucleotide of the present invention may specifically bind to the sequence of the target gene. The specific binding properties of such polynucleotides can be used to effectively separate target genes or fragments thereof from the mixture from the mixture. Thus, the polynucleotide can be named as a probe. The term "probe" refers to a substance that specifically detects a particular substance, site, condition, and the like.
일 구체예에서, 상기 제1 폴리뉴클레오티드는 단일 뉴클레오티드 변이 및/또는 삽입-결실변이 검출을 위한 것일 수 있다. 일 구체예에서, 상기 제2 폴리뉴클레오티드는 단일 뉴클레오티드 변이, 삽입-결실변이 및/또는 복제수 변이 검출을 위한 것일 수 있다. 일 구체예에서, 상기 제3 폴리뉴클레오티드는 유전자 전좌를 검출하기 위한 것일 수 있다. 따라서, 본 발명의 조성물은 상기 제1 내지 제3 폴리뉴클레오티드들을 모두 포함함으로써 암세포 유전체에서 단일 뉴클레오티드 변이(SNV), 삽입-결실변이(Indel), 복제수변이(CNV) 및 전좌를 한 번의 실험을 통해 모두 검출할 수 있는 잇점을 갖는다.In one embodiment, the first polynucleotide may be for single nucleotide variation and / or indel deletion. In one embodiment, the second polynucleotide may be for detecting a single nucleotide variation, indel deletion and / or copy number variation. In one embodiment, the third polynucleotide may be for detecting gene translocation. Therefore, the composition of the present invention includes all of the first to third polynucleotides to perform a single nucleotide mutation (SNV), indel, mutation (CNV) and translocation in the cancer cell genome. All have the benefit of being detectable.
상기 제1 폴리뉴클레오티드는 구체적으로 서열번호 1 내지 16의 뉴클레오티드 서열로 이루어진 군으로부터 선택되는 하나 이상의 서열을 포함하는 것일 수 있으며, 더욱 구체적으로는 서열번호 1 내지 16의 각각의 서열을 갖는 폴리뉴클레오티드를 모두 포함하는 것일 수 있다.Specifically, the first polynucleotide may include one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 1 to 16, and more specifically, polynucleotides having respective sequences of SEQ ID NOs: 1 to 16. It may be all inclusive.
상기 제2 폴리뉴클레오티드는 구체적으로 서열번호 17 내지 7266의 서열로 이루어진 군으로부터 선택되는 하나 이상의 서열을 포함하는 것일 수 있으며, 더욱 구체적으로는 서열번호 17 내지 7266의 각각의 서열을 갖는 폴리뉴클레오티드를 모두 포함하는 것일 수 있다.Specifically, the second polynucleotide may include one or more sequences selected from the group consisting of SEQ ID NOs: 17 to 7266, and more specifically, all polynucleotides having respective sequences of SEQ ID NOs: 17 to 7266. It may be to include.
상기 제3 폴리뉴클레오티드는 구체적으로 서열번호 7267 내지 8102의 서열로 이루어진 군으로부터 선택되는 하나 이상의 서열을 포함하는 것일 수 있으며, 더욱 구체적으로는 7267 내지 8102의 서열을 갖는 폴리뉴클레오티드를 모두 포함하는 것일 수 있다. 상기 제3 폴리뉴클레오티드는 75개 이상의 길이 (예를 들면 120개의 길이)를 가지고 있으며 5개 유전자 (ALK, RET, EWSR1, ROS1, TMPRSS2)의 인트론 영역을 커버할 수 있도록 설계되어, 5개 유전자와 전좌가 발생한 새로운 영역 및 유전자를 검출할 수 있다.Specifically, the third polynucleotide may include one or more sequences selected from the group consisting of the sequences of SEQ ID NOs: 7267 to 8102, and more specifically, may include all of the polynucleotides having the sequences of 7267 to 8102. have. The third polynucleotide has a length of 75 or more (eg 120 lengths) and is designed to cover the intron region of 5 genes (ALK, RET, EWSR1, ROS1, TMPRSS2). New regions and genes in which translocations have occurred can be detected.
상기 제1, 2, 또는 3 폴리뉴클레오티드는 폴리뉴클레오티드의 분리 또는 정제를 위한 모이어티를 더 포함할 수 있다. 상기 모이어티는 상기 폴리뉴클레오티드를 구서아는 뉴클레오티드들 중 하나 이상에 부착된 것일 수 있다. 상기 모이어티는 비오틴, 아비딘, 및 스트렙타비딘으로 이루어진 군으로부터 선택되는 하나 이상을 포함할 수 있다. 또한 상기 모이어티, 예를 들면 비오틴, 아비딘 또는 스트렙타비딘은 자성비드(magnetic bead)를 포함하거나, 또는 상기 모이어티에 특이적으로 결합하는 물질이 자성비드를 포함할 수 있다. 상기 분리 또는 정제는 모이어티에 특이적으로 결합하는 물질 또는 자기장에 의해 이루어질 수 있다. 일 실시예에서, 상기 폴리뉴클레오티드의 하나 이상의 염기에 바이오틴을 부착하고, 이 바이오틴이 부착된 폴리뉴클레오티드(프로브)를 유전체 DNA와 혼성화시킨 다음 자성비드에 코팅된 스트렙타비딘 파티클을 결합시킨 후, 자기장을 이용하여 유전체 DNA와 혼성화된 폴리뉴클레오티드를 분리하였다. The first, second, or third polynucleotide may further include a moiety for isolation or purification of the polynucleotide. The moiety may be attached to one or more of the nucleotides that comprise the polynucleotide. The moiety may comprise one or more selected from the group consisting of biotin, avidin, and streptavidin. In addition, the moiety, for example, biotin, avidin or streptavidin may include magnetic beads, or a substance specifically binding to the moiety may include magnetic beads. The separation or purification may be by a substance or magnetic field that specifically binds to the moiety. In one embodiment, biotin is attached to one or more bases of the polynucleotide, the biotin attached polynucleotide (probe) is hybridized with genomic DNA, and then the streptavidin particles coated on the magnetic beads are combined, followed by a magnetic field. Polynucleotides hybridized with genomic DNA were isolated using.
상기 조성물에서 상기 암세포는 암 환자로부터 분리된 것일 수 있다. 상기 암은 예를 들면 고형암일 수 있으며, 구체적으로 상기 암은 간암, 교세포종, 난소암, 대장암, 두경부암, 방광암, 신장세포암, 위암, 유방암, 전이암, 전립선암, 췌장암 및 폐암으로 이루어진 군으로부터 선택되는 하나 이상일 수 있다. 그러나 이에 한정되는 것은 아니며, 상기 조성물은 모든 암종에 적용이 가능하다.The cancer cells in the composition may be isolated from cancer patients. The cancer may be, for example, solid cancer, and specifically, the cancer may include liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer. It may be one or more selected from the group consisting of. However, the present invention is not limited thereto, and the composition is applicable to all carcinomas.
상기 조성물은 암세포의 생존능을 감소시키는데 효과적인 항암제를 탐색하는데 사용하기 위한 것일 수 있다. 상기 암세포의 생존능의 감소는 암세포의 제거, 암세포의 전이 또는 성장의 억제 또는 지연 등을 포함하는 것으로 이해된다. 또한 상기 조성물은 암세포를 가진 환자에서 암을 치료하는데 효과적인 항암제를 탐색하는데 사용하기 위한 것일 수 있다. 상기 효과적인 항암제는 상기 조성물을 이용하지 않고 선택되는 항암제와 비교하였을 때 생존능 감소 또는 암 치료 효과가 우수한 항암제를 의미할 수 있다.The composition may be for use in the search for an anticancer agent that is effective in reducing the viability of cancer cells. Reduction of the viability of the cancer cells is understood to include removal of cancer cells, inhibition or delay of metastasis or growth of cancer cells, and the like. The composition may also be for use in the search for an anticancer agent effective for treating cancer in a patient with cancer cells. The effective anticancer agent may mean an anticancer agent having excellent viability reduction or cancer treatment effect when compared with an anticancer agent selected without using the composition.
상기 조성물은 액체상태일 수 있다. 또한 상기 액체는 수성용액일 수 있다. 상기 조성물은 버퍼를 더 포함할 수 있다. 상기 조성물은 하나의 용기에 제1, 제2, 및 제3 폴리뉴클레오티드를 포함하는 것일 수 있다. The composition may be in a liquid state. In addition, the liquid may be an aqueous solution. The composition may further comprise a buffer. The composition may be one containing first, second, and third polynucleotides in one container.
본 발명에 따른 암세포 유전체의 변이를 검출하기 위한 조성물은 키트의 형태일 수 있다. 상기 키트는 TERT 유전자의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제1 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53 및 VHL 유전자 각각의 엑손 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제2 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드; 및 ALK, RET, ROS1, EWSR1 및 TMPRSS2 유전자 각각의 인트론 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제3 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드를 포함할 수 있고, 이들 폴리뉴클레오티드는 전술한 바와 같다.The composition for detecting the mutation of the cancer cell genome according to the invention may be in the form of a kit. The kit comprises a first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of the TERT gene, or a complementary polynucleotide thereof; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC A second polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the exon region of each of the STK11, SYK, TOP1, TP53, and VHL genes, or a complementary polynucleotide thereof; And a third polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the intron region of each of the ALK, RET, ROS1, EWSR1 and TMPRSS2 genes, or complementary polynucleotides thereof, as described above. same.
상기 키트는 상기 폴리뉴클레오티드가 유전체 핵산과 혼성화하는데 요구되는 알려진 물질을 더 포함할 수 있다. 예를 들면, 상기 핵산의 혼성화에 필요한 시약, 버퍼, 버퍼, 보조인자, 및/또는 기질을 더 포함할 수 있다. 또한 상기 키트가 PCR 증폭 과정에 적용되는 경우 선택적으로, PCR 증폭에 필요한 시약, 예컨대, 완충액, DNA 중합효소, DNA 중합효소보조인자 및 dNTPs를 포함할 수 있으며, 상기 키트가 면역 분석에 적용되는 경우, 본 발명의 키트는 선택적으로, 이차항체 및 표지의 기질을 포함할 수 있다. 또한, 상기 키트는 표적핵산을 증폭하기 위하여 사용하기 위한 설명서를 더 포함할 수 있으며, 상기한 시약 성분을 포함하는 다수의 별도 패키징 또는 컴파트먼트로 제작될 수 있다.The kit may further comprise known materials required for the polynucleotide to hybridize with the genomic nucleic acid. For example, it may further include reagents, buffers, buffers, cofactors, and / or substrates required for hybridization of the nucleic acid. Also, when the kit is subjected to a PCR amplification process, optionally, it may include reagents required for PCR amplification, such as buffers, DNA polymerases, DNA polymerase cofactors and dNTPs, and when the kit is subjected to an immunoassay. The kit of the present invention may optionally comprise a secondary antibody and a substrate of the label. In addition, the kit may further include instructions for use to amplify the target nucleic acid, and may be manufactured in a number of separate packaging or compartments containing the reagent components described above.
다른 양상은 상기 조성물을 이용하여 암세포의 유전체 DNA의 변이를 검출하는 방법을 제공한다. 일 구체예에서, 상기 검출방법은 암 세포 유래의 유전체 DNA와 상기 암 세포의 유전체 DNA의 변이를 검출하는데 사용하기 위한 조성물을 접촉시켜, 상기 유전체 DNA와 상기 조성물 중의 상기 제1, 2, 또는 3 폴리뉴클레오티드의 혼성화 산물을 얻는 단계; 상기 혼성화 산물 중의 상기 유전체 DNA의 뉴클레오티드 서열을 확인하는 단계; 및 상기 유전체 DNA의 확인된 뉴클레오티드 서열을 표준 뉴클레오티드 서열과 비교하여, 상기 유전체 DNA의 변이를 확인하는 단계를 포함할 수 있다. Another aspect provides a method of detecting mutations in genomic DNA of cancer cells using the composition. In one embodiment, the detection method comprises contacting the genomic DNA derived from a cancer cell with a composition for use in detecting a mutation of the genomic DNA of the cancer cell, whereby the genomic DNA and the first, second, or third in the composition are contacted. Obtaining a hybridization product of a polynucleotide; Identifying the nucleotide sequence of the genomic DNA in the hybridization product; And comparing the identified nucleotide sequence of the genomic DNA with a standard nucleotide sequence to identify the variation of the genomic DNA.
상기 검출방법에서, 상기 조성물은 전술한 바와 같다.In the detection method, the composition is as described above.
상기 유전체 변이는 상기한 바와 같다. 상기 유전체 변이는 구체적으로 단일 뉴클레오티드 변이, 삽입-결실 변이, 복제수 변이 및 전좌 중 하나 이상일 수 있으며, 더욱 구체적으로는 단일 뉴클레오티드 변이, 삽입-결실변이, 복제수 변이 및 전좌를 모두 포함할 수 있다.The dielectric variation is as described above. The genome variation may specifically be one or more of a single nucleotide variation, an insertion-deletion variation, a copy number variation, and a translocation, and more specifically, may include all of a single nucleotide variation, an insertion-deletion variation, a copy number variation, and a translocation. .
상기 암은 예를 들면 고형암일 수 있다. 구체적으로 상기 암은 간암, 교세포종, 난소암, 대장암, 두경부암, 방광암, 신장세포암, 위암, 유방암, 전이암, 전립선암, 췌장암 및 폐암으로 이루어진 군으로부터 선택되는 하나 이상일 수 있다.The cancer may for example be a solid cancer. Specifically, the cancer may be at least one selected from the group consisting of liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer.
상기 암 세포 유래의 유전체 DNA는 생물학적 시료로부터 분리된 유전체 DNA 또는 그의 단편일 수 있다. 예를 들면, 상기 시료는 혈액, 타액, 뇨, 분변, 조직, 세포 및 생검물로 이루어진 군으로부터 선택되는 어느 하나 이상일 수 있다. 시료는 보관된 생물학적 시료 또는 그로부터 분리된 유전체 DNA를 포함하는 것일 수 있다. 상기 보관은 알려진 방법에 의하여 보관된 것일 수 있다. 상기 유전체 DNA는 냉동 보관 또는 포르말린 고정된 파라핀 임베드된 조직을 상온에서 보관한 조직으로부터 유래된 DNA 또는 RNA일 수 있다. 생물학적 시료로부터 유전체 DNA를 분리하는 방법은 잘 알려져 있다. 상기 암 세포는 암 환자로부터 분리된 것일 수 있다. 따라서 상기 시료는 암 환자의 세포, 조직, 기관, 체액으로부터 분리한 것일 수 있으며, 이 경우, 상기 시료는 통상적인 방법, 예를 들면, 관련 의학 기법에서 당업자에 의해 잘 공지된 방법을 이용하는 생검에 의해 수득될 수 있다. The genomic DNA derived from the cancer cell may be genomic DNA or fragment thereof isolated from a biological sample. For example, the sample may be any one or more selected from the group consisting of blood, saliva, urine, feces, tissues, cells, and biopsies. The sample may be one that contains a stored biological sample or genomic DNA isolated therefrom. The storage may be stored by a known method. The genomic DNA may be DNA or RNA derived from tissues stored in frozen storage or formalin fixed paraffin-embedded tissue at room temperature. Methods of separating genomic DNA from biological samples are well known. The cancer cell may be isolated from a cancer patient. Thus, the sample may be isolated from cells, tissues, organs, and body fluids of a cancer patient, in which case the sample may be subjected to biopsy using conventional methods, for example, methods well known by those skilled in the relevant medical techniques. Can be obtained.
일 구체예에서, 상기 시료에 포함된 유전체 DNA는 임의의 크기로 단편화(fragmentation)된 것일 수 있다. 또한 상기 방법에서 상기 접촉시키는 단계 전 또는 접촉시키는 단계에서 암세포 유래의 유전체 DNA를 단편화하는 단계를 더 포함할 수 있다. 상기 단편화는 당업자에게 잘 알려져 있는 방법에 의해 수행될 수 있다. 예를 들면 초음파의 사용에 의해 유전체 DNA를 단편화할 수 있다. 상기 검출방법은 상기 유전체 DNA의 단편화 후, 단편화된 유전체 DNA의 양 말단에 증폭을 위한 서열을 라이게이션(ligation)시키는 단계를 포함할 수 있다. 상기 증폭을 위한 서열 (예를 들면, paired-end tag, universal tag)의 라이게이션 방법은 통상의 기술자가 공지된 기술을 적절히 선택하여 수행할 수 있다.In one embodiment, the genomic DNA contained in the sample may be fragmented (fragmentation) to any size. In addition, the method may further comprise the step of fragmenting the genomic DNA derived from cancer cells before or in the contacting step. Such fragmentation can be carried out by methods well known to those skilled in the art. For example, genomic DNA can be fragmented by the use of ultrasound. The detection method may include ligation of a sequence for amplification at both ends of the fragmented genomic DNA after fragmentation of the genomic DNA. The method of ligation of the sequence (eg, paired-end tag, universal tag) for the amplification can be performed by those skilled in the art by appropriately selecting a known technique.
상기 검출방법에서, 상기 혼성화는 알려진 방법에 의하여 수행될 수 있다. 예를 들면, 핵산의 혼성화에 적절한 것으로 알려진 버퍼 중에서 상기 폴리뉴클레오티드와 유전체 DNA를 인큐베이션함으로써 수행될 수 있다. 혼성화는 적절한 온도에서 수행될 수 있다. 혼성화에 적절한 온도는 예를 들면, 40 내지 80℃, 50 내지 75 ℃, 60 내지 70℃, 또는 62 내지 67℃일 수 있으며, 구체적으로는 65℃일 수 있다. 또한 혼성화 온도는 이에 제한되지 않고, 조성물에 포함된 폴리뉴클레오티드의 서열 및 길이에 따라 적절하게 선택될 수 있다. 혼성화 시간은 예를 들면, 1 시간 내지 12시간 (밤새) 동안일 수 있다. 일 구체예에서 상기 조성물에 포함되는 폴리뉴클레오티드는 표적으로 하는 유전자들의 서열을 갖는 유전체 DNA의 단편과 혼성화된다. In the detection method, the hybridization may be performed by a known method. For example, it can be performed by incubating the polynucleotide and genomic DNA in a buffer known to be suitable for hybridization of nucleic acids. Hybridization can be carried out at an appropriate temperature. Suitable temperatures for hybridization can be, for example, 40 to 80 ° C, 50 to 75 ° C, 60 to 70 ° C, or 62 to 67 ° C, specifically 65 ° C. In addition, the hybridization temperature is not limited thereto, and may be appropriately selected according to the sequence and length of the polynucleotide included in the composition. Hybridization time can be, for example, for 1 hour to 12 hours (overnight). In one embodiment the polynucleotides included in the composition hybridize with fragments of genomic DNA having the sequence of the genes they target.
상기 방법은 유전체 DNA와 상기 제1, 2, 또는 3 폴리뉴클레오티드의 혼성화 산물을 분리하는 단계를 더 포함할 수 있다. 상기 혼성화 산물의 분리 단계는 상기 혼성화 산물 중의 상기 유전체 DNA의 뉴클레오티드 서열을 확인하는 단계 전에 접촉시키는 단계에서 얻어진 접촉 산물로부터 혼성화 산물을 분리하는 것일 수 있다. 상기 분리는 폴리뉴클레오티드에 부착된 분리 또는 정제를 위한 모이어티를 이용하는 것일 수 있다. 상기 분리 또는 정제는 모이어티에 특이적으로 결합하는 물질 또는 자기장에 의해 이루어질 수 있다. 일 구체예에서, 자성 비드로 코팅된 스트렙타비딘을 이용하여 바이오틴이 부착된 상기 제1, 2, 또는 3 폴리뉴클레오티드와 유전체 DNA의 혼성화 산물을 분리하였다. 상기 분리에 의해 폴리뉴클레오티드와 혼성화된 유전체 DNA를 선별적으로 검출할 수 있다. 이를 "표적 캡처 (target capture)"라고 부를 수 있다.The method may further comprise separating the hybridization product of genomic DNA and the first, second, or three polynucleotides. The separating step of the hybridization product may be to separate the hybridization product from the contact product obtained in the contacting step before the step of identifying the nucleotide sequence of the genomic DNA in the hybridization product. The separation may be using a moiety for separation or purification attached to the polynucleotide. The separation or purification may be by a substance or magnetic field that specifically binds to the moiety. In one embodiment, streptavidin coated with magnetic beads was used to separate the hybridization product of the first, second, or third polynucleotide with genomic DNA to which biotin was attached. The separation allows selective detection of genomic DNA hybridized with polynucleotides. This may be called "target capture".
또한 상기 분리는 상기 혼성화 산물로부터 상기 유전체 DNA를 분리하는 단계, 즉, 혼성화된 홀리뉴클레오티드와 유전체 DNA를 분리하는 단계를 더 포함할 수 있다. 혼성화 산물로부터 유전체 DNA를 분리는 예를 들면 고온에 의한 분리 후 표적 DNA에 특이적 프라이머를 사용한 증폭에 의해 수행될 수 있다. 상기 고온은 80 내지 110℃, 90 내지 100℃ 또는 95℃일 수 있다. In addition, the separation may further include the step of separating the genomic DNA from the hybridization product, that is, the step of separating the hybridized holnucleotide and genomic DNA. Isolation of genomic DNA from hybridization products can be performed, for example, by amplification using primers specific for the target DNA after isolation by high temperature. The high temperature may be 80 to 110 ° C, 90 to 100 ° C or 95 ° C.
또한 상기 검출 방법은 상기 분리된 상기 혼성화 산물 또는 상기 유전체 DNA를 주형으로 하고, 상기 유전체 DNA 각각에 부착된 증폭을 위한 서열에 상보적인 universal primer를 프라이머로 사용하여 PCR하여, 상기 유전체 DNA를 증폭하는 단계를 더 포함할 수 있다. 상기 증폭된 유전체 DNA를 이용하여 뉴클레오티드 서열을 확인할 수 있다.In addition, the detection method is a PCR using a hybrid primer or a universal primer complementary to the sequence for amplification attached to each of the genomic DNA as a template, and amplify the genomic DNA It may further comprise a step. The nucleotide sequence can be confirmed using the amplified genomic DNA.
상기 뉴클레오티드 서열의 확인은 예를 들면 시퀀싱(sequencing) 방법을 통해 확인할 수 있으며, 구체적으로는 차세대 염기서열분석법에 의해 확인할 수 있다. 용어 "차세대 염기서열분석법(next generation sequencing: NGS)은 칩(Chip)기반 그리고 PCR기반 페어드엔드(paired end)형식으로 전장유전체를 조각내고, 상기 조각을 화학적인 반응(hybridization)에 기초하여 초고속으로 시퀀싱을 수행하는 기술을 의미한다. 차세대 염기서열 분석법에 의해 짧은 시간 내에 분석대상이 되는 시료에 대해 대량의 염기서열 데이터를 생성할 수 있다. Confirmation of the nucleotide sequence can be confirmed through, for example, a sequencing method, specifically, by next-generation sequencing. The term "next generation sequencing (NGS) fragments the full-length dielectric in chip-based and PCR-based paired ends, and the fragments are subjected to ultra-high speed based on chemical hybridization. By sequencing technology, a large amount of sequencing data can be generated for a sample to be analyzed within a short time by the next generation sequencing method.
상기 검출 방법은 확인된 유전체 DNA의 염기서열을 표준 염기서열과 비교하는 단계를 포함한다. 용어 "표준 뉴클레오티드 서열(reference neucleotide sequence)"은 변이 확인을 위해 참조가 되는, 변이를 포함하지 않는 인간 유전체 서열을 의미할 수 있다. 예를 들면 표준 염기서열로서 미국 국립보건원 산하 생물공학정보연구소(NCBI)의 데이터베이스에 게시된 인간 유전자 염기 서열, 구체적으로, NCBI37.1 또는 UCSC hg19(GRCh37)를 이용할 수 있다. 상기 유전체 DNA의 염기서열과 표준 염기서열 간의 비교는 공지된 다양한 서열 비교 분석 프로그램, 예를 들면 Maq, Bowtie, SOAP, GSNAP 등을 이용하여 수행할 수 있다. The detection method includes comparing a nucleotide sequence of the identified genomic DNA with a standard nucleotide sequence. The term “reference neucleotide sequence” may refer to a human genomic sequence that does not include a mutation, to which reference is made for identification of the mutation. For example, a human gene sequence published in a database of the National Institute of Bioscience and Biotechnology Information Institute (NCBI), specifically NCBI37.1 or UCSC hg19 (GRCh37), may be used as the standard sequence. The comparison between the base sequence and the standard sequence of the genomic DNA can be performed using various known sequence comparison analysis programs, for example, Maq, Bowtie, SOAP, GSNAP and the like.
상기 검출 방법은 암의 발생 및 진행에 의해 발생한 특정 영역의 유전체 DNA의 복제수를 표준 시료(reference)를 사용하여 얻어진 수준(대조군 수준)과 비교하는 단계를 더 포함할 수 있다. 또한 상기 검출 방법은 상기 증폭된 유전체 DNA 양의 수준이 대조군 수준 대비 증가된 경우 유전체 DNA의 카피 수가 증가되는 것으로 결정하는 단계를 더 포함할 수 있다.The detection method may further comprise comparing the number of copies of genomic DNA of a particular region caused by cancer development and progression with a level (control level) obtained using a standard reference. In addition, the detection method may further include determining that the copy number of the genomic DNA is increased when the level of the amplified genomic DNA amount is increased compared to the control level.
상기 검출 방법은 유전체 DNA의 변이를 확인하는 단계를 포함한다. 상기 변이 확인은 공지된 변이 검출 프로그램, 예를 들면 GATK, SAMtool, MoDIL, SeqSeq, PeMer, VariationHunter, Pindel, BreakDancer 및 Mutek등을 이용하여 수행할 수 있으나, 이에 제한되지 않는다. 일 실시예에서, GATK-2.2.9 알고리즘을 사용하여 단일 뉴클레오티드 변이 및 삽입-결실변이를 동정하였으며, CNV는 표준 세포주(reference cell line)을 사용하여 신호 강도(signal intensity) 대비 암조직 검체의 강도를 상대적인 수치로 비교하여 검출하는 in-house 프로그램을 개발하여 CNV 검출을 수행하였다. 또한 전좌는 BAM 파일 생성 과정에서 불일치 리드 (discordant reads)를 별도로 추출하여 가능한 CIGAR 알고리즘을 이용하여 동정하였다.The detection method includes identifying mutations in genomic DNA. The mutation check may be performed using a known mutation detection program, for example, GATK, SAMtool, MoDIL, SeqSeq, PeMer, VariationHunter, Pindel, BreakDancer, and Mutek, but is not limited thereto. In one embodiment, single nucleotide mutations and indels were identified using the GATK-2.2.9 algorithm, and CNVs were used to compare the intensity of cancer tissue specimens with signal intensity using a reference cell line. CNV detection was performed by developing an in-house program that detects and compares them with relative values. In addition, translocation was identified using the CIGAR algorithm, which extracts discrepant reads separately during BAM file generation.
상기 변이 확인 단계는 추출된 변이정보를 기존에 구축된 암 관련 유전자 변이 관련 데이터베이스와 비교하여 이미 알려진 변이인지 새롭게 발견된 변이인지 판단하는 단계를 더 포함할 수 있다. The step of identifying mutation may further include comparing the extracted mutation information with a previously constructed cancer-related genetic mutation-related database to determine whether it is a known mutation or a newly-discovered mutation.
상기 검출 방법은 확인된 유전체 DNA의 변이와 항암제의 개체에서의 암 치료 효과와의 상관관계를 확인하는 단계를 더 포함할 수 있다. 상기 상관관계를 확인하는 단계는 상기 항암제의 작용기작, 및/또는 작용의 대상이 되는 표적을 확인하는 단계를 포함하는 것일 수 있다. 이에 따라 상기 검출 방법은 상기 상관관계를 확인하는 것을 통해 유전체 DNA의 변이와 이러한 변이와 관련된 항암제의 확인 및/또는 항암제의 선택을 가능하게 한다. 따라서 상기 검출방법을 이용하여 개체 맞춤형 암 치료제를 선정하는데 정보를 제공할 수 있다.The detection method may further comprise the step of confirming a correlation between the identified mutation of the genomic DNA and the cancer treatment effect in the individual of the anticancer agent. Checking the correlation may include identifying a mechanism of action of the anticancer agent, and / or a target targeted for the action. Accordingly, the detection method enables the identification of mutations in genomic DNA and anticancer agents associated with such mutations, and / or selection of anticancer agents, by confirming the correlation. Therefore, information may be provided for selecting an individual cancer treatment agent customized using the detection method.
용어, "개체"는 암이 발병하거나 또는 발병한 것으로 의심되는 포유동물로서 분류된 모든 동물들을 지칭하고, 가축 및 농장 동물, 영장류 및 인간, 예를 들면, 인간, 비-인간 영장류, 소, 말, 돼지, 양, 염소, 개, 고양이 또는 설치류를 포함할 수 있다. 구체적으로, 개체는 임의의 연령 또는 인종의 인간 남성 또는 여성이다. "개체" 및 "환자"는 본 명세서에서 상호교환적으로 사용된다.The term “individual” refers to all animals classified as mammals with or suspected of having cancer and includes livestock and farm animals, primates and humans, eg, humans, non-human primates, cattle, horses. And pigs, sheep, goats, dogs, cats or rodents. In particular, the subject is a human male or female of any age or race. "Subject" and "patient" are used interchangeably herein.
상기 개체로부터 유전체 DNA를 수득하는 단계에서, 상기 유전체 DNA는 개체의 암 조직, 또는 암 세포로부터 수득하는 것일 수 있다. 수득 방법은 조직 또는 세포로부터 유전체 DNA를 분리하는 당업자에게 공지된 방법을 이용할 수 있다.In the step of obtaining genomic DNA from the subject, the genomic DNA may be obtained from cancer tissue or cancer cells of the subject. The obtaining method may use a method known to those skilled in the art for separating genomic DNA from tissue or cells.
상기 방법은 확인된 상관관계를 바탕으로 암 치료제 데이터 베이스에서 변이와 관련있는 암 치료제를 선택하는 단계를 포함한다. 암의 유전적 정보, 환자의 임상정보, 환자로부터 추출된 유전체 DNA의 변이정보, 및 유전체 변이 관련 암 치료제 정보 데이터 사이의 상관관계를 분석하면, 상기 각 데이터 간의 연관성을 예측할 수 있는 알고리즘을 구축할 수 있다. 또한 상기 구축된 알고리즘은 인 비트로 및 인 비보 조건에서 추가적인 실험 없이도, 암의 유전적 정보 데이터 및 환자의 임상정보 데이터로부터 환자 맞춤형 암 치료제를 도출하는데 사용될 수 있다. 예를 들어, 본 발명의 유전적 정보 데이터 및 임상정보 데이터로부터 개체 맞춤형 암 치료제를 선정하는 알고리즘은 시료의 유전체 변이의 분석 결과를 수신하는 단계; 환자의 임상정보 데이터를 수신하는 단계; 시료의 유전체 변이 분석 결과와 임상정보 데이터를 바탕으로 암 치료제 DB에서 개체 맞춤형 암 치료제를 선별하는 단계; 상기 암의 유전적 정보 데이터 및 개체의 임상정보 데이터와 이에 상응하여 선별된 개체 맞춤형 항암제 데이터를 축적하는 단계; 및 상기 축적된 유전적 정보 데이터, 임상정보 데이터 및 환자 맞춤형 항암제 데이터의 상관관계를 분석하는 단계에 의해 구축될 수 있다. 이때 상기 암 치료제 DB는 이미 알려진 서열변이 정보와 암 치료제의 상관관계가 정리된 DB인 것을 특징으로 할 수 있다. 예를 들어, 비소세포암(NSCLC) 환자에서 표피성장인자 수용체(EGFR) 유전자의 엑손 21번의 858번째 류신이 아르기닌으로 치환된 SNV(L858R) 및 861번째 류신이 글루타민으로 치환된 SNV(L861Q)가 검출될 경우, Gefitinib 및 erlotinib의 치료효과가 상승한다는 것이 알려져 있다(US 8,105,769). 상기 알고리즘은 데이터를 추가함에 의하여 갱신함으로써, 환자 맞춤형 항암제를 선별 성공률을 향상시킬 수 있다.The method includes selecting a cancer drug that is associated with the mutation in the cancer drug database based on the identified correlation. By analyzing the correlation between genetic information of cancer, clinical information of patients, mutation information of genomic DNA extracted from patients, and cancer drug information related to genome mutations, an algorithm for predicting the correlation between each data can be constructed. Can be. The constructed algorithm can also be used to derive patient-specific cancer therapeutics from genetic information data of a patient and clinical information data of a patient without additional experiments in vitro and in vivo conditions. For example, the algorithm for selecting an individual cancer treatment agent from the genetic information data and the clinical information data of the present invention may include: receiving a result of analysis of genome variation of a sample; Receiving clinical information data of the patient; Selecting an individual cancer treatment agent from a cancer treatment DB based on the results of the genome variation analysis and the clinical information data of the sample; Accumulating the genetic information data of the cancer and the clinical information data of the individual and the selected personalized anticancer agent data corresponding thereto; And analyzing the correlation between the accumulated genetic information data, clinical information data, and patient-specific anticancer drug data. In this case, the cancer therapeutic agent DB may be characterized in that the DB of the correlation between the known sequence variation information and the cancer treatment agent. For example, in patients with non-small cell cancer (NSCLC), SNV (L858R) in which the 858th leucine of exon 21 of the epidermal growth factor receptor (EGFR) gene is substituted with arginine and SNV (L861Q) where the 861th leucine is replaced by glutamine are detected. It is known that the therapeutic effect of Gefitinib and erlotinib is increased (US 8,105,769). The algorithm can be updated by adding data, thereby improving the success rate of screening patient-specific anticancer drugs.
일 양상에 따른 조성물은 주요 암 관련 유전자의 다양한 유전체 변이를 단 한번의 실험을 통해 검출할 수 있어 각각의 유전체 변이마다 다른 플랫폼을 사용하는 종래 방법보다 매우 경제적이고 효율적일 수 있다. 또한, 암의 유발 및 진행과 관련된 주요 유전자를 대상으로 높은 커버리지(high coverage)로 시퀀싱함으로써 고해상도를 확보하여 기존 방법으로 검출이 되지 않았던 낮은 빈도 유전체 변이의 동정을 가능하게 한다. 따라서 일 양상에 따른 조성물을 이용하여 암세포의 유전체를 포함하는 시료에서 다양한 유전자 변이를 동시에 높은 민감도와 정확도로 분석할 수 있고, 이러한 분석 결과를 바탕으로 환자 맞춤형 암 치료제를 효율적으로 탐색할 수 있다.The composition according to one aspect can detect various genome variations of major cancer-related genes through a single experiment, which can be very economical and efficient than conventional methods using different platforms for each genome variation. In addition, by sequencing with high coverage of the major genes related to the induction and progression of cancer to ensure high resolution to enable identification of low frequency genome mutations that were not detected by conventional methods. Therefore, by using the composition according to one aspect it is possible to analyze a variety of genetic variations in the sample containing the genome of cancer cells at the same time with high sensitivity and accuracy, it is possible to efficiently search for a patient-specific cancer treatment based on the analysis results.
도 1은 일 양상에 따른 조성물을 이용한 암 샘플의 뉴클레오티드 변이 분석방법을 도식화한 것이다.1 is a diagram illustrating a nucleotide variation analysis method of a cancer sample using the composition according to one aspect.
도 2는 일 양상에 따른 조성물이 변이 종류별로 포획 가능한 유전자의 종류를 나타낸다.2 shows the types of genes that the composition according to one aspect can capture for each type of mutation.
도 3의 A는 일 양상에 따른 조성물을 이용하여 NGS를 통해 검출한 단일 뉴클레오티드 변이 중의 일부를 나타낸 것이고, B는 일 양상에 따른 조성물을 이용하여 EGFR 유전자의 엑손에서 사이토신(C)이 티민(T)으로 치환된 SNV를 검출한 결과를 보여준다.FIG. 3A shows some of the single nucleotide variations detected through NGS using the composition according to one aspect, and B is the cytosine (C) at the exon of the EGFR gene using the composition according to one aspect. The result of detecting SNV substituted with T) is shown.
도 4의 A는 일 양상에 따른 조성물을 이용한 NGS를 통해 삽입-결실변이를 검출한 결과 중 일부를 정리한 표이고, B는 일 양상에 따른 조성물을 이용하여 검출한 삽입-결실변이를 검출한 결과이다.FIG. 4A is a table summarizing some of the results of detecting the insertion-deletion mutation through NGS using the composition according to one aspect, and B is the detection of the insertion-deletion mutation detected using the composition according to one aspect. The result is.
도 5의 A는 일 양상에 따른 조성물을 이용한 NGS를 통해 복제수 변이를 검출한 결과이고, B는 일 양상에 따른 조성물을 이용한 NGS를 통해 복제수 변이를 검출한 결과 중 일부를 정리한 표이다.5A is a result of detecting copy number variation through NGS using a composition according to one aspect, and B is a table summarizing a part of results of detecting copy number variation through NGS using a composition according to one aspect. .
도 6의 A는 일 양상에 따른 조성물을 이용한 NGS를 통해 유전자 전좌을 검출한 결과이고, B는 일 양상에 따른 프로브 조성물을 이용한 NGS를 통해 유전자 전좌를 검출한 결과 중 일부를 정리한 표이다.FIG. 6A is a result of detecting gene translocation through NGS using a composition according to an aspect, and B is a table listing some of the results of gene translocation through NGS using a probe composition according to an aspect.
도 7은 일 양상에 따른 조성물을 이용한 단일 뉴클레오티드 변이 검출 결과의 민감도를 나타내는 그래프다.7 is a graph showing the sensitivity of the detection result of a single nucleotide variation using the composition according to one aspect.
이하 본 발명을 실시예에 의해 보다 상세하게 설명한다. 그러나 이들 실시 예는 본 발명을 예시적으로 설명하기 위한 것으로, 본 발명의 범위가 이들 실시 예에 의해 제한되는 것은 아니다.Hereinafter, the present invention will be described in more detail with reference to Examples. However, these examples are for illustrative purposes only, and the scope of the present invention is not limited by these examples.
실시예 1. 변이 검출 타겟을 위한 유전자의 선정 Example 1 Selection of Genes for Mutation Detection Targets
암 관련 DB (예를 들면, My Cancer Genome (http://www.mycancergenome.org), The cancer genome atlas (TCGA) (http://cancergenome.nih.gov), 등) 및 선행문헌을 통해, 경제적 비용 및 최대화된 검출 효율성으로 주요 암 유전자의 변이를 검출할 수 있는 주요 암에서 빈번하게 발생하는 최적 개수의 핫스팟 유전자를 선정하였다. 그 결과 InDel, CNV를 검출하기 위한 유전자로서, ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, DKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53 및 VHL를 선정하였고, 또한 SNP, InDel을 검출할 수 있는 TERT 유전자를 추가적으로 선정하였다. 염색체 전좌를 검출하기 위한 유전자로서 ALK, RET, ROS1, EWSR1 및 TMPRSS2를 선정하였다.Through cancer-related DBs (eg, My Cancer Genome (http://www.mycancergenome.org), The cancer genome atlas (TCGA) (http://cancergenome.nih.gov), etc.) and prior literature The optimal number of hotspot genes that occur frequently in major cancers has been selected that can detect variations in major cancer genes at economic cost and maximized detection efficiency. As a result, ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, DKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1H, IH1, IRAS JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1 RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53, and VHL were selected, and additionally, the TERT gene for detecting SNP and InDel was additionally selected. ALK, RET, ROS1, EWSR1 and TMPRSS2 were selected as genes for detecting chromosomal translocation.
실시예 2. 암 샘플 분석을 위한 프로브 제작Example 2 Probe Preparation for Cancer Sample Analysis
TERT 유전자의 프로모터 지역을 검출할 수 있는 16개의 폴리뉴클레오티드(서열번호 1 내지 16)를 제작하였다(Agilent, Santa Clara, USA, 표 3). 각 폴리뉴클레오티드의 길이는 120bp이고, 인접한 서열번호를 갖는 2개의 폴리뉴클레오티드들 간에 80bp의 염기가 오버랩되도록 제작하였다 (예를 들면 서열번호 1의 3' 말단의 80개의 염기와 서열번호 2의 5' 말단의 80개의 염기가 서로 동일함). 또한 16개의 폴리뉴클레오티드들 각각은 TERT 유전자의 프로모터 지역의 일부분과 혼성화되지만, TERT 유전자의 프로모터 지역의 전체 염기서열을 커버할 수 있도록 제작하였다. 제작된 폴리뉴클레오티드는 RNA로 이루어진 단일 가닥으로, TERT 유전자가 전사되는 DNA 사슬의 프로모터 영역의 뉴클레오티드를 포함한다.Sixteen polynucleotides (SEQ ID NOS: 1-16) capable of detecting the promoter region of the TERT gene were constructed (Agilent, Santa Clara, USA, Table 3). Each polynucleotide is 120 bp in length, and 80 bp of base is overlapped between two polynucleotides having adjacent SEQ ID NO (for example, 80 bases at the 3 'end of SEQ ID NO: 1 and 5' of SEQ ID NO: 2). 80 bases at the ends are identical to each other). In addition, each of the 16 polynucleotides was hybridized with a portion of the promoter region of the TERT gene, but was designed to cover the entire sequence of the promoter region of the TERT gene. The produced polynucleotide is a single strand of RNA and includes nucleotides of a promoter region of a DNA chain to which the TERT gene is transcribed.
또한, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, GFR1, FGFR2, FGFR3, FLT3, GNA11, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMO, SRC, STK11, SYK, TOP1, TP53, VHL ALK, RET, ROS1, EWSR1 및 TMPRSS2 유전자를 검출할 수 있는 폴리뉴클레오티드를 제작하였다 (Agilent, Santa Clara, USA). 이 폴리뉴클레오티드 제작을 위한 각 유전자의 구체적인 정보는 상기 표 1과 같다. ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53 및 VHL 각 유전자의 엑손 영역의 서열을 바탕으로 폴리뉴클레오티드를 제작하였다. In addition, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EPBB2 ERBB3, ERBB4, EZH2, FBXW7, GFR1, FGFR2, FGFR3, FLT3, GNA11, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KRAS, MDM2, METH, ML MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMO, SRC, STK11, SYK, TOP1, TP53, VHL ALK, RET Polynucleotides capable of detecting ROS1, EWSR1 and TMPRSS2 genes were constructed (Agilent, Santa Clara, USA). Specific information of each gene for producing the polynucleotide is shown in Table 1 above. ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC Polynucleotides were prepared based on the sequence of the exon region of each of STK11, SYK, TOP1, TP53, and VHL.
또한, ALK, RET, ROS1, EWSR1 및 TMPRSS2의 경우 인트론 영역의 서열을 바탕으로 폴리뉴클레오티드를 제작하였다. 이 폴리뉴클레오티드 제작을 위한 각 유전자의 구체적인 정보는 상기 표 2와 같다.In addition, in the case of ALK, RET, ROS1, EWSR1 and TMPRSS2, polynucleotides were prepared based on the sequence of the intron region. Specific information of each gene for producing the polynucleotide is shown in Table 2 above.
각각의 폴리뉴클레오티드는 상기 TERT에서 설명한 바와 같이, 각각 120bp 길이를 갖고, 인접한 서열번호를 갖는 2개의 폴리뉴클레오티드들 간에 80bp의 염기가 오버랩되도록 제작하였으며, 또한 각각의 유전자 서열의 전체를 커버할 수 있도록 제작하였다. 그리고 유전자 내 하나의 뉴클레오티드가 3개의 폴리뉴클레오티드에 포함될 수 있도록 제작하였다. 제작된 폴리뉴클레오티드는 RNA로 이루어진 단일 가닥으로, 각각의 유전자가 전사되는 DNA 사슬(안티센스 DNA)의 유전자의 뉴클레오티드를 포함한다. 각 유전자별 제작된 폴리뉴클레오티드의 서열번호를 간략히 정리하면 하기 표 3과 같다.Each polynucleotide is 120 bp in length, as described in the above TERT, and is designed to overlap 80 bp of base between two polynucleotides having adjacent sequence numbers, and to cover the entirety of each gene sequence. Produced. And one nucleotide in the gene was produced to be included in three polynucleotides. The produced polynucleotide is a single strand of RNA and contains nucleotides of a gene of a DNA chain (antisense DNA) to which each gene is transcribed. The sequence numbers of the polynucleotides produced for each gene are summarized in Table 3 below.
검출 유전자 별 프로브 서열번호 요약Summary of probe sequence number by detection gene
유전자gene aa )) 시작번호Start number bb )) 끝번호End number 검출가능변이Detectable mutation cc ))
TERTTERT 1One 1616 SNV, IndelSNV, Indel
ABL1ABL1 1717 114114 SNV, Indel, CNVSNV, Indel, CNV
AKT1AKT1 115115 161161 SNV, Indel, CNVSNV, Indel, CNV
AKT2AKT2 162162 217217 SNV, Indel, CNVSNV, Indel, CNV
AKT3AKT3 218218 266266 SNV, Indel, CNVSNV, Indel, CNV
ALKALK 267267 413413 SNV, Indel, CNVSNV, Indel, CNV
APCAPC 414414 638638 SNV, Indel, CNVSNV, Indel, CNV
ARID1AARID1A 639639 804804 SNV, Indel, CNVSNV, Indel, CNV
ARID1BARID1B 805805 974974 SNV, Indel, CNVSNV, Indel, CNV
ARID2ARID2 975975 11261126 SNV, Indel, CNVSNV, Indel, CNV
ATMATM 11271127 13921392 SNV, Indel, CNVSNV, Indel, CNV
ATRXATRX 13931393 16071607 SNV, Indel, CNVSNV, Indel, CNV
AURKAAURKA 16081608 16381638 SNV, Indel, CNVSNV, Indel, CNV
AURKBAURKB 16391639 16691669 SNV, Indel, CNVSNV, Indel, CNV
BCL2BCL2 16701670 16871687 SNV, Indel, CNVSNV, Indel, CNV
BRAFBRAF 16881688 17661766 SNV, Indel, CNVSNV, Indel, CNV
BRCA1BRCA1 17671767 19241924 SNV, Indel, CNVSNV, Indel, CNV
BRCA2BRCA2 19251925 21852185 SNV, Indel, CNVSNV, Indel, CNV
CDH1CDH1 21862186 22582258 SNV, Indel, CNVSNV, Indel, CNV
CDK4CDK4 22592259 22842284 SNV, Indel, CNVSNV, Indel, CNV
CDK6CDK6 22852285 23132313 SNV, Indel, CNVSNV, Indel, CNV
CDKN2ACDKN2A 23142314 23402340 SNV, Indel, CNVSNV, Indel, CNV
CSF1RCSF1R 23412341 24302430 SNV, Indel, CNVSNV, Indel, CNV
CTNNB1CTNNB1 24312431 24932493 SNV, Indel, CNVSNV, Indel, CNV
DDR2DDR2 24942494 25652565 SNV, Indel, CNVSNV, Indel, CNV
EGFREGFR 25662566 26902690 SNV, Indel, CNVSNV, Indel, CNV
EPHB4EPHB4 26912691 27692769 SNV, Indel, CNVSNV, Indel, CNV
ERBB2ERBB2 27702770 28792879 SNV, Indel, CNVSNV, Indel, CNV
ERBB3ERBB3 28802880 30003000 SNV, Indel, CNVSNV, Indel, CNV
ERBB4ERBB4 30013001 31173117 SNV, Indel, CNVSNV, Indel, CNV
EZH2EZH2 31183118 31953195 SNV, Indel, CNVSNV, Indel, CNV
FBXW7FBXW7 31963196 32623262 SNV, Indel, CNVSNV, Indel, CNV
FGFR1FGFR1 32633263 33483348 SNV, Indel, CNVSNV, Indel, CNV
FGFR2FGFR2 33493349 34403440 SNV, Indel, CNVSNV, Indel, CNV
FGFR3FGFR3 34413441 35213521 SNV, Indel, CNVSNV, Indel, CNV
FLT3FLT3 35223522 36203620 SNV, Indel, CNVSNV, Indel, CNV
GNA11GNA11 36213621 36503650 SNV, Indel, CNVSNV, Indel, CNV
GNAQGNAQ 36513651 36823682 SNV, Indel, CNVSNV, Indel, CNV
GNASGNAS 36833683 37963796 SNV, Indel, CNVSNV, Indel, CNV
HNF1AHNF1A 37973797 38573857 SNV, Indel, CNVSNV, Indel, CNV
HRASHRAS 38583858 38783878 SNV, Indel, CNVSNV, Indel, CNV
IDH1IDH1 38793879 39153915 SNV, Indel, CNVSNV, Indel, CNV
IDH2IDH2 39163916 39593959 SNV, Indel, CNVSNV, Indel, CNV
IGF1RIGF1R 39603960 40674067 SNV, Indel, CNVSNV, Indel, CNV
ITKITK 40684068 41284128 SNV, Indel, CNVSNV, Indel, CNV
JAK1JAK1 41294129 42314231 SNV, Indel, CNVSNV, Indel, CNV
JAK2JAK2 42324232 43314331 SNV, Indel, CNVSNV, Indel, CNV
JAK3JAK3 43324332 44324432 SNV, Indel, CNVSNV, Indel, CNV
KDRKDR 44334433 45574557 SNV, Indel, CNVSNV, Indel, CNV
KITKIT 45584558 46474647 SNV, Indel, CNVSNV, Indel, CNV
KRASKRAS 46484648 46694669 SNV, Indel, CNVSNV, Indel, CNV
MDM2MDM2 46704670 47214721 SNV, Indel, CNVSNV, Indel, CNV
METMET 47224722 48354835 SNV, Indel, CNVSNV, Indel, CNV
MLH1MLH1 48364836 49124912 SNV, Indel, CNVSNV, Indel, CNV
MPLMPL 49134913 49654965 SNV, Indel, CNVSNV, Indel, CNV
MTORMTOR 49664966 52065206 SNV, Indel, CNVSNV, Indel, CNV
NF1NF1 52075207 54625462 SNV, Indel, CNVSNV, Indel, CNV
NOTCH1NOTCH1 54635463 56595659 SNV, Indel, CNVSNV, Indel, CNV
NPM1NPM1 56605660 56965696 SNV, Indel, CNVSNV, Indel, CNV
NRASNRAS 56975697 57145714 SNV, Indel, CNVSNV, Indel, CNV
NTRK1NTRK1 57155715 57925792 SNV, Indel, CNVSNV, Indel, CNV
PDGFRAPDGFRA 57935793 58905890 SNV, Indel, CNVSNV, Indel, CNV
PDGFRBPDGFRB 58915891 59845984 SNV, Indel, CNVSNV, Indel, CNV
PIK3CAPIK3CA 59855985 60746074 SNV, Indel, CNVSNV, Indel, CNV
PIK3R1PIK3R1 60756075 61456145 SNV, Indel, CNVSNV, Indel, CNV
PTCH1PTCH1 61466146 62716271 SNV, Indel, CNVSNV, Indel, CNV
PTCH2PTCH2 62726272 63746374 SNV, Indel, CNVSNV, Indel, CNV
PTENPTEN 63756375 64076407 SNV, Indel, CNVSNV, Indel, CNV
PTPN11PTPN11 64086408 64706470 SNV, Indel, CNVSNV, Indel, CNV
RB1RB1 64716471 65666566 SNV, Indel, CNVSNV, Indel, CNV
RETRET 65676567 66556655 SNV, Indel, CNVSNV, Indel, CNV
ROS1ROS1 66566656 68476847 SNV, Indel, CNVSNV, Indel, CNV
SMAD4SMAD4 68486848 68966896 SNV, Indel, CNVSNV, Indel, CNV
SMARCB1SMARCB1 68976897 69306930 SNV, Indel, CNVSNV, Indel, CNV
SMOSMO 69316931 69906990 SNV, Indel, CNVSNV, Indel, CNV
SRCSRC 69916991 70397039 SNV, Indel, CNVSNV, Indel, CNV
STK11STK11 70407040 70737073 SNV, Indel, CNVSNV, Indel, CNV
SYKSYK 70747074 71277127 SNV, Indel, CNVSNV, Indel, CNV
TOP1TOP1 71287128 72057205 SNV, Indel, CNVSNV, Indel, CNV
TP53TP53 72067206 72507250 SNV, Indel, CNVSNV, Indel, CNV
VHLVHL 72517251 72667266 SNV, Indel, CNVSNV, Indel, CNV
ALKALK 72677267 72787278 Translocation (전좌)Translocation
ALKALK 72797279 73087308 Translocation (전좌)Translocation
ALKALK 73097309 73247324 Translocation (전좌)Translocation
EWSR1EWSR1 73257325 73777377 Translocation (전좌)Translocation
EWSR1EWSR1 73787378 73847384 Translocation (전좌)Translocation
EWSR1EWSR1 73857385 73897389 Translocation (전좌)Translocation
EWSR1EWSR1 73907390 73937393 Translocation (전좌)Translocation
EWSR1EWSR1 73947394 74427442 Translocation (전좌)Translocation
EWSR1EWSR1 74437443 74597459 Translocation (전좌)Translocation
EWSR1EWSR1 74607460 75057505 Translocation (전좌)Translocation
EWSR1EWSR1 75067506 75137513 Translocation (전좌)Translocation
EWSR1EWSR1 75147514 75187518 Translocation (전좌)Translocation
EWSR1EWSR1 75197519 75417541 Translocation (전좌)Translocation
EWSR1EWSR1 75427542 75507550 Translocation (전좌)Translocation
RETRET 75517551 75827582 Translocation (전좌)Translocation
RETRET 75837583 75927592 Translocation (전좌)Translocation
RETRET 75937593 76027602 Translocation (전좌)Translocation
RETRET 76037603 76117611 Translocation (전좌)Translocation
RETRET 76127612 76227622 Translocation (전좌)Translocation
RETRET 76237623 76527652 Translocation (전좌)Translocation
ROS1ROS1 76537653 76717671 Translocation (전좌)Translocation
ROS1ROS1 76727672 77157715 Translocation (전좌)Translocation
ROS1ROS1 77167716 77407740 Translocation (전좌)Translocation
ROS1ROS1 77417741 77827782 Translocation (전좌)Translocation
ROS1ROS1 77837783 77937793 Translocation (전좌)Translocation
TMPRSS2TMPRSS2 77947794 78497849 Translocation (전좌)Translocation
TMPRSS2TMPRSS2 78507850 78607860 Translocation (전좌)Translocation
TMPRSS2TMPRSS2 78617861 79177917 Translocation (전좌)Translocation
TMPRSS2TMPRSS2 79187918 79557955 Translocation (전좌)Translocation
TMPRSS2TMPRSS2 79567956 81028102 Translocation (전좌)Translocation
a): 시작 및 끝은 서열번호의 시작 및 끝을 의미함.a): Beginning and ending means the beginning and end of SEQ ID NO.
b): 유전자는 각 프로브가 결합하는 표적(Target) 유전자를 의미함.b): Gene means a target gene to which each probe binds.
c): 변이는 프로브가 검출하는 변이를 의미함.c): A variation refers to a variation detected by a probe.
실시예 3. 제작된 프로브를 이용하여 시료로부터 변이 검출Example 3 Detecting Mutations from Samples Using Fabricated Probes
3-1. 타겟 캡처 및 라이브러리 제작3-1. Target capture and library authoring
NGS 실험을 위해 다양한 암환자 유래 암조직 검체 (Tissue, blood, FFPE, FNA 등)로부터 유전체 DNA를 QiAmp DNA Mini kit (Qiagen, Valencia, CA, USA)를 사용하여 분리하였다. 그 후, Nanodrop 8000 UV-Vis spectrometer (Thermo Scientific Inc., DE, USA), Qubit 2.0 Fluorometer (Life technologies Inc., Grand Island, NY, USA) 및 2200 TapeStation Instrument (Aglient Technologies, Santa Clara, CA, USA) 장비를 사용하여 분리된 유전체 DNA의 농도, 순도, 및 분해(degradation) 여부를 확인하였다. QC 기준에 부합한 검체들을 다음 단계의 실험에 사용하였다. Genomic DNA was isolated from various cancer patient-derived cancer tissue samples (Tissue, blood, FFPE, FNA, etc.) using the QiAmp DNA Mini kit (Qiagen, Valencia, CA, USA) for NGS experiments. Subsequently, Nanodrop 8000 UV-Vis spectrometer (Thermo Scientific Inc., DE, USA), Qubit 2.0 Fluorometer (Life technologies Inc., Grand Island, NY, USA) and 2200 TapeStation Instrument (Aglient Technologies, Santa Clara, CA, USA ), The concentration, purity, and degradation of the isolated genomic DNA were determined using the equipment. Samples meeting the QC criteria were used for the next step of the experiment.
각 조직으로부터 확보한 유전체 DNA (~250ng)는 Covaris S220 (Covaris, MA, USA)를 사용하여 전단(shearing)을 수행한 후, end-repair, A-tailing, paired-end adaptor ligation 및 amplification 단계를 거쳐 시퀀싱 라이브러리 제작을 수행하였다. 상기 실시예 1에서 선정된 83개의 유전체 영역들을 캡처하기 위해 제작된 폴리뉴클레오티드를 모두 포함하는 조성물을 사용하여 라이브러리의 혼성 시간 (hybridization time)은 65℃에서 24시간 동안 반응하였으며, 혼성화에 의해 캡처된 유전체 DNA 라이브러리 조각들을 정제(purification)하였다. 정제는 폴리뉴클레오티드에 부착된 바이오틴과 스트렙타비딘의 결합 특성을 이용하였다. 구체적으로 자성비드로 코팅된 스트렙타비딘과 캡처된 라이브러리 조각에 부착된 바이오틴을 결합시킨 후 자기력을 이용하여 혼합물로부터 캡처된 라이브러리 조각을 분리하였다. 그 후 정제된 유전체 DNA 라이브러리 조각을 index barcode tag와 함께 증폭을 수행하였다. Index barcode tag가 포함된 프라이머는 아래의 조건으로 PCR 장비에서 증폭되었다.Genomic DNA obtained from each tissue (~ 250ng) was sheared using Covaris S220 (Covaris, MA, USA), followed by end-repair, A-tailing, paired-end adapter ligation and amplification. The sequencing library was then fabricated. The hybridization time of the library was reacted at 65 ° C. for 24 hours using a composition containing all of the polynucleotides prepared to capture the 83 genomic regions selected in Example 1, and was captured by hybridization. Genomic DNA library fragments were purified. Purification took advantage of the binding properties of streptavidin and biotin attached to the polynucleotide. Specifically, after binding the magnetic beads coated streptavidin and biotin attached to the captured library fragments, the captured library fragments were separated from the mixture using magnetic force. Then, the purified genomic DNA library fragment was amplified with an index barcode tag. The primer containing the index barcode tag was amplified by PCR equipment under the following conditions.
단계step 온도Temperature 시간time
1One 98 ℃98 ℃ 45 초45 sec
22 98 ℃98 ℃ 15 초15 seconds
33 60 ℃60 30 초30 sec
44 72 ℃72 30 초30 sec
2번에서 4번 단계를 총 13회 반복한다.Repeat steps 2 through 4 a total of 13 times.
66 72 ℃72 5 분5 minutes
77 4 ℃4 ℃ 보관keep
3-2. 시퀀싱(Sequencing)3-2. Sequencing
실시예 3-1에서 포획한 유전자 절편을 NGS 시퀀싱 기계(Miseq, illumina, USA)에 주입하여 각 DNA 절편의 서열정보를 획득하고, 정렬하여 암 샘플에서 각 유전자에 대한 서열정보를 수득하였다. 시퀀싱 반응은 TruSeq Rapid PE Cluster kit 및 TruSeq Rapid SBS kit (Illumina, USA)를 사용하여 이루어졌으며 100bp paired-end 조건으로 수행하였다 (도 1). The gene fragments captured in Example 3-1 were injected into an NGS sequencing machine (Miseq, illumina, USA) to obtain sequence information of each DNA fragment and aligned to obtain sequence information for each gene in a cancer sample. Sequencing reactions were performed using TruSeq Rapid PE Cluster kit and TruSeq Rapid SBS kit (Illumina, USA) and performed under 100bp paired-end conditions (FIG. 1).
3-3. 변이정보 추출(Variant Calling)3-3. Variant Calling
실시예 3-2에서 획득한 시퀀싱 리드 (reads) 데이터를 Burrows-Wheeler Aligner (BWA) 알고리즘을 사용하여 UCSC hg19 reference genome ( http://genome.ucsc.edu)에 정렬(alignment)를 수행하였다. PCR duplication은 Picard-tools-1.8 (http://picard.sourceforge.net/)를 사용하여 제거하였으며, GATK-2.2.9 알고리즘을 사용하여 단일 뉴클레오티드 변이(Single Nucleotide Variation, SNV) 및 삽입-결실변이(Indel)를 동정하였다 (도 3 및 도 4 참조). CNV는 표준 세포주(reference cell line)을 사용하여 신호 강도(signal intensity) 대비 암조직 검체의 강도를 상대적인 수치로 비교하여 검출하는 in-house 프로그램을 개발하여 CNV 검출을 수행하였다 (도 5 참조). 전좌는 BAM 파일 생성 과정에서 불일치 리드 (discordant reads)를 별도로 추출하여 가능한 fusion pairs를 CIGAR 알고리즘을 사용하여 수행한 후 false positive calls를 제거하여 최종 전좌 정보를 동정하였다(도 6 참조). The sequencing reads data obtained in Example 3-2 were aligned to the UCSC hg19 reference genome (http://genome.ucsc.edu) using a Burrows-Wheeler Aligner (BWA) algorithm. PCR duplication was removed using Picard-tools-1.8 (http://picard.sourceforge.net/) and single nucleotide variation (SNV) and indel deletion using the GATK-2.2.9 algorithm. Indel was identified (see FIGS. 3 and 4). CNV developed CNV detection by developing an in-house program that detects cancer cells by comparing them with relative values using a reference cell line (see FIG. 5). The translocation was performed by extracting discordant reads separately in the process of generating a BAM file, performing possible fusion pairs using the CIGAR algorithm, and then removing false positive calls to identify final translocation information.
도 3의 A는 일 양상에 따른 조성물을 이용하여 NGS를 통해 검출한 단일 뉴클레오티드 변이 중의 일부를 나타낸 것으로, 유전자정보(Gene), 변이의 특성(Function), 변이형태 (variation type), 변이가 발생한 엑손번호 (Exon), 아미노산 변경 정보(Amino acid change), SNP DB 수록 정보 (dbSNP), 변이가 발생한 염색체 번호 (Chromosome), 변이의 위치 (Position), 참고 표준 뉴크레오타이드 서열 (Reference), 변이가 발생한 뉴클레오타이드 서열 (Alteration), 빈도 (VAF)를 정리한 표이고, B는 본 발명에 따른 프로브 조성물을 이용하여 EGFR 영역의 chromosome 7의 55,249,044으로부터 55,249,097까지의 결과를 IGV viewer로 확대하여 본 것이다. 55,249,072 위치의 사이토신(C)이 티민(T)으로 치환된 SNV를 검출한 결과를 보여준다.FIG. 3A shows a part of single nucleotide variations detected through NGS using the composition according to one aspect. Gene information, Gene, Function, Variation type, and Variation occurred. Exon number, amino acid change information, SNP DB recording information (dbSNP), chromosome number where mutation occurred (Chromosome), position of mutation, reference standard nucleotide sequence (Reference), mutation Is a table listing the generated nucleotide sequence (Alteration), frequency (VAF), and B is an enlarged view of the results from 55,249,044 to 55,249,097 of chromosome 7 in the EGFR region to the IGV viewer using the probe composition according to the present invention. The detection result of SNV in which cytosine (C) at position 55,249,072 is substituted with thymine (T) is shown.
도 4의 A는 일 양상에 따른 조성물을 이용한 NGS를 통해 삽입-결실변이를 검출한 결과 중 일부를 나타낸 것으로 유전자정보(Gene), 변이의 특성(Function), 변이형태(variation type), 변이가 발생한 엑손번호(Exon), 아미노산 변경 정보(Amino acid change), SNP DB 수록 정보 (dbSNP), 변이가 발생한 염색체 번호 (Chromosome), 변이의 위치 (Position), 참고 표준 뉴크레오타이드 서열 (Reference), 변이가 발생한 뉴클레오타이드 서열 (Alteration), 빈도 (VAF)를 정리한 표이고, B는 본 발명에 따른 프로브 조성물을 이용하여 EGFR 영역의 chromosome 7의 55,242,364으로부터 55,242,580까지의 결과를 IGV viewer로 확대하여 본 것으로, 55,242,465으로부터 55,242,480 까지의 15개 뉴클레오타이드의 결실변이를 검출한 결과를 보여준다.4A shows a part of the results of detecting the insertion-deletion mutation through the NGS using the composition according to an aspect of the present invention, Gene, Function, Variation type, Variation type, and Variation Exon number generated (Exon), amino acid change information (Amino acid change), SNP DB recording information (dbSNP), chromosome number (mutation) (Chromosome) the mutation occurred, the position (Position), reference standard nucleotide sequence (Reference), Nucleotide sequence (Alteration), frequency (VAF) is a table listing the mutation occurred, B using the probe composition according to the present invention to expand the results from 55,242,364 to 55,242,580 of chromosome 7 in the EGFR region to IGV viewer Deletion mutations of 15 nucleotides from 55,242,465 to 55,242,480 are shown.
도 5의 A는 정상 조직을 사용하여 종양 조직 함유량 (Tumor purity)를 100%, 50%, 30%로 줄여서 일 양상에 따른 조성물을 이용한 NGS를 통해 복제수 변이(Copy Number Variation, CNV)를 검출한 결과이다. 30%의 낮은 종양 조직 함유시에도 CDK4 및 MDM2의 복제수를 검출하고 있음을 보여 준다. B는 일 양상에 따른 실제 암종양 조직 검체 조성물을 이용한 NGS를 통해 복제수변이를 검출한 결과 중 일부를 정리한 표이다. CDK4 및 MDM2의 복제수를 휼륭하게 검출하고 있음을 보여 준다.Figure 5A is to reduce the tumor tissue content (Tumor purity) to 100%, 50%, 30% using normal tissue to detect Copy Number Variation (CNV) through NGS using the composition according to one aspect One result. It also shows that it detects the copy number of CDK4 and MDM2 even when containing 30% low tumor tissue. B is a table summarizing some of the results of detecting the copy variation through NGS using the actual cancer tumor tissue sample composition according to one aspect. It shows good detection of the number of copies of CDK4 and MDM2.
도 6의 A는 일 양상에 따른 조성물을 이용한 NGS를 통해 유전자 전좌(gene translocation)을 검출한 결과이고, B는 본 발명에 따른 프로브 조성물을 이용한 NGS를 통해 유전자 전좌를 검출한 결과 중 일부를 정리한 표이다. ALK의 인트론을 커버하는 프로브 서열이 EML이 결합되어 전좌가 발생한 영역을 정확히 검출한 결과이다. 6A is a result of detecting gene translocation through NGS using a composition according to an aspect, and B is a summary of some of results of detecting gene translocation through NGS using a probe composition according to the present invention. It is a vote. The probe sequence covering the intron of ALK is the result of accurately detecting the region where the translocation occurred due to the binding of EML.
3-4. 민감도(sensitivity) 측정3-4. Sensitivity Measurement
SNV 변이에 대해, 변이 정보를 이미 알고 있는 20개의 International HapMap 검체를 사용하여 각 빈도별로 혼합(pooling)을 한 검체 대비 민감도와 정확도를 측정하였다. 그 결과 99.72%의 민감도를 보이는 것으로 확인하였으며 실제 기대한 변이 빈도 대비 실제 측정값의 일치도는 99.43의 높은 정확도를 보였다 (표 5). For SNV mutations, 20 International HapMap samples with known mutation information were used to measure sensitivity and accuracy compared to pooled samples at each frequency. As a result, the sensitivity was 99.72%, and the agreement between the actual measured values and the expected frequency of variation showed high accuracy of 99.43 (Table 5).
도 7은 일 양상에 따른 조성물을 이용한 SNV 검출 결과의 민감도를 나타내는 그래프다. 1000x 의 시퀀싱 데이터를 생산한 경우 5% 이상의 변이를 99% 이상의 민감도로 검출하고 있음을 보여준다.7 is a graph showing the sensitivity of the SNV detection result using the composition according to one aspect. When 1000x of sequencing data is produced, more than 5% of the mutations are detected with a sensitivity of 99% or more.
InDel 변이에 대해, 변이 정보를 이미 알고 있는 28개의 cancer cell line 검체를 사용하여 각 빈도별로 혼합(pooling)을 한 검체 대비 민감도와 정확도를 측정하였다. 그 결과 99.55%의 민감도를 보이는 것으로 확인하였으며 PPV (positive prediction value)는 96.36의 높은 정확도를 보였다 (표 5). For InDel mutations, 28 cancer cell line samples with known mutation information were used to measure sensitivity and accuracy compared to pooled samples at each frequency. As a result, it was confirmed that the sensitivity is 99.55%, and the positive prediction value (PPV) has a high accuracy of 96.36 (Table 5).
CNV 변이에 대해, 변이 정보를 이미 알고 있는 4개의 암 세포주(cancer cell line) 및 normal paired 검체를 사용하여 각 빈도별로 혼합(pooling)을 한 검체 대비 민감도와 정확도를 측정하였다. 그 결과 종양 부피 30% 이상인 검체에서 100.0%의 민감도를 보이는 것으로 확인하였으며 PPV (positive prediction value)는 증폭변이(amplification)의 경우 75.0%, 결실변이(deletion)의 경우 100.0%의 높은 정확도를 보였다 (표 4). For CNV variability, sensitivity and accuracy were compared to pooled samples at each frequency using four cancer cell lines and normal paired samples with known mutation information. As a result, it was confirmed that a sensitivity of 100.0% was observed in a sample having a tumor volume of 30% or more, and PPV (positive prediction value) showed a high accuracy of 75.0% for amplification and 100.0% for deletion. Table 4).
전좌 변이에 대해, 변이 정보를 이미 알고 있는 4개의 암 세포주와 정상 세포주가 혼합된 혼합 검체를 사용하여 4가지 알고 있는 전좌 변이에 대해 민감도를 측정하였다. 그 결과 96.9%의 높은 정확도를 보였다 (표 5).For translocation mutations, sensitivity was measured for four known translocation mutations using a mixed sample of four cancer cell lines and normal cell lines with known mutation information. The result showed a high accuracy of 96.9% (Table 5).
일 양상에 따른 유전체의 변이 검출용 조성물을 이용한 각 변이별 민감도 분석 결과Sensitivity analysis result for each variation using the composition for detecting variation of the genome according to one aspect
변이transition 민감도responsiveness 변이transition 민감도responsiveness
SNVSNV 99.72%99.72% CNVCNV 100%100%
IndelIndel 99.55%99.55% TranslocationTranslocation 96.9%96.9%

Claims (20)

  1. 하기를 포함하는, 암 세포의 유전체 DNA의 변이를 검출하는데 사용하기 위한 조성물:Compositions for use in detecting mutations in genomic DNA of cancer cells, comprising:
    TERT 유전자의 뉴클레오티드 서열로부터 선택된 연속(contiguous) 뉴클레오티드 서열을 포함하는 제1 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드; A first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of the TERT gene, or a complementary polynucleotide thereof;
    ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53 및 VHL 유전자 각각의 엑손 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제2 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드; 및ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC A second polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the exon region of each of the STK11, SYK, TOP1, TP53, and VHL genes, or a complementary polynucleotide thereof; And
    ALK, RET, ROS1, EWSR1 및 TMPRSS2 유전자 각각의 인트론 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드 서열을 포함하는 제3 폴리뉴클레오티드, 또는 그의 상보적 폴리뉴클레오티드.A third polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the intron region of each of the ALK, RET, ROS1, EWSR1 and TMPRSS2 genes, or a complementary polynucleotide thereof.
  2. 청구항 1에 있어서, 상기 변이는 표준 유전체 DNA에 대하여 유전자의 카피 수의 변이 또는 뉴클레오티드 서열의 변이를 포함하는 것인 조성물.The composition of claim 1, wherein the variation comprises a variation in the copy number of the gene or a variation in the nucleotide sequence relative to standard genomic DNA.
  3. 청구항 2에 있어서, 상기 뉴클레오티드 서열의 변이는 표준 유전체 DNA에 대하여 하나 이상의 뉴클레오티드 서열의 치환, 삽입, 결실, 또는 전좌를 포함하는 것인 조성물.The composition of claim 2, wherein the variation in nucleotide sequence comprises substitution, insertion, deletion, or translocation of one or more nucleotide sequences relative to standard genomic DNA.
  4. 청구항 1에 있어서, 상기 제1 폴리뉴클레오티드는 TERT 유전자의 프로모터 영역의 뉴클레오티드 서열로부터 선택된 연속 뉴클레오티드를 포함하는 것인 조성물.The composition of claim 1, wherein the first polynucleotide comprises a contiguous nucleotide selected from the nucleotide sequence of the promoter region of the TERT gene.
  5. 청구항 1에 있어서, 상기 제1, 2, 또는 3 폴리뉴클레오티드는 DNA, RNA, 펩티드 핵산(Peptide Nucleic Acid: PNA), 잠금 핵산(Locked Nucleic Acid: LNA), 지프 핵산(Zip Nucleic Acid: ZNA), 가교 핵산(Bridged Nucleic Acid: BNA), 및 뉴클레오티드 유사체로부터 선택되는 하나 이상인 것인 조성물.The method according to claim 1, wherein the first, second, or third polynucleotide is DNA, RNA, peptide nucleic acid (Peptide Nucleic Acid (PNA), Locked Nucleic Acid (LNA), Zip Nucleic Acid (ZNA), At least one selected from bridged nucleic acid (BNA), and nucleotide analogues.
  6. 청구항 1에 있어서, 상기 제1, 2, 또는 3 폴리뉴클레오티드는 길이가 75 내지 200개의 뉴클레오티드인 것인 조성물.The composition of claim 1, wherein the first, second, or three polynucleotides are 75 to 200 nucleotides in length.
  7. 청구항 1에 있어서, 제1 폴리뉴클레오티드는 서열번호 1 내지 16의 뉴클레오티드 서열로 이루어진 군으로부터 선택되는 하나 이상의 서열을 포함하는 것인 조성물.The composition of claim 1, wherein the first polynucleotide comprises one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 1-16.
  8. 청구항 1에 있어서, 제2 폴리뉴클레오티드는 서열번호 17 내지 7266의 뉴클레오티드 서열로 이루어진 군으로부터 선택되는 하나 이상의 서열을 포함하는 것인 조성물.The composition of claim 1, wherein the second polynucleotide comprises one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 17-7266.
  9. 청구항 1에 있어서, 제3 폴리뉴클레오티드는 서열번호 7267 내지 8102의 뉴클레오티드 서열로 이루어진 군으로부터 선택되는 하나 이상의 서열을 포함하는 것인 조성물.The composition of claim 1, wherein the third polynucleotide comprises one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 7267-8102.
  10. 청구항 1에 있어서, 상기 제1, 2, 또는 3 폴리뉴클레오티드는 폴리뉴클레오티드의 분리 또는 정제를 위한 모이어티가 부착된 것인 조성물.The composition of claim 1, wherein the first, second, or third polynucleotide is attached with a moiety for isolation or purification of the polynucleotide.
  11. 청구항 1에 있어서, 상기 암 세포의 생존능을 감소시키는데 효과적인 항암제를 탐색하는데 사용하기 위한 것인 조성물.The composition of claim 1, for use in the search for an anticancer agent effective for reducing the viability of the cancer cell.
  12. 청구항 1에 있어서, 상기 암 세포를 가진 암 환자에서 암을 치료하는데 효과적인 항암제를 탐색하는데 사용하기 위한 것인 조성물.The composition of claim 1, for use in the search for an anticancer agent effective for treating cancer in a cancer patient with cancer cells.
  13. 암 세포 유래의 유전체 DNA와 청구항 1 내지 12 중 어느 한 항에 따른 조성물을 접촉시켜, 상기 유전체 DNA와 상기 조성물 중의 상기 제1, 2, 또는 3 폴리뉴클레오티드의 혼성화 산물을 얻는 단계;Contacting genomic DNA derived from a cancer cell with a composition according to any one of claims 1 to 12 to obtain a hybridization product of said genomic DNA with said first, second, or three polynucleotides in said composition;
    상기 혼성화 산물 중의 상기 유전체 DNA의 뉴클레오티드 서열을 확인하는 단계; 및 Identifying the nucleotide sequence of the genomic DNA in the hybridization product; And
    상기 유전체 DNA의 확인된 뉴클레오티드 서열을 표준 뉴클레오티드 서열과 비교하여, 상기 유전체 DNA의 변이를 확인하는 단계를 포함하는 암 세포의 유전체 DNA의 변이를 검출하는 방법.Comparing the identified nucleotide sequence of the genomic DNA with a standard nucleotide sequence to identify the mutation of the genomic DNA.
  14. 청구항 13에 있어서, 상기 유전체 변이는 표준 유전체 DNA에 대하여 유전자의 카피 수의 변이 또는 뉴클레오티드 서열의 변이를 포함하는 것인 방법.The method of claim 13, wherein the genomic variation comprises a variation in the copy number of the gene or a variation in the nucleotide sequence relative to standard genomic DNA.
  15. 청구항 13에 있어서, 상기 유전체 변이는 단일 뉴클레오티드 변이, 삽입-결실 변이, 복제수 변이 및 전좌로 이루어진 군으로부터 선택되는 하나 이상을 포함하는 것인 방법.The method of claim 13, wherein the genomic variation comprises one or more selected from the group consisting of single nucleotide variation, indel deletion variation, copy number variation and translocation.
  16. 청구항 13에 있어서, 상기 암은 간암, 교세포종, 난소암, 대장암, 두경부암, 방광암, 신장세포암, 위암, 유방암, 전이암, 전립선암, 췌장암 및 폐암으로 이루어진 군으로부터 선택되는 하나 이상인 것인 방법.The method according to claim 13, wherein the cancer is one or more selected from the group consisting of liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer Way to be.
  17. 청구항 13에 있어서, 상기 접촉시키는 단계 전에 또는 접촉시키는 단계에서 암 세포 유래의 유전체 DNA를 단편화하는 단계를 더 포함하는 것인 방법.The method of claim 13, further comprising fragmenting genomic DNA from cancer cells prior to or in the contacting step.
  18. 청구항 13에 있어서, 상기 혼성화 산물 중의 상기 유전체 DNA의 뉴클레오티드 서열을 확인하는 단계 전에, 접촉시키는 단계에서 얻어진 접촉 산물로부터 상기 유전체 DNA와 상기 제1, 2, 또는 3 폴리뉴클레오티드의 혼성화 산물을 분리하는 단계를 더 포함하는 것인 방법.The method of claim 13, wherein prior to the step of identifying the nucleotide sequence of the genomic DNA in the hybridization product, separating the hybridization product of the genomic DNA and the first, second, or third polynucleotide from the contact product obtained in the contacting step. It further comprises a.
  19. 청구항 13에 있어서, 확인된 유전체 DNA의 변이와 항암제의 개체에서의 암 치료 효과와의 상관관계를 확인하는 단계를 더 포함하는 것인 방법.The method of claim 13, further comprising the step of ascertaining a correlation between the identified variation of the genomic DNA and the cancer therapeutic effect in the subject of the anticancer agent.
  20. 청구항 19에 있어서, 상기 상관관계를 확인하는 단계는 상기 항암제의 작용기작 또는 작용의 대상이 되는 표적을 확인하는 단계를 포함하는 것인 방법.20. The method of claim 19, wherein identifying the correlation comprises identifying a mechanism of action of the anticancer agent or a target of action.
PCT/KR2015/012384 2014-11-18 2015-11-18 Gene panel for detecting cancer genome mutant WO2016080750A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2014-0161093 2014-11-18
KR20140161093 2014-11-18
KR1020150161608A KR20160059446A (en) 2014-11-18 2015-11-18 Cancer panel for identification of genomic variations in cancer
KR10-2015-0161608 2015-11-18

Publications (1)

Publication Number Publication Date
WO2016080750A1 true WO2016080750A1 (en) 2016-05-26

Family

ID=56014204

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2015/012384 WO2016080750A1 (en) 2014-11-18 2015-11-18 Gene panel for detecting cancer genome mutant

Country Status (1)

Country Link
WO (1) WO2016080750A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018131777A1 (en) * 2017-01-10 2018-07-19 사회복지법인 삼성생명공익재단 Mutant gene marker specific for bone metastasis of lung cancer
WO2020026031A3 (en) * 2018-08-02 2020-03-19 Cambridge Epigenetix Limited Methods and systems for target enrichment
CN113168886A (en) * 2018-08-13 2021-07-23 豪夫迈·罗氏有限公司 Systems and methods for germline and somatic variant calling using neural networks

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110119512A (en) * 2010-04-27 2011-11-02 사회복지법인 삼성생명공익재단 Method of detecting gene mutation using a blocking primer
WO2012011952A2 (en) * 2010-07-20 2012-01-26 Board Of Trustees Of The University Of Arkansas Copy number variant-dependent genes as diagnostic tools, predictive biomarkers and therapeutic targets
WO2013033169A1 (en) * 2011-08-31 2013-03-07 Sanofi Methods of identifying genomic translocations associated with cancer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110119512A (en) * 2010-04-27 2011-11-02 사회복지법인 삼성생명공익재단 Method of detecting gene mutation using a blocking primer
WO2012011952A2 (en) * 2010-07-20 2012-01-26 Board Of Trustees Of The University Of Arkansas Copy number variant-dependent genes as diagnostic tools, predictive biomarkers and therapeutic targets
WO2013033169A1 (en) * 2011-08-31 2013-03-07 Sanofi Methods of identifying genomic translocations associated with cancer

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PRITCHARD ET AL., THE JOURNAL OF MOLECULAR DIAGNOSTICS, vol. 16, no. 1, January 2014 (2014-01-01), pages 56 - 67 *
WAGLE ET AL.: "High-throughput detection of actionable genomic alterations in clinical tumor samples by targeted, massively parallel sequencing", CANCER DISCOVERY, vol. 2, 2012, pages 82 - 93, XP055150657, DOI: doi:10.1158/2159-8290.CD-11-0184 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018131777A1 (en) * 2017-01-10 2018-07-19 사회복지법인 삼성생명공익재단 Mutant gene marker specific for bone metastasis of lung cancer
WO2020026031A3 (en) * 2018-08-02 2020-03-19 Cambridge Epigenetix Limited Methods and systems for target enrichment
CN113168886A (en) * 2018-08-13 2021-07-23 豪夫迈·罗氏有限公司 Systems and methods for germline and somatic variant calling using neural networks

Similar Documents

Publication Publication Date Title
JP7434215B2 (en) Method for capturing cell-free methylated DNA and its use
US20230141527A1 (en) Methods for attaching adapters to sample nucleic acids
JP2022519159A (en) Analytical method of circulating cells
US20200255913A1 (en) Novel Markers for Detecting Microsatellite Instability in Cancer and Determining Synthetic Lethality with Inhibition of the DNA Base Excision Repair Pathway
US10294529B2 (en) Microsatellite instability markers in detection of cancer
KR20210131432A (en) Optimization of multigene analysis of tumor samples
US11939636B2 (en) Methods and systems for improving patient monitoring after surgery
US11891653B2 (en) Compositions and methods for analyzing cell-free DNA in methylation partitioning assays
US11384382B2 (en) Methods of attaching adapters to sample nucleic acids
US11946106B2 (en) Methods and systems to improve the signal to noise ratio of DNA methylation partitioning assays
CA3035386A1 (en) Methods and composition for the prediction of the activity of enzastaurin
US20220154286A1 (en) Compositions and methods for analyzing dna using partitioning and base conversion
WO2020096248A1 (en) Manufacturing and detection method of probe for detecting mutations in lung cancer tissue cells
US20220340979A1 (en) Use of cell free bacterial nucleic acids for detection of cancer
WO2016080750A1 (en) Gene panel for detecting cancer genome mutant
KR20220021909A (en) Rapid aneuploidy detection
WO2020096247A1 (en) Method for preparing probe for detecting mutation derived from cells in tissues of breast cancer and detection method
EP3696278A1 (en) Method of determining the origin of nucleic acids in a mixed sample
US20240093292A1 (en) Quality control method
US20210214803A1 (en) Methods and systems for improving patient monitoring after surgery
WO2022126938A1 (en) Method for detecting polynucleotide variations
US20230313288A1 (en) Methods for sequence determination using partitioned nucleic acids
WO2018216905A2 (en) Method for generating frequency distribution of background allele in sequencing data obtained from acellular nucleic acid, and method for detecting mutation from acellular nucleic acid using same
CA3199829A1 (en) Compositions and methods for enriching methylated polynucleotides
Konnick et al. DNA in Neoplastic Disease Diagnosis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15860874

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15860874

Country of ref document: EP

Kind code of ref document: A1