WO2016080750A1

WO2016080750A1 - Gene panel for detecting cancer genome mutant

Info

Publication number: WO2016080750A1
Application number: PCT/KR2015/012384
Authority: WO
Inventors: 배준설; 박웅양; 김나영
Original assignee: 사회복지법인 삼성생명공익재단
Priority date: 2014-11-18
Filing date: 2015-11-18
Publication date: 2016-05-26

Abstract

The present invention relates to a composition for detecting the mutation of cancer cell genomic DNA and a method for detecting the mutation of cancer cell genomic DNA. The composition, according to one aspect of the present invention, may be used for analyzing, at once with high sensitivity and accuracy, various gene mutations from a sample comprising a cancer cell genome, and on the basis of the results of such analysis, a patient-tailored cancer therapeutic agent may be efficiently explored.

Description

Gene panel for detecting cancer genome mutations

A composition for use in detecting mutations in genomic DNA of cancer cells, and a method for detecting mutations in genomic DNA of cancer cells.

Cancer is a disease of various kinds depending on the tissue and cells that develop, and also causes the cause. Cancers can be accompanied by a variety of genomic variations in each tumor, and many studies have reported that somatic mutations can significantly affect the development and progression of cancer. Accordingly, the method of detecting genome mutations in cancer cells has attracted great attention. In addition, the detected mutation information can greatly help cancer patients in selecting a custom anticancer agent.

However, since existing genome mutation detection methods are designed to detect only one genome variation, additional experiments for detecting other genome mutations are needed to detect various somatic mutations that cause cancer. It has the disadvantage that no new variant can be found other than the variant. In addition, the conventional method performs a separate detection method (e.g., SNV: real-time PCR, CNV: CGH array; or translocation: FISH; etc.) according to each genotype variation. It takes a lot of time to detect the variation of, and a large cost is generated. Therefore, there is a need for a method capable of easily and quickly detecting various genome variations occurring in cancer patients with high sensitivity and accuracy.

One aspect is to provide a composition for use in detecting mutations in genomic DNA of cancer cells.

Another aspect is to provide a method for detecting mutations in genomic DNA of cancer cells.

One aspect includes a first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of a TERT gene, or a complementary polynucleotide thereof; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC A second polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the exon region of each of the STK11, SYK, TOP1, TP53, and VHL genes, or a complementary polynucleotide thereof; And a third polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the intron region of each of the ALK, RET, ROS1, EWSR1 and TMPRSS2 genes, or complementary polynucleotides thereof A composition for use is provided.

The term "polynucleotide" refers to a nucleotide polymer of any length, where polynucleotides can be used interchangeably with nucleic acid, or oligonucleotide.

The term "gene" refers to a structural unit that determines genetic information, and expresses the expression of a structural gene and / or structural gene having information that determines the amino acid sequence of a protein or the nucleotide sequence of a functional RNA (tRNA, rRNA, etc.). Controlling genes (eg, promoters, repressors, operators, etc.). The term "gene" is understood herein to mean a single stranded side comprising a nucleotide sequence that is transcribed to produce a product of a gene. For example, "nucleotide sequence of a gene" shall mean a nucleotide sequence that encodes a function contained in a single strand comprising a nucleotide sequence that is transcribed to produce a product of a gene and / or a nucleotide sequence that controls the expression of a structural gene. Can be.

The term "contiguous nucleotide sequence" means two or more adjacent nucleotide sequences.

The term "complementary" means having a degree of complementarity capable of selectively hybridizing to the above-described nucleotide sequence under certain specific hybridization or annealing conditions. The term "complementary" may encompass both substantially complementary and perfectly complementary, and may specifically mean completely complementary.

The term "exon" refers to a region of a DNA sequence that contains protein synthesis information, for example, a nucleic acid molecule that encodes some or all of the expressed protein.

The term “intron” refers to a nucleic acid molecule segment that is transcribed into RNA but spliced from endogenous RNA before it is translated into protein.

The mutation may be one with respect to standard genomic DNA. Specifically, the variation may include variation in the copy number of the gene or variation in the nucleotide sequence with respect to standard genomic DNA. The variation in the copy number of the gene may be, for example, a copy variation (CNV). Variations in the nucleotide sequence may include substitution, insertion, deletion, or translocation of one or more nucleotide sequences relative to standard genomic DNA. Substitution of the one or more nucleotide sequences can be, for example, a single nucleotide variation (SNV). In addition, the mutation of the genomic DNA of the cancer cell may be more specifically one or more selected from the group consisting of a single nucleotide variation, indel deletion, copy number variation and translocation. The genome variant may also be most specifically a single nucleotide variant, an indel deletion, a copying variant and a translocation.

The term “Single Nucleotide Variation (SNV)” refers to the difference between a single base in which a single nucleotide polymorphism occurs in multiple populations in one species, whereas a small population in a sequence or species It means the difference between the single base appearing in, for example, may mean a difference from the standard base sequence shown in the sequencing data.

The term “insertion deletion” (Indel) refers to the insertion or deletion of a nucleotide sequence capable of changing the number of nucleic acids of a gene.

The term “Copy Number Variation (CNV)” refers to a variation in genomic DNA that appears repeatedly or is missing or amplified by a relatively large region of a particular chromosome. For example, overlapping DNA fragments of 1 kB or more It may be a mutation in which some are deleted.

The term "translocation" refers to the phenomenon that cleavage occurs in a portion of a chromosome, and that fragment binds to another portion or another chromosome of the same chromosome to change the shape of the chromosome.

The single nucleotide variation, or indel deletion, may be, for example, occurring at a promoter region of the TERT gene, and at one or more positions selected from Table 1 below. The promoter region of the TERT gene may be, for example, at position 1295163-1296162 of the human 5 chromosome. The copy number variation may occur at one or more positions selected from, for example, Table 1 below. For example, the translocation may occur at one or more positions selected from Table 2 below.

The first, second, or three polynucleotides may be specifically single stranded.

The first, second, or third polynucleotides include DNA, RNA, peptide nucleic acid (PNA), locked nucleic acid (LNA), zip nucleic acid (ZNA), bridged nucleic acid (Bridged Nucleic). Acid: BNA) and nucleotide analogues. The polynucleotide may specifically be DNA or RNA, and more specifically RNA. In one embodiment, a polynucleotide consisting of RNA was used to detect mutations in the cancer genome. When the polynucleotide is RNA, the binding strength is superior to other strengths, so that the hybridization time is shortened and has a high detection sensitivity. Therefore, it may be advantageous to detect large region mutations such as nucleotide deletion of 25 bp or more.

The first, second, or third polynucleotide is, for example, 75 to 200, 80 to 200, 90 to 200, 100 to 200, 100 to 180, 100 to 160, 100 to 140, 100 Nucleotides of from 120 to 110, from 110 to 180, from 110 to 160, from 110 to 140, from 110 to 130, or from 110 to 120. Specifically, it may be 110 to 120 nucleotides in size, and more specifically 120 nucleotides in size. When the first, second, or third polynucleotide has a size of 75 or less, the accuracy of capturing a target region is low, and when the size of 200 or more, the synthesis cost increases. Thus, the first, second, or third polynucleotides are economically sized and optimized for detecting mutations in genomic DNA.

The first, second, or third polynucleotide may specifically mean a population consisting of two or more polynucleotides. In this case, each of the sequences of the polynucleotides constituting the population includes a portion of the corresponding gene nucleotide sequence, and there is no corresponding gene nucleotide sequence region not included in the polynucleotide. This means that the entire nucleotide sequence of a gene of interest can be covered by the polynucleotides that make up the population. The term "cover by polynucleotides" herein means that the polynucleotide comprises a particular nucleotide sequence of a gene or a sequence complementary to that sequence.

In addition, when the first, second, or third polynucleotide is a polynucleotide population, one nucleotide in the nucleotide sequence of the gene may be covered by two or more specifically polynucleotides constituting three or more populations.

Also, when the first, second, or third polynucleotide is a population, any of the polynucleotides constituting the population and other polynucleotides including the nucleotide sequence of the gene closest to the population may be, for example, 50 to 150, 60 to 60 It may have 140, 70-120, 70-110, 70-100, 70-90, 70-80 or 80 identical sequences. Therefore, the oncogene can be sequenced with high coverage using the composition according to one aspect.

The first polynucleotide may specifically include a continuous nucleotide selected from the nucleotide sequence of the promoter region of the TERT gene.

The term “promoter” refers to a DNA region that is present in the upstream region of a structural gene and to which RNA polymerase binds to initiate transcription. The promoter region of the TERT gene may be, for example, at position 1295163-1296162 of the human 5 chromosome.

Specific examples of each of the genes related to the second polynucleotide may be shown in Table 1 below. The genes listed in Table 1 are all derived from humans.

Embodiments of Genes Associated with Second Polynucleotide

TargetIDTargetID	IntervalInterval	ChrChr	StartStart	EndEnd	RegionsRegions	SizeSize
ABL1ABL1	chr9:133589697-133761080chr9: 133589697-133761080	99	133589697133589697	133761080133761080	1414	3,964 3,964
AKT1AKT1	chr14:105236668-105258990chr14: 105236668-105258990	1414	105236668105236668	105258990105258990	1313	1,703 1,703
AKT2AKT2	chr19:40739466-40771184chr19: 40739466-40771184	1919	4073946640739466	4077118440771184	1515	2,133 2,133
AKT3AKT3	chr1:243663035-244006482chr1: 243663035-244006482	1One	243663035243663035	244006482244006482	1414	1,764 1,764
ALKALK	chr2:29416080-30143535chr2: 29416080-30143535	22	2941608029416080	3014353530143535	3131	5,738 5,738
APCAPC	chr5:112043405-112198243chr5: 112043405-112198243	55	112043405112043405	112198243112198243	1919	9,166 9,166
ARID1AARID1A	chr1:27022885-27107257chr1: 27022885-27107257	1One	2702288527022885	2710725727107257	2020	7,260 7,260
ARID1BARID1B	chr6:157099054-157529035chr6: 157099054-157529035	66	157099054157099054	157529035157529035	2323	7,836 7,836
ARID2ARID2	chr12:46123610-46298871chr12: 46123610-46298871	1212	4612361046123610	4629887146298871	2424	6,067 6,067
ATMATM	chr11:108098342-108236245chr11: 108098342-108236245	1111	108098342108098342	108236245108236245	6262	10,411 10,411
ATRXATRX	chrX:76763819-77041497chrX: 76763819-77041497	XX	7676381976763819	7704149777041497	3737	8,299 8,299
AURKAAURKA	chr20:54945204-54963263chr20: 54945204-54963263	2020	5494520454945204	5496326354963263	88	1,387 1,387
AURKBAURKB	chr17:8108179-8113552chr17: 8108179-8113552	1717	81081798108179	81135528113552	88	1,198 1,198
BCL2BCL2	chr18:60795848-60985909chr18: 60795848-60985909	1818	6079584860795848	6098590960985909	22	793 793
BRAFBRAF	chr7:140426284-140624513chr7: 140426284-140624513	77	140426284140426284	140624513140624513	2121	2,799 2,799
BRCA1BRCA1	chr17:41197685-41276123chr17: 41197685-41276123	1717	4119768541197685	4127612341276123	2424	6,184 6,184
BRCA2BRCA2	chr13:32890588-32972917chr13: 32890588-32972917	1313	3289058832890588	3297291732972917	2626	10,777 10,777
CDH1CDH1	chr16:68771309-68868194chr16: 68771309-68868194	1616	6877130968771309	6886819468868194	1717	3,135 3,135
CDK4CDK4	chr12:58142298-58145510chr12: 58142298-58145510	1212	5814229858142298	5814551058145510	77	1,052 1,052
CDK6CDK6	chr7:92244444-92462647chr7: 92244444-92462647	77	9224444492244444	9246264792462647	77	1,121 1,121
CDKN2ACDKN2A	chr9:21968218-21994463chr9: 21968218-21994463	99	2196821821968218	2199446321994463	55	1,135 1,135
CSF1RCSF1R	chr5:149433622-149466000chr5: 149433622-149466000	55	149433622149433622	149466000149466000	2222	3,449 3,449
CTNNB1CTNNB1	chr3:41265550-41280843chr3: 41265 550-41280843	33	4126555041265550	4128084341280843	1414	2,626 2,626
DDR2DDR2	chr1:162688844-162750046chr1: 162688844-162750046	1One	162688844162688844	162750046162750046	1616	3,064 3,064
EGFREGFR	chr7:55086961-55273320chr7: 55086961-55273320	77	5508696155086961	5527332055273320	3232	4,734 4,734
EPHB4EPHB4	chr7:100401073-100424662chr7: 100401073-100424662	77	100401073100401073	100424662100424662	1717	3,304 3,304
ERBB2ERBB2	chr17:37855803-37884307chr17: 37855803-37884307	1717	3785580337855803	3788430737884307	2828	4,360 4,360
ERBB3ERBB3	chr12:56474075-56495849chr12: 56474075-56495849	1212	5647407556474075	5649584956495849	2929	4,745 4,745
ERBB4ERBB4	chr2:212248330-213403264chr2: 212248330-213403264	22	212248330212248330	213403264213403264	2929	4,552 4,552
EZH2EZH2	chr7:148504728-148544400chr7: 148504728-148544400	77	148504728148504728	148544400148544400	2121	2,876 2,876
FBXW7FBXW7	chr4:153244023-153332965chr4: 153244023-153332965	44	153244023153244023	153332965153332965	1414	2,898 2,898
FGFR1FGFR1	chr8:38271136-38318634chr8: 38271136-38318634	88	3827113638271136	3831863438318634	2020	3,220 3,220
FGFR2FGFR2	chr10:123239085-123353341chr10: 123239085-123353341	1010	123239085123239085	123353341123353341	2323	3,259 3,259
FGFR3FGFR3	chr4:1795652-1809424chr4: 1795652-1809424	44	17956521795652	18094241809424	1818	3,360 3,360
FLT3FLT3	chr13:28578179-28674657chr13: 28578179-28674657	1313	2857817928578179	2867465728674657	2525	3,504 3,504
GNA11GNA11	chr19:3094640-3121187chr19: 3094640-3121187	1919	30946403094640	31211873121187	77	1,220 1,220
GNAQGNAQ	chr9:80336229-80646161chr9: 80336229-80646161	99	8033622980336229	8064616180646161	88	1,289 1,289
GNASGNAS	chr20:57415152-57485894chr20: 57415152-57485894	2020	5741515257415152	5748589457485894	1717	4,436 4,436
HNF1AHNF1A	chr12:121416562-121440298chr12: 121416562-121440298	1212	121416562121416562	121440298121440298	1212	2,729 2,729
HRASHRAS	chr11:532626-534332chr11: 532626-534332	1111	532626532626	534332534332	55	733 733
IDH1IDH1	chr2:209101793-209116285chr2: 209101793-209116285	22	209101793209101793	209116285209116285	88	1,408 1,408
IDH2IDH2	chr15:90627488-90645632chr15: 90627488-90645632	1515	9062748890627488	9064563290645632	1111	1,579 1,579
IGF1RIGF1R	chr15:99192801-99500681chr15: 99192801-99500681	1515	9919280199192801	9950068199500681	2121	4,524 4,524
ITKITK	chr5:156607979-156679698chr5: 156607979-156679698		55	156607979156607979	156679698156679698	1818	2,249 2,249
JAK1JAK1	chr1:65300235-65351957chr1: 65300235-65351957		1One	6530023565300235	6535195765351957	2424	3,945 3,945
JAK2JAK2	chr9:5021978-5126801chr9: 5021978-5126801		99	50219785021978	51268015126801	2323	3,859 3,859
JAK3JAK3	chr19:17937542-17955236chr19: 17937542-17955236		1919	1793754217937542	1795523617955236	2323	3,987 3,987
KDRKDR	chr4:55946098-55991470chr4: 55946098-55991470		44	5594609855946098	5599147055991470	3030	4,671 4,671
KITKIT	chr4:55524172-55604733chr4: 55524172-55604733		44	5552417255524172	5560473355604733	2222	3,379 3,379
KRASKRAS	chr12:25362719-25398328chr12: 25362719-25398328		1212	2536271925362719	2539832825398328	55	787 787
MDM2MDM2	chr12:69202248-69233639chr12: 69202248-69233639		1212	6920224869202248	6923363969233639	1414	1,945 1,945
METMET	chr7:116335801-116436188chr7: 116335801-116436188		77	116335801116335801	116436188116436188	2121	4,779 4,779
MLH1MLH1	chr3:37035029-37107120chr3: 37035029-37107120		33	3703502937035029	3710712037107120	2020	2,711 2,711
MPLMPL	chr1:43803510-43818453chr1: 43803510-43818453		1One	4380351043803510	4381845343818453	1212	2,233 2,233
MTORMTOR	chr1:11166652-11319476chr1: 11166652-11319476		1One	1116665211166652	1131947611319476	6060	9,217 9,217
NF1NF1	chr17:29422318-29705959chr17: 29422318-29705959		1717	2942231829422318	2970595929705959	6060	9,902 9,902
NOTCH1NOTCH1	chr9:139390513-139440248chr9: 139390513-139440248	99	139390513139390513	139440248139440248	3434	8,348 8,348
NPM1NPM1	chr5:170814943-170837579chr5: 170814943-170837579	55	170814943170814943	170837579170837579	1212	1,134 1,134
NRASNRAS	chr1:115251146-115258791chr1: 115251146-115258791	1One	115251146115251146	115258791115258791	44	650 650
NTRK1NTRK1	chr1:156785612-156851444chr1: 156785612-156851444	1One	156785612156785612	156851444156851444	1919	2,902 2,902
PDGFRAPDGFRA	chr4:55106210-55161449chr4: 55106210-55161449	44	5510621055106210	5516144955161449	2424	3,930 3,930
PDGFRBPDGFRB	chr5:149495316-149516620chr5: 149495316-149516620	55	149495316149495316	149516620149516620	2222	3,795 3,795
PIK3CAPIK3CA	chr3:178916604-178952162chr3: 178916604-178952162	33	178916604178916604	178952162178952162	2020	3,609 3,609
PIK3R1PIK3R1	chr5:67522494-67593439chr5: 67522494-67593439	55	6752249467522494	6759343967593439	1919	2,779 2,779
PTCH1PTCH1	chr9:98208655-98279112chr9: 98208655-98279112	99	9820865598208655	9827911298279112	2727	5,131 5,131
PTCH2PTCH2	chr1:45286351-45308614chr1: 45286351-45308614	1One	4528635145286351	4530861445308614	2424	4,171 4,171
PTENPTEN	chr10:89624217-89725239chr10: 89624217-89725239	1010	8962421789624217	8972523989725239	99	1,392 1,392
PTPN11PTPN11	chr12:112856906-112942578chr12: 112856906-112942578	1212	112856906112856906	112942578112942578	1616	2,112 2,112
RB1RB1	chr13:48878039-49054217chr13: 48878039-49054217	1313	4887803948878039	4905421749054217	2727	3,327 3,327
RETRET	chr10:43572697-43623727chr10: 43572697-43623727	1010	4357269743572697	4362372743623727	2020	3,777 3,777
ROS1ROS1	chr6:117609645-117746829chr6: 117609645-117746829	66	117609645117609645	117746829117746829	4545	7,978 7,978
SMAD4SMAD4	chr18:48573407-48604847chr18: 48573407-48604847	1818	4857340748573407	4860484748604847	1313	1,999 1,999
SMARCB1SMARCB1	chr22:24129347-24176377chr22: 24129347-24176377	2222	2412934724129347	2417637724176377	99	1,392 1,392
SMOSMO	chr7:128828983-128852302chr7: 128828983-128852302	77	128828983128828983	128852302128852302	1313	2,647 2,647
SRCSRC	chr20:36012547-36031792chr20: 36012547-36031792	2020	3601254736012547	3603179236031792	1212	1,869 1,869
STK11STK11	chr19:1206903-1226656chr19: 1206903-1226656	1919	12069031206903	12266561226656	99	1,482 1,482
SYKSYK	chr9:93606171-93657892chr9: 93606171-93657892	99	9360617193606171	9365789293657892	1313	2,168 2,168
TOP1TOP1	chr20:39657698-39751947chr20: 39657698-39751947	2020	3965769839657698	3975194739751947	2121	2,718 2,718
TP53TP53	chr17:7565247-7579922chr17: 7565247-7579922	1717	75652477565247	75799227579922	1414	1,697 1,697
VHLVHL	chr3:10183522-10191659chr3: 10183522-10191659	33	1018352210183522	1019165910191659	33	702 702
TotalTotal					1,5551,555	287,164 287,164

* In Table 1, Region means the number of target exon regions (but may include regions other than exons).

Specific examples of the intron region of each gene associated with the third polynucleotide may be shown in Table 2 below. The genes listed in Table 2 are all derived from humans.

Embodiments of each gene intron associated with a third polynucleotide

GeneGene	TargetIDTargetID	IntervalInterval	Chr.Chr.	StartStart	EndEnd	RegionsRegions	SizeSize
ALKALK	ALK1ALK1	chr2:29445475-29446206chr2: 29445475-29446206	22	2944547529445475	2944620629446206	1One	732 732
ALKALK	ALK2ALK2	chr2:29446396-29448325chr2: 29446396-29448325	22	2944639629446396	2944832529448325	1One	1,930 1,930
ALKALK	ALK3ALK3	chr2:29448433-29449786chr2: 29448433-29449786	22	2944843329448433	2944978629449786	1One	1,354 1,354
EWSR1EWSR1	EWSR11EWSR11	chr22:29670273-29674017chr22: 29670273-29674017	2222	2967027329670273	2967401729674017	1One	3,745 3,745
EWSR1EWSR1	EWSR110EWSR110	chr22:29693941-29694721chr22: 29693941-29694721	2222	2969394129693941	2969472129694721	1One	781 781
EWSR1EWSR1	EWSR111EWSR111	chr22:29694887-29695222chr22: 29694887-29695222	2222	2969488729694887	2969522229695222	1One	336 336
EWSR1EWSR1	EWSR112EWSR112	chr22:29695323-29695587chr22: 29695323-29695587	2222	2969532329695323	2969558729695587	1One	265 265
EWSR1EWSR1	EWSR12EWSR12	chr22:29674207-29682910chr22: 29674207-29682910	2222	2967420729674207	2968291029682910	1One	8,704 8,704
EWSR1EWSR1	EWSR13EWSR13	chr22:29678548-29682910chr22: 29678548-29682910	2222	2967854829678548	2968291029682910	1One	4,363 4,363
EWSR1EWSR1	EWSR14EWSR14	chr22:29683125-29684593chr22: 29683125-29684593	2222	2968312529683125	2968459329684593	1One	1,469 1,469
EWSR1EWSR1	EWSR15EWSR15	chr22:29684777-29687552chr22: 29684777-29687552	2222	2968477729684777	2968755229687552	1One	2,776 2,776
EWSR1EWSR1	EWSR16EWSR16	chr22:29687590-29688124chr22: 29687590-29688124	2222	2968759029687590	2968812429688124	1One	535 535
EWSR1EWSR1	EWSR17EWSR17	chr22:29688160-29688475chr22: 29688160-29688475	2222	2968816029688160	2968847529688475	1One	316 316
EWSR1EWSR1	EWSR18EWSR18	chr22:29688597-29692227chr22: 29688597-29692227	2222	2968859729688597	2969222729692227	1One	3,631 3,631
EWSR1EWSR1	EWSR19EWSR19	chr22:29692360-29693815chr22: 29692360-29693815	2222	2969236029692360	2969381529693815	1One	1,456 1,456
RETRET	RET1RET1	chr10:43604680-43606653chr10: 43604680-43606653	1010	4360468043604680	4360665343606653	1One	1,974 1,974
RETRET	RET2RET2	chr10:43606915-43607545chr10: 43606915-43607545	1010	4360691543606915	4360754543607545	1One	631 631
RETRET	RET3RET3	chr10:43607674-43608299chr10: 43607674-43608299	1010	4360767443607674	4360829943608299	1One	626 626
RETRET	RET4RET4	chr10:43608413-43609002chr10: 43608413-43609002	1010	4360841343608413	4360900243609002	1One	590 590
RETRET	RET5RET5	chr10:43609125-43609926chr10: 43609125-43609926	1010	4360912543609125	4360992643609926	1One	802 802
RETRET	RET6RET6	chr10:43610186-43612030chr10: 43610186-43612030	1010	4361018643610186	4361203043612030	1One	1,845 1,845
ROS1ROS1	ROS11ROS11	chr6:117641195-117642420chr6: 117641195-117642420	66	117641195117641195	117642420117642420	1One	1,226 1,226
ROS1ROS1	ROS12ROS12	chr6:117642559-117645493chr6: 117642559-117645493	66	117642559117642559	117645493117645493	1One	2,935 2,935
ROS1ROS1	ROS13ROS13	chr6:117645580-117647385chr6: 117645580-117647385	66	117645580117645580	117647385117647385	1One	1,806 1,806
ROS1ROS1	ROS14ROS14	chr6:117647579-117650490chr6: 117647579-117650490	66	117647579117647579	117650490117650490	1One	2,912 2,912
ROS1ROS1	ROS15ROS15	chr6:117650611-117658333chr6: 117650611-117658333	66	117650611117650611	117658333117658333	1One	7,723 7,723
TMPRSS2TMPRSS2	TMPRSS21TMPRSS21	chr21:42852531-42860319chr21: 42852531-42860319	2121	4285253142852531	4286031942860319	1One	7,789 7,789
TMPRSS2TMPRSS2	TMPRSS22TMPRSS22	chr21:42860442-42861432chr21: 42860442-42861432	2121	4286044242860442	4286143242861432	1One	991 991
TMPRSS2TMPRSS2	TMPRSS23TMPRSS23	chr21:42861522-42866281chr21: 42861522-42866281	2121	4286152242861522	4286628142866281	1One	4,760 4,760
TMPRSS2TMPRSS2	TMPRSS24TMPRSS24	chr21:42866507-42870044chr21: 42866507-42870044	2121	4286650742866507	4287004442870044	1One	3,538 3,538
TMPRSS2TMPRSS2	TMPRSS25TMPRSS25	chr21:42870118-42880006chr21: 42870118-42880006	2121	4287011842870118	4288000642880006	1One	9,889 9,889
TotalTotal						3131	82,430 82,430

The first, second, or third polynucleotide of the present invention may specifically bind to the sequence of the target gene. The specific binding properties of such polynucleotides can be used to effectively separate target genes or fragments thereof from the mixture from the mixture. Thus, the polynucleotide can be named as a probe. The term "probe" refers to a substance that specifically detects a particular substance, site, condition, and the like.

In one embodiment, the first polynucleotide may be for single nucleotide variation and / or indel deletion. In one embodiment, the second polynucleotide may be for detecting a single nucleotide variation, indel deletion and / or copy number variation. In one embodiment, the third polynucleotide may be for detecting gene translocation. Therefore, the composition of the present invention includes all of the first to third polynucleotides to perform a single nucleotide mutation (SNV), indel, mutation (CNV) and translocation in the cancer cell genome. All have the benefit of being detectable.

Specifically, the first polynucleotide may include one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 1 to 16, and more specifically, polynucleotides having respective sequences of SEQ ID NOs: 1 to 16. It may be all inclusive.

Specifically, the second polynucleotide may include one or more sequences selected from the group consisting of SEQ ID NOs: 17 to 7266, and more specifically, all polynucleotides having respective sequences of SEQ ID NOs: 17 to 7266. It may be to include.

Specifically, the third polynucleotide may include one or more sequences selected from the group consisting of the sequences of SEQ ID NOs: 7267 to 8102, and more specifically, may include all of the polynucleotides having the sequences of 7267 to 8102. have. The third polynucleotide has a length of 75 or more (eg 120 lengths) and is designed to cover the intron region of 5 genes (ALK, RET, EWSR1, ROS1, TMPRSS2). New regions and genes in which translocations have occurred can be detected.

The first, second, or third polynucleotide may further include a moiety for isolation or purification of the polynucleotide. The moiety may be attached to one or more of the nucleotides that comprise the polynucleotide. The moiety may comprise one or more selected from the group consisting of biotin, avidin, and streptavidin. In addition, the moiety, for example, biotin, avidin or streptavidin may include magnetic beads, or a substance specifically binding to the moiety may include magnetic beads. The separation or purification may be by a substance or magnetic field that specifically binds to the moiety. In one embodiment, biotin is attached to one or more bases of the polynucleotide, the biotin attached polynucleotide (probe) is hybridized with genomic DNA, and then the streptavidin particles coated on the magnetic beads are combined, followed by a magnetic field. Polynucleotides hybridized with genomic DNA were isolated using.

The cancer cells in the composition may be isolated from cancer patients. The cancer may be, for example, solid cancer, and specifically, the cancer may include liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer. It may be one or more selected from the group consisting of. However, the present invention is not limited thereto, and the composition is applicable to all carcinomas.

The composition may be for use in the search for an anticancer agent that is effective in reducing the viability of cancer cells. Reduction of the viability of the cancer cells is understood to include removal of cancer cells, inhibition or delay of metastasis or growth of cancer cells, and the like. The composition may also be for use in the search for an anticancer agent effective for treating cancer in a patient with cancer cells. The effective anticancer agent may mean an anticancer agent having excellent viability reduction or cancer treatment effect when compared with an anticancer agent selected without using the composition.

The composition may be in a liquid state. In addition, the liquid may be an aqueous solution. The composition may further comprise a buffer. The composition may be one containing first, second, and third polynucleotides in one container.

The composition for detecting the mutation of the cancer cell genome according to the invention may be in the form of a kit. The kit comprises a first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of the TERT gene, or a complementary polynucleotide thereof; ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC A second polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the exon region of each of the STK11, SYK, TOP1, TP53, and VHL genes, or a complementary polynucleotide thereof; And a third polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the intron region of each of the ALK, RET, ROS1, EWSR1 and TMPRSS2 genes, or complementary polynucleotides thereof, as described above. same.

The kit may further comprise known materials required for the polynucleotide to hybridize with the genomic nucleic acid. For example, it may further include reagents, buffers, buffers, cofactors, and / or substrates required for hybridization of the nucleic acid. Also, when the kit is subjected to a PCR amplification process, optionally, it may include reagents required for PCR amplification, such as buffers, DNA polymerases, DNA polymerase cofactors and dNTPs, and when the kit is subjected to an immunoassay. The kit of the present invention may optionally comprise a secondary antibody and a substrate of the label. In addition, the kit may further include instructions for use to amplify the target nucleic acid, and may be manufactured in a number of separate packaging or compartments containing the reagent components described above.

Another aspect provides a method of detecting mutations in genomic DNA of cancer cells using the composition. In one embodiment, the detection method comprises contacting the genomic DNA derived from a cancer cell with a composition for use in detecting a mutation of the genomic DNA of the cancer cell, whereby the genomic DNA and the first, second, or third in the composition are contacted. Obtaining a hybridization product of a polynucleotide; Identifying the nucleotide sequence of the genomic DNA in the hybridization product; And comparing the identified nucleotide sequence of the genomic DNA with a standard nucleotide sequence to identify the variation of the genomic DNA.

In the detection method, the composition is as described above.

The dielectric variation is as described above. The genome variation may specifically be one or more of a single nucleotide variation, an insertion-deletion variation, a copy number variation, and a translocation, and more specifically, may include all of a single nucleotide variation, an insertion-deletion variation, a copy number variation, and a translocation. .

The cancer may for example be a solid cancer. Specifically, the cancer may be at least one selected from the group consisting of liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer.

The genomic DNA derived from the cancer cell may be genomic DNA or fragment thereof isolated from a biological sample. For example, the sample may be any one or more selected from the group consisting of blood, saliva, urine, feces, tissues, cells, and biopsies. The sample may be one that contains a stored biological sample or genomic DNA isolated therefrom. The storage may be stored by a known method. The genomic DNA may be DNA or RNA derived from tissues stored in frozen storage or formalin fixed paraffin-embedded tissue at room temperature. Methods of separating genomic DNA from biological samples are well known. The cancer cell may be isolated from a cancer patient. Thus, the sample may be isolated from cells, tissues, organs, and body fluids of a cancer patient, in which case the sample may be subjected to biopsy using conventional methods, for example, methods well known by those skilled in the relevant medical techniques. Can be obtained.

In one embodiment, the genomic DNA contained in the sample may be fragmented (fragmentation) to any size. In addition, the method may further comprise the step of fragmenting the genomic DNA derived from cancer cells before or in the contacting step. Such fragmentation can be carried out by methods well known to those skilled in the art. For example, genomic DNA can be fragmented by the use of ultrasound. The detection method may include ligation of a sequence for amplification at both ends of the fragmented genomic DNA after fragmentation of the genomic DNA. The method of ligation of the sequence (eg, paired-end tag, universal tag) for the amplification can be performed by those skilled in the art by appropriately selecting a known technique.

In the detection method, the hybridization may be performed by a known method. For example, it can be performed by incubating the polynucleotide and genomic DNA in a buffer known to be suitable for hybridization of nucleic acids. Hybridization can be carried out at an appropriate temperature. Suitable temperatures for hybridization can be, for example, 40 to 80 ° C, 50 to 75 ° C, 60 to 70 ° C, or 62 to 67 ° C, specifically 65 ° C. In addition, the hybridization temperature is not limited thereto, and may be appropriately selected according to the sequence and length of the polynucleotide included in the composition. Hybridization time can be, for example, for 1 hour to 12 hours (overnight). In one embodiment the polynucleotides included in the composition hybridize with fragments of genomic DNA having the sequence of the genes they target.

The method may further comprise separating the hybridization product of genomic DNA and the first, second, or three polynucleotides. The separating step of the hybridization product may be to separate the hybridization product from the contact product obtained in the contacting step before the step of identifying the nucleotide sequence of the genomic DNA in the hybridization product. The separation may be using a moiety for separation or purification attached to the polynucleotide. The separation or purification may be by a substance or magnetic field that specifically binds to the moiety. In one embodiment, streptavidin coated with magnetic beads was used to separate the hybridization product of the first, second, or third polynucleotide with genomic DNA to which biotin was attached. The separation allows selective detection of genomic DNA hybridized with polynucleotides. This may be called "target capture".

In addition, the separation may further include the step of separating the genomic DNA from the hybridization product, that is, the step of separating the hybridized holnucleotide and genomic DNA. Isolation of genomic DNA from hybridization products can be performed, for example, by amplification using primers specific for the target DNA after isolation by high temperature. The high temperature may be 80 to 110 ° C, 90 to 100 ° C or 95 ° C.

In addition, the detection method is a PCR using a hybrid primer or a universal primer complementary to the sequence for amplification attached to each of the genomic DNA as a template, and amplify the genomic DNA It may further comprise a step. The nucleotide sequence can be confirmed using the amplified genomic DNA.

Confirmation of the nucleotide sequence can be confirmed through, for example, a sequencing method, specifically, by next-generation sequencing. The term "next generation sequencing (NGS) fragments the full-length dielectric in chip-based and PCR-based paired ends, and the fragments are subjected to ultra-high speed based on chemical hybridization. By sequencing technology, a large amount of sequencing data can be generated for a sample to be analyzed within a short time by the next generation sequencing method.

The detection method includes comparing a nucleotide sequence of the identified genomic DNA with a standard nucleotide sequence. The term “reference neucleotide sequence” may refer to a human genomic sequence that does not include a mutation, to which reference is made for identification of the mutation. For example, a human gene sequence published in a database of the National Institute of Bioscience and Biotechnology Information Institute (NCBI), specifically NCBI37.1 or UCSC hg19 (GRCh37), may be used as the standard sequence. The comparison between the base sequence and the standard sequence of the genomic DNA can be performed using various known sequence comparison analysis programs, for example, Maq, Bowtie, SOAP, GSNAP and the like.

The detection method may further comprise comparing the number of copies of genomic DNA of a particular region caused by cancer development and progression with a level (control level) obtained using a standard reference. In addition, the detection method may further include determining that the copy number of the genomic DNA is increased when the level of the amplified genomic DNA amount is increased compared to the control level.

The detection method includes identifying mutations in genomic DNA. The mutation check may be performed using a known mutation detection program, for example, GATK, SAMtool, MoDIL, SeqSeq, PeMer, VariationHunter, Pindel, BreakDancer, and Mutek, but is not limited thereto. In one embodiment, single nucleotide mutations and indels were identified using the GATK-2.2.9 algorithm, and CNVs were used to compare the intensity of cancer tissue specimens with signal intensity using a reference cell line. CNV detection was performed by developing an in-house program that detects and compares them with relative values. In addition, translocation was identified using the CIGAR algorithm, which extracts discrepant reads separately during BAM file generation.

The step of identifying mutation may further include comparing the extracted mutation information with a previously constructed cancer-related genetic mutation-related database to determine whether it is a known mutation or a newly-discovered mutation.

The detection method may further comprise the step of confirming a correlation between the identified mutation of the genomic DNA and the cancer treatment effect in the individual of the anticancer agent. Checking the correlation may include identifying a mechanism of action of the anticancer agent, and / or a target targeted for the action. Accordingly, the detection method enables the identification of mutations in genomic DNA and anticancer agents associated with such mutations, and / or selection of anticancer agents, by confirming the correlation. Therefore, information may be provided for selecting an individual cancer treatment agent customized using the detection method.

The term “individual” refers to all animals classified as mammals with or suspected of having cancer and includes livestock and farm animals, primates and humans, eg, humans, non-human primates, cattle, horses. And pigs, sheep, goats, dogs, cats or rodents. In particular, the subject is a human male or female of any age or race. "Subject" and "patient" are used interchangeably herein.

In the step of obtaining genomic DNA from the subject, the genomic DNA may be obtained from cancer tissue or cancer cells of the subject. The obtaining method may use a method known to those skilled in the art for separating genomic DNA from tissue or cells.

The method includes selecting a cancer drug that is associated with the mutation in the cancer drug database based on the identified correlation. By analyzing the correlation between genetic information of cancer, clinical information of patients, mutation information of genomic DNA extracted from patients, and cancer drug information related to genome mutations, an algorithm for predicting the correlation between each data can be constructed. Can be. The constructed algorithm can also be used to derive patient-specific cancer therapeutics from genetic information data of a patient and clinical information data of a patient without additional experiments in vitro and in vivo conditions. For example, the algorithm for selecting an individual cancer treatment agent from the genetic information data and the clinical information data of the present invention may include: receiving a result of analysis of genome variation of a sample; Receiving clinical information data of the patient; Selecting an individual cancer treatment agent from a cancer treatment DB based on the results of the genome variation analysis and the clinical information data of the sample; Accumulating the genetic information data of the cancer and the clinical information data of the individual and the selected personalized anticancer agent data corresponding thereto; And analyzing the correlation between the accumulated genetic information data, clinical information data, and patient-specific anticancer drug data. In this case, the cancer therapeutic agent DB may be characterized in that the DB of the correlation between the known sequence variation information and the cancer treatment agent. For example, in patients with non-small cell cancer (NSCLC), SNV (L858R) in which the 858th leucine of exon 21 of the epidermal growth factor receptor (EGFR) gene is substituted with arginine and SNV (L861Q) where the 861th leucine is replaced by glutamine are detected. It is known that the therapeutic effect of Gefitinib and erlotinib is increased (US 8,105,769). The algorithm can be updated by adding data, thereby improving the success rate of screening patient-specific anticancer drugs.

The composition according to one aspect can detect various genome variations of major cancer-related genes through a single experiment, which can be very economical and efficient than conventional methods using different platforms for each genome variation. In addition, by sequencing with high coverage of the major genes related to the induction and progression of cancer to ensure high resolution to enable identification of low frequency genome mutations that were not detected by conventional methods. Therefore, by using the composition according to one aspect it is possible to analyze a variety of genetic variations in the sample containing the genome of cancer cells at the same time with high sensitivity and accuracy, it is possible to efficiently search for a patient-specific cancer treatment based on the analysis results.

1 is a diagram illustrating a nucleotide variation analysis method of a cancer sample using the composition according to one aspect.

2 shows the types of genes that the composition according to one aspect can capture for each type of mutation.

FIG. 3A shows some of the single nucleotide variations detected through NGS using the composition according to one aspect, and B is the cytosine (C) at the exon of the EGFR gene using the composition according to one aspect. The result of detecting SNV substituted with T) is shown.

FIG. 4A is a table summarizing some of the results of detecting the insertion-deletion mutation through NGS using the composition according to one aspect, and B is the detection of the insertion-deletion mutation detected using the composition according to one aspect. The result is.

5A is a result of detecting copy number variation through NGS using a composition according to one aspect, and B is a table summarizing a part of results of detecting copy number variation through NGS using a composition according to one aspect. .

FIG. 6A is a result of detecting gene translocation through NGS using a composition according to an aspect, and B is a table listing some of the results of gene translocation through NGS using a probe composition according to an aspect.

7 is a graph showing the sensitivity of the detection result of a single nucleotide variation using the composition according to one aspect.

Hereinafter, the present invention will be described in more detail with reference to Examples. However, these examples are for illustrative purposes only, and the scope of the present invention is not limited by these examples.

실시예 1. 변이 검출 타겟을 위한 유전자의 선정 Example 1 Selection of Genes for Mutation Detection Targets

Through cancer-related DBs (eg, My Cancer Genome (http://www.mycancergenome.org), The cancer genome atlas (TCGA) (http://cancergenome.nih.gov), etc.) and prior literature The optimal number of hotspot genes that occur frequently in major cancers has been selected that can detect variations in major cancer genes at economic cost and maximized detection efficiency. As a result, ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, DKN2A, CSF1R, CTNNB1, DDR2, EGFR, EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1H, IH1, IRAS JAK1, JAK2, JAK3, KDR, KIT, KRAS, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1 RET, ROS1, SMAD4, SMARCB1, SMO, SRC, STK11, SYK, TOP1, TP53, and VHL were selected, and additionally, the TERT gene for detecting SNP and InDel was additionally selected. ALK, RET, ROS1, EWSR1 and TMPRSS2 were selected as genes for detecting chromosomal translocation.

실시예 2. 암 샘플 분석을 위한 프로브 제작Example 2 Probe Preparation for Cancer Sample Analysis

Sixteen polynucleotides (SEQ ID NOS: 1-16) capable of detecting the promoter region of the TERT gene were constructed (Agilent, Santa Clara, USA, Table 3). Each polynucleotide is 120 bp in length, and 80 bp of base is overlapped between two polynucleotides having adjacent SEQ ID NO (for example, 80 bases at the 3 'end of SEQ ID NO: 1 and 5' of SEQ ID NO: 2). 80 bases at the ends are identical to each other). In addition, each of the 16 polynucleotides was hybridized with a portion of the promoter region of the TERT gene, but was designed to cover the entire sequence of the promoter region of the TERT gene. The produced polynucleotide is a single strand of RNA and includes nucleotides of a promoter region of a DNA chain to which the TERT gene is transcribed.

In addition, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNB1, DDR2, EPBB2 ERBB3, ERBB4, EZH2, FBXW7, GFR1, FGFR2, FGFR3, FLT3, GNA11, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, KDR, KRAS, MDM2, METH, ML MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMO, SRC, STK11, SYK, TOP1, TP53, VHL ALK, RET Polynucleotides capable of detecting ROS1, EWSR1 and TMPRSS2 genes were constructed (Agilent, Santa Clara, USA). Specific information of each gene for producing the polynucleotide is shown in Table 1 above. ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC Polynucleotides were prepared based on the sequence of the exon region of each of STK11, SYK, TOP1, TP53, and VHL.

In addition, in the case of ALK, RET, ROS1, EWSR1 and TMPRSS2, polynucleotides were prepared based on the sequence of the intron region. Specific information of each gene for producing the polynucleotide is shown in Table 2 above.

Each polynucleotide is 120 bp in length, as described in the above TERT, and is designed to overlap 80 bp of base between two polynucleotides having adjacent sequence numbers, and to cover the entirety of each gene sequence. Produced. And one nucleotide in the gene was produced to be included in three polynucleotides. The produced polynucleotide is a single strand of RNA and contains nucleotides of a gene of a DNA chain (antisense DNA) to which each gene is transcribed. The sequence numbers of the polynucleotides produced for each gene are summarized in Table 3 below.

Summary of probe sequence number by detection gene

유전자gene ^aa ⁾⁾	시작번호Start number ^bb ⁾⁾	끝번호End number	검출가능변이Detectable mutation ^cc ⁾⁾
TERTTERT	1One	1616	SNV, IndelSNV, Indel
ABL1ABL1	1717	114114	SNV, Indel, CNVSNV, Indel, CNV
AKT1AKT1	115115	161161	SNV, Indel, CNVSNV, Indel, CNV
AKT2AKT2	162162	217217	SNV, Indel, CNVSNV, Indel, CNV
AKT3AKT3	218218	266266	SNV, Indel, CNVSNV, Indel, CNV
ALKALK	267267	413413	SNV, Indel, CNVSNV, Indel, CNV
APCAPC	414414	638638	SNV, Indel, CNVSNV, Indel, CNV
ARID1AARID1A	639639	804804	SNV, Indel, CNVSNV, Indel, CNV
ARID1BARID1B	805805	974974	SNV, Indel, CNVSNV, Indel, CNV
ARID2ARID2	975975	11261126	SNV, Indel, CNVSNV, Indel, CNV
ATMATM	11271127	13921392	SNV, Indel, CNVSNV, Indel, CNV
ATRXATRX	13931393	16071607	SNV, Indel, CNVSNV, Indel, CNV
AURKAAURKA	16081608	16381638	SNV, Indel, CNVSNV, Indel, CNV
AURKBAURKB	16391639	16691669	SNV, Indel, CNVSNV, Indel, CNV
BCL2BCL2	16701670	16871687	SNV, Indel, CNVSNV, Indel, CNV
BRAFBRAF	16881688	17661766	SNV, Indel, CNVSNV, Indel, CNV
BRCA1BRCA1	17671767	19241924	SNV, Indel, CNVSNV, Indel, CNV
BRCA2BRCA2	19251925	21852185	SNV, Indel, CNVSNV, Indel, CNV
CDH1CDH1	21862186	22582258	SNV, Indel, CNVSNV, Indel, CNV
CDK4CDK4	22592259	22842284	SNV, Indel, CNVSNV, Indel, CNV
CDK6CDK6	22852285	23132313	SNV, Indel, CNVSNV, Indel, CNV
CDKN2ACDKN2A	23142314	23402340	SNV, Indel, CNVSNV, Indel, CNV
CSF1RCSF1R	23412341	24302430	SNV, Indel, CNVSNV, Indel, CNV
CTNNB1CTNNB1	24312431	24932493	SNV, Indel, CNVSNV, Indel, CNV
DDR2DDR2	24942494	25652565	SNV, Indel, CNVSNV, Indel, CNV
EGFREGFR	25662566	26902690	SNV, Indel, CNVSNV, Indel, CNV
EPHB4EPHB4	26912691	27692769	SNV, Indel, CNVSNV, Indel, CNV
ERBB2ERBB2	27702770	28792879	SNV, Indel, CNVSNV, Indel, CNV
ERBB3ERBB3	28802880	30003000	SNV, Indel, CNVSNV, Indel, CNV
ERBB4ERBB4	30013001	31173117	SNV, Indel, CNVSNV, Indel, CNV
EZH2EZH2	31183118	31953195	SNV, Indel, CNVSNV, Indel, CNV
FBXW7FBXW7	31963196	32623262	SNV, Indel, CNVSNV, Indel, CNV
FGFR1FGFR1	32633263	33483348	SNV, Indel, CNVSNV, Indel, CNV
FGFR2FGFR2	33493349	34403440	SNV, Indel, CNVSNV, Indel, CNV
FGFR3FGFR3	34413441	35213521	SNV, Indel, CNVSNV, Indel, CNV
FLT3FLT3	35223522	36203620	SNV, Indel, CNVSNV, Indel, CNV
GNA11GNA11	36213621	36503650	SNV, Indel, CNVSNV, Indel, CNV
GNAQGNAQ	36513651	36823682	SNV, Indel, CNVSNV, Indel, CNV
GNASGNAS	36833683	37963796	SNV, Indel, CNVSNV, Indel, CNV
HNF1AHNF1A	37973797	38573857	SNV, Indel, CNVSNV, Indel, CNV
HRASHRAS	38583858	38783878	SNV, Indel, CNVSNV, Indel, CNV
IDH1IDH1	38793879	39153915	SNV, Indel, CNVSNV, Indel, CNV
IDH2IDH2	39163916	39593959	SNV, Indel, CNVSNV, Indel, CNV
IGF1RIGF1R	39603960	40674067	SNV, Indel, CNVSNV, Indel, CNV
ITKITK	40684068	41284128	SNV, Indel, CNVSNV, Indel, CNV
JAK1JAK1	41294129	42314231	SNV, Indel, CNVSNV, Indel, CNV
JAK2JAK2	42324232	43314331	SNV, Indel, CNVSNV, Indel, CNV
JAK3JAK3	43324332	44324432	SNV, Indel, CNVSNV, Indel, CNV
KDRKDR	44334433	45574557	SNV, Indel, CNVSNV, Indel, CNV
KITKIT	45584558	46474647	SNV, Indel, CNVSNV, Indel, CNV
KRASKRAS	46484648	46694669	SNV, Indel, CNVSNV, Indel, CNV
MDM2MDM2	46704670	47214721	SNV, Indel, CNVSNV, Indel, CNV
METMET	47224722	48354835	SNV, Indel, CNVSNV, Indel, CNV
MLH1MLH1	48364836	49124912	SNV, Indel, CNVSNV, Indel, CNV
MPLMPL	49134913	49654965	SNV, Indel, CNVSNV, Indel, CNV
MTORMTOR	49664966	52065206	SNV, Indel, CNVSNV, Indel, CNV
NF1NF1	52075207	54625462	SNV, Indel, CNVSNV, Indel, CNV
NOTCH1NOTCH1	54635463	56595659	SNV, Indel, CNVSNV, Indel, CNV
NPM1NPM1	56605660	56965696	SNV, Indel, CNVSNV, Indel, CNV
NRASNRAS	56975697	57145714	SNV, Indel, CNVSNV, Indel, CNV
NTRK1NTRK1	57155715	57925792	SNV, Indel, CNVSNV, Indel, CNV
PDGFRAPDGFRA	57935793	58905890	SNV, Indel, CNVSNV, Indel, CNV
PDGFRBPDGFRB	58915891	59845984	SNV, Indel, CNVSNV, Indel, CNV
PIK3CAPIK3CA	59855985	60746074	SNV, Indel, CNVSNV, Indel, CNV
PIK3R1PIK3R1	60756075	61456145	SNV, Indel, CNVSNV, Indel, CNV
PTCH1PTCH1	61466146	62716271	SNV, Indel, CNVSNV, Indel, CNV
PTCH2PTCH2	62726272	63746374	SNV, Indel, CNVSNV, Indel, CNV
PTENPTEN	63756375	64076407	SNV, Indel, CNVSNV, Indel, CNV
PTPN11PTPN11	64086408	64706470	SNV, Indel, CNVSNV, Indel, CNV
RB1RB1	64716471	65666566	SNV, Indel, CNVSNV, Indel, CNV
RETRET	65676567	66556655	SNV, Indel, CNVSNV, Indel, CNV
ROS1ROS1	66566656	68476847	SNV, Indel, CNVSNV, Indel, CNV
SMAD4SMAD4	68486848	68966896	SNV, Indel, CNVSNV, Indel, CNV
SMARCB1SMARCB1	68976897	69306930	SNV, Indel, CNVSNV, Indel, CNV
SMOSMO	69316931	69906990	SNV, Indel, CNVSNV, Indel, CNV
SRCSRC	69916991	70397039	SNV, Indel, CNVSNV, Indel, CNV
STK11STK11	70407040	70737073	SNV, Indel, CNVSNV, Indel, CNV
SYKSYK	70747074	71277127	SNV, Indel, CNVSNV, Indel, CNV
TOP1TOP1	71287128	72057205	SNV, Indel, CNVSNV, Indel, CNV
TP53TP53	72067206	72507250	SNV, Indel, CNVSNV, Indel, CNV
VHLVHL	72517251	72667266	SNV, Indel, CNVSNV, Indel, CNV
ALKALK	72677267	72787278	Translocation (전좌)Translocation
ALKALK	72797279	73087308	Translocation (전좌)Translocation
ALKALK	73097309	73247324	Translocation (전좌)Translocation
EWSR1EWSR1	73257325	73777377	Translocation (전좌)Translocation
EWSR1EWSR1	73787378	73847384	Translocation (전좌)Translocation
EWSR1EWSR1	73857385	73897389	Translocation (전좌)Translocation
EWSR1EWSR1	73907390	73937393	Translocation (전좌)Translocation
EWSR1EWSR1	73947394	74427442	Translocation (전좌)Translocation
EWSR1EWSR1	74437443	74597459	Translocation (전좌)Translocation
EWSR1EWSR1	74607460	75057505	Translocation (전좌)Translocation
EWSR1EWSR1	75067506	75137513	Translocation (전좌)Translocation
EWSR1EWSR1	75147514	75187518	Translocation (전좌)Translocation
EWSR1EWSR1	75197519	75417541	Translocation (전좌)Translocation
EWSR1EWSR1	75427542	75507550	Translocation (전좌)Translocation
RETRET	75517551	75827582	Translocation (전좌)Translocation
RETRET	75837583	75927592	Translocation (전좌)Translocation
RETRET	75937593	76027602	Translocation (전좌)Translocation
RETRET	76037603	76117611	Translocation (전좌)Translocation
RETRET	76127612	76227622	Translocation (전좌)Translocation
RETRET	76237623	76527652	Translocation (전좌)Translocation
ROS1ROS1	76537653	76717671	Translocation (전좌)Translocation
ROS1ROS1	76727672	77157715	Translocation (전좌)Translocation
ROS1ROS1	77167716	77407740	Translocation (전좌)Translocation
ROS1ROS1	77417741	77827782	Translocation (전좌)Translocation
ROS1ROS1	77837783	77937793	Translocation (전좌)Translocation
TMPRSS2TMPRSS2	77947794	78497849	Translocation (전좌)Translocation
TMPRSS2TMPRSS2	78507850	78607860	Translocation (전좌)Translocation
TMPRSS2TMPRSS2	78617861	79177917	Translocation (전좌)Translocation
TMPRSS2TMPRSS2	79187918	79557955	Translocation (전좌)Translocation
TMPRSS2TMPRSS2	79567956	81028102	Translocation (전좌)Translocation

a): Beginning and ending means the beginning and end of SEQ ID NO.

b): Gene means a target gene to which each probe binds.

c): A variation refers to a variation detected by a probe.

실시예 3. 제작된 프로브를 이용하여 시료로부터 변이 검출Example 3 Detecting Mutations from Samples Using Fabricated Probes

3-1. 타겟 캡처 및 라이브러리 제작3-1. Target capture and library authoring

Genomic DNA was isolated from various cancer patient-derived cancer tissue samples (Tissue, blood, FFPE, FNA, etc.) using the QiAmp DNA Mini kit (Qiagen, Valencia, CA, USA) for NGS experiments. Subsequently, Nanodrop 8000 UV-Vis spectrometer (Thermo Scientific Inc., DE, USA), Qubit 2.0 Fluorometer (Life technologies Inc., Grand Island, NY, USA) and 2200 TapeStation Instrument (Aglient Technologies, Santa Clara, CA, USA ), The concentration, purity, and degradation of the isolated genomic DNA were determined using the equipment. Samples meeting the QC criteria were used for the next step of the experiment.

Genomic DNA obtained from each tissue (~ 250ng) was sheared using Covaris S220 (Covaris, MA, USA), followed by end-repair, A-tailing, paired-end adapter ligation and amplification. The sequencing library was then fabricated. The hybridization time of the library was reacted at 65 ° C. for 24 hours using a composition containing all of the polynucleotides prepared to capture the 83 genomic regions selected in Example 1, and was captured by hybridization. Genomic DNA library fragments were purified. Purification took advantage of the binding properties of streptavidin and biotin attached to the polynucleotide. Specifically, after binding the magnetic beads coated streptavidin and biotin attached to the captured library fragments, the captured library fragments were separated from the mixture using magnetic force. Then, the purified genomic DNA library fragment was amplified with an index barcode tag. The primer containing the index barcode tag was amplified by PCR equipment under the following conditions.

단계step	온도Temperature	시간time
1One	98 ℃98 ℃	45 초45 sec
22	98 ℃98 ℃	15 초15 seconds
33	60 ℃60 ℃	30 초30 sec
44	72 ℃72 ℃	30 초30 sec
2번에서 4번 단계를 총 13회 반복한다.Repeat steps 2 through 4 a total of 13 times.
66	72 ℃72 ℃	5 분5 minutes
77	4 ℃4 ℃	보관keep

3-2. 시퀀싱(Sequencing)3-2. Sequencing

The gene fragments captured in Example 3-1 were injected into an NGS sequencing machine (Miseq, illumina, USA) to obtain sequence information of each DNA fragment and aligned to obtain sequence information for each gene in a cancer sample. Sequencing reactions were performed using TruSeq Rapid PE Cluster kit and TruSeq Rapid SBS kit (Illumina, USA) and performed under 100bp paired-end conditions (FIG. 1).

3-3. 변이정보 추출(Variant Calling)3-3. Variant Calling

The sequencing reads data obtained in Example 3-2 were aligned to the UCSC hg19 reference genome (http://genome.ucsc.edu) using a Burrows-Wheeler Aligner (BWA) algorithm. PCR duplication was removed using Picard-tools-1.8 (http://picard.sourceforge.net/) and single nucleotide variation (SNV) and indel deletion using the GATK-2.2.9 algorithm. Indel was identified (see FIGS. 3 and 4). CNV developed CNV detection by developing an in-house program that detects cancer cells by comparing them with relative values using a reference cell line (see FIG. 5). The translocation was performed by extracting discordant reads separately in the process of generating a BAM file, performing possible fusion pairs using the CIGAR algorithm, and then removing false positive calls to identify final translocation information.

FIG. 3A shows a part of single nucleotide variations detected through NGS using the composition according to one aspect. Gene information, Gene, Function, Variation type, and Variation occurred. Exon number, amino acid change information, SNP DB recording information (dbSNP), chromosome number where mutation occurred (Chromosome), position of mutation, reference standard nucleotide sequence (Reference), mutation Is a table listing the generated nucleotide sequence (Alteration), frequency (VAF), and B is an enlarged view of the results from 55,249,044 to 55,249,097 of chromosome 7 in the EGFR region to the IGV viewer using the probe composition according to the present invention. The detection result of SNV in which cytosine (C) at position 55,249,072 is substituted with thymine (T) is shown.

4A shows a part of the results of detecting the insertion-deletion mutation through the NGS using the composition according to an aspect of the present invention, Gene, Function, Variation type, Variation type, and Variation Exon number generated (Exon), amino acid change information (Amino acid change), SNP DB recording information (dbSNP), chromosome number (mutation) (Chromosome) the mutation occurred, the position (Position), reference standard nucleotide sequence (Reference), Nucleotide sequence (Alteration), frequency (VAF) is a table listing the mutation occurred, B using the probe composition according to the present invention to expand the results from 55,242,364 to 55,242,580 of chromosome 7 in the EGFR region to IGV viewer Deletion mutations of 15 nucleotides from 55,242,465 to 55,242,480 are shown.

Figure 5A is to reduce the tumor tissue content (Tumor purity) to 100%, 50%, 30% using normal tissue to detect Copy Number Variation (CNV) through NGS using the composition according to one aspect One result. It also shows that it detects the copy number of CDK4 and MDM2 even when containing 30% low tumor tissue. B is a table summarizing some of the results of detecting the copy variation through NGS using the actual cancer tumor tissue sample composition according to one aspect. It shows good detection of the number of copies of CDK4 and MDM2.

6A is a result of detecting gene translocation through NGS using a composition according to an aspect, and B is a summary of some of results of detecting gene translocation through NGS using a probe composition according to the present invention. It is a vote. The probe sequence covering the intron of ALK is the result of accurately detecting the region where the translocation occurred due to the binding of EML.

3-4. 민감도(sensitivity) 측정3-4. Sensitivity Measurement

For SNV mutations, 20 International HapMap samples with known mutation information were used to measure sensitivity and accuracy compared to pooled samples at each frequency. As a result, the sensitivity was 99.72%, and the agreement between the actual measured values and the expected frequency of variation showed high accuracy of 99.43 (Table 5).

7 is a graph showing the sensitivity of the SNV detection result using the composition according to one aspect. When 1000x of sequencing data is produced, more than 5% of the mutations are detected with a sensitivity of 99% or more.

For InDel mutations, 28 cancer cell line samples with known mutation information were used to measure sensitivity and accuracy compared to pooled samples at each frequency. As a result, it was confirmed that the sensitivity is 99.55%, and the positive prediction value (PPV) has a high accuracy of 96.36 (Table 5).

For CNV variability, sensitivity and accuracy were compared to pooled samples at each frequency using four cancer cell lines and normal paired samples with known mutation information. As a result, it was confirmed that a sensitivity of 100.0% was observed in a sample having a tumor volume of 30% or more, and PPV (positive prediction value) showed a high accuracy of 75.0% for amplification and 100.0% for deletion. Table 4).

For translocation mutations, sensitivity was measured for four known translocation mutations using a mixed sample of four cancer cell lines and normal cell lines with known mutation information. The result showed a high accuracy of 96.9% (Table 5).

Sensitivity analysis result for each variation using the composition for detecting variation of the genome according to one aspect

변이transition	민감도responsiveness	변이transition	민감도responsiveness
SNVSNV	99.72%99.72%		CNVCNV	100%100%
IndelIndel	99.55%99.55%	TranslocationTranslocation	96.9%96.9%

Claims

Compositions for use in detecting mutations in genomic DNA of cancer cells, comprising:

A first polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequence of the TERT gene, or a complementary polynucleotide thereof;

ABL1, AKT1, AKT2, AKT3, ALK, APC, ARID1A, ARID1B, ARID2, ATM, ATRX, AURKA, AURKB, BCL2, BRAF, BRCA1, BRCA2, CDH1, CDK4, CDK6, CDKN2A, CSF1R, CTNNBEGFR, DDR EPHB4, ERBB2, ERBB3, ERBB4, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IGF1R, ITK, JAK1, JAK2, JAK3, JAK2, MDM2, MET, MLH1, MPL, MTOR, NF1, NOTCH1, NPM1, NRAS, NTRK1, PDGFRA, PDGFRB, PIK3CA, PIK3R1, PTCH1, PTCH2, PTEN, PTPN11, RB1, RET, ROS1, SMAD4, SMARCB1, SMO, SMORC A second polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the exon region of each of the STK11, SYK, TOP1, TP53, and VHL genes, or a complementary polynucleotide thereof; And

A third polynucleotide comprising a contiguous nucleotide sequence selected from the nucleotide sequences of the intron region of each of the ALK, RET, ROS1, EWSR1 and TMPRSS2 genes, or a complementary polynucleotide thereof.
The composition of claim 1, wherein the variation comprises a variation in the copy number of the gene or a variation in the nucleotide sequence relative to standard genomic DNA.
The composition of claim 2, wherein the variation in nucleotide sequence comprises substitution, insertion, deletion, or translocation of one or more nucleotide sequences relative to standard genomic DNA.
The composition of claim 1, wherein the first polynucleotide comprises a contiguous nucleotide selected from the nucleotide sequence of the promoter region of the TERT gene.
The method according to claim 1, wherein the first, second, or third polynucleotide is DNA, RNA, peptide nucleic acid (Peptide Nucleic Acid (PNA), Locked Nucleic Acid (LNA), Zip Nucleic Acid (ZNA), At least one selected from bridged nucleic acid (BNA), and nucleotide analogues.
The composition of claim 1, wherein the first, second, or three polynucleotides are 75 to 200 nucleotides in length.
The composition of claim 1, wherein the first polynucleotide comprises one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 1-16.
The composition of claim 1, wherein the second polynucleotide comprises one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 17-7266.
The composition of claim 1, wherein the third polynucleotide comprises one or more sequences selected from the group consisting of nucleotide sequences of SEQ ID NOs: 7267-8102.
The composition of claim 1, wherein the first, second, or third polynucleotide is attached with a moiety for isolation or purification of the polynucleotide.
The composition of claim 1, for use in the search for an anticancer agent effective for reducing the viability of the cancer cell.
The composition of claim 1, for use in the search for an anticancer agent effective for treating cancer in a cancer patient with cancer cells.
Contacting genomic DNA derived from a cancer cell with a composition according to any one of claims 1 to 12 to obtain a hybridization product of said genomic DNA with said first, second, or three polynucleotides in said composition;

Identifying the nucleotide sequence of the genomic DNA in the hybridization product; And

Comparing the identified nucleotide sequence of the genomic DNA with a standard nucleotide sequence to identify the mutation of the genomic DNA.
The method of claim 13, wherein the genomic variation comprises a variation in the copy number of the gene or a variation in the nucleotide sequence relative to standard genomic DNA.
The method of claim 13, wherein the genomic variation comprises one or more selected from the group consisting of single nucleotide variation, indel deletion variation, copy number variation and translocation.
The method according to claim 13, wherein the cancer is one or more selected from the group consisting of liver cancer, glioblastoma, ovarian cancer, colon cancer, head and neck cancer, bladder cancer, kidney cell cancer, gastric cancer, breast cancer, metastatic cancer, prostate cancer, pancreatic cancer and lung cancer Way to be.
The method of claim 13, further comprising fragmenting genomic DNA from cancer cells prior to or in the contacting step.
The method of claim 13, wherein prior to the step of identifying the nucleotide sequence of the genomic DNA in the hybridization product, separating the hybridization product of the genomic DNA and the first, second, or third polynucleotide from the contact product obtained in the contacting step. It further comprises a.
The method of claim 13, further comprising the step of ascertaining a correlation between the identified variation of the genomic DNA and the cancer therapeutic effect in the subject of the anticancer agent.
20. The method of claim 19, wherein identifying the correlation comprises identifying a mechanism of action of the anticancer agent or a target of action.